This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Simple optimization for MASK_STORE.


Hi Richard,

I send you updated version of patch which contains fixes you mentioned
and additional early exit in
register_edge_assert_for() for gcond with vector comparison - it tries
to produce assert for
  if (vect != {0,0,0,0}) but can't create such constant. This is not
essential since this is applied to very specialized context.

My answers are below.

2015-11-12 16:58 GMT+03:00 Richard Biener <richard.guenther@gmail.com>:
> On Wed, Nov 11, 2015 at 2:13 PM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
>> Richard,
>>
>> What we should do to cope with this problem (structure size increasing)?
>> Should we return to vector comparison version?
>
> Ok, given this constraint I think the cleanest approach is to allow
> integer(!) vector equality(!) compares with scalar result.  This should then
> expand via cmp_optab and not via vec_cmp_optab.

  In fact it is expanded through cbranch_optab since the only use of
such comparison is for masked
store motion
>
> On gimple you can then have
>
>  if (mask_vec_1 != {0, 0, .... })
> ...
>
> Note that a fallback expansion (for optabs.c to try) would be
> the suggested view-conversion (aka, subreg) variant using
> a same-sized integer mode.
>
> Target maintainers can then choose what is a better fit for
> their target (and instruction set as register set constraints may apply).
>
> The patch you posted seems to do this but not restrict the compares
> to integer ones (please do that).
>
>        if (TREE_CODE (op0_type) == VECTOR_TYPE
>           || TREE_CODE (op1_type) == VECTOR_TYPE)
>          {
> -          error ("vector comparison returning a boolean");
> -          debug_generic_expr (op0_type);
> -          debug_generic_expr (op1_type);
> -          return true;
> +         /* Allow vector comparison returning boolean if operand types
> +            are equal and CODE is EQ/NE.  */
> +         if ((code != EQ_EXPR && code != NE_EXPR)
> +             || TREE_CODE (op0_type) != TREE_CODE (op1_type)
> +             || TYPE_VECTOR_SUBPARTS (op0_type)
> +                != TYPE_VECTOR_SUBPARTS (op1_type)
> +             || GET_MODE_SIZE (TYPE_MODE (TREE_TYPE (op0_type)))
> +                != GET_MODE_SIZE (TYPE_MODE (TREE_TYPE (op1_type))))
>
> These are all checked with the useless_type_conversion_p checks done earlier.
>
> As said I'd like to see a
>
>                 || ! VECTOR_INTEGER_TYPE_P (op0_type)

  I added check on VECTOR_BOOLEAN_TYPE_P (op0_type) instead since type
of mask was changed.
>
> check added so we and targets do not need to worry about using EQ/NE vs. CMP
> and worry about signed zeros and friends.
>
> +           {
> +             error ("type mismatch for vector comparison returning a boolean");
> +             debug_generic_expr (op0_type);
> +             debug_generic_expr (op1_type);
> +             return true;
> +           }
>
>
>
> --- a/gcc/tree-ssa-forwprop.c
> +++ b/gcc/tree-ssa-forwprop.c
> @@ -422,6 +422,15 @@ forward_propagate_into_comparison_1 (gimple *stmt,
>           enum tree_code def_code = gimple_assign_rhs_code (def_stmt);
>           bool invariant_only_p = !single_use0_p;
>
> +         /* Can't combine vector comparison with scalar boolean type of
> +            the result and VEC_COND_EXPR having vector type of comparison.  */
> +         if (TREE_CODE (TREE_TYPE (op0)) == VECTOR_TYPE
> +             && INTEGRAL_TYPE_P (type)
> +             && (TREE_CODE (type) == BOOLEAN_TYPE
> +                 || TYPE_PRECISION (type) == 1)
> +             && def_code == VEC_COND_EXPR)
> +           return NULL_TREE;
>
> this hints at larger fallout you paper over here.  So this effectively
> means we're trying combining (vec1 != vec2) != 0 for example
> and that fails miserably?  If so then the solution is to fix whatever
> does not expect this (valid) GENERIC tree.

  I changed it to the following check in combine_cond_expr_cond:
  /* Do not perform combining it types are not compatible.  */
  if (TREE_CODE (TREE_TYPE (op0)) == VECTOR_TYPE
      && !tree_int_cst_equal (TYPE_SIZE (type), TYPE_SIZE (TREE_TYPE (op0))))
    return NULL_TREE;
>
> +  if (ENABLE_ZERO_TEST_FOR_MASK_STORE == 0)
> +    return;
>
> not sure if I like a param more than a target hook ... :/
  I introduced the param instead of a target hook to make this
transformation user visible, i.e. to switch it on/off
for different targets.
>
> +      /* Create vector comparison with boolean result.  */
> +      vectype = TREE_TYPE (mask);
> +      zero = build_zero_cst (TREE_TYPE (vectype));
> +      zero = build_vector_from_val (vectype, zero);
>
> build_zero_cst (vectype);

Done.
>
> +      stmt = gimple_build_cond (EQ_EXPR, mask, zero, NULL_TREE, NULL_TREE);
>
> you can omit the NULL_TREE operands.

  I did not find such definition for it.
>
> +      gcc_assert (vdef && TREE_CODE (vdef) == SSA_NAME);
>
> please omit the assert.

  Done.
>
> +      gimple_set_vdef (last, new_vdef);
>
> do this before you create the PHI.
>
  Done.
> +         /* Put definition statement of stored value in STORE_BB
> +            if possible.  */
> +         arg3 = gimple_call_arg (last, 3);
> +         if (TREE_CODE (arg3) == SSA_NAME && has_single_use (arg3))
> +           {
> ...
>
> is this really necessary?  It looks incomplete to me anyway.  I'd rather have
> a late sink pass if this shows necessary.  Btw,..

 I tried to avoid creation of multiple ne basic blocks for the same
mask and also I don't want to put all semi-hammock guarded by this
mask to separate block to keep it small enough since x86 chips prefer
short branches. Note also that icc does almost the same.
>
> +                it is legal.  */
> +             if (gimple_bb (def_stmt) == bb
> +                 && is_valid_sink (def_stmt, last_store))
>
> with the implementation of is_valid_sink this is effectively
>
>    && (!gimple_vuse (def_stmt)
>           || gimple_vuse (def_stmt) == gimple_vdef (last_store))
I did inlining of correspondent part of "is_valif_sink" to this place.
>
> I still think this "pass" is quite a hack, esp. as it appears as generic
> code in a GIMPLE pass.  And esp. as this hack seems to be needed
> for Haswell only, not Boradwell or Skylake.

This is not truth since for all them this transformation is performed
for skylake and broadwell since both them
belong to HASWELL family.

>
> Thanks,
> Richard.
>

ChangeLog:
2015-11-19  Yuri Rumyantsev  <ysrumyan@gmail.com>

* config/i386/i386.c: Add conditional initialization of
PARAM_ZERO_TEST_FOR_MASK_STORE.
(ix86_expand_branch): Implement vector comparison with boolean result.
* config/i386/i386.h: New macros TARGET_OPTIMIZE_MASK_STORE.
* config/i386/sse.md (define_expand "cbranch<mode>4): Add define-expand
for vector comparion with eq/ne only.
* config/i386/x86-tune.def: New macros X86_TUNE_OPTIMIZE_MASK_STORE.
* fold-const.c (fold_relational_const): Add handling of vector
comparison with boolean result.
* params.def (PARAM_ZERO_TEST_FOR_MASK_STORE): New DEFPARAM.
* params.h (ENABLE_ZERO_TEST_FOR_MASK_STORE): New macros.
* tree-cfg.c (verify_gimple_comparison): Add argument CODE, allow
comparison of vector operands with boolean result for EQ/NE only.
(verify_gimple_assign_binary): Adjust call for verify_gimple_comparison.
(verify_gimple_cond): Likewise.
* tree-ssa-forwprop.c (combine_cond_expr_cond): Do not perform
combining for non-compatible vector types.
* tree-vect-loop.c (is_valid_sink): New function.
(optimize_mask_stores): Likewise.
* tree-vect-stmts.c (vectorizable_mask_load_store): Initialize
has_mask_store field of vect_info.
* tree-vectorizer.c (vectorize_loops): Invoke optimaze_mask_stores for
vectorized loops having masked stores.
* tree-vectorizer.h (loop_vec_info): Add new has_mask_store field and
correspondent macros.
(optimize_mask_stores): Add prototype.
* tree-vrp.c (register_edge_assert_for): Do not handle NAME with vector
type.

gcc/testsuite/ChangeLog:
* gcc.target/i386/avx2-vect-mask-store-move1.c: New test.
* gcc.target/i386/avx2-vect-mask-store-move2.c: Likewise.

Attachment: patch.6
Description: Binary data


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]