Vector Comparison patch

Artem Shinkarov artyom.shinkaroff@gmail.com
Mon Aug 15 17:53:00 GMT 2011


On Mon, Aug 15, 2011 at 3:24 PM, Richard Guenther
<richard.guenther@gmail.com> wrote:
> On Fri, Aug 12, 2011 at 4:03 AM, Artem Shinkarov
> <artyom.shinkaroff@gmail.com> wrote:
>> Hi
>>
>> Here is a completed version of the vector comparison patch we
>> discussed a long time ago here:
>> http://gcc.gnu.org/ml/gcc-patches/2010-08/msg01184.html
>>
>> The patch implements vector comparison according to the OpenCL
>> standard, when the result of the comparison of two vectors is vector
>> of signed integers, where -1 represents true and 0 false.
>>
>> The patch implements vector conditional res = VCOND<V1 ? V2 : V3>
>> which is expanded into:
>> foreach (i in length (V1)) res[i] = V1 == 0 ? V3[i] : V2[i].
>
> Some comments on the patch below.  First, in general I don't see
> why you need a new target hook to specify whether to "vectorize"
> a comparison.  Why are the existing hooks used by the vectorizer
> not enough?
>
> Index: gcc/fold-const.c
> ===================================================================
> --- gcc/fold-const.c    (revision 177665)
> +++ gcc/fold-const.c    (working copy)
> @@ -9073,34 +9073,61 @@ fold_comparison (location_t loc, enum tr
>      floating-point, we can only do some of these simplifications.)  */
>   if (operand_equal_p (arg0, arg1, 0))
>     {
> -      switch (code)
> +      if (TREE_CODE (TREE_TYPE (arg0)) == VECTOR_TYPE)
>        {
> -       case EQ_EXPR:
> -         if (! FLOAT_TYPE_P (TREE_TYPE (arg0))
> -             || ! HONOR_NANS (TYPE_MODE (TREE_TYPE (arg0))))
> -           return constant_boolean_node (1, type);
>
> I think this change should go in a separate patch for improved
> constant folding.  It shouldn't be necessary for enabling vector compares, no?

Unfortunately no, this case must be covered here, otherwise x != x
condition fails.

>
> +      if (TYPE_UNSIGNED (TREE_TYPE (TREE_TYPE (ifexp))))
> +        {
> +          error_at (colon_loc, "vector comparison must be of signed "
> +                              "integer vector type");
> +          return error_mark_node;
> +        }
>
> why that?

Well, later on I rely on this fact. I mean OpenCL says that it should
return -1 in the sense that all bits set. I don't really know, I can
support unsigned masks as well, but wouldn't it just introduce a
source for possible errors. I mean that natural choice for true and
flase is 0 and 1, not 0 and -1. Anyway I don't have a strong opinion
there, and I could easily adjust it if we decide that we want it.

>
> +      if (TYPE_VECTOR_SUBPARTS (type1) != TYPE_VECTOR_SUBPARTS (type2)
> +          || TYPE_VECTOR_SUBPARTS (TREE_TYPE (ifexp))
> +             != TYPE_VECTOR_SUBPARTS (type1))
> +        {
> +          error_at (colon_loc, "vectors of different length found in "
> +                               "vector comparison");
> +          return error_mark_node;
> +        }
>
> I miss verification that type1 and type2 are vector types, or is that done
> elsewhere?  I think type1 and type2 are already verified to be
> compatible (but you might double-check).  At least the above would be
> redundant with

Thanks, type1 and type2 both vectors comparison is missing, going to
be added in the new version of the patch.
>
> +      if (type1 != type2)
> +        {
> +          error_at (colon_loc, "vectors of different types involved in "
> +                               "vector comparison");
> +          return error_mark_node;
> +        }

You are right, what I meant here is TREE_TYPE (type1) != TREE_TYPE
(type2), because vector (4, int) have the same number of elements as
vector (4, float). This would be fixed in the new version.

>
> Joseph may have comments about the fully-fold stuff that follows.
>
> +      /* Currently the expansion of VEC_COND_EXPR does not allow
> +        expessions where the type of vectors you compare differs
> +        form the type of vectors you select from. For the time
> +        being we insert implicit conversions.  */
> +      if ((COMPARISON_CLASS_P (ifexp)
>
> Why only for comparison-class?
Not only, there is || involved:
(COMPARISON_CLASS_P (ifexp)  && TREE_TYPE (TREE_OPERAND (ifexp, 0)) != type1)
|| TREE_TYPE (ifexp) != type1

So if this is a comparison class, we check the first operand, because
the result of the comparison fits, however the operands could not. In
case we have an expression of signed vector, we know that we would
transform it into exp != {0,0,...} in tree-vect-generic.c, but if the
types of operands do not match we convert them.

>
> +          && TREE_TYPE (TREE_OPERAND (ifexp, 0)) != type1)
> +         || TREE_TYPE (ifexp) != type1)
> +       {
> +         tree comp_type = COMPARISON_CLASS_P (ifexp)
> +                          ? TREE_TYPE (TREE_OPERAND (ifexp, 0))
> +                          : TREE_TYPE (ifexp);
> +         tree vcond;
> +
> +         op1 = convert (comp_type, op1);
> +         op2 = convert (comp_type, op2);
> +         vcond = build3 (VEC_COND_EXPR, comp_type, ifexp, op1, op2);
> +         vcond = convert (type1, vcond);
> +         return vcond;
> +       }
> +      else
> +       return build3 (VEC_COND_EXPR, type1, ifexp, op1, op2);
>
> In the end we of course will try to fix the middle-end/backends to
> allow mixed types here as the current restriction doesn't really make sense.

Yes, that would be nice, but these conversions do not really affect
the code generation, so for the time being I think it is fine to have
them.

>
>     case EQ_EXPR:
>     case NE_EXPR:
> +      if (code0 == VECTOR_TYPE && code1 == VECTOR_TYPE)
> +        {
> +          tree intt;
> +          if (TREE_TYPE (type0) != TREE_TYPE (type1))
> +            {
> +              error_at (location, "comparing vectors with different "
> +                                  "element types");
> +              return error_mark_node;
> +            }
> +
> +          if (TYPE_VECTOR_SUBPARTS (type0) != TYPE_VECTOR_SUBPARTS (type1))
> +            {
> +              error_at (location, "comparing vectors with different "
> +                                  "number of elements");
> +              return error_mark_node;
> +            }
>
> as above - compatibility should already be ensured, thus type0 == type1
> here?

Yeah, we know that they are both vector types, but that is about all
we know. Anyhow, all these errors are reachable. As an example see
vector-compare-1.c:
r4 = x > y;  /* { dg-error "comparing vectors with different element types" } */
r8 == r4; /* { dg-error "comparing vectors with different number of
elements"} */

>
> +/* Try a hardware hook for vector comparison or
> +   extract comparison piecewise.  */
> +static tree
> +expand_vector_comparison (gimple_stmt_iterator *gsi, tree type, tree op0,
> +                          tree op1, enum tree_code code)
>
> comments should mention and describe all function arguments.

Ok, coming in the new version of the patch.

> +/* Expand vector condition EXP which should have the form
> +   VEC_COND_EXPR<cond, vec0, vec1> into the following
> +   vector:
> +     {cond[i] != 0 ? vec0[i] : vec1[i], ... }
> +   i changes from 0 to TYPE_VECTOR_SUBPARTS (TREE_TYPE (vec0)).  */
> +static tree
> +earlyexpand_vec_cond_expr (gimple_stmt_iterator *gsi, tree exp)
>
> that would be expand_vec_cond_expr_piecewise, no?

Adjusted.

>
> +  /* Ensure that we will be able to expand vector comparison
> +     in case it is not supported by the architecture.  */
> +  gcc_assert (COMPARISON_CLASS_P (cond));
>
> that looks dangerous to me - did you try
>
>  vec = v1 <= v2;
>  vec2 = vec ? v1 : v2;
>
> without optimization?

Sure, tests should cover this case.
I have this assertion there because only two cases are possible:
1) it is a comparison
2) function callee converted expr to expr != {0,0,...}
So we should be perfectly fine.

>
> +  /* Check if we need to expand vector condition inside of
> +     VEC_COND_EXPR.  */
> +  var = create_tmp_reg (TREE_TYPE (cond), "cond");
> +  new_rhs = expand_vector_comparison (gsi, TREE_TYPE (cond),
> +                                      TREE_OPERAND (cond, 0),
> +                                     TREE_OPERAND (cond, 1),
> +                                      TREE_CODE (cond));
>
> That unconditionally expands, so no need for "Check".

Ok.

>
> +  /* Expand VEC_COND_EXPR into a vector of scalar COND_EXPRs.  */
> +  v = VEC_alloc(constructor_elt, gc, nunits);
> +  for (i = 0; i < nunits;
> +       i += 1, index = int_const_binop (PLUS_EXPR, index, part_width))
> +    {
> +      tree tcond = tree_vec_extract (gsi, inner_type, var, part_width, index);
> +      tree a = tree_vec_extract (gsi, inner_type, vec0, part_width, index);
> +      tree b = tree_vec_extract (gsi, inner_type, vec1, part_width, index);
> +      tree rcond = gimplify_build2 (gsi, NE_EXPR, boolean_type_node, tcond,
> +                                    build_int_cst (inner_type ,0));
>
> I seriously doubt that when expanding this part piecewise expanding
> the mask first in either way is going to be beneficial.  Instead I would
> suggest to "inline" the comparison here.  Thus instead of

Well, the ting is that, if expand_vector_comparison, would insert
builtin there rather than expanding the code piecewise, I'll have to
do the comparison with 0 anyway, because true is expressed as -1
there.

Well, I would hope that in case we have:
c_0 = a_0 > b_0;
d_0 = c_0 != 0;

{d_0, d_1,...}

all the d_n should be constant-folded, or should I pull fold explicitly here?

1) I construct the mask
>
>  mask =
>         = { mask[0] != 0 ? ... }
>
> do
>
>          = { c0[0] < c1[0] ? ..., }
>
> or even expand the ? : using mask operations if we efficiently can
> create that mask.
>

I assume that if we cannot expand VEC_COND_EXPR, then masking the
elements is a problem for us. Otherwise VEC_COND_EXPE expansion has a
bug somewhere. Or I am wrong somewhere?

>
> +  /* Check if VEC_COND_EXPR is supported in hardware within the
> +     given types.  */
> +  if (code == VEC_COND_EXPR)
> +    {
> +      tree exp = gimple_assign_rhs1 (stmt);
> +      tree cond = TREE_OPERAND (exp, 0);
> +
> +      /* If VEC_COND_EXPR is presented as A ? V0 : V1, we
> +        change it to A != {0,0,...} ? V0 : V1  */
> +      if (!COMPARISON_CLASS_P (cond))
> +       TREE_OPERAND (exp, 0) =
> +           build2 (NE_EXPR, TREE_TYPE (cond), cond,
> +                   build_vector_from_val (TREE_TYPE (cond),
> +                     build_int_cst (TREE_TYPE (TREE_TYPE (cond)), 0)));
>
> That looks inefficient as well.  Iff we know that the mask is always
> either {-1, -1 ..} or {0, 0 ...} then we can expand the ? : using
> bitwise operations (see what the i?86 expander does, for example).

This is a requirement of VEC_COND_EXPR, I need to pass 4 parameters,
not 3, that is why I introduce this fake {0,0,..} here.

>
> @@ -471,6 +603,7 @@ expand_vector_operations_1 (gimple_stmt_
>
>   gcc_assert (code != CONVERT_EXPR);
>
> +
>   /* The signedness is determined from input argument.  */
>   if (code == VEC_UNPACK_FLOAT_HI_EXPR
>       || code == VEC_UNPACK_FLOAT_LO_EXPR)
>
> spurious whitespace change.

Fixed.
>
> Index: gcc/tree-cfg.c
> ===================================================================
> --- gcc/tree-cfg.c      (revision 177665)
> +++ gcc/tree-cfg.c      (working copy)
> @@ -3191,6 +3191,38 @@ verify_gimple_comparison (tree type, tre
>       return true;
>     }
>
> +  if (TREE_CODE (op0_type) == VECTOR_TYPE
> +      && TREE_CODE (op1_type) == VECTOR_TYPE
> +      && TREE_CODE (type) == VECTOR_TYPE)
> +    {
>
> this should check TREE_CODE (type) == VECTOR_TYPE only
> and then verify the comparison operands are actually vectors.

Yes, you are right, adjusted.

>
> +      if (TREE_TYPE (op0_type) != TREE_TYPE (op1_type))
> +        {
> +          error ("invalid vector comparison, vector element type mismatch");
> +          debug_generic_expr (op0_type);
> +          debug_generic_expr (op1_type);
> +          return true;
> +        }
>
> this needs to use code similar to the scalar variant,
>
>          !useless_type_conversion_p (op0_type, op1_type)
>          && !useless_type_conversion_p (op1_type, op0_type)
>
> which also makes the first TYPE_VECTOR_SUBPARTS redundant.
>

Fixed.

> +      if (TYPE_VECTOR_SUBPARTS (type) != TYPE_VECTOR_SUBPARTS (op0_type)
> +          && TYPE_PRECISION (TREE_TYPE (op0_type))
> +             != TYPE_PRECISION (TREE_TYPE (type)))
> +        {
> +          error ("invalid vector comparison resulting type");
> +          debug_generic_expr (type);
> +          return true;
> +        }
>
> I think you can drop the TYPE_PRECISION check.  We might want to
> assert that a vector element types precision always matches its
> mode precision (in make_vector_type).

I would leave it for a while. During the optimisation you can
construct some strange things, so I would better make verifier
resistant to the all kind of stuff.

>
> Index: gcc/c-parser.c
> ===================================================================
> --- gcc/c-parser.c      (revision 177665)
> +++ gcc/c-parser.c      (working copy)
> @@ -5337,8 +5337,17 @@ c_parser_conditional_expression (c_parse
>   if (c_parser_next_token_is (parser, CPP_COLON))
>     {
>       tree eptype = NULL_TREE;
> -
> +
>       middle_loc = c_parser_peek_token (parser)->location;
> +
> +      if (TREE_CODE (TREE_TYPE (cond.value)) == VECTOR_TYPE)
>
> watch out for whitespace changes - you add a trailing tab here.

Fixed.

>
> +/* Find target specific sequence for vector comparison of
> +   real-type vectors V0 and V1. Returns variable containing
> +   result of the comparison or NULL_TREE in other case.  */
> +static tree
> +vector_fp_compare (gimple_stmt_iterator *gsi, tree rettype,
> +                   enum machine_mode mode, tree v0, tree v1,
> +                   enum tree_code code)
> +{
> +  enum ix86_builtins fcode;
>
> is there a reason we need this and cannot simply provide expanders
> for the named patterns?  We'd need to give them semantics of
> producing all-ones / all-zero masks of course.  Richard, do you think
> that's sensible?  That way we'd avoid the new target hook and could
> simply do optab queries.

I think I don't really understand the idea. How we are going to
represent the fact that we need to convert a given node to the given
machine instruction? May be you could point where the similar
technique is already used.


The new patch will be tested and submitted here soon.


Thanks,
Artem.

>
> Thanks,
> Richard.
>
>> ChangeLog
>>
>> 2011-08-12 Artjoms Sinkarovs <artyom.shinkaroff@gmail.com>
>>
>>       gcc/
>>       * targhooks.c (default_builtin_vec_compare): New hook.
>>       * targhooks.h (default_builtin_vec_compare): New definition.
>>       * target.def (builtin_vec_compare): New hook.
>>       * target.h: New include (gimple.h).
>>       * fold-const.c
>>       (fold_comparison): Adjust x <cmp> x vector operations.
>>       * c-typeck.c (build_binary_op): Allow vector comparison.
>>       (c_obj_common_truthvalue_conversion): Deny vector comparison
>>       inside of if statement.
>>       (build_conditional_expr): Adjust to build VEC_COND_EXPR.
>>       * tree-vect-generic.c (do_compare): Helper function.
>>       (expand_vector_comparison): Check if hardware comparison
>>       is available, if not expand comparison piecewise.
>>       (expand_vector_operation): Handle vector comparison
>>       expressions separately.
>>       (earlyexpand_vec_cond_expr): Expand vector comparison
>>       piecewise.
>>       * Makefile.in: New dependencies.
>>       * tree-cfg.c (verify_gimple_comparison): Allow vector
>>       comparison operations in gimple.
>>       * c-parser.c (c_parser_conditional_expression): Adjust
>>       to handle VEC_COND_EXPR.
>>       * gimplify.c (gimplify_expr): Adjust to handle VEC_COND_EXPR.
>>       * config/i386/i386.c (vector_fp_compare): Build hardware
>>       specific code for floating point vector comparison.
>>       (vector_int_compare): Build hardware specific code for
>>       integer vector comparison.
>>       (ix86_vectorize_builtin_vec_compare): Implementation of
>>       builtin_vec_compare hook.
>>
>>       gcc/testsuite/
>>       * gcc.c-torture/execute/vector-vcond-1.c: New test.
>>       * gcc.c-torture/execute/vector-vcond-2.c: New test.
>>       * gcc.c-torture/execute/vector-compare-1.c: New test.
>>       * gcc.c-torture/execute/vector-compare-2.c: New test.
>>       * gcc.dg/vector-compare-1.c: New test.
>>       * gcc.dg/vector-compare-2.c: New test.
>>
>>       gcc/doc
>>       * extend.texi: Adjust.
>>       * tm.texi: Adjust.
>>       * tm.texi.in: Adjust.
>>
>>
>> bootstrapped and tested on x86_64_unknown-linux.
>>
>>
>> Thanks,
>> Artem Shinkarov.
>>
>



More information about the Gcc-patches mailing list