This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Vector Comparison patch


On Mon, Aug 15, 2011 at 6:58 PM, Artem Shinkarov
<artyom.shinkaroff@gmail.com> wrote:
> On Mon, Aug 15, 2011 at 3:24 PM, Richard Guenther
> <richard.guenther@gmail.com> wrote:
>> On Fri, Aug 12, 2011 at 4:03 AM, Artem Shinkarov
>> <artyom.shinkaroff@gmail.com> wrote:
>>> Hi
>>>
>>> Here is a completed version of the vector comparison patch we
>>> discussed a long time ago here:
>>> http://gcc.gnu.org/ml/gcc-patches/2010-08/msg01184.html
>>>
>>> The patch implements vector comparison according to the OpenCL
>>> standard, when the result of the comparison of two vectors is vector
>>> of signed integers, where -1 represents true and 0 false.
>>>
>>> The patch implements vector conditional res = VCOND<V1 ? V2 : V3>
>>> which is expanded into:
>>> foreach (i in length (V1)) res[i] = V1 == 0 ? V3[i] : V2[i].
>>
>> Some comments on the patch below. ?First, in general I don't see
>> why you need a new target hook to specify whether to "vectorize"
>> a comparison. ?Why are the existing hooks used by the vectorizer
>> not enough?
>>
>> Index: gcc/fold-const.c
>> ===================================================================
>> --- gcc/fold-const.c ? ?(revision 177665)
>> +++ gcc/fold-const.c ? ?(working copy)
>> @@ -9073,34 +9073,61 @@ fold_comparison (location_t loc, enum tr
>> ? ? ?floating-point, we can only do some of these simplifications.) ?*/
>> ? if (operand_equal_p (arg0, arg1, 0))
>> ? ? {
>> - ? ? ?switch (code)
>> + ? ? ?if (TREE_CODE (TREE_TYPE (arg0)) == VECTOR_TYPE)
>> ? ? ? ?{
>> - ? ? ? case EQ_EXPR:
>> - ? ? ? ? if (! FLOAT_TYPE_P (TREE_TYPE (arg0))
>> - ? ? ? ? ? ? || ! HONOR_NANS (TYPE_MODE (TREE_TYPE (arg0))))
>> - ? ? ? ? ? return constant_boolean_node (1, type);
>>
>> I think this change should go in a separate patch for improved
>> constant folding. ?It shouldn't be necessary for enabling vector compares, no?
>
> Unfortunately no, this case must be covered here, otherwise x != x
> condition fails.

How does it fail?

>>
>> + ? ? ?if (TYPE_UNSIGNED (TREE_TYPE (TREE_TYPE (ifexp))))
>> + ? ? ? ?{
>> + ? ? ? ? ?error_at (colon_loc, "vector comparison must be of signed "
>> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?"integer vector type");
>> + ? ? ? ? ?return error_mark_node;
>> + ? ? ? ?}
>>
>> why that?
>
> Well, later on I rely on this fact. I mean OpenCL says that it should
> return -1 in the sense that all bits set. I don't really know, I can
> support unsigned masks as well, but wouldn't it just introduce a
> source for possible errors. I mean that natural choice for true and
> flase is 0 and 1, not 0 and -1. Anyway I don't have a strong opinion
> there, and I could easily adjust it if we decide that we want it.

I think we want to allow both signed and unsigned masks.

>>
>> + ? ? ?if (TYPE_VECTOR_SUBPARTS (type1) != TYPE_VECTOR_SUBPARTS (type2)
>> + ? ? ? ? ?|| TYPE_VECTOR_SUBPARTS (TREE_TYPE (ifexp))
>> + ? ? ? ? ? ? != TYPE_VECTOR_SUBPARTS (type1))
>> + ? ? ? ?{
>> + ? ? ? ? ?error_at (colon_loc, "vectors of different length found in "
>> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? "vector comparison");
>> + ? ? ? ? ?return error_mark_node;
>> + ? ? ? ?}
>>
>> I miss verification that type1 and type2 are vector types, or is that done
>> elsewhere? ?I think type1 and type2 are already verified to be
>> compatible (but you might double-check). ?At least the above would be
>> redundant with
>
> Thanks, type1 and type2 both vectors comparison is missing, going to
> be added in the new version of the patch.
>>
>> + ? ? ?if (type1 != type2)
>> + ? ? ? ?{
>> + ? ? ? ? ?error_at (colon_loc, "vectors of different types involved in "
>> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? "vector comparison");
>> + ? ? ? ? ?return error_mark_node;
>> + ? ? ? ?}
>
> You are right, what I meant here is TREE_TYPE (type1) != TREE_TYPE
> (type2), because vector (4, int) have the same number of elements as
> vector (4, float). This would be fixed in the new version.
>
>>
>> Joseph may have comments about the fully-fold stuff that follows.
>>
>> + ? ? ?/* Currently the expansion of VEC_COND_EXPR does not allow
>> + ? ? ? ?expessions where the type of vectors you compare differs
>> + ? ? ? ?form the type of vectors you select from. For the time
>> + ? ? ? ?being we insert implicit conversions. ?*/
>> + ? ? ?if ((COMPARISON_CLASS_P (ifexp)
>>
>> Why only for comparison-class?
> Not only, there is || involved:
> (COMPARISON_CLASS_P (ifexp) ?&& TREE_TYPE (TREE_OPERAND (ifexp, 0)) != type1)
> || TREE_TYPE (ifexp) != type1
>
> So if this is a comparison class, we check the first operand, because
> the result of the comparison fits, however the operands could not. In
> case we have an expression of signed vector, we know that we would
> transform it into exp != {0,0,...} in tree-vect-generic.c, but if the
> types of operands do not match we convert them.

Hm, ok ... let's hope we can sort-out the backend issues before this
patch goes in so we can remove this converting stuff.

>>
>> + ? ? ? ? ?&& TREE_TYPE (TREE_OPERAND (ifexp, 0)) != type1)
>> + ? ? ? ? || TREE_TYPE (ifexp) != type1)
>> + ? ? ? {
>> + ? ? ? ? tree comp_type = COMPARISON_CLASS_P (ifexp)
>> + ? ? ? ? ? ? ? ? ? ? ? ? ?? TREE_TYPE (TREE_OPERAND (ifexp, 0))
>> + ? ? ? ? ? ? ? ? ? ? ? ? ?: TREE_TYPE (ifexp);
>> + ? ? ? ? tree vcond;
>> +
>> + ? ? ? ? op1 = convert (comp_type, op1);
>> + ? ? ? ? op2 = convert (comp_type, op2);
>> + ? ? ? ? vcond = build3 (VEC_COND_EXPR, comp_type, ifexp, op1, op2);
>> + ? ? ? ? vcond = convert (type1, vcond);
>> + ? ? ? ? return vcond;
>> + ? ? ? }
>> + ? ? ?else
>> + ? ? ? return build3 (VEC_COND_EXPR, type1, ifexp, op1, op2);
>>
>> In the end we of course will try to fix the middle-end/backends to
>> allow mixed types here as the current restriction doesn't really make sense.
>
> Yes, that would be nice, but these conversions do not really affect
> the code generation, so for the time being I think it is fine to have
> them.
>
>>
>> ? ? case EQ_EXPR:
>> ? ? case NE_EXPR:
>> + ? ? ?if (code0 == VECTOR_TYPE && code1 == VECTOR_TYPE)
>> + ? ? ? ?{
>> + ? ? ? ? ?tree intt;
>> + ? ? ? ? ?if (TREE_TYPE (type0) != TREE_TYPE (type1))
>> + ? ? ? ? ? ?{
>> + ? ? ? ? ? ? ?error_at (location, "comparing vectors with different "
>> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?"element types");
>> + ? ? ? ? ? ? ?return error_mark_node;
>> + ? ? ? ? ? ?}
>> +
>> + ? ? ? ? ?if (TYPE_VECTOR_SUBPARTS (type0) != TYPE_VECTOR_SUBPARTS (type1))
>> + ? ? ? ? ? ?{
>> + ? ? ? ? ? ? ?error_at (location, "comparing vectors with different "
>> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?"number of elements");
>> + ? ? ? ? ? ? ?return error_mark_node;
>> + ? ? ? ? ? ?}
>>
>> as above - compatibility should already be ensured, thus type0 == type1
>> here?
>
> Yeah, we know that they are both vector types, but that is about all
> we know. Anyhow, all these errors are reachable. As an example see
> vector-compare-1.c:
> r4 = x > y; ?/* { dg-error "comparing vectors with different element types" } */
> r8 == r4; /* { dg-error "comparing vectors with different number of
> elements"} */

Ok, I see.

>>
>> +/* Try a hardware hook for vector comparison or
>> + ? extract comparison piecewise. ?*/
>> +static tree
>> +expand_vector_comparison (gimple_stmt_iterator *gsi, tree type, tree op0,
>> + ? ? ? ? ? ? ? ? ? ? ? ? ?tree op1, enum tree_code code)
>>
>> comments should mention and describe all function arguments.
>
> Ok, coming in the new version of the patch.
>
>> +/* Expand vector condition EXP which should have the form
>> + ? VEC_COND_EXPR<cond, vec0, vec1> into the following
>> + ? vector:
>> + ? ? {cond[i] != 0 ? vec0[i] : vec1[i], ... }
>> + ? i changes from 0 to TYPE_VECTOR_SUBPARTS (TREE_TYPE (vec0)). ?*/
>> +static tree
>> +earlyexpand_vec_cond_expr (gimple_stmt_iterator *gsi, tree exp)
>>
>> that would be expand_vec_cond_expr_piecewise, no?
>
> Adjusted.
>
>>
>> + ?/* Ensure that we will be able to expand vector comparison
>> + ? ? in case it is not supported by the architecture. ?*/
>> + ?gcc_assert (COMPARISON_CLASS_P (cond));
>>
>> that looks dangerous to me - did you try
>>
>> ?vec = v1 <= v2;
>> ?vec2 = vec ? v1 : v2;
>>
>> without optimization?
>
> Sure, tests should cover this case.
> I have this assertion there because only two cases are possible:
> 1) it is a comparison
> 2) function callee converted expr to expr != {0,0,...}
> So we should be perfectly fine.
>
>>
>> + ?/* Check if we need to expand vector condition inside of
>> + ? ? VEC_COND_EXPR. ?*/
>> + ?var = create_tmp_reg (TREE_TYPE (cond), "cond");
>> + ?new_rhs = expand_vector_comparison (gsi, TREE_TYPE (cond),
>> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?TREE_OPERAND (cond, 0),
>> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? TREE_OPERAND (cond, 1),
>> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?TREE_CODE (cond));
>>
>> That unconditionally expands, so no need for "Check".
>
> Ok.
>
>>
>> + ?/* Expand VEC_COND_EXPR into a vector of scalar COND_EXPRs. ?*/
>> + ?v = VEC_alloc(constructor_elt, gc, nunits);
>> + ?for (i = 0; i < nunits;
>> + ? ? ? i += 1, index = int_const_binop (PLUS_EXPR, index, part_width))
>> + ? ?{
>> + ? ? ?tree tcond = tree_vec_extract (gsi, inner_type, var, part_width, index);
>> + ? ? ?tree a = tree_vec_extract (gsi, inner_type, vec0, part_width, index);
>> + ? ? ?tree b = tree_vec_extract (gsi, inner_type, vec1, part_width, index);
>> + ? ? ?tree rcond = gimplify_build2 (gsi, NE_EXPR, boolean_type_node, tcond,
>> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?build_int_cst (inner_type ,0));
>>
>> I seriously doubt that when expanding this part piecewise expanding
>> the mask first in either way is going to be beneficial. ?Instead I would
>> suggest to "inline" the comparison here. ?Thus instead of
>
> Well, the ting is that, if expand_vector_comparison, would insert
> builtin there rather than expanding the code piecewise, I'll have to
> do the comparison with 0 anyway, because true is expressed as -1
> there.
>
> Well, I would hope that in case we have:
> c_0 = a_0 > b_0;
> d_0 = c_0 != 0;
>
> {d_0, d_1,...}
>
> all the d_n should be constant-folded, or should I pull fold explicitly here?
>
> 1) I construct the mask
>>
>> ?mask =
>> ? ? ? ? = { mask[0] != 0 ? ... }
>>
>> do
>>
>> ? ? ? ? ?= { c0[0] < c1[0] ? ..., }
>>
>> or even expand the ? : using mask operations if we efficiently can
>> create that mask.
>>
>
> I assume that if we cannot expand VEC_COND_EXPR, then masking the
> elements is a problem for us. Otherwise VEC_COND_EXPE expansion has a
> bug somewhere. Or I am wrong somewhere?

I think we can always do bitwise operations, so if we can get at the
mask vector we are fine.

I was thinking about how the case of explicitly computing the value of
v1 < v2 into a vector vs. a condition inside a VEC_COND_EXPR should
be handled.  If the target produces a mask of condition codes for
a comparison then it might be able to efficiently expand a VEC_COND_EXPR.
It could as well generate a mask via expanding v1 < v2 ? -1 : 0 then.
A similar case is for AMD XOP which can expand mask ? v1 : v2
with a single instruction (so even without seeing a comparison).

Basically, if we can get at the mask we should use that to do the
vector selection in parallel via (v1 & mask) | (v2 & ~mask).

If we cannot even get at the mask then we can build the result
vector piecewise as { v1[0] < v2[0] ? v1[0] : v2[0], .... } etc.

>>
>> + ?/* Check if VEC_COND_EXPR is supported in hardware within the
>> + ? ? given types. ?*/
>> + ?if (code == VEC_COND_EXPR)
>> + ? ?{
>> + ? ? ?tree exp = gimple_assign_rhs1 (stmt);
>> + ? ? ?tree cond = TREE_OPERAND (exp, 0);
>> +
>> + ? ? ?/* If VEC_COND_EXPR is presented as A ? V0 : V1, we
>> + ? ? ? ?change it to A != {0,0,...} ? V0 : V1 ?*/
>> + ? ? ?if (!COMPARISON_CLASS_P (cond))
>> + ? ? ? TREE_OPERAND (exp, 0) =
>> + ? ? ? ? ? build2 (NE_EXPR, TREE_TYPE (cond), cond,
>> + ? ? ? ? ? ? ? ? ? build_vector_from_val (TREE_TYPE (cond),
>> + ? ? ? ? ? ? ? ? ? ? build_int_cst (TREE_TYPE (TREE_TYPE (cond)), 0)));
>>
>> That looks inefficient as well. ?Iff we know that the mask is always
>> either {-1, -1 ..} or {0, 0 ...} then we can expand the ? : using
>> bitwise operations (see what the i?86 expander does, for example).
>
> This is a requirement of VEC_COND_EXPR, I need to pass 4 parameters,
> not 3, that is why I introduce this fake {0,0,..} here.

Sure, but if you look at expand_vec_cond_expr_p then you don't need
that, and this fake comparison should instead be produced by the
expander (or really avoided by maybe splitting up the named pattern
into two).

It's for sure not necessary for earlyexpand_vec_cond_expr (but instead
makes it less efficient - with just the mask it can do the bitwise
fallback easily).

>>
>> @@ -471,6 +603,7 @@ expand_vector_operations_1 (gimple_stmt_
>>
>> ? gcc_assert (code != CONVERT_EXPR);
>>
>> +
>> ? /* The signedness is determined from input argument. ?*/
>> ? if (code == VEC_UNPACK_FLOAT_HI_EXPR
>> ? ? ? || code == VEC_UNPACK_FLOAT_LO_EXPR)
>>
>> spurious whitespace change.
>
> Fixed.
>>
>> Index: gcc/tree-cfg.c
>> ===================================================================
>> --- gcc/tree-cfg.c ? ? ?(revision 177665)
>> +++ gcc/tree-cfg.c ? ? ?(working copy)
>> @@ -3191,6 +3191,38 @@ verify_gimple_comparison (tree type, tre
>> ? ? ? return true;
>> ? ? }
>>
>> + ?if (TREE_CODE (op0_type) == VECTOR_TYPE
>> + ? ? ?&& TREE_CODE (op1_type) == VECTOR_TYPE
>> + ? ? ?&& TREE_CODE (type) == VECTOR_TYPE)
>> + ? ?{
>>
>> this should check TREE_CODE (type) == VECTOR_TYPE only
>> and then verify the comparison operands are actually vectors.
>
> Yes, you are right, adjusted.
>
>>
>> + ? ? ?if (TREE_TYPE (op0_type) != TREE_TYPE (op1_type))
>> + ? ? ? ?{
>> + ? ? ? ? ?error ("invalid vector comparison, vector element type mismatch");
>> + ? ? ? ? ?debug_generic_expr (op0_type);
>> + ? ? ? ? ?debug_generic_expr (op1_type);
>> + ? ? ? ? ?return true;
>> + ? ? ? ?}
>>
>> this needs to use code similar to the scalar variant,
>>
>> ? ? ? ? ?!useless_type_conversion_p (op0_type, op1_type)
>> ? ? ? ? ?&& !useless_type_conversion_p (op1_type, op0_type)
>>
>> which also makes the first TYPE_VECTOR_SUBPARTS redundant.
>>
>
> Fixed.
>
>> + ? ? ?if (TYPE_VECTOR_SUBPARTS (type) != TYPE_VECTOR_SUBPARTS (op0_type)
>> + ? ? ? ? ?&& TYPE_PRECISION (TREE_TYPE (op0_type))
>> + ? ? ? ? ? ? != TYPE_PRECISION (TREE_TYPE (type)))
>> + ? ? ? ?{
>> + ? ? ? ? ?error ("invalid vector comparison resulting type");
>> + ? ? ? ? ?debug_generic_expr (type);
>> + ? ? ? ? ?return true;
>> + ? ? ? ?}
>>
>> I think you can drop the TYPE_PRECISION check. ?We might want to
>> assert that a vector element types precision always matches its
>> mode precision (in make_vector_type).
>
> I would leave it for a while. During the optimisation you can
> construct some strange things, so I would better make verifier
> resistant to the all kind of stuff.

Ok.

>>
>> Index: gcc/c-parser.c
>> ===================================================================
>> --- gcc/c-parser.c ? ? ?(revision 177665)
>> +++ gcc/c-parser.c ? ? ?(working copy)
>> @@ -5337,8 +5337,17 @@ c_parser_conditional_expression (c_parse
>> ? if (c_parser_next_token_is (parser, CPP_COLON))
>> ? ? {
>> ? ? ? tree eptype = NULL_TREE;
>> -
>> +
>> ? ? ? middle_loc = c_parser_peek_token (parser)->location;
>> +
>> + ? ? ?if (TREE_CODE (TREE_TYPE (cond.value)) == VECTOR_TYPE)
>>
>> watch out for whitespace changes - you add a trailing tab here.
>
> Fixed.
>
>>
>> +/* Find target specific sequence for vector comparison of
>> + ? real-type vectors V0 and V1. Returns variable containing
>> + ? result of the comparison or NULL_TREE in other case. ?*/
>> +static tree
>> +vector_fp_compare (gimple_stmt_iterator *gsi, tree rettype,
>> + ? ? ? ? ? ? ? ? ? enum machine_mode mode, tree v0, tree v1,
>> + ? ? ? ? ? ? ? ? ? enum tree_code code)
>> +{
>> + ?enum ix86_builtins fcode;
>>
>> is there a reason we need this and cannot simply provide expanders
>> for the named patterns? ?We'd need to give them semantics of
>> producing all-ones / all-zero masks of course. ?Richard, do you think
>> that's sensible? ?That way we'd avoid the new target hook and could
>> simply do optab queries.
>
> I think I don't really understand the idea. How we are going to
> represent the fact that we need to convert a given node to the given
> machine instruction? May be you could point where the similar
> technique is already used.

In all places we check optab_handler (op, mode) != CODE_FOR_nothing.
We have eq_optab for example, so optab_handler (eq_optab, V4SImode)
would get you the instruction sequence for a comparison of V4SImode
vectors.  That isn't yet properly defined what it should return.

Otherwise I'd say we should ask the target to expand
v1 < v2 as VEC_COND_EXPR (v1 < v2, -1, 0) instead.  That one could
as well special-case the -1 and 0 result vectors (and maybe it already
does).

Richard.
>
> The new patch will be tested and submitted here soon.
>
>
> Thanks,
> Artem.
>
>>
>> Thanks,
>> Richard.
>>
>>> ChangeLog
>>>
>>> 2011-08-12 Artjoms Sinkarovs <artyom.shinkaroff@gmail.com>
>>>
>>> ? ? ? gcc/
>>> ? ? ? * targhooks.c (default_builtin_vec_compare): New hook.
>>> ? ? ? * targhooks.h (default_builtin_vec_compare): New definition.
>>> ? ? ? * target.def (builtin_vec_compare): New hook.
>>> ? ? ? * target.h: New include (gimple.h).
>>> ? ? ? * fold-const.c
>>> ? ? ? (fold_comparison): Adjust x <cmp> x vector operations.
>>> ? ? ? * c-typeck.c (build_binary_op): Allow vector comparison.
>>> ? ? ? (c_obj_common_truthvalue_conversion): Deny vector comparison
>>> ? ? ? inside of if statement.
>>> ? ? ? (build_conditional_expr): Adjust to build VEC_COND_EXPR.
>>> ? ? ? * tree-vect-generic.c (do_compare): Helper function.
>>> ? ? ? (expand_vector_comparison): Check if hardware comparison
>>> ? ? ? is available, if not expand comparison piecewise.
>>> ? ? ? (expand_vector_operation): Handle vector comparison
>>> ? ? ? expressions separately.
>>> ? ? ? (earlyexpand_vec_cond_expr): Expand vector comparison
>>> ? ? ? piecewise.
>>> ? ? ? * Makefile.in: New dependencies.
>>> ? ? ? * tree-cfg.c (verify_gimple_comparison): Allow vector
>>> ? ? ? comparison operations in gimple.
>>> ? ? ? * c-parser.c (c_parser_conditional_expression): Adjust
>>> ? ? ? to handle VEC_COND_EXPR.
>>> ? ? ? * gimplify.c (gimplify_expr): Adjust to handle VEC_COND_EXPR.
>>> ? ? ? * config/i386/i386.c (vector_fp_compare): Build hardware
>>> ? ? ? specific code for floating point vector comparison.
>>> ? ? ? (vector_int_compare): Build hardware specific code for
>>> ? ? ? integer vector comparison.
>>> ? ? ? (ix86_vectorize_builtin_vec_compare): Implementation of
>>> ? ? ? builtin_vec_compare hook.
>>>
>>> ? ? ? gcc/testsuite/
>>> ? ? ? * gcc.c-torture/execute/vector-vcond-1.c: New test.
>>> ? ? ? * gcc.c-torture/execute/vector-vcond-2.c: New test.
>>> ? ? ? * gcc.c-torture/execute/vector-compare-1.c: New test.
>>> ? ? ? * gcc.c-torture/execute/vector-compare-2.c: New test.
>>> ? ? ? * gcc.dg/vector-compare-1.c: New test.
>>> ? ? ? * gcc.dg/vector-compare-2.c: New test.
>>>
>>> ? ? ? gcc/doc
>>> ? ? ? * extend.texi: Adjust.
>>> ? ? ? * tm.texi: Adjust.
>>> ? ? ? * tm.texi.in: Adjust.
>>>
>>>
>>> bootstrapped and tested on x86_64_unknown-linux.
>>>
>>>
>>> Thanks,
>>> Artem Shinkarov.
>>>
>>
>


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]