Vector Comparison patch

Artem Shinkarov artyom.shinkaroff@gmail.com
Wed Aug 17 17:07:00 GMT 2011


On Wed, Aug 17, 2011 at 3:58 PM, Richard Guenther
<richard.guenther@gmail.com> wrote:
> On Wed, Aug 17, 2011 at 3:30 PM, Artem Shinkarov
> <artyom.shinkaroff@gmail.com> wrote:
>> Hi
>>
>> Several comments before the new version of the patch.
>> 1) x != x
>> I am happy to adjust constant_boolean_node, but look at the code
>> around line 9074 in fold-const.c, you will see that x <op> x
>> elimination, even with adjusted constant_boolean_node, will look about
>> the same as my code. Because I need to check the parameters (!FLOAT_P,
>>  HONOR_NANS) on TREE_TYPE (arg0) not arg0, and I need to construct
>> constant_boolean_node (-1), not 1 in case of true.
>> But I will change constant_boolean_node to accept vector types.
>
> Hm, that should be handled transparently if you look at the defines
> of FLOAT_TYPE_P and the HONOR_* macros.
>

Ok, Currently I have this, what do you think:
      int true_val = TREE_CODE (type) == VECTOR_TYPE ? -1 : 0;
      tree arg0_type = TREE_CODE (type) == VECTOR_TYPE
		       ? TREE_TYPE (TREE_TYPE (arg0)) : TREE_TYPE (arg0);
	switch (code)
	  {
	  case EQ_EXPR:
	    if (! FLOAT_TYPE_P (arg0_type)
		|| ! HONOR_NANS (TYPE_MODE (arg0_type)))
	      return constant_boolean_node (true_val, type);
	    break;

	  case GE_EXPR:
	  case LE_EXPR:
	    if (! FLOAT_TYPE_P (arg0_type)
		|| ! HONOR_NANS (TYPE_MODE (arg0_type)))
	      return constant_boolean_node (true_val, type);
	    return fold_build2_loc (loc, EQ_EXPR, type, arg0, arg1);

	  case NE_EXPR:
	    /* For NE, we can only do this simplification if integer
	       or we don't honor IEEE floating point NaNs.  */
	    if (FLOAT_TYPE_P (arg0_type)
		&& HONOR_NANS (TYPE_MODE (arg0_type)))
	      break;
	    /* ... fall through ...  */
	  case GT_EXPR:
	  case LT_EXPR:
	    return constant_boolean_node (0, type);
	  default:
	    gcc_unreachable ();
	  }

Works fine for both vector and scalar cases.

>>
>> 2) comparison vs vcond
>> v = v1 < v2;
>> v = v1 < v2 ? {-1,...} : {0,...};
>>
>> are not the same.
>> 16,25c16,22
>> <       movdqa  .LC1(%rip), %xmm1
>> <       pshufd  $225, %xmm1, %xmm1
>> <       pshufd  $39, %xmm0, %xmm0
>> <       movss   %xmm2, %xmm1
>> <       pshufd  $225, %xmm1, %xmm1
>> <       pcmpgtd %xmm1, %xmm0
>> <       pcmpeqd %xmm1, %xmm1
>> <       pcmpeqd %xmm1, %xmm0
>> <       pand    %xmm1, %xmm0
>> <       movdqa  %xmm0, -24(%rsp)
>> ---
>>>       pshufd  $39, %xmm0, %xmm1
>>>       movdqa  .LC1(%rip), %xmm0
>>>       pshufd  $225, %xmm0, %xmm0
>>>       movss   %xmm2, %xmm0
>>>       pshufd  $225, %xmm0, %xmm0
>>>       pcmpgtd %xmm0, %xmm1
>>>       movdqa  %xmm1, -24(%rsp)
>>
>> So I would keep the hook, it could be removed at any time when the
>> standard expansion will start to work fine.
>
> Which one is which?

You must be joking. :)
The first one (inefficient) is vec0 > vec1 ? {-1,...} : {0,...}
The second is vec0 > vec1. expand_vec_cond_expr is stupid, which is
fine, but it means that we need to construct it carefully.

> I'd really like to make this patch simpler at first,
> and removing that hook is an obvious thing that _should_ be possible,
> even optimally (by fixing the targets).

Ok, let's remove the hook, then could you provide some more
information rather than we just need to do it?

Simple in this case means inefficient -- I would hope to make it
efficient as well.

>> 3) mask ? vec0 : vec1
>> So no, I don't think we need to convert {3, 4, -1, 5} to {0,0,-1,0}
>> (that would surprise my anyway, I'd have expected {-1,-1,-1,-1} ;)).
>>
>> Does OpenCL somehow support you here?
>>
>> OpenCL says that vector operation mask ? vec0 : vec1 is the same as
>> select (vec0, vec1, mask). The semantics of select operation is the
>> following:
>>
>> gentype select (gentype a, gentype b, igentype c)
>> For each component of a vector type,
>> result[i] = if MSB of c[i] is set ? b[i] : a[i].
>>
>> I am not sure what they really understand using the term MSB. As far
>> as I know MSB is Most Significant Bit, so does it mean that in case of
>> 3-bit integer 100 would trigger true but 011 would be still false...
>
> Yes, MSB is Most Significant Bit - that's a somewhat odd definition ;)
>
>> My reading would be that if all bits set, then take the first element,
>> otherwise the second.
>>
>> It is also confusing when  a ? vec0 : vec1, and a != 0 ? vec0 vec1
>> produce different results. So I would stick to all bits set being true
>> scenario.
>
> For the middle-end part definitely.  Thus I'd simply leave the mask alone.
>

Well, it seems very unnatural to me. In the case of scalars mask ?
val0 : val1 would not work the same way as (mask & val0) | (~mask  &
val1), why should we have the same behaviour for the vector stuff?


>> 4) Backend stuff. Ok, we could always fall back to reject the cases
>> when cond and operands have different type, and then fix the backend.
>>
>> Adjustments are coming.
>>
>>
>> Artem.
>>
>

New issue about transforming cond to cons == {-1, ..} in
expand_vec_cond_expr. When I do this:
  icode = get_vcond_icode (vec_cond_type, mode);
  if (icode == CODE_FOR_nothing)
    return 0;

  /* If OP0 is not a comparison, adjust it by transforming to
     the expression OP0 == {-1, -1, ...}  */
  if (!COMPARISON_CLASS_P (op0))
    op0 = build2 (EQ_EXPR, TREE_TYPE (op0), op0,
		  build_vector_from_val (TREE_TYPE (op0),
		  build_int_cst (TREE_TYPE (TREE_TYPE (op0)), -1)));

I run into the trouble that the constant vector which I insert, cannot
be expanded, and  compiler fails with assertion.

This happens on my machine:
Linux temanbk 2.6.38-gentoo-r4 #3 SMP Mon Aug 8 00:32:30 BST 2011
x86_64 Intel(R) Core(TM)2 Duo CPU T7300 @ 2.00GHz GenuineIntel
GNU/Linux

When I run a comparison of vectors of 64-bit integers. They are
lowered in the veclower, but if I insert them in expand_vec_cond_expr,
I receive an error. However expand_vec_cond_expr_p happily accepts it.


Thanks,
Artem.



More information about the Gcc-patches mailing list