This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: fold x ^ y to 0 if x == y


On 20 July 2016 at 23:07, Prathamesh Kulkarni
<prathamesh.kulkarni@linaro.org> wrote:
> On 20 July 2016 at 16:35, Richard Biener <rguenther@suse.de> wrote:
>> On Wed, 20 Jul 2016, Prathamesh Kulkarni wrote:
>>
>>> On 8 July 2016 at 12:29, Richard Biener <rguenther@suse.de> wrote:
>>> > On Fri, 8 Jul 2016, Richard Biener wrote:
>>> >
>>> >> On Fri, 8 Jul 2016, Prathamesh Kulkarni wrote:
>>> >>
>>> >> > Hi Richard,
>>> >> > For the following test-case:
>>> >> >
>>> >> > int f(int x, int y)
>>> >> > {
>>> >> >    int ret;
>>> >> >
>>> >> >    if (x == y)
>>> >> >      ret = x ^ y;
>>> >> >    else
>>> >> >      ret = 1;
>>> >> >
>>> >> >    return ret;
>>> >> > }
>>> >> >
>>> >> > I was wondering if x ^ y should be folded to 0 since
>>> >> > it's guarded by condition x == y ?
>>> >> >
>>> >> > optimized dump shows:
>>> >> > f (int x, int y)
>>> >> > {
>>> >> >   int iftmp.0_1;
>>> >> >   int iftmp.0_4;
>>> >> >
>>> >> >   <bb 2>:
>>> >> >   if (x_2(D) == y_3(D))
>>> >> >     goto <bb 3>;
>>> >> >   else
>>> >> >     goto <bb 4>;
>>> >> >
>>> >> >   <bb 3>:
>>> >> >   iftmp.0_4 = x_2(D) ^ y_3(D);
>>> >> >
>>> >> >   <bb 4>:
>>> >> >   # iftmp.0_1 = PHI <iftmp.0_4(3), 1(2)>
>>> >> >   return iftmp.0_1;
>>> >> >
>>> >> > }
>>> >> >
>>> >> > The attached patch tries to fold for above case.
>>> >> > I am checking if op0 and op1 are equal using:
>>> >> > if (bitmap_intersect_p (vr1->equiv, vr2->equiv)
>>> >> >    && operand_equal_p (vr1->min, vr1->max)
>>> >> >    && operand_equal_p (vr2->min, vr2->max))
>>> >> >   { /* equal /* }
>>> >> >
>>> >> > I suppose intersection would check if op0 and op1 have equivalent ranges,
>>> >> > and added operand_equal_p check to ensure that there is only one
>>> >> > element within the range. Does that look correct ?
>>> >> > Bootstrap+test in progress on x86_64-unknown-linux-gnu.
>>> >>
>>> >> I think VRP is the wrong place to catch this and DOM should have but it
>>> >> does
>>> >>
>>> >> Optimizing block #3
>>> >>
>>> >> 1>>> STMT 1 = x_2(D) le_expr y_3(D)
>>> >> 1>>> STMT 1 = x_2(D) ge_expr y_3(D)
>>> >> 1>>> STMT 1 = x_2(D) eq_expr y_3(D)
>>> >> 1>>> STMT 0 = x_2(D) ne_expr y_3(D)
>>> >> 0>>> COPY x_2(D) = y_3(D)
>>> >> 0>>> COPY y_3(D) = x_2(D)
>>> >> Optimizing statement ret_4 = x_2(D) ^ y_3(D);
>>> >>   Replaced 'x_2(D)' with variable 'y_3(D)'
>>> >>   Replaced 'y_3(D)' with variable 'x_2(D)'
>>> >>   Folded to: ret_4 = x_2(D) ^ y_3(D);
>>> >> LKUP STMT ret_4 = x_2(D) bit_xor_expr y_3(D)
>>> >>
>>> >> heh, registering both reqivalencies is obviously not going to help...
>>> >>
>>> >> The 2nd equivalence is from doing
>>> >>
>>> >>       /* We already recorded that LHS = RHS, with canonicalization,
>>> >>          value chain following, etc.
>>> >>
>>> >>          We also want to record RHS = LHS, but without any
>>> >> canonicalization
>>> >>          or value chain following.  */
>>> >>       if (TREE_CODE (rhs) == SSA_NAME)
>>> >>         const_and_copies->record_const_or_copy_raw (rhs, lhs,
>>> >>                                                     SSA_NAME_VALUE (rhs));
>>> >>
>>> >> generally recording both is not helpful.  Jeff?  This seems to be
>>> >> r233207 (fix for PR65917) which must have regressed this testcase.
>>> >
>>> > Just verified it works fine on the GCC 5 branch:
>>> >
>>> > Optimizing block #3
>>> >
>>> > 0>>> COPY y_3(D) = x_2(D)
>>> > 1>>> STMT 1 = x_2(D) le_expr y_3(D)
>>> > 1>>> STMT 1 = x_2(D) ge_expr y_3(D)
>>> > 1>>> STMT 1 = x_2(D) eq_expr y_3(D)
>>> > 1>>> STMT 0 = x_2(D) ne_expr y_3(D)
>>> > Optimizing statement ret_4 = x_2(D) ^ y_3(D);
>>> >   Replaced 'y_3(D)' with variable 'x_2(D)'
>>> > Applying pattern match.pd:240, gimple-match.c:11346
>>> > gimple_simplified to ret_4 = 0;
>>> >   Folded to: ret_4 = 0;
>>> I have reported it as PR71947.
>>> Could you help me point out how to fix this ?
>>
>> Not record both equivalences.  This might break the testcase it was
>> introduced for (obviously).  Which is why I CCed Jeff for his opinion.
> Well, folding happens for x - y, if x == y.
>
> int f(int x, int y)
> {
>   int ret;
>   if (x == y)
>     ret = x - y;
>   else
>     ret = 1;
>
>   return ret;
> }
Oops wrong test-case, the dump below is of the following test-case:
int f(int x, int y)
{
  int ret = 10;
  if (x == y)
    ret = x  -  y;
  return ret;
}
>
> For the above test-case, extract_range_from_binary_expr_1()
> determines that range of ret = [0, 0]
> and propagates it.
>
> vrp1 dump:
> f (int x, int y)
> {
>   int ret;
>
>   <bb 2>:
>   if (x_2(D) == y_3(D))
>     goto <bb 3>;
>   else
>     goto <bb 4>;
>
>   <bb 3>:
>   ret_4 = x_2(D) - y_3(D);
>
>   <bb 4>:
>   # ret_1 = PHI <0(3), 10(2)>
>   return ret_1;
>
> }
>
> Then the dce pass removes ret_4 = x_2(D) - y_3(D) since it's
> redundant.
> However it appears vrp fails to notice the equality for the following test-case,
> and sets range for ret to VARYING.
>
> int f(int x, int y, int a, int b)
> {
>   int ret = 10;
>   if (a == x
>       && b == y
>       && a == b)
>     ret = x - y;
>
>   return ret;
> }
>
> Looking at the vrp dump, shows the following form
> after inserting ASSERT_EXPR:
>
> SSA form after inserting ASSERT_EXPRs
> f (int x, int y, int a, int b)
> {
>   int ret;
>   _Bool _1;
>   _Bool _2;
>   _Bool _3;
>
>   <bb 2>:
>   _1 = a_5(D) == x_6(D);
>   _2 = b_7(D) == y_8(D);
>   _3 = _1 & _2;
>   if (_3 != 0)
>     goto <bb 3>;
>   else
>     goto <bb 5>;
>
>   <bb 3>:
>   a_11 = ASSERT_EXPR <a_5(D), a_5(D) == x_6(D)>;
>   x_12 = ASSERT_EXPR <x_6(D), x_6(D) == a_11>;
>   b_13 = ASSERT_EXPR <b_7(D), b_7(D) == y_8(D)>;
>   y_14 = ASSERT_EXPR <y_8(D), y_8(D) == b_13>;
>   if (a_11 == b_13)
>     goto <bb 4>;
>   else
>     goto <bb 5>;
>
>   <bb 4>:
>   ret_9 = x_12 ^ y_14;
>
>   <bb 5>:
>   # ret_4 = PHI <10(2), 10(3), ret_9(4)>
>   return ret_4;
>
> }
>
> Shouldn't there also be a range assertion for a_11 == b_13 ?
>
> Should we modify extract_range_from_binary_expr_1 to handle
> this case for bit_xor_expr so similar to minus_expr case,
> the range [0, 0] would get propagated and make ret = x ^ y redundant
> which will then be removed by dce pass ?
> I have attached untested patch for that.
> Does it look OK ?
> (It also doesn't handle  a == x && b == y && a == b case).
>
> Thanks,
> Prathamesh
>>
>> Richard.
>>
>> --
>> Richard Biener <rguenther@suse.de>
>> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]