[PATCH PR71347][Partial revert r235513]Compute cost for all uses in group

Bin.Cheng amker.cheng@gmail.com
Mon Jun 20 09:11:00 GMT 2016


On Mon, Jun 20, 2016 at 9:20 AM, Bin.Cheng <amker.cheng@gmail.com> wrote:
> On Mon, Jun 20, 2016 at 9:18 AM, Christophe Lyon
> <christophe.lyon@linaro.org> wrote:
>> On 18 June 2016 at 10:59, Andreas Schwab <schwab@linux-m68k.org> wrote:
>>> Bin Cheng <Bin.Cheng@arm.com> writes:
>>>
>>>> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr71347.c b/gcc/testsuite/gcc.dg/tree-ssa/pr71347.c
>>>> new file mode 100644
>>>> index 0000000..7e5ad49
>>>> --- /dev/null
>>>> +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr71347.c
>>>> @@ -0,0 +1,17 @@
>>>> +/* { dg-do compile } */
>>>> +/* { dg-options "-O2 -fdump-tree-optimized" } */
>>>> +
>>>> +double in;
>>>> +extern void Write (double);
>>>> +void foo (void)
>>>> +{
>>>> +  static double X[9];
>>>> +  int i;
>>>> +        X[1] = in * in;
>>>> +        for (i = 2; i <= 8; i++)
>>>> +            X[i] = X[i - 1] * X[1];
>>>> +        Write (X[5]);
>>>> +}
>>>> +
>>>> +/* Load of X[i - i] can be omitted by reusing X[i] in previous iteration.  */
>>>> +/* { dg-final { scan-tree-dump-not ".* = MEM.*;" "optimized"} } */
>>>
>>> The test fails on ia64, this is what I get in .optimized:
>>>
>>
>> The test passes on aarch64, but fails on arm targets. Maybe that's
>> easier for Bin to reproduce?
> Hi all,
> Sorry for the inconvenience, I will have a look at the two targets.
Hmm, the failure is because post-increment is enabled in IVOPT on both
ia64 and arm.  As a result, IVOPT tends to choose iv_cand which is
incremented after the first store.  The dump for IVOPT is as:


  <bb 3>:
  # prephitmp_20 = PHI <pretmp_15(4), _2(2)>
  # prephitmp_22 = PHI <pretmp_21(4), _2(2)>
  # ivtmp.23_16 = PHI <ivtmp.23_8(4), ivtmp.23_23(2)>
  _6 = prephitmp_20 * prephitmp_22;
  _5 = (void *) ivtmp.23_16;
  MEM[base: _5, offset: 0B] = _6;
  ivtmp.23_8 = ivtmp.23_16 + 8;
  if (ivtmp.23_8 != _27)
    goto <bb 4>;
  else
    goto <bb 5>;

  <bb 4>:
  _24 = (void *) ivtmp.23_8;
  _25 = _24 + 18446744073709551608;
  pretmp_15 = MEM[base: _25, offset: 0B];
  pretmp_21 = X[1];
  goto <bb 3>;

Note address expressions of the load and store now are of different
forms, though of the same value.  I will look into DOM to see if it
can be improved to handle address expressions in different forms.
Also I believe this case is long time failed before it was introduced,
I will mark it XFAIL for the moment.

Thanks,
bin
>
> Thanks,
> bin
>>
>> Christophe
>>
>>> ;; Function foo (foo, funcdef_no=0, decl_uid=1387, cgraph_uid=0, symbol_order=1)
>>>
>>> foo ()
>>> {
>>>   int i;
>>>   static double X[9];
>>>   double in.0_1;
>>>   double in.1_2;
>>>   double _3;
>>>   int _4;
>>>   double _5;
>>>   double _6;
>>>   double _7;
>>>   double _8;
>>>
>>>   <bb 2>:
>>>   in.0_1 = in;
>>>   in.1_2 = in;
>>>   _3 = in.0_1 * in.1_2;
>>>   X[1] = _3;
>>>   i_13 = 2;
>>>   goto <bb 4>;
>>>
>>>   <bb 3>:
>>>   _4 = i_9 + -1;
>>>   _5 = X[_4];
>>>   _6 = X[1];
>>>   _7 = _5 * _6;
>>>   X[i_9] = _7;
>>>   i_15 = i_9 + 1;
>>>
>>>   <bb 4>:
>>>   # i_9 = PHI <i_13(2), i_15(3)>
>>>   if (i_9 <= 8)
>>>     goto <bb 3>;
>>>   else
>>>     goto <bb 5>;
>>>
>>>   <bb 5>:
>>>   _8 = X[5];
>>>   Write (_8);
>>>   return;
>>>
>>> }
>>>
>>>
>>> Andreas.
>>>
>>> --
>>> Andreas Schwab, schwab@linux-m68k.org
>>> GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
>>> "And now for something completely different."



More information about the Gcc-patches mailing list