This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug tree-optimization/71237] [7 regression] scev tests failing after pass reorganization


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71237

--- Comment #1 from Andre Vieira <andre.simoesdiasvieira at arm dot com> ---
So yes disabling LIM will make the tests "PASS". Though I couldnt find an
option to do this, I disabled the pass by changing passes.def, so that doesnt
sound like a good idea to test SCCP. 

However, it might be good to point out that at least for arm-none-eabi and
x86_64-pc-linux-gnu these tests are no longer testing SCCP, SCCP will not
change this code. I looked at the dumps and compared assembly of -O2 with and
without '-fno-tree-scev-cprop'.

On arm-none-eabi, it used to be IVOPTS that made the test pass, it would reuse
the same ivtmp for computing the address used by the memory dereference and the
a_p assignment. Now due to the reordering of LIM, it will no longer do this.

On x86_64 I see the following code coming out of the OPTIMIZED dump for the
scev-4.c case:

...
  <bb 4>:
  # ivtmp.10_14 = PHI <_24(3), ivtmp.10_25(4)>
  i_11 = (int) ivtmp.10_14;
  MEM[symbol: a, index: ivtmp.10_14, step: 8, offset: 4B] = 100;
  ivtmp.10_25 = ivtmp.10_14 + _24;
  i_22 = (int) ivtmp.10_25;
  if (i_22 <= 999)
    goto <bb 4>;
  else
    goto <bb 5>;

  <bb 5>:
  _2 = (sizetype) i_11;
  _3 = _2 * 8;
  _10 = _3 + 4;
  _1 = &a + _10;
  a_p = _1;
...

Now yes the scan-times &a will pass, but thats because the MEM is using
symbol:a instead of base: &a. Not sure this can be qualified as a proper PASS.
Disabling LIM here the same way I did before, that is removing the pass_lim
after pass_laddress and before pass_split_crit_edges generates the following
OPTIMIZED dump:

...
  <bb 4>:
  _16 = (sizetype) k_4(D);
  _15 = _16 * 8;
  _21 = _15 + 4;
  _22 = &a + _21;
  ivtmp.9_14 = (unsigned long) _22;

  <bb 5>:
  # i_11 = PHI <k_4(D)(4), i_8(5)>
  # ivtmp.9_13 = PHI <ivtmp.9_14(4), ivtmp.9_17(5)>
  _1 = (int *) ivtmp.9_13;
  MEM[base: _1, offset: 0B] = 100;
  i_8 = k_4(D) + i_11;
  ivtmp.9_17 = ivtmp.9_13 + _15;
  if (i_8 <= 999)
    goto <bb 5>;
  else
    goto <bb 6>;

  <bb 6>:
  a_p = _1;
...

I prefer this output, since you loose the needless tailing address calculation.
I am not so sure the eventually generated assembly is better in this case
though. Ill add both as attachments.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]