Bug 108523 - [13 Regression] -O1 -fcode-hoisting causes long compilation time ?
Summary: [13 Regression] -O1 -fcode-hoisting causes long compilation time ?
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: 13.0
: P1 normal
Target Milestone: 13.0
Assignee: Richard Biener
URL:
Keywords: compile-time-hog
Depends on:
Blocks:
 
Reported: 2023-01-24 17:00 UTC by David Binderman
Modified: 2023-01-26 07:40 UTC (History)
2 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2023-01-25 00:00:00


Attachments
C source code (23.10 KB, text/plain)
2023-01-24 17:00 UTC, David Binderman
Details

Note You need to log in before you can comment on or make changes to this bug.
Description David Binderman 2023-01-24 17:00:56 UTC
Created attachment 54337 [details]
C source code

The attached C code takes about 0.1 seconds to compile normally:

$ (ulimit -t 300; time ../results.20230124.release/bin/gcc -c -w -O1    -fno-var-tracking bug875.c)

real	0m0.133s
user	0m0.095s
sys	0m0.013s

Add in -fcode-hoisting and it seems to take a lot longer:

$ (ulimit -t 300; time ../results.20230124.release/bin/gcc -c -w -O1  -fcode-hoisting  -fno-var-tracking bug875.c)
gcc: fatal error: Killed signal terminated program cc1
compilation terminated.

real	5m3.155s
user	4m59.955s
sys	0m0.017s

$ ../results.20230124.release/bin/gcc -v
Using built-in specs.
COLLECT_GCC=../results.20230124.release/bin/gcc
COLLECT_LTO_WRAPPER=/home/dcb36/gcc/results.20230124.release/libexec/gcc/x86_64-pc-linux-gnu/13.0.1/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: ../trunk.d1/configure --prefix=/home/dcb36/gcc/results.20230124.release --disable-multilib --disable-bootstrap --with-pkgversion=e304e9283a97e28d --enable-checking=release --enable-languages=c
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 13.0.1 20230124 (experimental) (e304e9283a97e28d)
Comment 1 David Binderman 2023-01-24 17:04:55 UTC
Also won't run to completion with a ulimit of 750 seconds. 

Trying 1200 seconds.
Comment 2 David Binderman 2023-01-24 17:29:37 UTC
Doesn't complete in 1200 seconds.
Comment 3 David Binderman 2023-01-24 17:42:35 UTC
Problem seems to start sometime before git hash g:9b111debbfb79a0a,
dated 20221229.

I'll try a build of a month earlier.
Comment 4 David Binderman 2023-01-24 18:00:27 UTC
Seems to run fine in about 0.1 seconds with g:4d08c674b0114622,
dated 20221129.

That seems to be about 533 commits.

I'll have a go at a git bisect.
Comment 5 David Binderman 2023-01-24 18:25:43 UTC
Current range seems to be g:4d08c674b0114622 .. g:400d9fc1f0433611
which is 133 commits.
Comment 6 David Binderman 2023-01-24 18:40:36 UTC
Git range now seems to be g:4d08c674b0114622 .. g:b2aa75ded65f8c02
which is a range of 33 commits.
Comment 7 David Binderman 2023-01-24 18:56:07 UTC
Git range now seems to be g:4d08c674b0114622 .. g:36cabc257dfb7dd4
which is 8 commits.
Comment 8 David Binderman 2023-01-24 19:41:31 UTC
Commit g:fd8dd6c0384969170e594be34da278a072d5eb76 dated 2022-11-29
looks to be the one that causes the bug.

fd8dd6c0384969170e594be34da278a072d5eb76 is the first bad commit
commit fd8dd6c0384969170e594be34da278a072d5eb76
Author: Richard Biener <rguenther@suse.de>
Date:   Tue Nov 29 12:56:22 2022 +0100


Over to Richard for their best advice.
Comment 9 Richard Biener 2023-01-25 07:35:45 UTC
I will have a look.
Comment 10 Richard Biener 2023-01-25 10:06:00 UTC
We don't converge:

Predication says _347 and _346 are equal on edge 108 -> 20
Setting value number of g_6_lsm.126_456 to _346 (changed)
Iterating to 25 BB20
...
Setting value number of g_6_lsm.126_456 to _347 (changed)
Iterating to 25 BB20
...
Predication says _347 and _346 are equal on edge 108 -> 20
Setting value number of g_6_lsm.126_456 to _346 (changed)
Iterating to 25 BB20
...
Setting value number of g_6_lsm.126_456 to _347 (changed)
Iterating to 25 BB20
...
Comment 11 Richard Biener 2023-01-25 11:19:37 UTC
So the issue is that we oscillate between recognizing the PHI as degenerate
because of an equivalence recorded on the non-backedge(!) to the backedge value
which in turn changes because of this change:

Value numbering stmt = g_6_lsm.126_456 = PHI <g_6_lsm.126_304(19), g_6_lsm.126_338(108)>
Setting value number of g_6_lsm.126_456 to _347
...
Value numbering stmt = _350 = g_6_lsm.126_456;
Setting value number of _350 to _347 (changed)
_347 is available for _347
Value numbering stmt = _352 = _348 ^ _350;
_347 is available for _347
Match-and-simplified _348 ^ _350 to _346
RHS _348 ^ _350 simplified to _346
Setting value number of _352 to _346 (changed)
_346 is available for _346
Value numbering stmt = g_6_lsm.126_304 = _352;
Setting value number of g_6_lsm.126_304 to _346 (changed)
_346 is available for _346
Looking for changed values of backedge 19->20 destination PHIs
Setting value number of g_481.64_448 to g_481.64_448
Setting value number of .MEM_464 to .MEM_409
Predication says _347 and _346 are equal on edge 108 -> 20
Setting value number of g_6_lsm.126_456 to _346 (changed)
Iterating to 25 BB20
...
Value numbering stmt = g_6_lsm.126_456 = PHI <g_6_lsm.126_304(19), g_6_lsm.126_338(108)>
Predication says _347 and _346 are equal on edge 108 -> 20
Setting value number of g_6_lsm.126_456 to _346
...
Value numbering stmt = _350 = g_6_lsm.126_456;
Setting value number of _350 to _346 (changed)
_346 is available for _346
Value numbering stmt = _352 = _348 ^ _350;
_346 is available for _346
Match-and-simplified _348 ^ _350 to _347
RHS _348 ^ _350 simplified to _347
Setting value number of _352 to _347 (changed)
_347 is available for _347
Value numbering stmt = g_6_lsm.126_304 = _352;
Setting value number of g_6_lsm.126_304 to _347 (changed)
_347 is available for _347
Looking for changed values of backedge 19->20 destination PHIs
Setting value number of g_481.64_448 to g_481.64_448
Setting value number of .MEM_464 to .MEM_409
Setting value number of g_6_lsm.126_456 to _347 (changed)
Iterating to 25 BB20

and repeat.

And we have:

  _348 = _346 ^ _347;

and

<bb 18> [local count: 42914375]:
# g_1198.66_449 = PHI <_358(141), 0(17)>
# g_6_lsm.126_131 = PHI <g_6_lsm.126_349(141), g_6_lsm.126_438(17)>
g_481_lsm.127_447 = 1;
_57 = g_6_lsm.126_131;
_58 = _57 ^ _348;
g_6_lsm.126_338 = _58;
if (_346 != _347)
  goto <bb 150>; [5.50%]
else
  goto <bb 108>; [94.50%]

<bb 108> [local count: 40554084]:
goto <bb 20>; [100.00%]


So when the value we compare to oscillates.  Interestingly when I try
to feed a similar case into FRE1 that "works":

int foo (int a, int b, int n)
{
  int val = a;
  if (a == b)
    {
      do
        {
          val = (a ^ b) ^ val;
        }
      while (--n > 0);
    }
  return val;
}

but maybe this is by luck.  So what we need to avoid is for an equivalence
a == b, to oscillate between 'a' and 'b' as result.
Comment 12 Richard Biener 2023-01-25 12:31:42 UTC
I have a patch but the testcase is too large.
Comment 13 GCC Commits 2023-01-25 12:37:41 UTC
The master branch has been updated by Richard Biener <rguenth@gcc.gnu.org>:

https://gcc.gnu.org/g:c29d85359add807200a1a851026b4e4a9d6b714c

commit r13-5348-gc29d85359add807200a1a851026b4e4a9d6b714c
Author: Richard Biener <rguenther@suse.de>
Date:   Wed Jan 25 13:31:46 2023 +0100

    tree-optimization/108523 - fix endless iteration in VN
    
    The following fixes not converging iteration in value-numbering of
    PHI nodes when we use an equivalence to prove the PHI node is
    degenerate.  We have to avoid the situation where we oscillate
    between the two equivalent values because the result is fed back
    via a backedge.
    
            PR tree-optimization/108523
            * tree-ssa-sccvn.cc (visit_phi): Avoid using the exclusive
            backedge value for the result when using predication to
            prove equivalence.
Comment 14 Richard Biener 2023-01-25 12:38:18 UTC
Fixed, but I'll see if somebody comes up with a reduced testcase.
Comment 15 David Binderman 2023-01-25 16:49:00 UTC
(In reply to Richard Biener from comment #14)
> Fixed, but I'll see if somebody comes up with a reduced testcase.

I have a reduction running with cvise.
Comment 16 David Binderman 2023-01-25 17:10:31 UTC
cvise produces:

int g_149, g_167, g_481;
main() {
  int *l_1478 = &g_149;
  *l_1478 ^= g_167;
lbl_1481:
  for (;;) {
    g_481 = 1;
    for (; g_481; g_481 += 1) {
      g_167 ^= *l_1478;
      if (g_149)
        goto lbl_1481;
    }
  }
}
Comment 17 Richard Biener 2023-01-26 07:21:45 UTC
Thanks a lot!
Comment 18 GCC Commits 2023-01-26 07:39:58 UTC
The master branch has been updated by Richard Biener <rguenth@gcc.gnu.org>:

https://gcc.gnu.org/g:1f6d05e9ad858b59b824f57d09400adcb2c5e4ad

commit r13-5378-g1f6d05e9ad858b59b824f57d09400adcb2c5e4ad
Author: Richard Biener <rguenther@suse.de>
Date:   Thu Jan 26 08:38:35 2023 +0100

    tree-optimization/108523 - testcase for the bug
    
    This adds a reduced testcase for the PR.
    
            PR tree-optimization/108523
            * gcc.dg/torture/pr108523.c: New testcase.
Comment 19 Richard Biener 2023-01-26 07:40:34 UTC
Fixed.