Bug 58478 - [4.7 Regression] very slow compilation at -O1 and above on a nested loop
Summary: [4.7 Regression] very slow compilation at -O1 and above on a nested loop
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: 4.7.3
: P3 normal
Target Milestone: 4.8.0
Assignee: Not yet assigned to anyone
URL:
Keywords: compile-time-hog
Depends on:
Blocks:
 
Reported: 2013-09-19 19:14 UTC by Zhendong Su
Modified: 2014-06-12 13:29 UTC (History)
0 users

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2013-09-19 00:00:00


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Zhendong Su 2013-09-19 19:14:22 UTC
The following testcase takes much longer to compile at -O1 and above using GCC 4.7.3 on x86_64-linux (in both 32-bit and 64-bit modes). 

It does not seem to affect 4.8 and the current trunk. 


$ gcc-4.7 -v
Using built-in specs.
COLLECT_GCC=gcc-4.7
COLLECT_LTO_WRAPPER=/usr/local/gcc-4.7/libexec/gcc/x86_64-unknown-linux-gnu/4.7.3/lto-wrapper
Target: x86_64-unknown-linux-gnu
Configured with: ../gcc-4.7.3/configure --enable-languages=c,c++,objc,obj-c++,fortran,lto --disable-checking --with-gmp=/usr/local/gcc-4.7 --with-mpfr=/usr/local/gcc-4.7 --with-mpc=/usr/local/gcc-4.7 --with-ppl=/usr/local/gcc-4.7 --with-cloog=/usr/local/gcc-4.7 --prefix=/usr/local/gcc-4.7
Thread model: posix
gcc version 4.7.3 (GCC) 
$ time gcc-4.7 -O0 small.c
0.02user 0.02system 0:00.05elapsed 86%CPU (0avgtext+0avgdata 38336maxresident)k
0inputs+32outputs (0major+6471minor)pagefaults 0swaps
$ time gcc-4.7 -O1 small.c
33.43user 0.12system 0:33.64elapsed 99%CPU (0avgtext+0avgdata 660976maxresident)k
0inputs+32outputs (0major+53023minor)pagefaults 0swaps
$ time gcc-4.8 -O1 small.c
0.02user 0.02system 0:00.05elapsed 85%CPU (0avgtext+0avgdata 36704maxresident)k
0inputs+32outputs (0major+6364minor)pagefaults 0swaps
$ time gcc-trunk -O1 small.c
0.04user 0.01system 0:00.07elapsed 82%CPU (0avgtext+0avgdata 42656maxresident)k
0inputs+32outputs (0major+6700minor)pagefaults 0swaps
$ 


-----------------------------


int a, b, c, d, e; 

int main ()
{
  for (a = 0; a < 10; a++)
    for (b = 0; b < 10; b++)
      for (c = 0; c < 10; c++)
	for (d = 0; d < 10; d++)
	  for (e = 0; e < 10; e++)
	    ;
  return 0;
}
Comment 1 Marek Polacek 2013-09-19 19:18:46 UTC
Confirmed.
Comment 2 Zhendong Su 2013-09-19 19:50:37 UTC
(In reply to Marek Polacek from comment #1)
> Confirmed.

That's quick; thanks Marek! 

Please also take a look at 58479 when you get a chance. 

It's related (as well as 58318).
Comment 3 Richard Biener 2013-09-20 07:44:38 UTC
A lot of work was spent to make GCC faster in corner-cases for GCC 4.8 (and 4.9).
This bug triggers

 tree reassociation      :  18.28 (96%) usr   0.05 (15%) sys  18.37 (94%) wall    2047 kB ( 3%) ggc

which means it's likely gsi_for_stmt becoming O(1) in 4.8 vs being O(n) in 4.7
and the loops being completely unrollend (and thus larger BBs).  Indeed
the final cunroll pass unrolls the loop nest completely, leaving a basic-block
with ~166000 instructions and no constant propagation before it (sth 4.8
improved on as well).

I'd say WONTFIX and move on to 4.8.
Comment 4 Richard Biener 2014-06-12 13:29:35 UTC
Fixed for 4.8.0.