The following testcase takes much longer to compile at -O1 and above using GCC 4.7.3 on x86_64-linux (in both 32-bit and 64-bit modes). It does not seem to affect 4.8 and the current trunk. $ gcc-4.7 -v Using built-in specs. COLLECT_GCC=gcc-4.7 COLLECT_LTO_WRAPPER=/usr/local/gcc-4.7/libexec/gcc/x86_64-unknown-linux-gnu/4.7.3/lto-wrapper Target: x86_64-unknown-linux-gnu Configured with: ../gcc-4.7.3/configure --enable-languages=c,c++,objc,obj-c++,fortran,lto --disable-checking --with-gmp=/usr/local/gcc-4.7 --with-mpfr=/usr/local/gcc-4.7 --with-mpc=/usr/local/gcc-4.7 --with-ppl=/usr/local/gcc-4.7 --with-cloog=/usr/local/gcc-4.7 --prefix=/usr/local/gcc-4.7 Thread model: posix gcc version 4.7.3 (GCC) $ time gcc-4.7 -O0 small.c 0.02user 0.02system 0:00.05elapsed 86%CPU (0avgtext+0avgdata 38336maxresident)k 0inputs+32outputs (0major+6471minor)pagefaults 0swaps $ time gcc-4.7 -O1 small.c 33.43user 0.12system 0:33.64elapsed 99%CPU (0avgtext+0avgdata 660976maxresident)k 0inputs+32outputs (0major+53023minor)pagefaults 0swaps $ time gcc-4.8 -O1 small.c 0.02user 0.02system 0:00.05elapsed 85%CPU (0avgtext+0avgdata 36704maxresident)k 0inputs+32outputs (0major+6364minor)pagefaults 0swaps $ time gcc-trunk -O1 small.c 0.04user 0.01system 0:00.07elapsed 82%CPU (0avgtext+0avgdata 42656maxresident)k 0inputs+32outputs (0major+6700minor)pagefaults 0swaps $ ----------------------------- int a, b, c, d, e; int main () { for (a = 0; a < 10; a++) for (b = 0; b < 10; b++) for (c = 0; c < 10; c++) for (d = 0; d < 10; d++) for (e = 0; e < 10; e++) ; return 0; }
Confirmed.
(In reply to Marek Polacek from comment #1) > Confirmed. That's quick; thanks Marek! Please also take a look at 58479 when you get a chance. It's related (as well as 58318).
A lot of work was spent to make GCC faster in corner-cases for GCC 4.8 (and 4.9). This bug triggers tree reassociation : 18.28 (96%) usr 0.05 (15%) sys 18.37 (94%) wall 2047 kB ( 3%) ggc which means it's likely gsi_for_stmt becoming O(1) in 4.8 vs being O(n) in 4.7 and the loops being completely unrollend (and thus larger BBs). Indeed the final cunroll pass unrolls the loop nest completely, leaving a basic-block with ~166000 instructions and no constant propagation before it (sth 4.8 improved on as well). I'd say WONTFIX and move on to 4.8.
Fixed for 4.8.0.