Bug 54337 - Dramatic Compilation slow-down on higher Optimizaitons
Summary: Dramatic Compilation slow-down on higher Optimizaitons
Status: RESOLVED DUPLICATE of bug 46590
Alias: None
Product: gcc
Classification: Unclassified
Component: middle-end (show other bugs)
Version: 4.7.1
: P3 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
Keywords: compile-time-hog
Depends on:
Reported: 2012-08-20 23:02 UTC by nbhargava
Modified: 2012-08-21 08:09 UTC (History)
1 user (show)

See Also:
Target: x86_64-*-*
Known to work:
Known to fail:
Last reconfirmed:

Test Case (1.39 KB, application/octet-stream)
2012-08-20 23:02 UTC, nbhargava
Linux-compatible test case (1.39 KB, application/octet-stream)
2012-08-20 23:21 UTC, nbhargava
Linux-compatible test case (1.39 KB, application/octet-stream)
2012-08-21 00:04 UTC, nbhargava

Note You need to log in before you can comment on or make changes to this bug.
Description nbhargava 2012-08-20 23:02:28 UTC
Created attachment 28061 [details]
Test Case

The attached file contains a few function definitions followed by a main method
with a single line of code duplicated 2500 times.

At optimization levels O0, the code compiles relatively quickly (~8s).
However, at optimization levels O2 and O3, the compilation time jumps to over 30s to compile -- almost a 4x difference.

Interesting things about what speeds up the optimized compilation:

Having an if-statement with an empty body cuts down compilation times on all
optimization levels to under 3s.

This could be related to http://llvm.org/bugs/show_bug.cgi?id=13651, as the problem was discovered in a file that repeats a macro heavily, as these examples do.
Comment 1 Andrew Pinski 2012-08-20 23:07:54 UTC
For me it does not even compile:
t1.cc:44:36: error: ‘numeric_limits’ is not a member of ‘std’
   expected_ss << std::setprecision(std::numeric_limits<float>::digits10);
Comment 2 nbhargava 2012-08-20 23:21:43 UTC
Created attachment 28062 [details]
Linux-compatible test case

The original test case was timed and compiled on a Mac. This one shows a much less extreme version of the problem (only goes from 2.5 seconds to 5 seconds).

The original problem was found as part of the gtest framework and was slow on all architectures. I will try to manufacture a version that better showcases the original problem in the meanwhile.
Comment 3 nbhargava 2012-08-21 00:04:10 UTC
Created attachment 28063 [details]
Linux-compatible test case

$ time g++ -O3 -c t2.cc

real	0m22.806s
user	0m22.390s
sys	0m0.330s

$ time g++ -O0 -c t2.cc

real	0m2.357s
user	0m2.260s
sys	0m0.050s

On both Linux and Mac, this type of discrepancy exists.
Comment 4 Steven Bosscher 2012-08-21 07:00:24 UTC
Can you please provide the output of your "gcc --version" and of a compiler run with the extra flag "-ftime-report"?
Comment 5 Richard Biener 2012-08-21 08:09:55 UTC
Sort-of "confirmed", though you should expect compile-time increases from -O0 to
-O[123] and I don't think the increases are unreasonable, esp. for C++ code which
can see a lot if inlining.

> /usr/bin/time g++-4.7 -S -o /dev/null t.C -O0
2.24user 0.09system 0:02.84elapsed 82%CPU (0avgtext+0avgdata 918736maxresident)k
22384inputs+0outputs (97major+62614minor)pagefaults 0swaps
> /usr/bin/time g++-4.7 -S -o /dev/null t.C -O1
11.24user 0.15system 0:11.43elapsed 99%CPU (0avgtext+0avgdata 1045616maxresident)k
0inputs+0outputs (0major+113102minor)pagefaults 0swaps
> /usr/bin/time g++-4.7 -S -o /dev/null t.C -O2
16.67user 0.38system 0:17.12elapsed 99%CPU (0avgtext+0avgdata 1189056maxresident)k
2504inputs+0outputs (11major+199132minor)pagefaults 0swaps
> /usr/bin/time g++-4.7 -S -o /dev/null t.C -O3
17.79user 0.30system 0:18.20elapsed 99%CPU (0avgtext+0avgdata 1203280maxresident)k
328inputs+0outputs (2major+200860minor)pagefaults 0swaps

-O3 time-report:

 alias stmt walking      :   2.87 (16%) usr   0.02 ( 4%) sys   2.96 (16%) wall       1 kB ( 0%) ggc
 tree PTA                :   4.59 (26%) usr   0.01 ( 2%) sys   4.63 (25%) wall    1313 kB ( 1%) ggc

so I think we have a duplicate for this kind of issue.  SCCVN walks over
non-aliased stores are not limited and can expose quadratic complexity
if there are no aliases in the function that stop the walk.  PR46590
comes to my mind which already tracks this.

*** This bug has been marked as a duplicate of bug 46590 ***