Bug 17278 - [4.0/4.1 Regression] 8% C++ compile-time regression in comparison with 3.4.1 at -O1 optimization level
Summary: [4.0/4.1 Regression] 8% C++ compile-time regression in comparison with 3.4.1 ...
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: middle-end (show other bugs)
Version: 4.0.0
: P2 normal
Target Milestone: 4.0.0
Assignee: Not yet assigned to anyone
URL:
Keywords: compile-time-hog
Depends on: 17707
Blocks: 13776
  Show dependency treegraph
 
Reported: 2004-09-02 08:55 UTC by Karel Gardas
Modified: 2005-03-02 21:28 UTC (History)
4 users (show)

See Also:
Host: i686-pc-linux-gnu
Target: i686-pc-linux-gnu
Build: i686-pc-linux-gnu
Known to work: 3.4.0
Known to fail: 4.0.0
Last reconfirmed: 2005-02-01 00:21:16


Attachments
gcc 3.5.0 preprocessed typecode.cc file (151.06 KB, application/octet-stream)
2004-09-02 08:58 UTC, Karel Gardas
Details
Disable some expensive passes at -O1 (721 bytes, patch)
2005-01-21 10:57 UTC, Steven Bosscher
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Karel Gardas 2004-09-02 08:56:00 UTC
Hello,
attached typecode.ii preprocessed file shows 40% regression in compile-time
while compiled with 3.5.0 in comparison with 3.4.1 when -O1 optimization is
used. To be correct, 40% regression is shown on not-preprocessed file. When I
compile file preprocessed by 3.5.0 with 3.4.1, the regression is only about 30%
which seems like 3.5.0's libstdc++ library is either bigger or slower to
compiler with.
Cheers,
Karel
Comment 1 Karel Gardas 2004-09-02 08:58:04 UTC
Created attachment 7021 [details]
gcc 3.5.0 preprocessed typecode.cc file
Comment 2 Karel Gardas 2004-09-02 09:00:06 UTC
Here is an analysis done by Steven Bosscher:
http://gcc.gnu.org/ml/gcc/2004-08/msg01602.html
Comment 3 Giovanni Bajo 2004-09-02 11:05:07 UTC
Confirmed. Looks like it's just a matter of tuning which optimizations are 
implied by -O1.
Comment 4 Andrew Pinski 2004-09-02 17:07:05 UTC
Hmm, I disagree with Steven's analysis, I think there are other problems here than just more 
optimizations, if I have time I will look into it, the next few days.
Comment 5 Andrew Pinski 2004-10-03 23:37:50 UTC
Well now most of the time is in cgraph_reset_static_var_maps which is PR 17707.
Comment 6 Andrew Pinski 2004-10-25 12:55:09 UTC
Rewording summary because now we are only 23%:
File            342-O0  400-O0  Delta%  342-O1  400-O1  Delta%  342-O2  400-O2  Delta%
typecode.cc     9.09    7.65    18.82   13.53   17.73   -23.69  32.95   23.29   41.48
From <http://gcc.gnu.org/ml/gcc/2004-10/msg00952.html>.
Comment 7 Karel Gardas 2004-10-25 13:06:01 UTC
Subject: Re:  [4.0 Regression] 24% C++ compile-time regression
 in comparison with 3.4.1 at -O1 optimization level


Yes, but this only apply to typecode.cc. If you consider ir.cc, you will
need to increase from 40 to 44% and since subject does not talk about
typecode.cc, I would consider leaving it at 40% better option for now...

Cheers,
Karel

Comment 8 Karel Gardas 2004-10-25 13:12:00 UTC
Subject: Re:  [4.0 Regression] 24% C++ compile-time regression
 in comparison with 3.4.1 at -O1 optimization level


Please have a look into http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13776
for preprocessed ir.cc file for your experiments.

Cheers,
Karel

Comment 9 Andrew Pinski 2004-10-25 14:07:58 UTC
For ir.cc, does -fno-threadsafe-statics help if so this is a non bug (in that c++ front-end has changed 
to output more functions so what does the middle-end/back-end expect but slower compile time for 
those components).
Comment 10 Karel Gardas 2004-10-26 06:45:42 UTC
Subject: Re:  [4.0 Regression] 24% C++ compile-time regression
 in comparison with 3.4.1 at -O1 optimization level

 Hi,

I have tested -fno-threadsafe-statics now and it does not affect so much,
IMHO:

$ c++  -I../include  -time -O0 -Wall   -DPIC -fPIC  -c ir.cc -o ir.pic.o
# cc1plus 68.57 2.26
# as 5.92 0.27

$ c++  -I../include  -fno-threadsafe-statics -time -O0 -Wall   -DPIC -fPIC  -c ir.cc -o ir.pic.o
# cc1plus 67.94 2.04
# as 5.86 0.26

Cheers,
Karel

Comment 11 Andrew Pinski 2004-11-27 00:39:59 UTC
Anybody want to do new timings for typecode.ii at -O1 because I think that testcase is now fixed?
Comment 12 Serge Belyshev 2004-11-27 03:35:09 UTC
       3.4.4     4.0.0     delta
---------------------------------
-O0      8.2       7.1      -13%
-O1     11.0      16.5       50%
-O2     23.3      21.8       -6%
Comment 13 Andrew Pinski 2004-12-04 17:54:46 UTC
 tree remove redundant PHIs:   0.34 ( 2%) usr   0.02 ( 0%) sys   0.34 ( 1%) wall
 tree SSA rewrite      :   0.42 ( 3%) usr   0.06 ( 1%) sys   0.62 ( 3%) wall
 tree SSA other        :   0.88 ( 6%) usr   0.61 ( 9%) sys   1.39 ( 6%) wall
 tree operand scan     :   0.45 ( 3%) usr   0.64 (10%) sys   1.51 ( 6%) wall
 dominator optimization:   0.86 ( 6%) usr   0.05 ( 1%) sys   0.83 ( 3%) wall
 tree alias analysis   :   0.19 ( 1%) usr   0.01 ( 0%) sys   0.25 ( 1%) wall
 tree PHI insertion    :   0.23 ( 2%) usr   0.02 ( 0%) sys   0.24 ( 1%) wall

 expand                :   0.72 ( 5%) usr   0.08 ( 1%) sys   0.91 ( 4%) wall
I wounder how expand is this slow, it might be just again counting more than just expand here (again).
Comment 14 Steven Bosscher 2004-12-23 11:28:50 UTC
Karel, your latest comparison is almost a month old (it was
here: http://gcc.gnu.org/ml/gcc/2004-11/msg01157.html), and
we've fixed a few compile time bottlenecks since then.  Can
you spare some cycles and send an updated comparison?  It's
probably still ir.cc where we have regressions, but perhaps
not as bad as before *fingers crossed*  ;-)

Comment 15 Steven Bosscher 2004-12-23 11:30:54 UTC
It's interesting that -O1 is consistently slower than previous
releases.  Perhaps we should turn off some of the more costly
tree passes at -O1, such as iterating in DOM, and the expensive
loop optimizations.  Any thoughts on this?
 
Comment 16 Karel Gardas 2004-12-28 21:00:39 UTC
Subject: Re:  [4.0 Regression] 24% C++ compile-time regression
 in comparison with 3.4.1 at -O1 optimization level


New comparison is here:
http://gcc.gnu.org/ml/gcc/2004-12/msg01157.html

Good work! :-)

Cheers,
Karel
--
Karel Gardas                  kgardas@objectsecurity.com
ObjectSecurity Ltd.           http://www.objectsecurity.com

Comment 17 Andrew Pinski 2004-12-28 22:33:20 UTC
Now only 8%.
Comment 18 Karel Gardas 2004-12-28 22:39:37 UTC
Subject: Re:  [4.0 Regression] 8% C++ compile-time
 regression in comparison with 3.4.1 at -O1 optimization level


On Tue, 28 Dec 2004, pinskia at gcc dot gnu dot org wrote:

> Now only 8%.

True for typecode.cc, but not for ir.cc where there is ~28% regression.

Cheers,
Karel

Comment 19 Andrew Pinski 2004-12-28 22:40:40 UTC
(In reply to comment #18)
> Subject: Re:  [4.0 Regression] 8% C++ compile-time
>  regression in comparison with 3.4.1 at -O1 optimization level
> 
> 
> On Tue, 28 Dec 2004, pinskia at gcc dot gnu dot org wrote:
> 
> > Now only 8%.
> 
> True for typecode.cc, but not for ir.cc where there is ~28% regression.

PR 13776 is keeping track of that regression.
This one is for typecode.cc.
Comment 20 Karel Gardas 2004-12-28 22:42:42 UTC
Subject: Re:  [4.0 Regression] 8% C++ compile-time
 regression in comparison with 3.4.1 at -O1 optimization level

On Tue, 28 Dec 2004, pinskia at gcc dot gnu dot org wrote:

> > On Tue, 28 Dec 2004, pinskia at gcc dot gnu dot org wrote:
> >
> > > Now only 8%.
> >
> > True for typecode.cc, but not for ir.cc where there is ~28% regression.
>
> PR 13776 is keeping track of that regression.
> This one is for typecode.cc.

Err, you are right! Sorry for that.

Karel

Comment 21 Mark Mitchell 2005-01-20 22:58:14 UTC
Does this 8% regression apply to preprocessed source, or only to unpreprocessed
source?  If the latter, then this PR should be closed as WONTFIX; the runtime
library has gotten bigger, and that makes things slower, but nothing is going to
be done about that.
Comment 22 Steven Bosscher 2005-01-21 00:24:57 UTC
Mark, typecode.ii ;-) 
 
So it is preprocessed.  That doesn't mean it's smaller though, the 
preprocessed larger library is still a larger library. 
 
Anyway, the problem here is more that compared to gcc 3.x we do a lot 
more work at -O1.  Basically all tree-ssa passes run at -O1, we probably 
should look into disabling a few, and some RTL passes too. 
 
I've suggested a few things before.  Disabling RTL CSE1, gcse, and jump 
threading for -O1, and don't allow tree-ssa DOM to iterating at -O1 would 
make a big difference already, I think.  But apparently nobody cares 
enough to actually try it. 
 
Comment 23 Steven Bosscher 2005-01-21 10:57:24 UTC
Created attachment 8029 [details]
Disable some expensive passes at -O1

I'm running a SPECint comparison between GCC-hammer-branch and mainline
with the attached patch applied.
Comment 25 CVS Commits 2005-01-27 16:33:53 UTC
Subject: Bug 17278

CVSROOT:	/cvs/gcc
Module name:	gcc
Changes by:	steven@gcc.gnu.org	2005-01-27 16:32:16

Modified files:
	gcc            : ChangeLog opts.c tree-ssa-dom.c 

Log message:
	PR middle-end/17278
	* opts.c (decode_options): Move flag_thread_jumps from -O1 and
	higher to -O2 and higher.  Likewise for tree PRE.
	* tree-ssa-dom.c (tree_ssa_dominator_optimize): Only iterate at -O2
	and better.

Patches:
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/ChangeLog.diff?cvsroot=gcc&r1=2.7306&r2=2.7307
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/opts.c.diff?cvsroot=gcc&r1=1.90&r2=1.91
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/tree-ssa-dom.c.diff?cvsroot=gcc&r1=2.88&r2=2.89

Comment 26 Steven Bosscher 2005-01-27 16:40:04 UTC
Partially fixed at least. 
 
Karel, new timings?  (This one will probably still be a bit slower, but 
hopefully we've gained a bit again...) 
Comment 27 Karel Gardas 2005-01-31 09:00:47 UTC
Subject: Re:  [4.0 Regression] 8% C++ compile-time
 regression in comparison with 3.4.1 at -O1 optimization level


Hello,

new timings are here: http://gcc.gnu.org/ml/gcc/2005-01/msg01714.html

Actually typecode.cc went to ~9% regression for -O1, please read this
report for more information why.

Cheers,
Karel


Comment 28 Steven Bosscher 2005-02-01 00:22:02 UTC
I have no further ideas for speedups for this bug... 
Comment 29 Kazu Hirata 2005-03-01 15:08:25 UTC
Karel, could you retest the testcase with the gcc-4.0 branch?

Several speed-up patches went in after your last benchmark.

Thanks,
Comment 30 Karel Gardas 2005-03-02 20:05:49 UTC
Subject: Re:  [4.0/4.1 Regression] 8% C++ compile-time
 regression in comparison with 3.4.1 at -O1 optimization level


New results for 4.0.0 20050301 are posted here:
http://gcc.gnu.org/ml/gcc/2005-03/msg00132.html

Cheers,
Karel

Comment 31 Giovanni Bajo 2005-03-02 21:21:00 UTC
At this point, I think we could safely close this and related bugs. Karel 
could continue to periodically test GCC and report new regressions in new 
bugs. I don't think keeping these open bring us any benefit right now. Karel, 
do you agree?
Comment 32 Karel Gardas 2005-03-02 21:25:19 UTC
Subject: Re:  [4.0/4.1 Regression] 8% C++ compile-time
 regression in comparison with 3.4.1 at -O1 optimization level

I agree with Giovanni that both #17278 and #13776 are fixed from MICO
compile-time regressions point of view. If you would like to close them,
I'm also for it, just please be careful with #13776 which seems to
"accumulate" more staff than only MICO-related regressions.

Thanks!

Karel

Comment 33 Andrew Pinski 2005-03-02 21:27:59 UTC
Fixed.
Comment 34 Giovanni Bajo 2005-03-02 21:28:42 UTC
OK let's close this as fixed then. Many thanks to the hard work of the whole 
GCC team!