This is the mail archive of the libstdc++@gcc.gnu.org mailing list for the libstdc++ project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: compile time regressions (was: merging for 3.4)


It looks like this didn't make it to the lists.


To: Gerald Pfeifer <pfeifer@dbai.tuwien.ac.at>
Cc: Eric Christopher <echristo@redhat.com>,  Neil Booth <neil@daikokuya.co.uk>,
	  Jan Hubicka <hubicka@ucw.cz>,  Joe Buck <jbuck@synopsys.com>,
	  Diego Novillo <dnovillo@redhat.com>,
	  Mark Mitchell <mark@codesourcery.com>,
	  Benjamin Kosnik <bkoz@redhat.com>,
	  Gabriel Dos Reis <gdr@integrable-solutions.net>,
	  "pcarlini@unitus.it" <pcarlini@unitus.it>,
	  "libstdc++@gcc.gnu.org" <libstdc++@gcc.gnu.org>, "" <gcc@gcc.gnu.org>
Subject: Re: compile time regressions (was: merging for 3.4)

Gerald Pfeifer <pfeifer@dbai.tuwien.ac.at> writes:

  > On Wed, 4 Dec 2002, Matt Austern wrote:
  > > Are there good test cases for the 3.1 -> 3.3 compile time
  > > regressions?  It would be interesting to study them and find out
  > > just what has gotten slower.
  > 
  > PR 3083 (yes, that old) basically still applies, but has been superseded
  > by PR 8361 (http://gcc.gnu.org/cgi-bin/gnatsweb.pl?cmd=view&pr=8361).
  > 
  > > If the numbers people have been tossing around are real then these are
  > > very serious regressions and we should consider slipping the schedule
  > > rather than releasing a compiler with those sorts of performance
  > > problems.
  > 
  > Here are some timings from that PR:
  >   gcc-3.0.4/bin/g++    45.45 user
  >   gcc-3.1.1/bin/g++    57.79 user  +27%
  >   gcc-3.2.1/bin/g++    59.30 user  +30%
  >   gcc-current/bin/g++  72.29 user  +59%
  > 
  > Devang asked me wrt. to number w/o optimization:

I have some more detailed data generated using -ftime-report for the
file in PR 8361 and also for expr.c and combine.c from GCC (chosen
because they are big).

The machine used for this is a Sun E250 with 2 300MHz Ultra-II
processors and 1GB RAM running solaris-2.7. 
The compilers are gcc-3.0.4, gcc-3.1, a gcc from mainline CVS built
sometime at the end of May (I had it lying around), mainline CVS as
of last Friday. 

Here are the user times for the file in PR 8361, when using -O2 -ftime-report

			  3.0.4           3.1   CVS-May    CVS
Execution times (seconds)	
 garbage collection    :   21.58         38.19   41.28   44.92
 cfg construction      :                  2.52    1.51    2.60
 cfg cleanup           :                  5.54    4.12    6.33
 trivially dead code   :                	  3.83    3.57		       
 life analysis         :                 10.39   12.99   10.50
 life info update      :                  2.28    3.01    6.07
 preprocessing         :    0.61          0.77    0.54    0.55
 lexical analysis      :    1.44          1.10    1.03    1.55
 parser                :   27.32         29.27   28.93   43.37
 expand                :   14.06         16.95   23.21   31.82
 varconst              :    0.54          0.64    0.64    0.60
 integration           :    4.05          4.17    4.72    9.83
 jump                  :    3.36          2.26    2.77    2.49
 CSE                   :   12.84         25.61   23.82   25.19
 global CSE            :    2.22          4.60    5.37    8.83
 loop analysis         :    2.16          2.56    2.55    2.48
 CSE 2                 :    7.44          8.87    7.87    8.78
 branch prediction     :                          2.97    5.43		       
 flow analysis         :    2.84          0.78    0.62    0.57
 combiner              :    3.12          4.32    4.43    4.84
 if-conversion         :    0.34          0.16    0.21    0.56
 regmove               :    0.79          0.92    1.06    1.07
 scheduling            :    4.71          6.61    8.76    9.14
 local alloc           :    2.13          3.00    3.30    3.19
 global alloc          :    2.83          5.59    6.23    7.51
 reload CSE regs       :    5.10          5.76    4.74    4.89
 flow 2                :    2.12          0.35    0.49    0.57
 if-conversion 2       :    0.09          0.07    0.08    0.14
 peephole 2            :                  0.80    0.76    0.55
 rename registers      :                  4.51    4.83   11.74
 scheduling 2          :    2.34          2.82    3.43    3.69
 delay branch sched    :    1.49          1.86    2.08    2.27
 reorder blocks        :    0.32          0.29    0.39    0.20
 shorten branches      :    0.18          0.27    0.25    0.30
 final                 :    1.37          1.21    1.24    1.11
 symout                :    0.03          0.03    0.03    0.03
 rest of compilation   :    2.27          2.51    2.99    3.74
 TOTAL                 :  129.70        197.62  217.14  271.08


It looks like there was a significant increase in the garbage
collection time. It's strange that GC is so expensive for C++
compilation. 
The conclusion of  a similar discussion in the summer was that the
new inliner, inlines more, hence the increased compilation time. 
The parser seems to become slower and slower. 

Why is "rename registers" so expensive ? That "rename registers" time
only includes -fcprop-registers (register renaming is not done for
-O2).  
"Branch predictions" seems expensive too.


Timings for expr.i (from mainline expr.c)

			  3.0.4    3.1   CVS-May     CVS
Execution times (seconds)
 garbage collection    :  1.51     5.78     6.10     6.58  !!!
 cfg construction      :      	   1.14	    0.48     1.18
 cfg cleanup           :      	  58.96	   14.91    21.55  !!!
 trivially dead code   :       	       	    3.04     3.08
 life analysis         :      	  10.57	   11.52     9.24
 life info update      :      	   1.57	    2.13     8.05
 preprocessing         :  0.39	   0.41	    0.44     0.43
 lexical analysis      :  0.41	   0.39	    0.26     0.27
 parser                :  2.67	   1.72	    1.73     1.91
 expand                :      	   1.58	    1.68     1.82
 varconst              :  0.00	       	    0.01     0.00
 integration           :      	   0.27	    0.24     0.26
 jump                  :  5.78	   2.84	    2.50     2.67
 CSE                   :  3.93	  12.22	    9.68     9.67
 global CSE            :  5.82	   5.48	    6.01     9.58
 loop analysis         :  0.99	   1.41	    1.20     1.23
 CSE 2                 :  3.26	   5.98	    4.59     4.66
 branch prediction     :       	       	    2.04     3.62       
 flow analysis         :  2.15	   0.66	    0.38     0.42
 combiner              :  2.63	   4.87	    4.89     5.08
 if-conversion         :  0.31	   0.23	    0.20     0.46
 regmove               :  0.57	   0.75	    0.71     0.69
 scheduling            :  6.26	  10.10	    8.83    13.24
 local alloc           :  1.79	   2.61	    2.43     2.23
 global alloc          :  1.98	   4.36	    4.56     4.80
 reload CSE regs       :  4.48	   5.64	    4.63     4.58
 flow 2                :  2.01	   0.35	    0.39     0.56
 if-conversion 2       :  0.05	   0.08	    0.05     0.17
 peephole 2            :      	   0.51	    0.52     0.49
 rename registers      :      	   3.85	    4.07     7.92
 scheduling 2          :  2.40	   1.88	    2.08     7.00
 delay branch sched    :  7.97	   2.94	   14.43    13.67 !!!
 reorder blocks        :  0.22	   0.13	    0.26     0.33
 shorten branches      :  0.08	   0.16	    0.19     0.26
 final                 :  0.81	   0.74	    0.75     0.74
 symout                :  0.01	   0.01	    0.00     0.02
 rest of compilation   :  0.95	   1.37	    1.27     1.68
 TOTAL                 : 59.44   151.58   119.25   150.20

GC is slowing down here too... 
Also delay branch sched. Did anything major change in that pass? 


Timings for combine.i (from mainline combine.c)

			  3.0.4    3.1   CVS-May    CVS
Execution times (seconds)
 garbage collection    :   0.26    3.12   3.09      3.42  !!!
 cfg construction      :       	   0.66	  0.36	    0.55
 cfg cleanup           :       	   4.78	  2.82	    4.38
 trivially dead code   :       	       	  1.30	    1.57		       
 life analysis         :       	   4.89	  5.25	    4.06
 life info update      :       	   1.29	  1.19	    2.95  !!!
 preprocessing         :   0.26	   0.45	  0.25	    0.32
 lexical analysis      :   0.41	   0.38	  0.22	    0.24
 parser                :   1.75	   1.02	  1.01	    1.12
 expand                :       	   1.20	  1.20	    1.35
 varconst              :   0.00	       	  0.01	    0.00
 integration           :       	   0.10	  0.18	    0.15
 jump                  :   1.49	   0.75	  0.57	    0.73
 CSE                   :   2.68	   6.91	  5.39	    5.18
 global CSE            :   1.09	   1.43	  1.51	    2.51
 loop analysis         :   0.52	   0.75	  0.64	    0.58
 CSE 2                 :   2.29	   3.21	  2.44	    2.63
 branch prediction     :       	       	  1.57	    2.30  !!!	       
 flow analysis         :   1.08	   0.41	  0.19	    0.19
 combiner              :   1.81	   2.70	  2.70	    2.81
 if-conversion         :   0.21	   0.06	  0.07	    0.26
 regmove               :   0.41	   0.37	  0.52	    0.39
 scheduling            :   2.19	   2.61	  2.48	    2.61
 local alloc           :   0.92	   1.28	  1.31	    1.23
 global alloc          :   1.08	   2.39	  2.33	    2.58
 reload CSE regs       :   2.05	   2.27	  1.73	    1.83
 flow 2                :   0.75	   0.20	  0.23	    0.29
 if-conversion 2       :   0.02	   0.02	  0.03	    0.07
 peephole 2            :       	   0.31	  0.40	    0.23
 rename registers      :       	   2.31	  2.22	    4.77  !!!
 scheduling 2          :   1.06	   1.06	  1.23	    1.49
 delay branch sched    :   1.07	   1.19	  1.30	    1.27
 reorder blocks        :   0.10	   0.04	  0.08	    0.09
 shorten branches      :   0.10	   0.09	  0.07	    0.14
 final                 :   0.44	   0.40	  0.47	    0.41
 symout                :   0.00	       	  0.01	    0.00
 rest of compilation   :   0.52	   0.69	  0.60	    0.85
 TOTAL                 :  24.57	  49.38	 47.00	   55.61

GC and rename registers again. 


Hope this helps. If there's a desire I could try producing the same
stats for other testcases. 


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]