This is the mail archive of the mailing list for the GCC project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: [PATCH] New c++ inliner

In article <> you write:

> here's a patch which changes the C++ AST inliner. There are a number of
> changes, and a number of pieces of future work indicated. However, the
> high lights are
> a) no horrible compile time & memory degradation.
> b) produced object code is faster and smaller [...]

Hi Nathan,

Here are my early experiences with your patch (I added it to an
already bootstrapped tree, quickstrapped it and re-installed the
compiler only).  I have a non-public C++ code base that solves a class
of symbolic math problems known as cypto addition (e.g. ``the + earth
+ venus + saturn + uranus = neptune'', find the value of each digit
symbol).  The code was written to be a straightforward CPU-bound use
of STL.  It was never explicitly tuned for any particular

Table Notes: code generator was for i386, actual machine is i686.
-static was used in all cases.  size(1) was used to obtain binary size
of the object file before linking.  Process time is reported in
seconds with built-in time command (u is user time, s is system time).

compiler	options		compile time	binary size	execution time
2.95.2				2.7u+0.3s	25283		64.2u
		-O		3.8u+0.3s	15867		13.8u
		-O2		5.3u+0.3s	15637		13.4u
		-O3		5.3u+0.3s	15384		12.8u
3.0				8.0u+0.4s	39627		59.7u
		-O		10.4u+0.4s	23803		11.1u
		-O2		13.0u+0.3s	23239		10.5u
		-O3		33.7u+0.8s	64811		10.4u
      -O3 -finline-limit=64000	82.6u+4.4s	111051		9.1u
mainline			8.9u+0.4s	38735		58.5u
		-O		11.7u+0.3s	25035		10.8u
		-O2		15.0u+0.4s	25131		10.3u
		-O3		44.9u+0.9s	70927		10.0u

IMHO, this is a nice example since it already clearly beats 2.95.2 in
terms of execution time with default parameters (the STL
implementation did change but not radically).  BTW, from past detailed
study, I can confirm that the compile time and binary size bloat is
mainly due to the libstdc++-v2 to libstdc++-v3 transition.  Now, we
turn to recompiling with your proposed patch:

mainline+optimize4.patch	9.0u+0.4s	38735		58.6u
		-O		9.6u+0.4s	19507		14.2u
		-O2		12.3u+0.4s	19847		13.6u
		-O3		12.8u+0.4s	20019		13.5u

I wondered what I would have to do for this example to obtain the
performance seen above (-O3).

-O3 --param max-inline-ast=100	12.5u+0.5s	18759		12.5u
-O3 --param max-inline-ast=200	15.5u+0.4s	23843		11.0u
-O3 --param max-inline-ast=400	20.2u+0.4s	33519		10.1u
-O3 --param max-inline-ast=800	21.4u+0.6s	35007		9.4u
-O3 --param max-inline-ast=1600	42.9u+0.7s	66103		8.7u
-O3 --param max-inline-ast=3200	50.0u+1.0s	74723		8.5u
-O3 --param max-inline-ast=6400	76.7u+1.2s	89727		7.9u
-O3 --param max-inline-ast=12800 102.8u+3.5s	107203		8.0u

(Wow, I know by cranking -finline-limit up, I never saw this code
 perform this well, but maybe I never cranked it high enough)

I consider that last move to be "negative improvement according to all
metrics" thus I stopped.  I did not attempt to tune with the other new
parameter since this one appeared to be the "big hammer". ;-)

But, then, consider that performance of the code compiled with -O2 was
fairly good before your patch.  How do I get that back?

-O2 --param max-inline-ast=800	15.7u+0.4s	24639		10.6u
-O2 --param max-inline-ast=1600	15.6u+0.5s	24571		10.4u
[higher values of max-inline-ast not seen to hurt compile time or help
 executable run time.]

I do not know that my use of STL in this example is representative of
all other uses, but it might be nice if max-inline-ast could be set by
default to a value that covers the nominal STL cases.  I think you
have max-inline-ast set to low by default.  I look at it this way,
with max-inline-ast={200, 400, 800}, the optimizing and inlining
compiler at -O3 is still 2 to 3 times faster than with the old
algorithm (for this one example, more data obviously needed, but I did
study results from Gerald's code).

As I told you in earlier private e-mail, thank you for being so
complete is describing how you propose to change the algorithm.  As
usual, great work Nathan.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]