This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: PR 10196 / Re: Inliner parameters
- From: Steven Bosscher <s dot bosscher at student dot tudelft dot nl>
- To: Richard Guenther <rguenth at tat dot physik dot uni-tuebingen dot de>
- Cc: gcc at gcc dot gnu dot org, Mark Mitchell <mark at codesourcery dot com>
- Date: 23 Apr 2003 13:54:43 +0200
- Subject: Re: PR 10196 / Re: Inliner parameters
- References: <Pine.LNX.4.44.0304231303170.18094-100000@bellatrix.tat.physik.uni-tuebingen.de>
Op wo 23-04-2003, om 13:36 schreef Richard Guenther:
> On Thu, 17 Apr 2003, Richard Guenther wrote:
>
> > On 17 Apr 2003, Steven Bosscher wrote:
> >
> > > Op wo 16-04-2003, om 23:17 schreef Richard Guenther:
> > > > > When was the last time somebody tried to tune the parameters a bit? Did
> > > > > anyone try the effects of different parameter settings for, say, SPEC
> > > > > and POOMA (and, ideally, on more than one platform)?
> > > >
> > > > I tried various parameters for POOMA to tune the performance of the
> > > > optimized code and the key parameter to change was min-inline-insns.
> > > > This is _way_ too low for POOMA to collapse the expression template
> > > > trees. I need to bump this up to 250 to get good performance. The
> > > > max-inline-insns-single can be dropped to 250 without loss then.
> > >
> > > What happened to the compile times with bigger min-inline-insns?
> >
> > Here are compile time and runtime numbers for my performace testcase using
> > g++-3.3 (GCC) 3.3 20030414 (prerelease) with options -O2 -march=athlon
> > -fomit-frame-pointer -funroll-loops -fno-exceptions --param min-inline-insns=X.
> > Lower numbers for the perf. indicator are better.
> >
> > X compile-time performance indicator
> > default 49.50 1.99804e-06
> > 50 50.25 2.26817e-06
> > 100 50.00 1.96918e-06
> > 150 51.00 1.90269e-06
> > 200 58.25 1.83045e-06
> > 250 61.25 1.28309e-06
> > 300 62.75 1.29364e-06
> > default + -Dinline="__inline__ __attribute__((always_inline))"
> > 50.50 1.31171e-06
> > (while the source is not optimized for inline->always_inline
> > transformation)
> >
> > just to show what happens with EH on, for the best param above (250)
> > we get
> >
> > 250 [again, goes into swap... - till then, 3min elapsed]
> > killed it - going to a machine with 2GB ram and more GHz where we cant
> > compare the compile time numbers from above, of course...
> > 250 448.00 [uses 750MB of ram] 5.96641e-07
> > which is more than an order of magnitude worse than without EH on a
> > faster CPU with faster mem... ugh!
>
> With g++-3.3 (GCC) 3.3 20030423 (prerelease) I now get
>
> 250 79.75 [154MB] 1.2667e-06
>
> which is not only a lot better in compile time and in memory usage, but
> also on-par in performance with the -fno-exceptions case.
>
> Just to repeat the -fno-exceptions case, with the new gcc I get
>
> 250 63.75 [153MB] 1.26749e-06
>
> so compile time is still worse for exceptions turned on, but that is to
> be expected anyway. Just for the curious, here are the g++-3.2 (GCC) 3.2.3
> 20030414 (prerelease) numbers:
>
> with exceptions (default inlining params)
>
> default 63.00 [170MB] 1.96574e-06
>
> without exceptions (default inlining params)
>
> default 54.00 [161MB] 1.97593e-06
>
>
> The testcase for PR10196 now shows:
>
> g++-3.3 -fno-exceptions: 11.75s
> g++-3.3 -fexceptions: 14.75s
> g++-3.2 -fno-exceptions: 9.50s
> g++-3.2 -fexceptions: 11.50s
>
> which is _a lot_ better, but still a 19% regression for -fno-exceptions
> and a 22% regression for -fexceptions. But as these numbers are below
> 30%, we can now downgrade the priority of the PR?
Part of that 30% can probably be explained with PR 8361, but inlining
still is slower, and there should be a PR for that, I think.
So I propose we close PR 10316, and we either close 10196 and open a new
PR for the inliner slowdown, or we leave 10196 open with a remark in the
audit trail. Does that sound OK to you?
Greetz
Steven