Inliner heuristics updated

Jan Hubicka jh@suse.cz
Wed May 25 20:32:00 GMT 2005


> On 5/25/05, Jan Hubicka <jh@suse.cz> wrote:
> > > On 5/25/05, Jan Hubicka <jh@suse.cz> wrote:
> > > > Hi,
> > > > I am about to commit the patch in the attached form (with testcase using
> > > > Janis' new tree-prof scripts, thanks!).
> > > > I did some extra testing on SPEC, Gerald testcase and tramp3d.  For SPEC
> > > > the new inliner always brings slightly better results at slightly
> > > > smaller compile time, important differences are only in the profile
> > > > driven runs I sent last time.
> > >
> > > Note that for testcases like tramp3d with loads of small functions being
> > > inlined anyway, profiling now is several orders of magitude slower because
> > > even very small functions are instrumented.  This results in 90% of the
> > > generated assembly being long long increments of (redundant) counters.
> > >
> > > I know you are aware of this problem, I just want to remind you that a fix
> > > for this (running some inlining before instrumentation) is necessary before 4.1.
> > 
> > Well, I must admit that I don't consider it a must for 4.1 (adding one
> > counter per function call in source file seems resonable), but I do have
> > patch for this (it basically makes local inlining pass inlining all
> > functions that are small enought to be inlined unconditoinally) and I am
> > just going to rescuesce it and re-benchmark.  Last time I tried it it
> > helped your testcase and also reduced memory usage for your testcasee as
> > well as Gerald's testcase as we didn't performed that much of inlining.
> > 
> > On the other hand it slowed down bootstrap a bit as inliner was run
> > twice over many functions and inliner is still linear in the size of
> > source function.  This might've changed as inliner is much cheaper now
> > than it used to be on tree-profiling previously (not sure in what
> > direction thought), but the linearity in size of function being inlined
> > into is quite unnecesary.  If we had way to point into call statement
> > from cgraph node and start inlining without walking the instruction
> > chain, we would be happy (the time complexity would depend on size of
> > code actually inlined).
> 
> Would it be possible to enable this first inlining pass only for the
> instrumented compile?  Or would this screw the profile using compile?
> Maybe we can condition it on both, -fprofile-generate and -fprofile-use.

We can enable it for -fprofile-generate/-fprofile-use only I guess.  It
would make it more dificult to track down differences between profiled
and unprofiled code but that is not too critical I guess...

Honza
> 
> Richard.



More information about the Gcc-patches mailing list