This is the mail archive of the mailing list for the GCC project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Inliner heuristics updated

On 5/25/05, Jan Hubicka <> wrote:
> > On 5/25/05, Jan Hubicka <> wrote:
> > > Hi,
> > > I am about to commit the patch in the attached form (with testcase using
> > > Janis' new tree-prof scripts, thanks!).
> > > I did some extra testing on SPEC, Gerald testcase and tramp3d.  For SPEC
> > > the new inliner always brings slightly better results at slightly
> > > smaller compile time, important differences are only in the profile
> > > driven runs I sent last time.
> >
> > Note that for testcases like tramp3d with loads of small functions being
> > inlined anyway, profiling now is several orders of magitude slower because
> > even very small functions are instrumented.  This results in 90% of the
> > generated assembly being long long increments of (redundant) counters.
> >
> > I know you are aware of this problem, I just want to remind you that a fix
> > for this (running some inlining before instrumentation) is necessary before 4.1.
> Well, I must admit that I don't consider it a must for 4.1 (adding one
> counter per function call in source file seems resonable), but I do have
> patch for this (it basically makes local inlining pass inlining all
> functions that are small enought to be inlined unconditoinally) and I am
> just going to rescuesce it and re-benchmark.  Last time I tried it it
> helped your testcase and also reduced memory usage for your testcasee as
> well as Gerald's testcase as we didn't performed that much of inlining.
> On the other hand it slowed down bootstrap a bit as inliner was run
> twice over many functions and inliner is still linear in the size of
> source function.  This might've changed as inliner is much cheaper now
> than it used to be on tree-profiling previously (not sure in what
> direction thought), but the linearity in size of function being inlined
> into is quite unnecesary.  If we had way to point into call statement
> from cgraph node and start inlining without walking the instruction
> chain, we would be happy (the time complexity would depend on size of
> code actually inlined).

Would it be possible to enable this first inlining pass only for the
instrumented compile?  Or would this screw the profile using compile?
Maybe we can condition it on both, -fprofile-generate and -fprofile-use.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]