This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Inliner heuristics updated


> On 5/25/05, Jan Hubicka <jh@suse.cz> wrote:
> > Hi,
> > I am about to commit the patch in the attached form (with testcase using
> > Janis' new tree-prof scripts, thanks!).
> > I did some extra testing on SPEC, Gerald testcase and tramp3d.  For SPEC
> > the new inliner always brings slightly better results at slightly
> > smaller compile time, important differences are only in the profile
> > driven runs I sent last time.
> 
> Note that for testcases like tramp3d with loads of small functions being
> inlined anyway, profiling now is several orders of magitude slower because
> even very small functions are instrumented.  This results in 90% of the
> generated assembly being long long increments of (redundant) counters.
> 
> I know you are aware of this problem, I just want to remind you that a fix
> for this (running some inlining before instrumentation) is necessary before 4.1.

Well, I must admit that I don't consider it a must for 4.1 (adding one
counter per function call in source file seems resonable), but I do have
patch for this (it basically makes local inlining pass inlining all
functions that are small enought to be inlined unconditoinally) and I am
just going to rescuesce it and re-benchmark.  Last time I tried it it
helped your testcase and also reduced memory usage for your testcasee as
well as Gerald's testcase as we didn't performed that much of inlining.

On the other hand it slowed down bootstrap a bit as inliner was run
twice over many functions and inliner is still linear in the size of
source function.  This might've changed as inliner is much cheaper now
than it used to be on tree-profiling previously (not sure in what
direction thought), but the linearity in size of function being inlined
into is quite unnecesary.  If we had way to point into call statement
from cgraph node and start inlining without walking the instruction
chain, we would be happy (the time complexity would depend on size of
code actually inlined).

I am going to modify cgraph nodes to have pointer to statement that gets
us closer, but then we still hit the problem of not having something
like RBI that would be independent on other instruction changes in the
basic block.

Honza
> 
> Thanks,
> Richard.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]