This is the mail archive of the
mailing list for the GCC project.
Re: Inliner heuristics updated
> On 5/26/05, Jan Hubicka <email@example.com> wrote:
> > > Yes. In principle two stages of inlining could even make a better
> > > code-size estimate if we're doing some simple optimization between
> > > them. So adding an extra switch for this early-inlining-of-small-functions
> > > which would be automatically enabled for profile-based optimization
> > > could be useful in general, especially at -O3 to get better size estimates.
> > > Definitely I'd want to do some experiments if you can dig out the code
> > > to do multiple inlining passes.
> > Hi,
> > I am attaching the patch (not 100% polished, few testsuite failures).
> > Quite amusingly the performance characteristics has changed since last
> > time I tested it. Now it seems to be performance neutral on GCC
> > components (the slowdown has been actually caused by forgotten
> > cgraph_verify_node call), it improves compile times by 12% on Gerald's
> > testcase but it regress compile time wise on tramp3d by 30%. On the
> > other hand it makes tramp3d also 20% faster for me for some reason, so
> > this tradeoff looks sort-of resonable.
> > With -ftree-based-profiling we now need 29s per iteration instead of 6s
> > withhout, so the slowdown is still quite noticeable and additionally I
> > noticed that compile times increase from 3m to 10m. This is caused by
> > fact that we use code size estimates computed before inlinig to drive
> > the inliner, I will fix that.
> You mean not re-computing the size estimate, but only using the update
> from the simple formula, or not updating at all? Anyway, this is good news!
Well, problem is that we compute before instrumenting so we may end up
with quite a bigger code to inline than expected...
I need to move computation into the IPA pass itself now, as is done
already on tree-profiling (but with early optimization and inling things
gets even more twisty, as we probably want to do early inlining first,
profile next, optimize later and re-inline. This effectively mean
keeping cgraph up-to-date across optimization that is going to me
somewhat more fun, I guess we need to simply rebuild the cgraph node
afterwards :( )