This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Drop callee function size limits for IPA inlining


> Hi,
> 
> On Fri, 18 Feb 2011, Jan Hubicka wrote:
> 
> > > > > By removing the size limits that prevents inlining will push unit 
> > > > > growth down that in turn penalize cases that really do need it.
> > > > 
> > > > Easy: scale growth limits with estimated usefullness of an edge.  
> > > > "this edge is totally boring, so inlining it must not grow the unit 
> > > > by even 1%, but that edge there, it's really really hot (or we have 
> > > > reason to believe that inlining it will do wonders and magic), so 
> > > > allow a unit growth of 100% for it."  The hard part obviously is 
> > > > estimating the usefullness, but we knew that from the start :)
> > 
> > When you derive usefulnes of call from local property, why to relate it 
> > to a percentage of global property, like unit size is?
> 
> Because usefulness is not only a local property.  It becomes global by 
> comparing it to usefulness of other calls, at which point you're able to 
> say "this call is more useful to inline than that one".  Apart from that I 

This is when you decide to consistently mix badness, usefulness (decision
whether you want or do not want to inline) and unit growth together.

The SPEC2k6 results with Richi's patch arrived today.  On SPECint we don't see
much of code size growth.  This is (as I saw earlier) because on really big
units we tend to find so many inline candidates (useful calls) so we eventually
always bail out on unit growth limits.  This happens on all big programs i
tested - mozilla, gcc etc.

On SPECfp there are code size growths as the tests tends to be smaller (at
least due to fortranish organization smaller in absolute number of functions).

We have c-ray with relatively large raysphere (polyhedron's fatigue is the same
case as analyzed in one of the PRs).  We declare decision on whether to inline
it to be global property. We set growth to 30% that is safely big enough to
make raysphere inlined on c-ray.

However now if you make c-ray not a stupid benchmark, but part of big program,
then raysphere won't be inlined with LTO since the other useful calls from big
program leading to smaller callees will win. Badness is more or less function
of the callee size with few extra local hints based on caller side so it is
quite safe to assume that the fibheap queue is ordered by size of the function
and we cut of much earlier than the current default of 40 instructions in size.

So it seems that we will end up with "benchmark trick" that will make small
program work but we will end up with cases where LTO will make final program
smaller.

This observation was also one of reasons for actually tuning down the auto
inlining insns for 4.5 and 4.6 compared to earlier releases.  Perhaps it was
not a coolest idea bit it seems to me that to make c-ray like code fast not by
an accident (i.e. that it remains consistently fast even in bigger application)
there is not shortcut except for making inliner to realize that inlining
rayspehere is really good idea after all...

> don't see what's illogical about local properties influencing global ones.  
> For instance a very simple relation would be "for functions declared 
> inline we accept a 100% unit growth, for functions only auto-inlined but 
> not declared so, we only accept 10%".

Yes, it makes sense to do that.  I was considering this but it seemed better
to avoid another tunable parameter.

This won't help in the case of fortran and c-ray.

Honza
> 
> > But lets develop some infrastructure so we can experiment easilly and 
> > then we will see.
> 
> Indeed.
> 
> 
> Ciao,
> Michael.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]