This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]

Re: Sigh. Inlining heuristics.

To: Linus Torvalds <torvalds at transmeta dot com>
Subject: Re: Sigh. Inlining heuristics.
From: Daniel Berlin <dan at cgsoftware dot com>
Date: Tue, 10 Jul 2001 01:47:04 -0400
Cc: dan at cgsoftware dot com, gcc at gcc dot gnu dot org
References: <200107100519.f6A5JOD11200@penguin.transmeta.com>

Linus Torvalds <torvalds@transmeta.com> writes:

> Why do you care about the relative size of the inliner vs the inlinee AT
> ALL?

Because for non-statics, we'll end up inlining huge functions into
small ones, up to 15k insns, making your small

int foo(int a, int b)
{
        bar(a, b);
}
into a 15000 insn function.

Which is bad.

I thought about the other option, inlining everything where the call
overhead is the main cost (which must be what carlo meant), until we
hit a certain factor of code growth, but this had problems that made
it more annoying to implement, and i wasn't up for it tonight.
> 
> If you have function (a) that calls function (b), it doesn't really
> _matter_ whether (a) is smaller than (b) or not.
Sure.

> 
> The only thing that matters is the transformation you do: turn a
> "call/ret" into a inlined copy of (b).  The size of function (a) never
> enters the picture. 
> 
> Therefore you should compare the size of (b) with the size of the call
> itself, not with the size of (a).  And notice that the "size of the
> call" is more than just the call instruction, it is
>  - the call instruction
>  - the return
>  - the saving/restoring of registers around the call
>  - the argument setup
>  - the argument loading in the function
> 
> If you're looking for a heuristic for the "size" for the call, assume
> something like (5+2*nr-of-arguments) instructions.  You can tweak it
> later. 
This won't work strictly as written, because inline calls in arguments as well.

but only *after* we've decided whether to inline a function or not.

And I can't easily get the number of statements in the argument setup,
i have to walk the entire tree and count the greatest depth
(DECL_NUM_STMTS only works on function decls. I thought i was golden
till I got the tree check for doing it on args_init)

I thought of this one already, and would love to use it, but i wanted
a start, and figured it would spur others to improve it.

> 
> So why not make the heuristic be:
>  - if the target function tree is less than N times the size of the
>    call tree, then inline it.

Add another constraint saying we don't want to grow the code by more
than a certain factor, and you've got the original idea i started
with, but couldn't easily implement.
You'll note that the half no one argued with, is the controlling of
the code growth. That part is easy.
> 
> where "N" would be some small integer (and we all know that all random
> small integers have the initial value "3", don't we?). 
> 
> And don't allow this to percolate.  Remember: the disadvantage of
> inlining is code bloat and bad icache, so it is _NOT_ valid to just
> recursively use the same heuristic, because that would cause exponential
> code growth (BAD!) with only linear performance improvement.  That's a
> bad trade-off. You allow _linear_ code growth only, up to the factor
> N.
Right. Which is one of the problems with what we have now.  We just
have the "any function can grow to 15k insns" rule.

> 
> So once you've used up a factor of 2 for a call-site that got inlined,
> you can't just recursively use another factor of N - you now only have a
> factor of "N-2" left.

This part was already handled. This is why the code growth
heuristic I proposed is one everyone seemed to be okay with, and is
based on the size of the original function, before we started
inlining, not the size of the function *as* we are inlining it.

> 
> So assuming a call cost of roughtly 10 instructions on average, you
> should hesitate to inline anything that grows to more than 30
> instructions in size. 
Which is why i found it growing a 300 insn function to 15k a bit
ridiculous (I also found the part where we had found 128 functions to
inline humorous, but not quite as much. 128 functions is C++ for
you. 15k insns is just plain dumb).
> 
> Unless the user _tells_ you to inline, of course.  The programmer may
> know that the function is only called once, and is worth inlining.  So
> if the user specifically asks for inlining, you might bump the "max
> instructions" up a bit - say from 30 to 100 instructions.
> 
> An inlining limit of 15,000 instructions is just completely
> ridiculous. 

> 
> 			Linus

-- 
"It's a good apartment because they allow pets.  I have a
Shetland pony named Nikkie.  Last summer Nikkie was involved in
a bizarre electrolysis accident.  All her hair was removed
except for her tail.  Now I rent her out to Hare Krishna family
picnics.
"-Steven Wright

References:
- Re: Sigh. Inlining heuristics.
  - From: Linus Torvalds

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]