This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: Sigh. Inlining heuristics.
- To: Linus Torvalds <torvalds at transmeta dot com>
- Subject: Re: Sigh. Inlining heuristics.
- From: Daniel Berlin <dan at cgsoftware dot com>
- Date: Tue, 10 Jul 2001 01:47:04 -0400
- Cc: dan at cgsoftware dot com, gcc at gcc dot gnu dot org
- References: <200107100519.f6A5JOD11200@penguin.transmeta.com>
Linus Torvalds <torvalds@transmeta.com> writes:
> Why do you care about the relative size of the inliner vs the inlinee AT
> ALL?
Because for non-statics, we'll end up inlining huge functions into
small ones, up to 15k insns, making your small
int foo(int a, int b)
{
bar(a, b);
}
into a 15000 insn function.
Which is bad.
I thought about the other option, inlining everything where the call
overhead is the main cost (which must be what carlo meant), until we
hit a certain factor of code growth, but this had problems that made
it more annoying to implement, and i wasn't up for it tonight.
>
> If you have function (a) that calls function (b), it doesn't really
> _matter_ whether (a) is smaller than (b) or not.
Sure.
>
> The only thing that matters is the transformation you do: turn a
> "call/ret" into a inlined copy of (b). The size of function (a) never
> enters the picture.
>
> Therefore you should compare the size of (b) with the size of the call
> itself, not with the size of (a). And notice that the "size of the
> call" is more than just the call instruction, it is
> - the call instruction
> - the return
> - the saving/restoring of registers around the call
> - the argument setup
> - the argument loading in the function
>
> If you're looking for a heuristic for the "size" for the call, assume
> something like (5+2*nr-of-arguments) instructions. You can tweak it
> later.
This won't work strictly as written, because inline calls in arguments as well.
but only *after* we've decided whether to inline a function or not.
And I can't easily get the number of statements in the argument setup,
i have to walk the entire tree and count the greatest depth
(DECL_NUM_STMTS only works on function decls. I thought i was golden
till I got the tree check for doing it on args_init)
I thought of this one already, and would love to use it, but i wanted
a start, and figured it would spur others to improve it.
>
> So why not make the heuristic be:
> - if the target function tree is less than N times the size of the
> call tree, then inline it.
Add another constraint saying we don't want to grow the code by more
than a certain factor, and you've got the original idea i started
with, but couldn't easily implement.
You'll note that the half no one argued with, is the controlling of
the code growth. That part is easy.
>
> where "N" would be some small integer (and we all know that all random
> small integers have the initial value "3", don't we?).
>
> And don't allow this to percolate. Remember: the disadvantage of
> inlining is code bloat and bad icache, so it is _NOT_ valid to just
> recursively use the same heuristic, because that would cause exponential
> code growth (BAD!) with only linear performance improvement. That's a
> bad trade-off. You allow _linear_ code growth only, up to the factor
> N.
Right. Which is one of the problems with what we have now. We just
have the "any function can grow to 15k insns" rule.
>
> So once you've used up a factor of 2 for a call-site that got inlined,
> you can't just recursively use another factor of N - you now only have a
> factor of "N-2" left.
This part was already handled. This is why the code growth
heuristic I proposed is one everyone seemed to be okay with, and is
based on the size of the original function, before we started
inlining, not the size of the function *as* we are inlining it.
>
> So assuming a call cost of roughtly 10 instructions on average, you
> should hesitate to inline anything that grows to more than 30
> instructions in size.
Which is why i found it growing a 300 insn function to 15k a bit
ridiculous (I also found the part where we had found 128 functions to
inline humorous, but not quite as much. 128 functions is C++ for
you. 15k insns is just plain dumb).
>
> Unless the user _tells_ you to inline, of course. The programmer may
> know that the function is only called once, and is worth inlining. So
> if the user specifically asks for inlining, you might bump the "max
> instructions" up a bit - say from 30 to 100 instructions.
>
> An inlining limit of 15,000 instructions is just completely
> ridiculous.
>
> Linus
--
"It's a good apartment because they allow pets. I have a
Shetland pony named Nikkie. Last summer Nikkie was involved in
a bizarre electrolysis accident. All her hair was removed
except for her tail. Now I rent her out to Hare Krishna family
picnics.
"-Steven Wright