Inlining heuristics

Tue May 13 19:45:00 GMT 2003

I'll stick my neck out and give my understanding of inlining issues as they
came up in Gambit-C, a Scheme->C compiler.  Because working with lots of
little functions is something like "the lisp way", inlining is quite important
here for performance, and my guess is that similar issues come up in
heavily templatized C++ code.  Marc Feeley, the author of Gambit-C, certainly
understands the specifics of these issues much better than I do.

1.	Bottom up inlining is (more or less) equivalent to compressing the
	call graph into layers.  This is not a bad strategy.  Gambit-C used
	to use the top-down strategy but changed because of unpredictable
	behavior.

2.	Gambit-C uses a single heuristic---a parameter determines how much
	expansion is allowed by inlining a function.  In the example that
	Michael Matz gave, where a short, frequently-called, function
	very infrequently calls a much larger function, this heuristic
	would not allow the large function to be inlined into the small one;
	on the other hand, the small one would still be inlined into its
	callers, since it would not increase the size of the callers very
	much.

The heuristic is applied at the expression level.  The default inlining-limit
is 300%, i.e., inlining is allowed to triple the size of the function.
For heavily numerical code I usually use 500% or 1000%, with carefully
hand-tuned kernels set to 0%, since I've usually written them with as
much inlining as I want from the beginning.  (Loops in Scheme are just function
calls, so loop unrolling is achieved via the inlining mechanism for function
calls.)  Inlining control is pretty much at the block level in Gambit-C; this
would seem to be hard to do with the current machinery in gcc.

Perhaps some experiments could be attempted with C/C++ to see if heuristics
like this could be helpful.

Brad