This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Memory footprint/compile time explosion caused by the tree inliner


On Thursday, April 17, 2003, at 03:15 PM, Eric Botcazou wrote:
PR optimization/10160 reports a huge compile time regression (4h30min vs a
few minutes) on the 3.3 branch wrt the 3.2 branch for LyX/QT on Sparc.

We fix this, then break it, then fix it, then break it. :-( We are doing something wrong.


Let me explain, in C++, there are no non-inlined functions. In C++, there are no function bigger than 5 lines. Now, choose your inlining strategy.

Hint, inlining all function under 6 lines doesn't work. Hint, inlining all functions that are inline doesn't work.

Possible solutions, don't allow more than 15x growth in caller, don't allow caller to grow past 1,000 stmts.

The
time is mostly (85%) spent in the scheduler but the regression is caused by
the tree inliner.

It would be nice if the scheduler works with very large functions. It is ok to miss some of the edges, as long as the creamy center is nice.


For example, one constructor grows from 128 stmts to 15437 stmts (154370
insns according to the fixed conversion factor), that is more than 100
times. Given that the scheduler is at least O(n^2), the game is over.



The problem is the new heuristics of the tree inliner:


int sum_insns = (id ? id->inlined_stmts : 0) * INSNS_PER_STMT
+ currfn_insns;
/* In the extreme case that we have exceeded the recursive inlining
limit by a huge factor (128), we just say no. Should not happen
in real life. */

This comment it wrong. The comment should read, this happens all the time in C++.


if (sum_insns > MAX_INLINE_INSNS * 128)

Suggestion:


	if (sum_insns > MAX_INLINE_INSNS * 128
	    || sum_insns > 15 * (currfn_insn - (id ? id->inlined_stmts : 0)))

Better, make the 15 a parameter. Try values 2-20 on real C++ code, benchmark the benefit v compile time, set value near knee. I think the relative cap is better than an absolute cap, as large bodied functions (aka C), the limit will be higher, and in small bodied functions, the limit will be lower.

:-(

Thanks for tacking this down, wanna try testing out the code above and find a reasonable constant and submit it?


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]