[PATCH] ipa-inline: Adjust condition for caller_growth_limits

Jan Hubicka hubicka@ucw.cz
Wed Jan 8 10:03:00 GMT 2020


> 
> Thanks.  So caller could be {hot, cold} + {large, small}, same for callee.  It may
> produce up to 4 * 4 = 16 combinations.  Agree that hard to define useful,
> and useful really doesn't reflect performance improvements certainly. :)
> 
> My case is A1(1) calls A2(2), A2(2) calls A3(3).  A1, A2, A3 are 
> specialized cloned nodes with different input, they are all hot, called once
> and each have about 1000+ insns.  By default, large-function-growth/insns are
> both not reached, so A1 will inline A2, A2 will inline A3, which is 40% slower than no-inline.
> 
> If adjust the large-function-growth/insns to allow 
> A2 inline A3 only, the performance is 20% slower then no-inline.
> All the inlinings are generated in functions called once.

I see, I assume that this is exchange.  What is difficult for GCC in
exchange is the large loop nest.  GCC generally assumes that what is
inside of deeper loop nest is more iportant and if I recall correctly
there are 10 nested loops wrapping the recursie call.

Basic observation is that for every self recursive function the
combined frequency of all self recursive calls must be less than entry
block frequency or the recursion tree will never end.

Some time ago we added PRED_LOOP_EXIT_WITH_RECURSION,
PRED_RECURSIVE_CALL, PRED_LOOP_GUARD_WITH_RECURSION which makes loops
leading to recursion less likely to iterate. But this may not be enough
to get profile correct.

I wonder if we can not help the situation by extending
esitmate_bb_frequencies to simply sum the frequencies of recursive calls
and if they exceeds entry block forcingly scale down corresponding BBs
accordingly (which would leave profile locally inconsistent, but I am
not sure how to do much better - one could identif control dependencies
and drop probabilities but after that one would need re-propagate the
loop nest I guess.

This may 1) make inliner less eager to perform the inline
         2) make tree optimizers to produce less damage on the outer
	    loops if inlining happens.
Honza
> 
> 
> Xionghu
> 
> > 
> > Honza
> > 
> 



More information about the Gcc-patches mailing list