This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [RFC/patch] Callgraph based inlining heuristics


Richard Guenther wrote:

Steven Bosscher wrote:


Jan Hubicka wrote:


(that is not the case of attribute leafify proposed - doing the
datastructure
in full generality allowing each particular walk in the callgraph to


be


either
collapsed or not is probably too expensive so attribute leafify can be
implemented as a special case).


Assuming we really want such an attribute...



Yes, we really want such an attribute. There are cases where you benefit


I know that _you_ really want it because you want to have your application work well right now, and I understand that. What I am afraid of with this attribute is that we implement yet another very ugly (IMHO) "feature" and be stuck with it forever... And "leafify" only papers over the real problem, it is better to find a way to fix that instead. Of course this fix will not be as easy to implement as "leafify". The Dark Side is easier, more seductive...

very much from cse/gcse if you remove all calls inside a loop by
leafifying it for my scientific C++ application (POOMA based). Also
the loop optimizer could probably do better in this case (it doesnt, but
thats another problem). And I never want the compiler to do so much
inlining without telling it explicitly (of course some profile feedback
on the resulting asm code speed/size would really cut it).

That is what we should be aiming for instead: Find a way to include some kind of profile information to guide inlining. What I would really like to see is that in addition to callgraph based inlining (unit-at-a-time), we could also decide to expand calls inline later on in the compilation process when we discover that the call is in a hot zone of the code. The problem with this is that it will require the availability of a tree CFG (ie. tree-ssa) and of profile information in the tree CFG. I don't know if this is feasible at all because we still don't maintain the CFG across all passes, let alone when expanding trees to RTL, and I don't have a clue about how GCC collects and loads profile information. But IIRC Honza was doing some work in that area as well?

Gr.
Steven



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]