This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [tree-ssa, RFC] CFG transparent RTL expansion


On Fri, 2003-12-19 at 07:13, Jan Hubicka wrote:

> > 1 - Where do you propose to emit the measurement code, and where do you
> > plan to read it in? Presumably they have to be the same location...  I
> > assume the measurement code is currently emitted in RTL land, probably
> > early in the rtl cycle? Would we be doing that in the front end, or in
> > SSA land? or just before SSA after the CFG is created

> 
> I plan to move it to very early SSA compilation path, just after early
> cleanups (ideally DOM and DCE) is done so we don't end up profiling
> unnecesairly compilicated CFG.


> > 
> > 2 - Does profiling work on abnormal edges?  Ie, are throw edges
> > annotated, or are they guessed/ignored/calculated when possible?
> 
> It works on noncritical abnormal edges, for critical abnormal edges we
> guess, but of course the idea is that once we make EH edges redirectable
> (and thus not abnormal) abnormal edges will be real anomality and very
> few of them will be critical (setjmp edges are not and computed jump
> edges also won't create critical ones as long as we stay with one
> computed jump per function trick we do at SSA now)


> > 
> > simplistic example, but lets say profiling indicates that we ought to
> > inline a(z) when we are processing b. How does that profile information
> > from a() get propagated into the code which is inlined into b? There is
> > no CFG for a(), just the trees. Do you have to read the info for a() in
> > from the profile source each time its inlined or how does that work?
> 
> The inlining problem is not dealt with this particular patch, it solves
> just SSA<->RTL interface, but you know what is my longer term plan, but
> lets try to not run into very deep details with the inlining interface
> right now.
> 

I realize this. The main purpose of this patch is to enable other
things, so I think we need to discuss those other things in order to
make decisions on whether this is the best approach.


> This is common problem with updating profile (not only after inlining
> but after any code specialization - unrolling, tracing or whatever). 
> 
> The idea is when such conflict is noticed, the profile gets updates
> somehow partly incorrectly (such that all basic block directly dominated
> by edge being removed gets their frequencies subtracted).  The
> optimizations expect such partially invalided profiles and must behave
> sanely in presence of them.



Clearly, as long as the CFG exists, thats where the information ought to
be stored. The only real question I want to deal with is should the CFG
be kept after SSA right through to RTL, or when the CFG is destroyed,
should the information be attached to the trees somehow and then used
during expansion to annotate the new CFG rtl creates, and/or used to
annotate the CFG for trees when the function is inlined. I think thats
fundamentally where we have decisions to make, so I would like to work
through the various differences. Im also about to go on vacation, so the
more I have to think about the better :-)

If we annotate the trees with profiling info when we are done with them,
reasonably correct branch information can be propagated into the inlined
code. If the CFG is the only place the information is stored, we'd also
have to keep the CFG for a() around in order to put its information into
into b() when its inlined. So we wouldn't just be keeping the trees for
inlineable functions around, we'd have to keep objectified CFG's as
well. Dont read that as me saying Im religously opposed to keeping CFGs
right through to RTL. Its merely an observation.


> The profile becomes less exact during optimization process, but overall
> most of compilers (ORC, IMPACT for instance) are able to keep it by such
> simplistics methods in good shape to bring benefits on BB reordering.
> Most questions about the profile are easy, like "is this loop executed
> insanily many times" or "is this basic block hot?"
> these questions are relatively stable WRT slight degenerations.
> 

Sure, but if we don't bring an inlinable function's profile into the new
function when we inline it, we are throwing away reasonably correct
information right off the bat. If you have all the probilities for a()'s
branches, you ought to bring those into the code you create when you
inline a() into b(). I'd like to make sure we do that.

Yes, The degenerations dont affect loop stuff so much as it does the
lower level bits like register allocation and branch reversal/prediction
at the rtl level, which will be the farthest away consumers of this
information, the side effects of which we cant predict, just measure :-)

So we presumably have to read the information back in during the second
compilation at the same point the information was written out during the
first one, or we dont get a good mapping of block execution counts
right? And thats somewhere in the SSA optimizer.

> > 
> > 4- Does anyone other than me find the idea of inlining optimized trees
> > appealing? I understand thats not on the table right now because there
> > are difficulties with EH, at the very least. I would think we could
> > encapsulate that info somehow, but clearly thats a problem that would
> > need to be solved.
> 
> Sure it is appealing.  In order to do something sane, you need to do
> some analysis that are not doable without CFG/early cleanups.
> As pointed out already, the benefits of inlining estimate by time of
> execution of the callee, while the costs is the size.  At the moment we
> do have only the second information and our strategy is "do as many
> inlinining as size constraint allows and lets hope that very fast
> functions will get inlined too then"
> 
> For instance my current recursive inlining code would do much better job
> if we are able to discover tail calls.  Another examples include partial
> inlining if functions like
> if (test)
>    common fast path
> else
>    something large
> 
> can be inlined as
> 
> if (test)
>   fast path
> else
>   do_something_large ()
> 

OK, so we're on the same page there, someday we want to inline optimized
functions :-)


So where do you envision the inliner then? It would have to be done
after the profiling information is read is in order to make use of it.
Presumably immediately after.  You also plan to create a CFG aware
inliner and objectify the CFG, and do the inlining from that? And then
the inliner would work on SSA? right after DCE or somesuch place. SO the
inliner will be an SSA inliner, or would we go out of SSA and back into
SSA at that point?

Andrew




Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]