This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [tree-profiling-branch PATCH] Function cloning + IPCP extension (RESUBMISSION)


Steven Bosscher <stevenb@suse.de> wrote on 11/10/2004 18:03:51:

> On Monday 11 October 2004 16:26, Razya Ladelsky wrote:
> > the existing infrastructure. This should prove to be very useful, and
> > easily portable
> > to low-level Gimple/CFG/Tree-SSA etc. This already raised critical 
issues
> > such as
> > "how to represent the outcome of IPA?" - for CP we can insert a 
statement
> > to indicate
> > that a parameter is known to hold a constant value, but for other 
analysis
> > (value prop,
> > alias analysis) this does not work.
> 
> I do agree it is useful, but I doubt it will be so easy to port to
> low level GIMPLE.  But we'll see, perhaps it will indeed not be that
> hard.
> 
> 
> > > To quote myself from "
http://gcc.gnu.org/projects/tree-profiling.html":
> > >
> > > ==============================================================
> > > (...) plans exist for reorganisation of the compiler passes as 
follows:
> > >
> > > 1. For each SCC in the call graph in reverse topological order
> > >       do simple intraprocedural optimizations in SSA form
> > >       (basically, run anything that is cheap and can not
> > >       create overlapping live ranges, but will improve the
> > >       results of the next step);
> > >       collect intraprocedural information necessary for
> > >       interprocedural data flow analyses;
> > >       store the function's intermediate representation somewhere (*)
> >
> > Are you convinced that tree-ssa/flow-sensitive/global intraprocedural 
is
> > necessary?
> > Most of the benefit we know of can be obtained using scalable
> > flow-insensitive analysis.
> 
> My reasoning is: If we're going to have to to traverse each function
> to gather function-local information, then why not clean it up first?
> 

We are not sure how much there is to clean up at such an early stage,
and does not seem to justify transforming to tree-ssa. 

> 
> > Most of the benefit we know of can be obtained using scalable
> > flow-insensitive analysis.
> 
> It would still be flow-insensitive in the above scheme (except maybe
> that you get some flow-sensitive stuff for free if we can collect
> local information on the function in SSA form).
> 

Again, it's a cost-performance issue; Tree-ssa can improve performance(and
simplify the intraprocedural stage) but seems too costly for whole 
program.

> 
> > Your scheme implies that we hold all functions in Tree-ssa form, 
right?
> 
> I hope so, but I don't know if that is necessary or benificial (or cost
> effective, if you like).  I don't know how expensive it would be to take
> each function out of SSA after step 1, and rewriting it a second time
> for step 3 (ie. post-IPA).  We should definitely keep the function body
> in low-GIMPLE, i.e. even if we'd take the function out of SSA form, you
> wouldn't run TER during out-of-ssa.
> 
> 
> > > 2. Perform interprocedural analyses and optimizations (at least
> > >    constant propagation, side-effects analysis, and function cloning
> > >    and specialization).
> >
> > Where does the inliner kick in?
> 
> I believe cloning and inliner should be the last IPO passes to run,
> cloning before inlining, or simultaneously (dunno what works, or if
> it makes a difference).  You definitely want to be able to inline
> a clone, of course.
> 
> 

We agree, Inlining does not affect IPA, but :

> > IPA can open-up new
> > opportunities/priorities
> > for inlining
> 
> That is what Zadeck said too.  I was surprised by that a bit, I had
> expected the only new inlining opportunities to come from cloning
> after IPCP, and simplifying the clone.  That should be pretty easy
> to keep track of.
> 
> Perhaps we need something completely different.  You have more
> experience with this than I do - so what do you think the framework
> for IPA should look like?
>

Jan's framework seems reasonable to us for whole program:
- Gathering intraprocedural properties (on low gimple).
- Performing IPA (recording its decisions/results).
  Making inlining decisions.
- Trasformation function by function, starting with materializing the 
clones and inlining.
 
> 
> > (it already does so now!).
> 
> How often?  Do you have an example?
> You still haven't shown any numbers :-(
> 
> > IPA (and the inliner) need access to all functions together; the 
important
> > point to decide is the representation in which all functions will be 
held
> > together.
> 
> IPA would only need all functions together if you indeed would not just
> have a phase before IPA to collect global data.  If you have a pre-pass
> that collects all global data, IPA would work with just that data, and
> you would not need to have all functions together.  This does impose a
> few limitations, but it also removes the requirement that you have to be
> able to hold all functions together.  You can just dump them somewhere
> and do IPA based only on the call graph and  information you've gathered
> locally in each function before dumping it.
> 

Agreed.

> At least, that was the idea.
> 
> If you're going to run some kind of iterating IPA like Honza mentioned,
> then you of course do need all functions readily available.  But would
> that be possible in general for *very* large applications, and what
> does it buy us compared to the scheme I schetched?  For example, does
> IPA make new inlining opportunities available often enough to justify
> keeping all functions around at all times?
> 
> To me, it appears to be a similar problem as that of classic GCSE: You
> can always do another iteration to find more redundancies, but in most
> cases the costs outweight the benefit.  For IPA, I would *hope* that a
> scheme where you collect local data once and do IPA once would catch
> the vast majority of IPO opportunities.  If that is indeed the case,
> then those last few inlining opportunities and that extra bit of side
> effects info that you get from some kind of iterative IPA is maybe not 
> worth the trouble.
> But since both Zadeck and you argue for keeping all functions around, I
> guess it *is* worth the trouble from your points of view.
>

We are not suggesting to have iterative IPA, only to do IPA before 
inlining.
 
> Zadeck mentioned that he could have several tens of thousands of Java
> classes in memory and do full inter-class inlining.  That's mighty
> impressive, but not something GCC would be capable of anytime soon,
> unless we manage to radically change our data structures to be far more
> efficient.
> 
> Oh well...  I guess I really do not have enough experience with IPA in
> production compilers to make any sensible comments on all this.  Hope
> you do ;-)
> If you have numbers or some references to papers on this subject, that
> would be much appreciated.
>

We'll send you pointers to relevant papers shortly.
 
> > AFAWCS, porting our IP/CP from trees to lowered Gimple + CFG seems 
pretty
> > straightforward because it's flow insensitive (and we'll definitely do
> > that as soon as it's available for us!).
> 
> Great.
> 
> 
> > Cloning function CFGs should be done very
> > similarly to the way they are inlined; in-fact, maybe the inliner 
should
> > first clone the callee, and then tailor it into the caller.
> 
> Agreed.  I've always considered inlining to be just a special case of
> cloning, conceptually.
> 
> Gr.
> Steven
> 
> 
> 


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]