This is the mail archive of the
mailing list for the GCC project.
- From: Jan Hubicka <jh at suse dot cz>
- To: Kenneth Zadeck <zadeck at naturalbridge dot com>
- Cc: Jan Hubicka <jh at suse dot cz>, Steven Bosscher <stevenb at suse dot de>,Razya Ladelsky <RAZYA at il dot ibm dot com>, gcc-patches at gcc dot gnu dot org,hubicka at ucw dot cz, Mircea Namolaru <NAMOLARU at il dot ibm dot com>,Ayal Zaks <ZAKS at il dot ibm dot com>, "Berlin, Daniel" <dberlin at dberlin dot org>,"Novillo, Diego" <dnovillo at redhat dot com>
- Date: Tue, 12 Oct 2004 16:31:57 +0200
- Subject: Re: IPA
- References: <OF5CDC40B7.BBE1ECAB-ONC2256F29.0053D8D3-C2256F29.0053FD4F@il.ibm.com> <firstname.lastname@example.org> <20041010205940.GB912@kam.mff.cuni.cz> <email@example.com> <20041010232538.GD912@kam.mff.cuni.cz> <416BE79F.firstname.lastname@example.org>
> Jan Hubicka wrote:
> >>On Sunday 10 October 2004 22:59, Jan Hubicka wrote:
> >>>While I personally agree with the scheme you outlined (and also
> >>>described it in the GCC Summit paper), the requirement of not loading
> >>>whole program into memory at once (so relying only on global date while
> >>>doing IPA and performing the actual transformations later) actually
> >>>brings some stress on the implementation of IPA optimizers (ie
> >>>optimizers done after inlining needs to deal with the clones and it is
> >>>dificult to iterate optimizations like we commonly do for local
> >>In my mental model, inlining and cloning would be one of the
> >>last optimizations to do on the call graph. Wouldn't it be a
> >>simple matter of adding the clones to the call graph and moving
> >>on as if it is a normal compilation? At least you would not
> >>actually do any IPA on clones and functions after inlining.
> >>That would require re-analyzing the function to get the global
> >>data right, wouldn't it?
> >The idea is that inlining (as any code specialization transformation)
> >makes analysis
> >more precise (ie once you constant propagate operand into function
> >argument it might become very trivial), so one approach how to cope with
> >this is to simply re-throw the expanded CFG after inlining to the
> >analysers again to gain extra precision.
> >One posibility is to update the local information once you decide to do
> >the cloning (ie to decide in informed way that function X is good to
> >clone when it's argument is constant, you need to know anyway that it
> >will simplify after you do so)
> I had a lot of problems getting this to work. I think that you need to
> either work on the code after inlining or before cloning. The patch
> that I posted and am still waiting for comments introduces the concept
> of a "master clone", one unique clone among all of the clones for a
> given procedure where you can hang the info off of. My propagation also
> gloms all of the edges for all of the clones into one bundle so you
> think you are working on the graph before any cloning has occurred.
> The problem with the current cgraphunit is that there is no time when
> all of the inlining has been done because once on procedure has
> everything inlined into it, it is immediately compiled. This problem
> has been solved on the profiling and struct branches and there is a plan
> to get this back into main line after the freeze is over.
I am not quite sure about the struct branch, but with tree profiling,
there is still inlining done just before compiling the function. The
difference is only in a fact that you will have the CFG built in the
early stages of compilation, so each function body will be already in
low gimple and after early optimizations once the IPA analysis takes
place. (so for instance the inliner can already use the profile read
back to compiler or built via guessing)
Concerning the IPA analysis, it seems to me that the presence of clones
should not be too major pain. I have plan to implement pass manager
that will have three hooks for each optimizatoin. One will be called
for each function body once the function is finalized supposed to
compute local properties, later you will get called to compute the
global info during the cgraph_optimize and then you will get called for
each body to perform the changes)
If you don't get any benefits from inlining, you can register yourself
before the inliner and you won't see the clones in global phasse. If
you get some more info from expanded callgraph, you register yourself
after the phasse and you should be able to propagate the info across the
cloned nodes as if you were already analysing the function body after
Ie in most cases we will have the local info shared across the clonnes
and you we locally analyze before the inlining decisions takes place,
while global info might be different for each clone, but I guess I have
to think more about this.
> >>>In fact Kenneth already suggested in private mail to
> >>>not use this approach and instead load everything in the memory at once
> >>>and do kind of iteration on the IPA.
> >>Can you show some pseudo-algorithm of how this would work?
> >I believe Kenny's idea is to make early optimization, IPA,
> >clonning/inlining, IPA again on already inlined and compiled functions
> >and finally compile the function.
> My plan should really be called the Berlin, Novillo, Zadeck plan. I
> outlined this to Jan. It is a good one for single files, it is not the
> right one for "whole programs".
> In this plan, modest "whole compilation unit analysis is first performed
> before inlining is done. Next, each procedure is inlined and then
> compiled down past the point where ssa form is built to the point where
> some of the reducing optimizations are performed (i.e. constant prop,
> dead code, value numbering). These optimizations are going extract most
> of the benefit from the inlining and may delete enough code to really
> effect the interprocedural info. Then we apply our best aliasing
> algorithm and over the entire compilation unit and then restart the
> single procedural compilations with than information.
> This will be too bulky for the whole program plan, but is really doable
> in the framework of todays build environments and it should get us real
> improvements over what we have today.
> I am a big advocate of the whole world at link time plan but I
> understand the political problems with this.
I am trying to keep organization of the current GCC implementation of
IPA optimizers in a shape so they will scale up to the link time
optimization approach and I really hope that sooner or later we will
find way how to avoid the political issues with this scheme.