This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: IPA




Jan Hubicka wrote:

On Sunday 10 October 2004 22:59, Jan Hubicka wrote:


While I personally agree with the scheme you outlined (and also
described it in the GCC Summit paper), the requirement of not loading
whole program into memory at once (so relying only on global date while
doing IPA and performing the actual transformations later) actually
brings some stress on the implementation of IPA optimizers (ie
optimizers done after inlining needs to deal with the clones and it is
dificult to iterate optimizations like we commonly do for local
optimizations).


In my mental model, inlining and cloning would be one of the
last optimizations to do on the call graph. Wouldn't it be a
simple matter of adding the clones to the call graph and moving
on as if it is a normal compilation? At least you would not
actually do any IPA on clones and functions after inlining.
That would require re-analyzing the function to get the global
data right, wouldn't it?



The idea is that inlining (as any code specialization transformation) makes analysis more precise (ie once you constant propagate operand into function argument it might become very trivial), so one approach how to cope with this is to simply re-throw the expanded CFG after inlining to the analysers again to gain extra precision.

One posibility is to update the local information once you decide to do
the cloning (ie to decide in informed way that function X is good to
clone when it's argument is constant, you need to know anyway that it
will simplify after you do so)




I had a lot of problems getting this to work. I think that you need to either work on the code after inlining or before cloning. The patch that I posted and am still waiting for comments introduces the concept of a "master clone", one unique clone among all of the clones for a given procedure where you can hang the info off of. My propagation also gloms all of the edges for all of the clones into one bundle so you think you are working on the graph before any cloning has occurred.

The problem with the current cgraphunit is that there is no time when all of the inlining has been done because once on procedure has everything inlined into it, it is immediately compiled. This problem has been solved on the profiling and struct branches and there is a plan to get this back into main line after the freeze is over.



In fact Kenneth already suggested in private mail to
not use this approach and instead load everything in the memory at once
and do kind of iteration on the IPA.


Can you show some pseudo-algorithm of how this would work?



I believe Kenny's idea is to make early optimization, IPA,
clonning/inlining, IPA again on already inlined and compiled functions
and finally compile the function.




My plan should really be called the Berlin, Novillo, Zadeck plan. I outlined this to Jan. It is a good one for single files, it is not the right one for "whole programs".

In this plan, modest "whole compilation unit analysis is first performed before inlining is done. Next, each procedure is inlined and then compiled down past the point where ssa form is built to the point where some of the reducing optimizations are performed (i.e. constant prop, dead code, value numbering). These optimizations are going extract most of the benefit from the inlining and may delete enough code to really effect the interprocedural info. Then we apply our best aliasing algorithm and over the entire compilation unit and then restart the single procedural compilations with than information.

This will be too bulky for the whole program plan, but is really doable in the framework of todays build environments and it should get us real improvements over what we have today.

I am a big advocate of the whole world at link time plan but I understand the political problems with this.


I would like to hear some opinions
here.


Hmm, obviously it depends on how much we can improve on memory
efficiency, there is obvioulsy a lot of room for improvements
there if we could move to saner data structures. Everyone knows
we're a *little* memory wasteful right now ;-)



My major concern is that even if we are not little wastefull, we will
have scalability problems for very large compilation units that are
going to be more common..


But even then, I wonder if you could take the source of a very
large code base (say, GCC itself, or the linux kernel), and do
link time whole program optimizations on it if you have to keep
everything in memory.



GCC or Linux is still little. Think of Open Office in 10 years ;)


And iterating (or perhaps worklist based?) IPA doesn't sound very
attractive either. I suppose you could offer it as an option at
some very high optimization level, but does it really pay off in
general to do IPA so aggressively? Is the benefit large enough to
justify a complicated and probably compile time expensive framework
like that?



worklist is probably not good idea. I had in mind simply re-running the passes, iteration was probably not best way to call it ;)

Honza


In any case, we'll be stuck on doing everything in-memory at first
anyway, so we'll have the opportunity to experiment a bit, and that
will be interesting.  Of course, making pre-inlining optimizations
possible is the most important thing for now...

Gr.
Steven







Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]