This is the mail archive of the mailing list for the GCC project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [RFC] WHOPR - A whole program optimizer framework for GCC

Diego Novillo wrote
> On 12/18/07 08:29, Jan Hubicka wrote:
>> Doing call graph changes should not be that hard (I was trying to keep
>> similar deisgn in mind when implementing it, even if we stepped away
>> from the plan in some cases, like reorganizing passes from vertical to
>> horisontal order). Nearest problem I see is merging different
>> declarations of units read back, I have prototype implementation of DECL
>> merging pass done from my trip this week and hope to have it working at
>> least for --combine and C during christmas.
> Cool.  Yeah, that is going to be one of the main things we need to
> continue.  For the next little while I will be working on finishing
> tuples, most of what remains are mechanical changes to get bootstraps
> going.  I will then work on tuning RTL generation.
> Since we have these two ongoing branches (LTO and tuples) that will be
> used by whole program optimizer, I think we need to coordinate a
> little bit.  I wrote up a wiki page to keep all these things linked
> from one place.
> I started a very incomplete implementation plan that I would like
> folks to help fill in.
> Ken/Nathan, what are the major issues still missing in LTO?  I wrote
> up a couple, but I'm sure you guys have a much more complete list.
> Jan, wrt the optimization plan coming out of the analysis phase, and
> the various pieces of header/summary information, what do you think
> are the major pieces we need?
> In terms of branch mechanics, I'm initially tempted to do this
> implementation on a branch separate from tuples and lto.  This will
> allow us to merge both lto and tuples separately, as the rest of the
> optimizer is still a long ways away.  What do folks think?
> Thanks.  Diego.
I am hoping that in the next couple of days, Nathan and I will be able
to say that we have completed to work that Codesourcery/NaturalBridge
contracted to do with IBM.  Completion means that we are able to compile
and run the C language spec 2000 benchmarks in LTO mode, as well as
compile all of the gcc compiler itself (this does not include the runtime).

There are still many open issues that we are hoping that the community
would address
(The next four items are considered general cleanups/improvements
independent of LTO and would be welcomed as changes to the truck when
stage I opens.  However a complete LTO implementation depends on them
being completed):

1) Removal of the rest of the lang hooks.
2) Removal of support for not file at time mode (I believe that IanT has
a patch for this.)
3) Removal of any remaining places where the front ends directly
generate rtl.
4) Gimplifying static initializers at the same time as everything else.

When these 4 items are done, it will be possible to consider the making
lto work
with other front ends. 

There are still LTO items that do not work with the C front ends.  Most
of these support extensions to C.

1) We do not handle types that reference local variables.  Such as
arrays that are sized by the parameter to a function.
2) Nested functions.
3) Attributes associated with types, like packed.

(1) may be hard. The rest of a simple matter of programming.

There is still a matter that it is difficult to separate the LTO type
information from the debugging information. 

There are a large number of things that need to change to make lto/whopr
a reality.  Many of them are addressed in the google document. 

I personally was planning to start restructuring the ipa passes and
serializing the cgraph.  I was waiting for Honza to get back to being
regularly available so that we could work on that together.  The current
code does not need serialize the cgraph since it loads all functions
into memory, the call graph is just rebuilt as each function is loaded. 
This obviously needs to be changed before we can at all talk about
distributing the compilation. 

I personally think that the most pressing problems are

1) making lto/whopr work in the presence of modules that do not fit
perfectly together, because of type or function argument mismatches.  I
think that this will be a challenging problem that will require a lot of
thought and code.  The easy case of just dying when things do not match
up is ok, but it is unlikely that lto/whopr will be a generally useful
tool without at least being able to swallow any existing program and at
least do try to do something good.

2) making the front end specific aliasing information available in some
language independent manner to the back ends.  Gcc is basically a C
compiler with a bunch of other front ends graphed onto it.  While it
makes many accommodations to the requirements of other languages, it
rarely does things to take advantage of the "higher level" information
that is available in non C languages.  Toon's paper at last year's
summit is a good example of exactly how badly we do, and the problem is
likely to only get worse with LTO/whopr as the lang hooks go away. 
While the last section of the whopr pays some lip service to this, we as
a community have never really addressed the issues of how we could
expand/change our internal representation to accomodate the high level
features supported by the non c frontends.

I have not looked at the code for Diego's tuplization.  The wiki does
not indicate that there is any semantic difference between gimple trees
and gimple tuples, it just appears to be a well designed data structure
cleanup.  It would be nice if this next round of intermediate code was
able to represent some of the information that we will be throwing away
the lang hooks. 

Both of these are very hard problems and they are likely to require the
same level of commitment that will be required to make Whopr work.  It
is not that i think that making lto/whopr work in a distributed
environment is not an important problem, it is just that i think that we
need to make LTO produce good code on real programs first. 


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]