This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [RFC] WHOPR - A whole program optimizer framework for GCC


On Dec 19, 2007 1:41 PM, Kenneth Zadeck <zadeck@naturalbridge.com> wrote:

> I am hoping that in the next couple of days, Nathan and I will be able
> to say that we have completed to work that Codesourcery/NaturalBridge
> contracted to do with IBM.  Completion means that we are able to compile
> and run the C language spec 2000 benchmarks in LTO mode, as well as
> compile all of the gcc compiler itself (this does not include the runtime).

Sounds good.  Do you folks have some criteria for merging into
mainline?  I ran the C testsuite today by forcing every single test to
use -flto, there were about <8,000 testsuite failures (FAIL +
UNRESOLVED), which is about a 16% failure rate.

We (Google) plan to keep working on those failures and getting the C++
front end in shape.  We (GCC) should probably figure out a set of
criteria to consider merging the branch into mainline.  Should we
shoot for being able to bootstrap with -flto enabled?  I would at
least be able to pass all the testsuites with -flto enabled.

> There are still many open issues that we are hoping that the community
> would address

Thanks.  I've added some of these items to the implementation plan on
the wiki page.  The rest were already there, please take a look and
add/modify to the list.

> I personally was planning to start restructuring the ipa passes and
> serializing the cgraph.

Great.  Those are items under the WPA phase. If you have a list of
things to be done besides the ones that are already there, could you
add them?  The more specific we are in this list, the easier it will
be for folks to pick up stuff to do.

> I personally think that the most pressing problems are
>
> 1) making lto/whopr work in the presence of modules that do not fit
> perfectly together, because of type or function argument mismatches.

Agreed.

> that is available in non C languages.  Toon's paper at last year's
> summit is a good example of exactly how badly we do, and the problem is
> likely to only get worse with LTO/whopr as the lang hooks go away.

Are you talking about aliasing or things like high-level array operations?

> While the last section of the whopr pays some lip service to this

Well, no.  That only addresses some of the aliasing problems.
Representing high-level concepts like array/vector arithmetic or class
hierarchies is not something we have done well in GIMPLE.  In terms of
whole program optimization, we will be interested in addressing class
hierarchy optimizations.

> a community have never really addressed the issues of how we could
> expand/change our internal representation to accomodate the high level
> features supported by the non c frontends.

We have for concurrency with the extensions to support OpenMP which
are useful in contexts like auto-parallelism.  But in general, we
don't transfer some things like array syntax or class hierarchies very
well.

Now, adding high-level concepts to an IL is usually expensive in
several ways.  Beyond arrays and class hierarchies, do you see any
other high-level concept worth transferring into GIMPLE?  I wouldn't
want to represent very many high-level concepts in GIMPLE.

> The wiki does not indicate that there is any semantic difference between gimple trees
> and gimple tuples

Right, there isn't.  The work on tuples is orthogonal and can go
in/out at any time.  It's just mechanically big, as it changes data
structures used by most of the compiler.  All this work can proceed in
parallel.

> Both of these are very hard problems and they are likely to require the
> same level of commitment that will be required to make Whopr work.  It
> is not that i think that making lto/whopr work in a distributed
> environment is not an important problem, it is just that i think that we
> need to make LTO produce good code on real programs first.

Oh, absolutely.  The design simply allows the first (LGEN) and last
stage (LTRANS) to operate in a distributed environment.  The initial
implementation can simply assume a shared file system.  Distribution
can be added later.  The only important parameter is to avoid
implementation decisions that will prevent processing massively large
applications.


Thanks.  Diego.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]