This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [RFC] WHOPR - A whole program optimizer framework for GCC

From: Kenneth Zadeck <zadeck at naturalbridge dot com>
To: Diego Novillo <dnovillo at google dot com>
Cc: Jan Hubicka <hubicka at ucw dot cz>, gcc at gcc dot gnu dot org, Nathan Froyd <froydnj at codesourcery dot com>, Mark Mitchell <mark at codesourcery dot com>
Date: Wed, 19 Dec 2007 22:48:08 -0500
Subject: Re: [RFC] WHOPR - A whole program optimizer framework for GCC
References: <47603F3C.2090808@google.com> <20071218132914.GB12527@atrey.karlin.mff.cuni.cz> <476946B8.9030409@google.com> <476965CC.5050301@naturalbridge.com> <b798aad50712191909q45e9a867ra722db2f2729b405@mail.gmail.com>

Diego Novillo wrote:
> On Dec 19, 2007 1:41 PM, Kenneth Zadeck <zadeck@naturalbridge.com> wrote:
>
>   
>> I am hoping that in the next couple of days, Nathan and I will be able
>> to say that we have completed to work that Codesourcery/NaturalBridge
>> contracted to do with IBM.  Completion means that we are able to compile
>> and run the C language spec 2000 benchmarks in LTO mode, as well as
>> compile all of the gcc compiler itself (this does not include the runtime).
>>     
>
> Sounds good.  Do you folks have some criteria for merging into
> mainline?  I ran the C testsuite today by forcing every single test to
> use -flto, there were about <8,000 testsuite failures (FAIL +
> UNRESOLVED), which is about a 16% failure rate.
>   
I never tried such a test.  My listing of things to do was based on the
execute tests.
It is hard to say what that 16% really means without going thur case by
case.  I did this for the execute tests.  That was what I based my list
on.  It would not surprise me if there are other issues, but it could
also be the case that the listed failures are just over expressed in the
test suite.
 
The "last" bug that nathan and I are working on is that local statics
are not done correctly.
The hope is that will be fixed tomorrow. 

An idea that has been kicked around is that when lto is good enough to
replace the old --combine, then we should remove --combine and replace
it with lto.  I have not really thought thru the details of this, but
given that --combine is (i believe) a c only thing, having lto be c only
is not that big a deal.  Certainly no extra regressions in the c
testsuite is required.


> We (Google) plan to keep working on those failures and getting the C++
> front end in shape.  We (GCC) should probably figure out a set of
> criteria to consider merging the branch into mainline.  Should we
> shoot for being able to bootstrap with -flto enabled?  I would at
> least be able to pass all the testsuites with -flto enabled.
>
>   
My guess is that you are not going to get C++ working until all of the
lang hooks are properly resolved.  Some of the ways that some of these
langhooks were resolved in the lto branch was to assume c. 

>> There are still many open issues that we are hoping that the community
>> would address
>>     
>
> Thanks.  I've added some of these items to the implementation plan on
> the wiki page.  The rest were already there, please take a look and
> add/modify to the list.
>
>   
>> I personally was planning to start restructuring the ipa passes and
>> serializing the cgraph.
>>     
>
> Great.  Those are items under the WPA phase. If you have a list of
> things to be done besides the ones that are already there, could you
> add them?  The more specific we are in this list, the easier it will
> be for folks to pick up stuff to do.
>
>   
sure
>> I personally think that the most pressing problems are
>>
>> 1) making lto/whopr work in the presence of modules that do not fit
>> perfectly together, because of type or function argument mismatches.
>>     
>
> Agreed.
>
>   
>> that is available in non C languages.  Toon's paper at last year's
>> summit is a good example of exactly how badly we do, and the problem is
>> likely to only get worse with LTO/whopr as the lang hooks go away.
>>     
>
> Are you talking about aliasing or things like high-level array operations?
>
>   
Arrays and type heirarchy games are certainly the two that come to
mind.  Strings are also a possibility.  Many languages do magical things
with strings that go way beyond what one can do with arrays.
>> While the last section of the whopr pays some lip service to this
>>     
>
> Well, no.  That only addresses some of the aliasing problems.
> Representing high-level concepts like array/vector arithmetic or class
> hierarchies is not something we have done well in GIMPLE.  In terms of
> whole program optimization, we will be interested in addressing class
> hierarchy optimizations.
>
>   
>> a community have never really addressed the issues of how we could
>> expand/change our internal representation to accomodate the high level
>> features supported by the non c frontends.
>>     
>
> We have for concurrency with the extensions to support OpenMP which
> are useful in contexts like auto-parallelism.  But in general, we
> don't transfer some things like array syntax or class hierarchies very
> well.
>
> Now, adding high-level concepts to an IL is usually expensive in
> several ways.  Beyond arrays and class hierarchies, do you see any
> other high-level concept worth transferring into GIMPLE?  I wouldn't
> want to represent very many high-level concepts in GIMPLE.
>
>   
>> The wiki does not indicate that there is any semantic difference between gimple trees
>> and gimple tuples
>>     
>
> Right, there isn't.  The work on tuples is orthogonal and can go
> in/out at any time.  It's just mechanically big, as it changes data
> structures used by most of the compiler.  All this work can proceed in
> parallel.
>
>   
>> Both of these are very hard problems and they are likely to require the
>> same level of commitment that will be required to make Whopr work.  It
>> is not that i think that making lto/whopr work in a distributed
>> environment is not an important problem, it is just that i think that we
>> need to make LTO produce good code on real programs first.
>>     
>
> Oh, absolutely.  The design simply allows the first (LGEN) and last
> stage (LTRANS) to operate in a distributed environment.  The initial
> implementation can simply assume a shared file system.  Distribution
> can be added later.  The only important parameter is to avoid
> implementation decisions that will prevent processing massively large
> applications.
>
>
> Thanks.  Diego.
>   

kenny

References:
- [RFC] WHOPR - A whole program optimizer framework for GCC
  - From: Diego Novillo
- Re: [RFC] WHOPR - A whole program optimizer framework for GCC
  - From: Jan Hubicka
- Re: [RFC] WHOPR - A whole program optimizer framework for GCC
  - From: Diego Novillo
- Re: [RFC] WHOPR - A whole program optimizer framework for GCC
  - From: Kenneth Zadeck
- Re: [RFC] WHOPR - A whole program optimizer framework for GCC
  - From: Diego Novillo

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]