This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: GVN-PRE sucks -- well, for SPECCFP2000 mgrid anyway




Daniel Berlin wrote:


On Aug 15, 2004, at 8:05 PM, Mark Mitchell wrote:


Daniel Berlin wrote:

Yeah.
I'll see if i can come up with some simple heuristics that help here, without hurting anything else too much.


I wonder if it's better to restrain PRE, or let it go ahead, and then have the register allocator move computations back into the loop.

It could also split the live ranges, and solve the problem that way.
This is what i would suggest, but realistically, it's going to be easier for us to restrain PRE in our current infrastructure. That, and nobody ever seems to want to improve the register allocator :).
Though at some point, all our optimizations are going to start causing real problems if our RA can't split live ranges.
On the other hand, we're also going to need this type of register pressure calculation for unrolling and whatnot.


It's also not going to restrain PRE all *that* much in most cases.
This loop/function seems to be a very bad case. Other compilers i can get debug info from all estimate the register pressure at ~25 integer registers, and ~5 float point registers.


That's going to cause spilling on almost all platforms without PRE pulling anything out.

The only real problem with restraining PRE is that it would be far better for PRE to pull it out, and something later to decide which computations to move back i. It might be the case that it makes more sense to keep some of the address arithemtic pulled out, at the cost of moving some other calculation back into the loop. Who knows.

This is handled differently in different compilers. There are basically two schools of thought here:

1) let pre be pre and have the register allocator "rematerialize" the guys that have been pulled out of the loop.

2) make each of the optimizations have a cost model that factors in the way that several transformations effect "max live" i.e. the number of items that will need to be kept in registers at any point in the program.

I have always been in favor of the first school but there really are two sides to this issue. In the gcc world there are two parts of the debate. The first is actually making the decision and the second is to get the community to stick by it. This is one of those places where the cats really do need to march in lock step because this becomes not just an issue with pre but with every optimization that can move code around.

The truth is that without any technical debate I can see plan 1 becoming the only one that will be workable in gcc because the management problems become intractable (I am still adding up the hours from the last attempt to get a single cat turned around.) You are basically going to have to get everyone to try and make their transformations pressure sensitive.

The technical issues are that it is easier to write all the transformations so that they assume that they have infinite registers and then split things up later. Plan 2 runs into problems because the first optimizations to run get to use up all of the registers and the rest of the optimizations get constrained even if they would have been more profitable.

This will really become an issue as we start tuning the loop transformations. It really is easier to work in a world where you have ejected everything from a loop before you decide if it is profitable to do something like loop switching which will end up changing all of your register pressure estimates anyway. It is hard to see how any splitting decision that danny makes now will be correct after the loops are switched.

I realize that I am being unsympathetic to danny's "no one wants to tackle the register allocator argument". However, it may just be time to light a fire under Richard Henderson and have him do what he did at IBM.



I've copied Kenny here to see if he happens to know of any magic bullets for this kind of thing.


(Kenny, the issue is that PRE is pulling so much stuff out of loops that we get register pressure, and hence bad code.)

--
Mark Mitchell
CodeSourcery, LLC
(916) 791-8304
mark@codesourcery.com



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]