This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: An unusual Performance approach using Synthetic registers
- From: Daniel Berlin <dberlin at dberlin dot org>
- To: Andy Walker <ja_walker at earthlink dot net>
- Cc: Chris Lattner <sabre at nondot dot org>, gcc at gcc dot gnu dot org
- Date: Sun, 29 Dec 2002 14:47:42 -0500
- Subject: Re: An unusual Performance approach using Synthetic registers
On Sunday, December 29, 2002, at 12:29 PM, Andy Walker wrote:
On Sunday 29 December 2002 02:34 am, Chris Lattner wrote:
<snip>
However, the true problem is that GCC performs most of its scalar
optimizations on the RTL representation, which operates on values in
physical registers. For an machine with few physical registers, such
as
X86, many scalar values are spilled to the stack, impeding
optimization.
The fix for this particular problem, IMHO, is not to increase the
number
of registers, for this is only a bandaid. Instead, optimizations
should
be performed on a representation which is not limited by the number of
registers in the target: in GCC this would be the tree
representation, or
hopefully soon the GENERIC and GIMPLE representations. The idea here
is
that you perform optimizations on a representation exposing an
infinite
virtual register set, so scalar stay scalars, and optimizations are
effective despite the number of physical registers in the target.
I keep in mind that at some point the theory hits the silicon. As an
old
assembler programmer, I always have in the back of my head the
question:
"What will this look like in instructions or in memory?"
My approach may allow for an effectively infinite register set.
Which we have (Chris is wrong, we don't perform optimizations on
physical registers, we perform it on an infinite register set), and the
register allocator simply thunks the infinite number down to the real
number we have on the architecture.
For each
subroutine, the compiler figures out how many registers are needed.
Right
now, in GCC, the next step after adding up the number of registers is
to call
Daniel Berlin's implementation of Color-Graphing.
First, this is incorrect.
Right now use an ad-hoc register allocator that isn't all that good
unless you specifically use -fnew-ra.
Second, Michael Matz is responsible for more of the new allocator than
me (i'm just trying to give credit where it's due).
Instead of Color-Graphing, just allocate that number of Synthetic
registers
as part of the frame for that subroutine. If you need one hundred,
use one
hundred, and get on with it.
I have some questions about whether an effectively infinite number of
Synthetic registers is a good idea.
I will not look into that until I have some timing and memory results
using a
statically fixed number of Synthetic registers.
Andy