This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: RFC: Patch to align spills beyond what the stack supports

From: Jeff Law <law at redhat dot com>
To: Steve Ellcey <sellcey at imgtec dot com>, gcc-patches at gcc dot gnu dot org
Date: Mon, 11 May 2015 14:48:17 -0600
Subject: Re: RFC: Patch to align spills beyond what the stack supports
Authentication-results: sourceware.org; auth=none
References: <446d4a11-c8d4-4d75-85d0-e71a0023767c at BAMAIL02 dot ba dot imgtec dot org>

On 05/07/2015 11:52 AM, Steve Ellcey  wrote:

I would like to get some feedback on an idea of how to spill registers
that require (or perhaps only prefer for performance reasons) an alignment
greater than that supported by the stack.

If you look at how GCC supports local variables with alignment requirements
greater than the stack supports, there are two methods.  One is to dynamically
realign the stack and the other is to alloca a block of space on the stack
and create a pointer to an aligned address within that space.

The advantage of the first approach is that in addition to aligning local
variables, aligning the stack will align spill slots since they are also
stored on the stack.  The disadvantage with this approach is that it is
complicated, involves a fair amount of target dependent changes, and risks
breaking a targets ABI if not done very carefully.  The x86 target is the
only one that has implemented this approach as far as I can see.

The advantage of the second approach is that it is target independent and
doesn't risk breaking a targets ABI.  The disadvantage is that it doesn't
help with aligning spill slots since it isn't actually changing the stacks
alignment.

After spending some time trying to implement the dynamic stack alignment
method for MIPS, I decided to try the second approach and extend the alloca
method to spills.

My approach was to create a tree pass that tried to guess at how many
aligned spill slots might be needed in a routine and then allocate an
aligned local variable to hold those spills.  This variable would get the
needed alignment via the existing alloca method used for local variables.

This guessing of how many spills we may need is the main drawback to my
approach because we have no way of actually knowing how many (if any) spill
slots we will end up needing.  The advantage of this approach is that
the only real target dependent code I needed to create was to set a hard
register that could point to the spill variable so that the lra-spill code
could use it when needed for spills.   I didn't implement anything for
non-lra register allocation.

I have been thinking that it might be possible to tag the alloca that
allocates the spill area somehow so that lra-spills could increase (or
decrease) it if needed but I haven't investigated that idea in detail yet.

What do people think about this idea?  Attached is a patch I created that
implements this idea for MIPS.  I am still doing some testing, mostly with
MSA vector registers (a MIPS feature not yet checked in).  Reading and writing
these 16 byte registers on a 16 byte boundary is more efficient than an
8 byte boundary but the O32 MIPS ABI only supports 8 byte stack alignment.

Is this an approach that might be approved for checkin with some further
work?

I think the first approach is generally better -- again, you're doingthings fairly early in the pipeline to attack a problem that is muchlater in the pipeline. While you've probably minimized many of the wayswe can lose with this approach today, it still strikes me as likely tobe fragile over time.

For example, what happens if some later pass splits one (or more) of theoriginal pseudos, then scheduling interleaves their lifetimes so thatthey can't share a stack slot. Boom, the estimations aren't going towork well and the user is going to have to start using PARAMs to avoidICEs/asserts. Not good.

The dynamic realignment is somewhat painful, but it attacks the problemin the right place. There may be bits of it that we should begeneralizing to make it easier to implement (since I don't think this isa long term problem that will isolated to x86 and MIPS).


jeff

References:
- RFC: Patch to align spills beyond what the stack supports
  - From: Steve Ellcey

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]