This is the mail archive of the
mailing list for the GCC project.
Re: [PATCH 2/2][v3] Drop excess size used for run time allocated stack variables.
- From: Dominik Vogt <vogt at linux dot vnet dot ibm dot com>
- To: Jeff Law <law at redhat dot com>
- Cc: gcc-patches at gcc dot gnu dot org, Andreas Krebbel <krebbel at linux dot vnet dot ibm dot com>
- Date: Fri, 22 Jul 2016 10:55:22 +0100
- Subject: Re: [PATCH 2/2][v3] Drop excess size used for run time allocated stack variables.
- Authentication-results: sourceware.org; auth=none
- References: <20160429221242.GA2205@linux.vnet.ibm.com> <email@example.com> <20160503141753.GA17351@linux.vnet.ibm.com> <20160525133054.GA6938@linux.vnet.ibm.com> <20160525133241.GB6938@linux.vnet.ibm.com> <firstname.lastname@example.org> <20160623095741.GA9623@linux.vnet.ibm.com> <email@example.com>
- Reply-to: vogt at linux dot vnet dot ibm dot com
On Thu, Jul 21, 2016 at 02:07:05PM -0600, Jeff Law wrote:
> On 06/23/2016 03:57 AM, Dominik Vogt wrote:
> >>and use that rather than rounding size up to an alignment boundary.
> >Not exactly. Consider the unpatched code. At the beginning we
> >have some amount of space to be allocated on the stack at runtime
> >("SSIZE"), some requested alignment for it ("SALIGN").
> >get_dynamic_stack_size() first calculates the space needed for run
> >time alignment:
> > SIZE = SSIZE + SALIGN - 1
> >Then it calls round_push() to add *another* chunk of memory to the
> >allocation size to be able to align it to the required stack slot
> >alignment ("SLOTALIGN") at run time.
> > SIZE = SIZE + SLOTALIGN - 1
> > = SSIZE + (SALIGN - 1) + (SLOTALIGN - 1)
> >Now it has added two chunks of memory but alignment is only done
> >once. With the patch it just adds the maximum of (SALIGN - 1) and
> >(SLOTALIGN - 1), not both. Thinking about it, the "round_push"
> >stuff is a very complicated way of saying "add max(A, B)".
> Now I see it. Thanks, that helped a ton.
> >I'd volunteer to clean this up more, but preferrably when the two
> >pending patches are in. The current code is a real brain-twister.
> I'd be all for such cleanups after we wrap up the pending patches.
> It's certainly a rats nest of code right now.
> This patch is fine for the trunk. Thanks for your patience.
Actually I was goind to abandon the patch in its current state.
:-) We talked about it internally and concluded that the problem
is really this:
* get_dynamic_stack_size is passed a SIZE of a data block (which
is allocated elsewhere), the SIZE_ALIGN of the SIZE (i.e. the
alignment of the underlying memory units (e.g. 32 bytes split
into 4 times 8 bytes = 64 bit alignment) and the
REQUIRED_ALIGN of the data portion of the allocated memory.
* Assuming the function is called with SIZE = 2, SIZE_ALIGN = 8
and REQUIRED_ALIGN = 64 it first adds 7 bytes to SIZE -> 9.
This is what is needed to have two bytes 8-byte-aligned at some
memory location without any known alignment.
* Finally round_push is called to round up SIZE to a multiple of
the stack slot size.
The key to understanding this is that the function assumes that
STACK_DYNMAIC_OFFSET is completely unknown at the time its called
and therefore it does not make assumptions about the alignment of
STACKPOINTER + STACK_DYNMAIC_OFFSET. The latest patch simply
hard-codes that SP + SDO is supposed to be aligned to at least
stack slot size (and does that in a very complicated way). Since
there is no guarantee that this is the case on all targets, the
patch is broken. It may miscalculate a SIZE that is too small in
However, on many targets there is some guarantee about the
alignment of SP + SDO even if the actual value of SDO is unknown.
On s390x it's always 8-byte-aligned (stack slot size). So the
right fix should be to add knowledge about the target's guaranteed
alignment of SP + SDO to the function. I'm right now testing a
much simpler patch that uses
REGNO_POINTER_ALIGN(VIRTUAL_STACK_DYNAMIC_REGNUM) as the
Dominik ^_^ ^_^