This is the mail archive of the
mailing list for the GCC project.
Re: [PATCH 2/2][v3] Drop excess size used for run time allocated stack variables.
- From: Jeff Law <law at redhat dot com>
- To: vogt at linux dot vnet dot ibm dot com, gcc-patches at gcc dot gnu dot org, Andreas Krebbel <krebbel at linux dot vnet dot ibm dot com>
- Date: Thu, 21 Jul 2016 14:07:05 -0600
- Subject: Re: [PATCH 2/2][v3] Drop excess size used for run time allocated stack variables.
- Authentication-results: sourceware.org; auth=none
- References: <20160429221242.GA2205@linux.vnet.ibm.com> <firstname.lastname@example.org> <20160503141753.GA17351@linux.vnet.ibm.com> <20160525133054.GA6938@linux.vnet.ibm.com> <20160525133241.GB6938@linux.vnet.ibm.com> <email@example.com> <20160623095741.GA9623@linux.vnet.ibm.com>
On 06/23/2016 03:57 AM, Dominik Vogt wrote:
* explow.c (round_push): Use know adjustment.
(allocate_dynamic_stack_space): Pass known adjustment to round_push.
* gcc.dg/pr50938.c: New test.
>From 4296d353e1d153b5b5ee435a44cae6117bf2fff0 Mon Sep 17 00:00:00 2001
From: Dominik Vogt <firstname.lastname@example.org>
Date: Fri, 29 Apr 2016 08:36:59 +0100
Subject: [PATCH 2/2] Drop excess size used for run time allocated stack
The present calculation sometimes led to more stack memory being used than
necessary with alloca. First, (STACK_BOUNDARY -1) would be added to the
size = plus_constant (Pmode, size, extra);
size = force_operand (size, NULL_RTX);
Then round_push was called and added another (STACK_BOUNDARY - 1) before
rounding down to a multiple of STACK_BOUNDARY. On s390x this resulted in
adding 14 before rounding down for "x" in the test case pr36728-1.c.
round_push() now takes an argument to inform it about what has already been
added to size.
gcc/explow.c | 45 +++++++++++++++++++++---------------
gcc/testsuite/gcc.dg/pr50938.c | 52 ++++++++++++++++++++++++++++++++++++++++++
2 files changed, 79 insertions(+), 18 deletions(-)
create mode 100644 gcc/testsuite/gcc.dg/pr50938.c
diff --git a/gcc/explow.c b/gcc/explow.c
index 09a0330..85596e2 100644
@@ -949,24 +949,30 @@ anti_adjust_stack (rtx adjust)
/* Round the size of a block to be pushed up to the boundary required
- by this machine. SIZE is the desired size, which need not be constant. */
+ by this machine. SIZE is the desired size, which need not be constant.
+ ALREADY_ADDED is the number of units that have already been added to SIZE
+ for other alignment reasons.
-round_push (rtx size)
+round_push (rtx size, int already_added)
- rtx align_rtx, alignm1_rtx;
+ rtx align_rtx, add_rtx;
|| crtl->preferred_stack_boundary == MAX_SUPPORTED_STACK_ALIGNMENT)
int align = crtl->preferred_stack_boundary / BITS_PER_UNIT;
+ int add;
if (align == 1)
+ add = (align > already_added) ? align - already_added - 1 : 0;
if (CONST_INT_P (size))
- HOST_WIDE_INT new_size = (INTVAL (size) + align - 1) / align * align;
+ HOST_WIDE_INT new_size = (INTVAL (size) + add) / align * align;
if (INTVAL (size) != new_size)
size = GEN_INT (new_size);
So presumably the idea here is when the requested SIZE would require
allocating additional space to first see if the necessary space is
already available inside ALREADY_ADDED
and use that rather than rounding size up to an alignment boundary.
Not exactly. Consider the unpatched code. At the beginning we
have some amount of space to be allocated on the stack at runtime
("SSIZE"), some requested alignment for it ("SALIGN").
get_dynamic_stack_size() first calculates the space needed for run
SIZE = SSIZE + SALIGN - 1
Then it calls round_push() to add *another* chunk of memory to the
allocation size to be able to align it to the required stack slot
alignment ("SLOTALIGN") at run time.
SIZE = SIZE + SLOTALIGN - 1
= SSIZE + (SALIGN - 1) + (SLOTALIGN - 1)
Now it has added two chunks of memory but alignment is only done
once. With the patch it just adds the maximum of (SALIGN - 1) and
(SLOTALIGN - 1), not both. Thinking about it, the "round_push"
stuff is a very complicated way of saying "add max(A, B)".
Now I see it. Thanks, that helped a ton.
I'd be all for such cleanups after we wrap up the pending patches. It's
certainly a rats nest of code right now.
I'd volunteer to clean this up more, but preferrably when the two
pending patches are in. The current code is a real brain-twister.
This patch is fine for the trunk. Thanks for your patience.