This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH 2/2][v3] Drop excess size used for run time allocated stack variables.


On 05/25/2016 07:32 AM, Dominik Vogt wrote:
On Wed, May 25, 2016 at 02:30:54PM +0100, Dominik Vogt wrote:
> On Tue, May 03, 2016 at 03:17:53PM +0100, Dominik Vogt wrote:
> > Version two of the patch including a test case.
> >
> > On Mon, May 02, 2016 at 09:10:25AM -0600, Jeff Law wrote:
> > > On 04/29/2016 04:12 PM, Dominik Vogt wrote:
> > > >The attached patch removes excess stack space allocation with
> > > >alloca in some situations.  Plese check the commit message in the
> > > >patch for details.
> >
> > > However, I would strongly recommend some tests, even if they are
> > > target specific.  You can always copy pr36728-1 into the s390x
> > > directory and look at size of the generated stack.  Simliarly for
> > > pr50938 for x86.
> >
> > However, x86 uses the "else" branch in round_push, i.e. it uses
> > "virtual_preferred_stack_boundary_rtx" to calculate the number of
> > bytes to add for stack alignment.  That value is unknown at the
> > time round_push is called, so the test case fails on such targets,
> > and I've no idea how to fix this properly.
>
> Third version of the patch with the suggested cleanup in the first
> patch and the functional stuff in the second one.  The first patch
> is based on Jeff's draft with the change suggested by Eric and
> more cleanup added by me.
This is the updated funtional patch.  Re-tested with limited
effort, i.e. tested and bootstrapped on s390x biarch (but did not
look for performance regressions compared to version 2 of the
patch).

Ciao

Dominik ^_^  ^_^

-- Dominik Vogt IBM Germany


0002-v3-ChangeLog


gcc/ChangeLog

	* explow.c (round_push): Use know adjustment.
	(allocate_dynamic_stack_space): Pass known adjustment to round_push.
gcc/testsuite/ChangeLog

	* gcc.dg/pr50938.c: New test.


0002-v3-Drop-excess-size-used-for-run-time-allocated-stack-v.patch


From 4296d353e1d153b5b5ee435a44cae6117bf2fff0 Mon Sep 17 00:00:00 2001
From: Dominik Vogt <vogt@linux.vnet.ibm.com>
Date: Fri, 29 Apr 2016 08:36:59 +0100
Subject: [PATCH 2/2] Drop excess size used for run time allocated stack
 variables.

The present calculation sometimes led to more stack memory being used than
necessary with alloca.  First, (STACK_BOUNDARY -1) would be added to the
allocated size:

  size = plus_constant (Pmode, size, extra);
  size = force_operand (size, NULL_RTX);

Then round_push was called and added another (STACK_BOUNDARY - 1) before
rounding down to a multiple of STACK_BOUNDARY.  On s390x this resulted in
adding 14 before rounding down for "x" in the test case pr36728-1.c.

round_push() now takes an argument to inform it about what has already been
added to size.
---
 gcc/explow.c                   | 45 +++++++++++++++++++++---------------
 gcc/testsuite/gcc.dg/pr50938.c | 52 ++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 79 insertions(+), 18 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/pr50938.c

diff --git a/gcc/explow.c b/gcc/explow.c
index 09a0330..85596e2 100644
--- a/gcc/explow.c
+++ b/gcc/explow.c
@@ -949,24 +949,30 @@ anti_adjust_stack (rtx adjust)
 }

 /* Round the size of a block to be pushed up to the boundary required
-   by this machine.  SIZE is the desired size, which need not be constant.  */
+   by this machine.  SIZE is the desired size, which need not be constant.
+   ALREADY_ADDED is the number of units that have already been added to SIZE
+   for other alignment reasons.
+*/

 static rtx
-round_push (rtx size)
+round_push (rtx size, int already_added)
 {
-  rtx align_rtx, alignm1_rtx;
+  rtx align_rtx, add_rtx;

   if (!SUPPORTS_STACK_ALIGNMENT
       || crtl->preferred_stack_boundary == MAX_SUPPORTED_STACK_ALIGNMENT)
     {
       int align = crtl->preferred_stack_boundary / BITS_PER_UNIT;
+      int add;

       if (align == 1)
 	return size;

+      add = (align > already_added) ? align - already_added - 1 : 0;
+
       if (CONST_INT_P (size))
 	{
-	  HOST_WIDE_INT new_size = (INTVAL (size) + align - 1) / align * align;
+	  HOST_WIDE_INT new_size = (INTVAL (size) + add) / align * align;

 	  if (INTVAL (size) != new_size)
 	    size = GEN_INT (new_size);
So presumably the idea here is when the requested SIZE would require allocating additional space to first see if the necessary space is already available inside ALREADY_ADDED and use that rather than rounding size up to an alignment boundary.

I can see how that works in the sense of avoiding allocating extra space. What I'm struggling with is how do we know the space actually allocated is going to have the right alignment?

What am I missing here?

jeff




Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]