This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[PATCH]: Stack alignment - take 2


> Right.  This is because you need to keep the stack aligned, so you can
> never have an empty frame.  Which brings us back to a variant of the
> previous poster's question: is the stack pointer supposed to be
> aligned, or not?

Here is a second attempt to eliminate wasted space in the PA frame
due to unnecessary rounding and rounding to an alignment larger than
necessary.  I have introduced a macro STACK_ALIGNMENT_NEEDED which
allows a port to handle stack alignment itself.  The rounding of the
PA frame size is now done entirely in the backend in compute_frame_size.

This saves space in a number of ways.  First, it allows us to define
STARTING_FRAME_OFFSET to 8 in the 64-byte port and save 8 bytes in
every frame.  Secondly, the overall alignment in assign_stack_local_1
was being done to STACK_BOUNDARY.  This is set to a double word
boundary on the PA.  However, the register save block that follows
the locals in the PA frame only needs to be aligned to word boundary,
so we save a word about half the time by aligning to a word boundary.

I have also improved compute_frame_size.  We no longer allocate a
slot for %r3 when it is used as the frame pointer.  Finally, we don't
need to do a double word alignment before the floating point register
save block if we don't save any floating point registers.  The
alignment of the outgoing argument block is determined by the final
rounding to PREFERRED_STACK_BOUNDARY done at the end of compute_frame_size.

The preferred stack boundary on the 32-bit ports is 64 bytes.  If
unnecessary rounding just happens bump the frame size over the 64
byte boundary, then a lot of stack space could be wasted.

The patch has been tested on hppa2.0-hp-hpux11.11, hppa64-hp-hpux11.11,
and hppa-unknown-linux-gnu with no regressions.

In response to the question, the PA requires strict alignment of all
instructions and data.  Any misalignment of the stack pointer would
cause immediate problems in a non-trivial program.  So, the stack
pointer must be aligned.  However, we don't want assign_stack_local_1
to do it for us as it can't provide the alignment we need.

Ok for main?

Dave
-- 
J. David Anglin                                  dave dot anglin at nrc-cnrc dot gc dot ca
National Research Council of Canada              (613) 990-0752 (FAX: 952-6605)

2003-03-01  John David Anglin  <dave dot anglin at nrc-cnrc dot gc dot ca>

	* function.c (STACK_ALIGNMENT_NEEDED): New macro.  Default to 1.
	(assign_stack_local_1): Perform overall stack alignment only when
	STACK_ALIGNMENT_NEEDED is non-zero.
	* doc/tm.texi (STACK_ALIGNMENT_NEEDED): Document.

	* pa.c (compute_frame_size): Rename fsize to size.  Account for
	alignment to a word boundary before general register save block.  Only
	account for double-word alignment before floating point register save
	block if one or more are saved.  Don't allocate space for %r3 when
	frame pointer is needed.
	(hppa_expand_prologue): Include alignment to word boundary in local
	frame size.
	* pa.h (STARTING_FRAME_OFFSET): Define to 8 on both 32 and 64-bit ports.
	(STACK_ALIGNMENT_NEEDED): Define.

Index: function.c
===================================================================
RCS file: /cvsroot/gcc/gcc/gcc/function.c,v
retrieving revision 1.404
diff -u -3 -p -r1.404 function.c
--- function.c	28 Feb 2003 07:06:33 -0000	1.404
+++ function.c	28 Feb 2003 19:21:47 -0000
@@ -70,6 +70,10 @@ Software Foundation, 59 Temple Place - S
 #define LOCAL_ALIGNMENT(TYPE, ALIGNMENT) ALIGNMENT
 #endif
 
+#ifndef STACK_ALIGNMENT_NEEDED
+#define STACK_ALIGNMENT_NEEDED 1
+#endif
+
 /* Some systems use __main in a way incompatible with its use in gcc, in these
    cases use the macros NAME__MAIN to give a quoted symbol and SYMBOL__MAIN to
    give the same symbol without quotes for an alternative entry point.  You
@@ -566,16 +570,27 @@ assign_stack_local_1 (mode, size, align,
   frame_off = STARTING_FRAME_OFFSET % frame_alignment;
   frame_phase = frame_off ? frame_alignment - frame_off : 0;
 
-  /* Round frame offset to that alignment.
-     We must be careful here, since FRAME_OFFSET might be negative and
-     division with a negative dividend isn't as well defined as we might
-     like.  So we instead assume that ALIGNMENT is a power of two and
-     use logical operations which are unambiguous.  */
+  /* Round the frame offset to the specified alignment.  The default is
+     to always honor requests to align the stack but a port may choose to
+     do its own stack alignment by defining STACK_ALIGNMENT_NEEDED.  */
+  if (STACK_ALIGNMENT_NEEDED
+      || mode != BLKmode
+      || size != 0)
+    {
+      /*  We must be careful here, since FRAME_OFFSET might be negative and
+	  division with a negative dividend isn't as well defined as we might
+	  like.  So we instead assume that ALIGNMENT is a power of two and
+	  use logical operations which are unambiguous.  */
 #ifdef FRAME_GROWS_DOWNWARD
-  function->x_frame_offset = FLOOR_ROUND (function->x_frame_offset - frame_phase, alignment) + frame_phase;
+      function->x_frame_offset
+	= (FLOOR_ROUND (function->x_frame_offset - frame_phase, alignment)
+	   + frame_phase);
 #else
-  function->x_frame_offset = CEIL_ROUND (function->x_frame_offset - frame_phase, alignment) + frame_phase;
+      function->x_frame_offset
+	= (CEIL_ROUND (function->x_frame_offset - frame_phase, alignment)
+	   + frame_phase);
 #endif
+    }
 
   /* On a big-endian machine, if we are allocating more space than we will use,
      use the least significant bytes of those that are allocated.  */
Index: doc/tm.texi
===================================================================
RCS file: /cvsroot/gcc/gcc/gcc/doc/tm.texi,v
retrieving revision 1.204
diff -u -3 -p -r1.204 tm.texi
--- doc/tm.texi	25 Feb 2003 17:06:40 -0000	1.204
+++ doc/tm.texi	28 Feb 2003 19:21:48 -0000
@@ -2816,6 +2816,16 @@ value @code{STARTING_FRAME_OFFSET}.
 @c i'm not sure if the above is still correct.. had to change it to get
 @c rid of an overfull.  --mew 2feb93
 
+ at findex STACK_ALIGNMENT_NEEDED
+ at item STACK_ALIGNMENT_NEEDED
+Define to zero to disable final alignment of the stack during reload.
+The non-zero default for this macro is suitable for most ports.
+
+On ports where @code{STARTING_FRAME_OFFSET} is non-zero or where there
+is a register save block following the local block that doesn't require
+alignment to @code{STACK_BOUNDARY}, it may be beneficial to disable
+stack alignment and do it in the backend.
+
 @findex STACK_POINTER_OFFSET
 @item STACK_POINTER_OFFSET
 Offset from the stack pointer register to the first location at which
Index: config/pa/pa.c
===================================================================
RCS file: /cvsroot/gcc/gcc/gcc/config/pa/pa.c,v
retrieving revision 1.199
diff -u -3 -p -r1.199 pa.c
--- config/pa/pa.c	26 Feb 2003 16:29:22 -0000	1.199
+++ config/pa/pa.c	28 Feb 2003 19:21:48 -0000
@@ -3195,15 +3195,24 @@ compute_frame_size (size, fregs_live)
      int size;
      int *fregs_live;
 {
-  int i, fsize;
-
-  /* Space for frame pointer + filler. If any frame is allocated
-     we need to add this in because of STARTING_FRAME_OFFSET.
-
-     Similar code also appears in hppa_expand_prologue.  Change both
-     of them at the same time.  */
-  fsize = size + (size || frame_pointer_needed ? STARTING_FRAME_OFFSET : 0);
+  int freg_saved = 0;
+  int i, j;
 
+  /* The code in hppa_expand_prologue and hppa_expand_epilogue must
+     be consistent with the rounding and size calculation done here.
+     Change them at the same time.  */
+
+  /* We do our own stack alignment.  First, round the size of the
+     stack locals up to a word boundary.  */
+  size = (size + UNITS_PER_WORD - 1) & ~(UNITS_PER_WORD - 1);
+
+  /* Space for previous frame pointer + filler.  If any frame is
+     allocated, we need to add in the STARTING_FRAME_OFFSET.  We
+     waste some space here for the sake of HP compatibility.  The
+     first slot is only used when the frame pointer is needed.  */
+  if (size || frame_pointer_needed)
+    size += STARTING_FRAME_OFFSET;
+  
   /* If the current function calls __builtin_eh_return, then we need
      to allocate stack space for registers that will hold data for
      the exception handler.  */
@@ -3213,41 +3222,49 @@ compute_frame_size (size, fregs_live)
 
       for (i = 0; EH_RETURN_DATA_REGNO (i) != INVALID_REGNUM; ++i)
 	continue;
-      fsize += i * UNITS_PER_WORD;
+      size += i * UNITS_PER_WORD;
     }
 
   /* Account for space used by the callee general register saves.  */
-  for (i = 18; i >= 3; i--)
+  for (i = 18, j = frame_pointer_needed ? 4 : 3; i >= j; i--)
     if (regs_ever_live[i])
-      fsize += UNITS_PER_WORD;
-
-  /* Round the stack.  */
-  fsize = (fsize + 7) & ~7;
+      size += UNITS_PER_WORD;
 
   /* Account for space used by the callee floating point register saves.  */
   for (i = FP_SAVED_REG_LAST; i >= FP_SAVED_REG_FIRST; i -= FP_REG_STEP)
     if (regs_ever_live[i]
-	|| (! TARGET_64BIT && regs_ever_live[i + 1]))
+	|| (!TARGET_64BIT && regs_ever_live[i + 1]))
       {
-	if (fregs_live)
-	  *fregs_live = 1;
+	freg_saved = 1;
 
 	/* We always save both halves of the FP register, so always
 	   increment the frame size by 8 bytes.  */
-	fsize += 8;
+	size += 8;
       }
 
+  /* If any of the floating registers are saved, account for the
+     alignment needed for the floating point register save block.  */
+  if (freg_saved)
+    {
+      size = (size + 7) & ~7;
+      if (fregs_live)
+	*fregs_live = 1;
+    }
+
   /* The various ABIs include space for the outgoing parameters in the
-     size of the current function's stack frame.  */
-  fsize += current_function_outgoing_args_size;
+     size of the current function's stack frame.  We don't need to align
+     for the outgoing arguments as their alignment is set by the final
+     rounding for the frame as a whole.  */
+  size += current_function_outgoing_args_size;
 
   /* Allocate space for the fixed frame marker.  This space must be
      allocated for any function that makes calls or otherwise allocates
      stack space.  */
-  if (!current_function_is_leaf || fsize)
-    fsize += TARGET_64BIT ? 16 : 32;
+  if (!current_function_is_leaf || size)
+    size += TARGET_64BIT ? 16 : 32;
 
-  return ((fsize + PREFERRED_STACK_BOUNDARY / 8 - 1)
+  /* Finally, round to the preferred stack boundary.  */
+  return ((size + PREFERRED_STACK_BOUNDARY / 8 - 1)
 	  & ~(PREFERRED_STACK_BOUNDARY / 8 - 1));
 }
 
@@ -3313,8 +3330,8 @@ pa_output_function_prologue (file, size)
 void
 hppa_expand_prologue ()
 {
-  int size = get_frame_size ();
   int merge_sp_adjust_with_store = 0;
+  int size = get_frame_size ();
   int i, offset;
   rtx insn, tmpreg;
 
@@ -3322,13 +3339,12 @@ hppa_expand_prologue ()
   fr_saved = 0;
   save_fregs = 0;
 
-  /* Allocate space for frame pointer + filler. If any frame is allocated
-     we need to add this in because of STARTING_FRAME_OFFSET.
-
-     Similar code also appears in compute_frame_size.  Change both
-     of them at the same time.  */
-  local_fsize = size + (size || frame_pointer_needed
-			? STARTING_FRAME_OFFSET : 0);
+  /* Compute total size for frame pointer, filler, locals and rounding to
+     the next word boundary.  Similar code appears in compute_frame_size
+     and must be changed in tandem with this code.  */
+  local_fsize = (size + UNITS_PER_WORD - 1) & ~(UNITS_PER_WORD - 1);
+  if (local_fsize || frame_pointer_needed)
+    local_fsize += STARTING_FRAME_OFFSET;
 
   actual_fsize = compute_frame_size (size, &save_fregs);
 
Index: config/pa/pa.h
===================================================================
RCS file: /cvsroot/gcc/gcc/gcc/config/pa/pa.h,v
retrieving revision 1.184
diff -u -3 -p -r1.184 pa.h
--- config/pa/pa.h	26 Feb 2003 16:29:22 -0000	1.184
+++ config/pa/pa.h	28 Feb 2003 19:21:48 -0000
@@ -676,10 +676,17 @@ extern struct rtx_def *hppa_pic_save_rtx
 /* Offset within stack frame to start allocating local variables at.
    If FRAME_GROWS_DOWNWARD, this is the offset to the END of the
    first local allocated.  Otherwise, it is the offset to the BEGINNING
-   of the first local allocated.  The start of the locals must lie on
-   a STACK_BOUNDARY or else the frame size of leaf functions will not
-   be zero.  */
-#define STARTING_FRAME_OFFSET (TARGET_64BIT ? 16 : 8)
+   of the first local allocated.
+
+   On the 32-bit ports, we reserve one slot for the previous frame
+   pointer and one fill slot.  The fill slot is for compatibility
+   with HP compiled programs.  On the 64-bit ports, we reserve one
+   slot for the previous frame pointer.  */
+#define STARTING_FRAME_OFFSET 8
+
+/* Define STACK_ALIGNMENT_NEEDED to zero to disable final alignment
+   of the stack.  The default is to align it to STACK_BOUNDARY.  */
+#define STACK_ALIGNMENT_NEEDED 0
 
 /* If we generate an insn to push BYTES bytes,
    this says how many the stack pointer really advances by.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]