This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

RFC: Patch to allow spill slot alignment greater than the stack alignment


I have spent some time trying to do dynamic stack alignment on MIPS and had
considerable trouble.  The problems are mainly due to the dwarf based stack
unwinding and setjmp/longjmp usages where the code does not go through the
normal function prologue and epilogue code.

Since the only need I have for a dynamically aligned stack is to ensure
that MSA registers (128 bits long, wanting 128 bit alignment) get spilled
to an aligned address on an ABI where the stack is only guaranteed to be
64 bit aligned, I decided to try an approach based on how local variables
are aligned beyond the supported stack alignment.  I allocate the space with
padding and then create an aligned pointer into that space.  Since I want
to do this in lra_spill and I can't allocate a new register to use at
that point what I do is create an 'aligned spill base register' based
on the frame_register in expand_prologue, and then in assign_mem_slot,
if I want an alignment greater than what the stack supports, I allocate
extra space and change the MEM reference for the spill from 'frame_register
+ offset' to 'aligned_spill_base_register + offset'.

The main advantage to this approach over dynamically aligning the stack
is that by not changing the real stack (or frame) pointer there is
minimal chance of breaking the ABI and there are no changes needed to
the dwarf unwind code.  The main disadvantage is that I am padding each
individual spill so I am wasting more space than absolutely required.
It should be possible to address this by putting all the aligned spills
together and sharing the padding but I would like to leave that for a
future improvement.  

In the mean time I would like to get some comments on this approach and
see what people think.  Does this seem like a reasonable approach to
allowing for aligned spills beyond what the stack supports?

I have tested this on MIPS, forcing 64 bit registers to be aligned,
even though the stack supports this natively as a way to test the code
before MSA is checked in and it seems to be working with no regressions.
I do get scan failures in the gcc.target/mips section because I am forcing
GCC to setup $16 as the 'aligned spill base register' in all routines
as a stress test but I do not get any new execution failures.

Steve Ellcey
sellcey@imgtec.com


2015-10-02  Steve Ellcey  <sellcey@imgtec.com>

	* config/mips/mips.c (mips_attribute_table): Add align_spills
	and no_align_spills attribute entries.
	(mips_cfun_has_msa_p): New function.
	(setup_align_basereg_p): Ditto.
	(mips_align_spill_p): Ditto..
	(mips_align_spill_base_regnum): Ditto..
	(mips_align_spill_maximum): Ditto..
	(mips_expand_prologue): Setup aligned spill register when needed.
	(mips_conditional_register_usage): Reserve aligned spill register when
	needed.
	(TARGET_ALIGN_SPILL_P): Set to mips_align_spill_p.
	(TARGET_ALIGN_SPILL_BASE_REGNUM): Set to mips_align_spill_base_regnum.
	(TARGET_ALIGN_SPILL_MAXIMUM): Set to mips_align_spill_maximum.
	* config/mips/mips.opt (malign-spills): New option.
	* df-scan.c (df_get_regular_block_artificial_uses): Check for usage
	of aligned spill register.
	(df_get_eh_block_artificial_uses): Ditto.
	* hooks.c (hook_uint_void_invalid_regnum): New function.
	* hooks.h (hook_uint_void_invalid_regnum): Ditto.
	* lra-spills.c (align_slot_address): New function.
	(assign_mem_slot): Align slot address if necessary.
	* target.def (align_spill_p): New hook.
	(align_spill_base_regnum): New hook.
	(align_spill_maximum): New hook.
	* tm.texi.in (TARGET_ALIGN_SPILL_P): New hook.
	(TARGET_ALIGN_SPILL_BASE_REGNUM): Ditto.
	(TARGET_ALIGN_SPILL_MAXIMUM): Ditto.
	* tm.texi: Regenerate.


diff --git a/gcc/config/mips/mips.c b/gcc/config/mips/mips.c
index 0e0ecf2..84c3724 100644
--- a/gcc/config/mips/mips.c
+++ b/gcc/config/mips/mips.c
@@ -773,6 +773,8 @@ static const struct attribute_spec mips_attribute_table[] = {
     mips_handle_use_shadow_register_set_attr, false },
   { "keep_interrupts_masked",	0, 0, false, true,  true, NULL, false },
   { "use_debug_exception_return", 0, 0, false, true,  true, NULL, false },
+  { "align_spills", 0, 0, true, false, false, NULL, false },
+  { "no_align_spills", 0, 0, true, false, false, NULL, false },
   { NULL,	   0, 0, false, false, false, NULL, false }
 };
 
@@ -1607,6 +1609,49 @@ mips_merge_decl_attributes (tree olddecl, tree newdecl)
 			   DECL_ATTRIBUTES (newdecl));
 }
 
+/* Determine if a function may use MSA registers and thus need
+   aligned spills.  */
+
+static bool
+mips_cfun_has_msa_p (void)
+{
+  /* For now, for testing, assume all functions use MSA
+     (and thus may need aligned spills ).  */
+#if 0
+  if (!cfun || !TARGET_MSA)
+    return FALSE;
+
+  for (insn = get_insns (); insn; insn = NEXT_INSN (insn))
+    {
+      if (MSA_SUPPORTED_MODE_P (GET_MODE (insn)))
+        return TRUE;
+    }
+  return FALSE;
+#else
+  return TRUE;
+#endif
+}
+
+/* Determine whether or not we want to use aligned spills in a function.  */
+
+static bool
+setup_align_basereg_p (void)
+{
+  bool want_alignment = TARGET_ALIGN_SPILLS && mips_cfun_has_msa_p ();
+  if (current_function_decl)
+    {
+      tree attr = DECL_ATTRIBUTES (current_function_decl);
+      if (lookup_attribute ("align_spills", attr))
+	want_alignment = mips_cfun_has_msa_p ();
+      else if (lookup_attribute ("no_align_spills", attr))
+	want_alignment = false;
+
+      if (mips_get_compress_mode (current_function_decl) & MASK_MIPS16)
+	want_alignment = false;
+    }
+  return want_alignment;
+}
+
 /* Implement TARGET_CAN_INLINE_P.  */
 
 static bool
@@ -10371,6 +10416,32 @@ mips_cfun_might_clobber_call_saved_reg_p (unsigned int regno)
   return false;
 }
 
+/* Implement TARGET_ALIGN_SPILL_P.  */
+
+static bool
+mips_align_spill_p (machine_mode mode)
+{
+  if (setup_align_basereg_p () && (GET_MODE_SIZE (mode) > 4))
+    return true;
+  return false;
+}
+
+/* Implement TARGET_ALIGN_SPILL_BASE_REGNUM.  */
+
+static unsigned int
+mips_align_spill_base_regnum (void)
+{
+  return setup_align_basereg_p () ? (GP_REG_FIRST + 16) : INVALID_REGNUM;
+}
+
+/* Implement TARGET_ALIGN_SPILL_MAXIMUM.  */
+
+static unsigned int
+mips_align_spill_maximum (void)
+{
+  return 128;
+}
+
 /* Return true if the current function must save register REGNO.  */
 
 static bool
@@ -10395,6 +10466,9 @@ mips_save_reg_p (unsigned int regno)
   if (regno == RETURN_ADDR_REGNUM && crtl->calls_eh_return)
     return true;
 
+  if (regno == targetm.align_spill_base_regnum ())
+    return true;
+
   return false;
 }
 
@@ -11819,6 +11893,32 @@ mips_expand_prologue (void)
 	}
     }
 
+  /* Setup aligned spill register based on the frame register if we may
+     need to spill registers with an alignment greater than the stack
+     alignment.  */
+
+  if (setup_align_basereg_p ())
+    {
+      rtx align_spill_rtx = gen_rtx_REG (Pmode,
+					 targetm.align_spill_base_regnum ());
+      rtx (*and_insn) (rtx, rtx, rtx);
+      rtx insn;
+      unsigned int alignment;
+
+      emit_insn (gen_blockage ());
+      and_insn = (Pmode == SImode) ? gen_andsi3 : gen_anddi3;
+      alignment = targetm.align_spill_maximum ();
+      /* Verify that alignment is a power of 2.  */
+      gcc_assert (alignment == (alignment & -alignment));
+      mips_emit_move (MIPS_PROLOGUE_TEMP (Pmode),
+		      GEN_INT (-alignment));
+      insn = and_insn (align_spill_rtx,
+        frame_pointer_needed ? hard_frame_pointer_rtx : stack_pointer_rtx,
+        MIPS_PROLOGUE_TEMP (Pmode));
+      RTX_FRAME_RELATED_P (emit_insn (insn)) = 1;
+      emit_insn (gen_blockage ());
+    }
+
   mips_emit_loadgp ();
 
   /* Initialize the $gp save slot.  */
@@ -18372,6 +18472,14 @@ mips_conditional_register_usage (void)
       AND_COMPL_HARD_REG_SET (operand_reg_set,
 			      reg_class_contents[(int) MD_REGS]);
     }
+
+  if (setup_align_basereg_p ())
+    {
+      fixed_regs[targetm.align_spill_base_regnum ()] = 1;
+      call_used_regs[targetm.align_spill_base_regnum ()] = 1;
+      call_really_used_regs[targetm.align_spill_base_regnum ()] = 1;
+    }
+
   /* $f20-$f23 are call-clobbered for n64.  */
   if (mips_abi == ABI_64)
     {
@@ -20169,6 +20277,15 @@ mips_ira_change_pseudo_allocno_class (int regno, reg_class_t allocno_class)
 #undef TARGET_HARD_REGNO_SCRATCH_OK
 #define TARGET_HARD_REGNO_SCRATCH_OK mips_hard_regno_scratch_ok
 
+#undef TARGET_ALIGN_SPILL_P
+#define TARGET_ALIGN_SPILL_P mips_align_spill_p
+
+#undef TARGET_ALIGN_SPILL_BASE_REGNUM
+#define TARGET_ALIGN_SPILL_BASE_REGNUM mips_align_spill_base_regnum
+
+#undef TARGET_ALIGN_SPILL_MAXIMUM
+#define TARGET_ALIGN_SPILL_MAXIMUM mips_align_spill_maximum
+
 struct gcc_target targetm = TARGET_INITIALIZER;
 
 #include "gt-mips.h"
diff --git a/gcc/config/mips/mips.opt b/gcc/config/mips/mips.opt
index 84887d1..b903720 100644
--- a/gcc/config/mips/mips.opt
+++ b/gcc/config/mips/mips.opt
@@ -54,6 +54,10 @@ mabicalls
 Target Report Mask(ABICALLS)
 Generate code that can be used in SVR4-style dynamic objects
 
+malign-spills
+Common Report Mask(ALIGN_SPILLS)
+Allow register spills with an alignment greater than the stack supports.
+
 mmad
 Target Report Var(TARGET_MAD)
 Use PMC-style 'mad' instructions
diff --git a/gcc/df-scan.c b/gcc/df-scan.c
index eea93df..5b1e228 100644
--- a/gcc/df-scan.c
+++ b/gcc/df-scan.c
@@ -3405,6 +3405,9 @@ df_get_regular_block_artificial_uses (bitmap regular_block_artificial_uses)
     {
       if (frame_pointer_needed)
 	bitmap_set_bit (regular_block_artificial_uses, HARD_FRAME_POINTER_REGNUM);
+      if (targetm.align_spill_base_regnum () != INVALID_REGNUM)
+	bitmap_set_bit (regular_block_artificial_uses,
+			targetm.align_spill_base_regnum ());
     }
   else
     /* Before reload, there are a few registers that must be forced
@@ -3475,6 +3478,10 @@ df_get_eh_block_artificial_uses (bitmap eh_block_artificial_uses)
       if (FRAME_POINTER_REGNUM != ARG_POINTER_REGNUM
 	  && fixed_regs[ARG_POINTER_REGNUM])
 	bitmap_set_bit (eh_block_artificial_uses, ARG_POINTER_REGNUM);
+      if (targetm.align_spill_base_regnum () != INVALID_REGNUM
+	  && fixed_regs[targetm.align_spill_base_regnum ()])
+	bitmap_set_bit (eh_block_artificial_uses,
+			targetm.align_spill_base_regnum ());
     }
 }
 
diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index eb495a8..3364e1a 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -11555,6 +11555,24 @@ If defined, this function returns an appropriate alignment in bits for an atomic
 ISO C11 requires atomic compound assignments that may raise floating-point exceptions to raise exceptions corresponding to the arithmetic operation whose result was successfully stored in a compare-and-exchange sequence.  This requires code equivalent to calls to @code{feholdexcept}, @code{feclearexcept} and @code{feupdateenv} to be generated at appropriate points in the compare-and-exchange sequence.  This hook should set @code{*@var{hold}} to an expression equivalent to the call to @code{feholdexcept}, @code{*@var{clear}} to an expression equivalent to the call to @code{feclearexcept} and @code{*@var{update}} to an expression equivalent to the call to @code{feupdateenv}.  The three expressions are @code{NULL_TREE} on entry to the hook and may be left as @code{NULL_TREE} if no code is required in a particular place.  The default implementation leaves all three expressions as @code{NULL_TREE}.  The @code{__atomic_feraiseexcept} function from @code{libatomic} may be of use as part of the code generated in @code{*@var{update}}.
 @end deftypefn
 
+@deftypefn {Target Hook} bool TARGET_ALIGN_SPILL_P (machine_mode @var{mode})
+This function returns true if a register of type @var{mode} should be
+aligned beyond what the stack frame supports.  It should only return true
+in functions where @code{align_spill_base_regnum} has been defined.
+@end deftypefn
+
+@deftypefn {Target Hook} {unsigned int} TARGET_ALIGN_SPILL_BASE_REGNUM (void)
+This function returns the register number of a register set up in the
+fuction prologue to be a secondary stack pointer that has an alignment of
+@var{align_spill_maximum}.  It will be used in place of the frame pointer
+when spilling registers for which @var{align_spill_p} is true.
+@end deftypefn
+
+@deftypefn {Target Hook} {unsigned int} TARGET_ALIGN_SPILL_MAXIMUM (void)
+This function returns the alignment given to registers for which
+@var{align_spill_p} is true.
+@end deftypefn
+
 @deftypefn {Target Hook} void TARGET_RECORD_OFFLOAD_SYMBOL (tree)
 Used when offloaded functions are seen in the compilation unit and no named
 sections are available.  It is called once for each symbol that must be
diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
index 92835c1..283b21e 100644
--- a/gcc/doc/tm.texi.in
+++ b/gcc/doc/tm.texi.in
@@ -8205,6 +8205,12 @@ and the associated definitions of those functions.
 
 @hook TARGET_ATOMIC_ASSIGN_EXPAND_FENV
 
+@hook TARGET_ALIGN_SPILL_P
+
+@hook TARGET_ALIGN_SPILL_BASE_REGNUM
+
+@hook TARGET_ALIGN_SPILL_MAXIMUM
+
 @hook TARGET_RECORD_OFFLOAD_SYMBOL
 
 @hook TARGET_OFFLOAD_OPTIONS
diff --git a/gcc/hooks.c b/gcc/hooks.c
index 0fb9add..309a4c1 100644
--- a/gcc/hooks.c
+++ b/gcc/hooks.c
@@ -27,6 +27,7 @@
 #include "coretypes.h"
 #include "tm.h"
 #include "hooks.h"
+#include "rtl.h"
 
 /* Generic hook that does absolutely zappo.  */
 void
@@ -481,3 +482,10 @@ void
 hook_void_gcc_optionsp (struct gcc_options *opts ATTRIBUTE_UNUSED)
 {
 }
+
+/* Generic hook that takes no arguments and returns INVALID_REGNUM.  */
+unsigned int
+hook_uint_void_invalid_regnum (void)
+{
+  return INVALID_REGNUM;
+}
diff --git a/gcc/hooks.h b/gcc/hooks.h
index c3d4bd3..6071612 100644
--- a/gcc/hooks.h
+++ b/gcc/hooks.h
@@ -110,4 +110,5 @@ extern const char *hook_constcharptr_const_rtx_insn_null (const rtx_insn *);
 extern const char *hook_constcharptr_const_tree_const_tree_null (const_tree, const_tree);
 extern const char *hook_constcharptr_int_const_tree_null (int, const_tree);
 extern const char *hook_constcharptr_int_const_tree_const_tree_null (int, const_tree, const_tree);
+extern unsigned int hook_uint_void_invalid_regnum (void);
 #endif
diff --git a/gcc/lra-spills.c b/gcc/lra-spills.c
index a210c41..925fa26 100644
--- a/gcc/lra-spills.c
+++ b/gcc/lra-spills.c
@@ -137,6 +137,51 @@ static struct slot *slots;
 /* The number of the stack slots currently existing.  */
 static int slots_num;
 
+/* Modify a padded spill slot reference to have the correct offset and
+   base register to ensure that it is aligned.  */
+
+static rtx
+align_slot_address (rtx orig_slot, machine_mode mode)
+{
+  HOST_WIDE_INT orig_offset, new_offset, unit_alignment;
+  rtx addr_expr, basereg_rtx, offset_rtx, new_addr, new_slot;
+  rtx aligned_basereg = gen_rtx_REG (Pmode, targetm.align_spill_base_regnum ());
+  lra_assert (MEM_P (orig_slot));
+  addr_expr = XEXP (orig_slot, 0);
+  if (GET_CODE (addr_expr) == PLUS)
+    {
+      basereg_rtx = XEXP (addr_expr, 0);
+      offset_rtx = XEXP (addr_expr, 1);
+      lra_assert (CONST_INT_P (offset_rtx));
+      orig_offset = INTVAL (offset_rtx);
+    }
+  else
+    {
+      lra_assert (REG_P (addr_expr));
+      basereg_rtx = addr_expr;
+      orig_offset = 0;
+    }
+  lra_assert (basereg_rtx == frame_pointer_rtx);
+
+  /* Increase offset to account for difference between spill basereg and
+     frame pointer. */
+  unit_alignment = targetm.align_spill_maximum () / BITS_PER_UNIT;
+  new_offset = orig_offset + unit_alignment
+		- (MAX_SUPPORTED_STACK_ALIGNMENT / BITS_PER_UNIT);
+
+  /* Increase offset if necessary to ensure that it maintains alignment.  */
+  if (new_offset % unit_alignment > 0)
+    new_offset = new_offset + unit_alignment - (new_offset % unit_alignment);
+
+  lra_assert ((new_offset % unit_alignment) == 0);
+
+  new_addr = plus_constant (Pmode, aligned_basereg,
+			    new_offset);
+  new_slot = gen_rtx_MEM (mode, new_addr);
+  MEM_NOTRAP_P (new_slot) = 1;
+  return new_slot;
+}
+
 /* Set up memory of the spilled pseudo I.  The function can allocate
    the corresponding stack slot if it is not done yet.	*/
 static void
@@ -169,11 +214,25 @@ assign_mem_slot (int i)
   else
     {
       rtx stack_slot;
+      if (targetm.align_spill_p (mode))
+	{
+	  lra_assert (targetm.align_spill_base_regnum () != INVALID_REGNUM);
+	  lra_assert (inherent_align <= targetm.align_spill_maximum ());
+	  lra_assert (min_align <= targetm.align_spill_maximum ());
+	  /* Pad space being allocated to account for the possible difference
+             between the frame_register and the base register used for aligned
+	     spills.  We may also need to increase the offset returned by
+	     assign_stack_local to maintain that alignment.  */
+	  unsigned int pad = 2 * (targetm.align_spill_maximum ()
+			      - MAX_SUPPORTED_STACK_ALIGNMENT) / BITS_PER_UNIT;
+	  x = assign_stack_local (BLKmode, total_size + pad, -1);
+	  x = align_slot_address (x, mode);
+	}
+      else
+	x = assign_stack_local (mode, total_size,
+				min_align > inherent_align
+				|| total_size > inherent_size ? -1 : 0);
 
-      /* No known place to spill from => no slot to reuse.  */
-      x = assign_stack_local (mode, total_size,
-			      min_align > inherent_align
-			      || total_size > inherent_size ? -1 : 0);
       stack_slot = x;
       /* Cancel the big-endian correction done in assign_stack_local.
 	 Get the address of the beginning of the slot.	This is so we
diff --git a/gcc/target.def b/gcc/target.def
index f330709..daf7e26 100644
--- a/gcc/target.def
+++ b/gcc/target.def
@@ -5728,6 +5728,30 @@ DEFHOOK
  void, (tree *hold, tree *clear, tree *update),
  default_atomic_assign_expand_fenv)
 
+DEFHOOK
+(align_spill_p,
+ "This function returns true if a register of type @var{mode} should be\n\
+aligned beyond what the stack frame supports.  It should only return true\n\
+in functions where @code{align_spill_base_regnum} has been defined.",
+ bool, (machine_mode mode),
+ hook_bool_mode_false)
+
+DEFHOOK
+(align_spill_base_regnum,
+ "This function returns the register number of a register set up in the\n\
+fuction prologue to be a secondary stack pointer that has an alignment of\n\
+@var{align_spill_maximum}.  It will be used in place of the frame pointer\n\
+when spilling registers for which @var{align_spill_p} is true.",
+ unsigned int, (void),
+ hook_uint_void_invalid_regnum)
+
+DEFHOOK
+(align_spill_maximum,
+ "This function returns the alignment given to registers for which\n\
+@var{align_spill_p} is true.",
+ unsigned int, (void),
+ hook_uint_void_0)
+
 /* Leave the boolean fields at the end.  */
 
 /* True if we can create zeroed data by switching to a BSS section



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]