PATCH: PR target/40838: gcc shouldn't assume that the stack is aligned

H.J. Lu hjl.tools@gmail.com
Tue Oct 20 21:15:00 GMT 2009


On Mon, Oct 19, 2009 at 6:25 PM, Michael Matz <matz@suse.de> wrote:
> Hi,
>
> On Mon, 19 Oct 2009, H.J. Lu wrote:
>
>> > He tries to solve this by pessimistically assuming that potentially
>> > everything imaginable could go on stack.  What I don't understand is
>> > why we don't instead track hard_stack_alignment in assign_*_temp
>> > (where we then assume that the stack will be aligned perfectly), and
>> > expand stack realignment code _after_ having expanded everything else
>> > (plus examined local variables for the possibility of generating spill
>> > slots).
>>
>> Vectorizer may not call assign_*_temp at all. Instead, x86 backend may
>> call gen_reg_rtx to generate pseudo registers when expanding vector
>> statement.
>
> If that (and not assign_*_temp) is the problem, then it's obvious that
> fiddling with the vectorizer doesn't solve it.  The expanders can create
> new pseudos for whatever they see fit.  For expanding vector statements,
> for expanding block moves, for expanding string compares, for expanding
> additions, for anything.  Mucking around with the vectorizer won't solve
> the problem.
>

Here is a new patch. I added hard_stack_alignment, moved
update_stack_boundary after RTL expansion and updated
ix86_function_ok_for_sibcall to deal with it.  Any comments?

Thanks.


-- 
H.J.
---
gcc/

2009-10-20  H.J. Lu  <hongjiu.lu@intel.com>

	PR target/40836
	* cfgexpand.c (get_decl_align_unit): Update hard_stack_alignment.
	(expand_one_var): Likewise.
	(gimple_expand_cfg): Initialize hard_stack_alignment to 0. Move
	update_stack_boundary call to ...
	(expand_stack_alignment): Here.

	* emit-rtl.c (gen_reg_rtx): Update hard_stack_alignment.
	* function.c (assign_stack_local_1): Likewise.
	(assign_parms): Likewise.
	(locate_and_pad_parm): Likewise.

	* function.h (rtl_data): Add hard_stack_alignment.

	* config/i386/i386.c (ix86_minimum_incoming_stack_boundary): New.
	(verride_options): Don't check ix86_force_align_arg_pointer here.
	(ix86_function_ok_for_sibcall): Use it.
	(ix86_update_stack_boundary): Likewise.

	* config/i386/i386.h (STACK_REALIGN_DEFAULT): Update comments.

gcc/testsuite/

2009-10-20  H.J. Lu  <hongjiu.lu@intel.com>

	PR target/40838
	* gcc.target/i386/incoming-6.c: New.
	* gcc.target/i386/incoming-7.c: Likewise.
	* gcc.target/i386/incoming-8.c: Likewise.
	* gcc.target/i386/incoming-9.c: Likewise.
	* gcc.target/i386/incoming-10.c: Likewise.
	* gcc.target/i386/incoming-11.c: Likewise.
	* gcc.target/i386/incoming-12.c: Likewise.
	* gcc.target/i386/incoming-13.c: Likewise.
	* gcc.target/i386/incoming-14.c: Likewise.
	* gcc.target/i386/incoming-15.c: Likewise.
-------------- next part --------------
gcc/

2009-10-20  H.J. Lu  <hongjiu.lu@intel.com>

	PR target/40836
	* cfgexpand.c (get_decl_align_unit): Update hard_stack_alignment.
	(expand_one_var): Likewise.
	(gimple_expand_cfg): Initialize hard_stack_alignment to 0. Move
	update_stack_boundary call to ...
	(expand_stack_alignment): Here.

	* emit-rtl.c (gen_reg_rtx): Update hard_stack_alignment.
	* function.c (assign_stack_local_1): Likewise.
	(assign_parms): Likewise.
	(locate_and_pad_parm): Likewise.

	* function.h (rtl_data): Add hard_stack_alignment.

	* config/i386/i386.c (ix86_minimum_incoming_stack_boundary): New.
	(verride_options): Don't check ix86_force_align_arg_pointer here.
	(ix86_function_ok_for_sibcall): Use it.
	(ix86_update_stack_boundary): Likewise.

	* config/i386/i386.h (STACK_REALIGN_DEFAULT): Update comments.

gcc/testsuite/

2009-10-20  H.J. Lu  <hongjiu.lu@intel.com>

	PR target/40838
	* gcc.target/i386/incoming-6.c: New.
	* gcc.target/i386/incoming-7.c: Likewise.
	* gcc.target/i386/incoming-8.c: Likewise.
	* gcc.target/i386/incoming-9.c: Likewise.
	* gcc.target/i386/incoming-10.c: Likewise.
	* gcc.target/i386/incoming-11.c: Likewise.
	* gcc.target/i386/incoming-12.c: Likewise.
	* gcc.target/i386/incoming-13.c: Likewise.
	* gcc.target/i386/incoming-14.c: Likewise.
	* gcc.target/i386/incoming-15.c: Likewise.

diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
index acd70c1..05df1fd 100644
--- a/gcc/cfgexpand.c
+++ b/gcc/cfgexpand.c
@@ -247,6 +247,9 @@ get_decl_align_unit (tree decl)
 	  gcc_assert(!crtl->stack_realign_processed);
           crtl->stack_alignment_estimated = align;
 	}
+
+      if (crtl->hard_stack_alignment < align)
+	crtl->hard_stack_alignment = align;
     }
 
   /* stack_alignment_needed > PREFERRED_STACK_BOUNDARY is permitted.
@@ -994,6 +997,9 @@ expand_one_var (tree var, bool toplevel, bool really_expand)
           gcc_assert(!crtl->stack_realign_processed);
 	  crtl->stack_alignment_estimated = align;
 	}
+
+      if (crtl->hard_stack_alignment < align)
+	crtl->hard_stack_alignment = align;
     }
 
   if (TREE_CODE (origvar) == SSA_NAME)
@@ -3472,6 +3478,19 @@ expand_stack_alignment (void)
   gcc_assert (crtl->stack_alignment_needed
 	      <= crtl->stack_alignment_estimated);
 
+  /* Call update_stack_boundary here again to update incoming stack
+     boundary.  It may set incoming stack alignment to a different
+     value after RTL expansion.  TARGET_FUNCTION_OK_FOR_SIBCALL may
+     use the minimum incoming stack alignment to check if it is OK
+     to perform sibcall optimization since sibcall optimization will
+     only align the outgoing stack to incoming stack boundary.  */
+  if (targetm.calls.update_stack_boundary)
+    targetm.calls.update_stack_boundary ();
+
+  /* The incoming stack frame has to be aligned at least at
+     parm_stack_boundary.  */
+  gcc_assert (crtl->parm_stack_boundary <= INCOMING_STACK_BOUNDARY);
+
   /* Update crtl->stack_alignment_estimated and use it later to align
      stack.  We check PREFERRED_STACK_BOUNDARY if there may be non-call
      exceptions since callgraph doesn't collect incoming stack alignment
@@ -3564,6 +3583,7 @@ gimple_expand_cfg (void)
   crtl->max_used_stack_slot_alignment = STACK_BOUNDARY;
   crtl->stack_alignment_estimated = STACK_BOUNDARY;
   crtl->preferred_stack_boundary = STACK_BOUNDARY;
+  crtl->hard_stack_alignment = 0;
   cfun->cfg->max_jumptable_ents = 0;
 
 
@@ -3626,23 +3646,6 @@ gimple_expand_cfg (void)
   if (crtl->stack_protect_guard)
     stack_protect_prologue ();
 
-  /* Update stack boundary if needed.  */
-  if (SUPPORTS_STACK_ALIGNMENT)
-    {
-      /* Call update_stack_boundary here to update incoming stack
-	 boundary before TARGET_FUNCTION_OK_FOR_SIBCALL is called.
-	 TARGET_FUNCTION_OK_FOR_SIBCALL needs to know the accurate
-	 incoming stack alignment to check if it is OK to perform
-	 sibcall optimization since sibcall optimization will only
-	 align the outgoing stack to incoming stack boundary.  */
-      if (targetm.calls.update_stack_boundary)
-	targetm.calls.update_stack_boundary ();
-      
-      /* The incoming stack frame has to be aligned at least at
-	 parm_stack_boundary.  */
-      gcc_assert (crtl->parm_stack_boundary <= INCOMING_STACK_BOUNDARY);
-    }
-
   expand_phi_nodes (&SA);
 
   /* Register rtl specific functions for cfg.  */
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 73913b8..3db9485 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -1905,6 +1905,7 @@ static bool ix86_valid_target_attribute_p (tree, tree, tree, int);
 static bool ix86_valid_target_attribute_inner_p (tree, char *[]);
 static bool ix86_can_inline_p (tree, tree);
 static void ix86_set_current_function (tree);
+static unsigned int ix86_minimum_incoming_stack_boundary (bool);
 
 static enum calling_abi ix86_function_abi (const_tree);
 
@@ -3239,12 +3240,10 @@ override_options (bool main_args_p)
   if (ix86_force_align_arg_pointer == -1)
     ix86_force_align_arg_pointer = STACK_REALIGN_DEFAULT;
 
+  ix86_default_incoming_stack_boundary = PREFERRED_STACK_BOUNDARY;
+
   /* Validate -mincoming-stack-boundary= value or default it to
      MIN_STACK_BOUNDARY/PREFERRED_STACK_BOUNDARY.  */
-  if (ix86_force_align_arg_pointer)
-    ix86_default_incoming_stack_boundary = MIN_STACK_BOUNDARY;
-  else
-    ix86_default_incoming_stack_boundary = PREFERRED_STACK_BOUNDARY;
   ix86_incoming_stack_boundary = ix86_default_incoming_stack_boundary;
   if (ix86_incoming_stack_boundary_string)
     {
@@ -4277,7 +4276,8 @@ ix86_function_ok_for_sibcall (tree decl, tree exp)
 
   /* If we need to align the outgoing stack, then sibcalling would
      unalign the stack, which may break the called function.  */
-  if (ix86_incoming_stack_boundary < PREFERRED_STACK_BOUNDARY)
+  if (ix86_minimum_incoming_stack_boundary (true)
+      < PREFERRED_STACK_BOUNDARY)
     return false;
 
   if (decl)
@@ -8196,37 +8196,57 @@ find_drap_reg (void)
     }
 }
 
-/* Update incoming stack boundary and estimated stack alignment.  */
+/* Return minimum incoming stack alignment.  */
 
-static void
-ix86_update_stack_boundary (void)
+static unsigned int
+ix86_minimum_incoming_stack_boundary (bool sibcall)
 {
+  unsigned int incoming_stack_boundary;
+
   /* Prefer the one specified at command line. */
-  ix86_incoming_stack_boundary 
-    = (ix86_user_incoming_stack_boundary
-       ? ix86_user_incoming_stack_boundary
-       : ix86_default_incoming_stack_boundary);
+  if (ix86_user_incoming_stack_boundary)
+    incoming_stack_boundary = ix86_user_incoming_stack_boundary;
+  /* In 32bit, use MIN_STACK_BOUNDARY for incoming stack boundary if
+     -mstackrealign is used and it is called to check if sibcall is
+     OK or hard stack alignment is 128bit.  */
+  else if (!TARGET_64BIT
+	   && ix86_force_align_arg_pointer
+	   && (sibcall || crtl->hard_stack_alignment == 128))
+    incoming_stack_boundary = MIN_STACK_BOUNDARY;
+  else
+    incoming_stack_boundary = ix86_default_incoming_stack_boundary;
 
   /* Incoming stack alignment can be changed on individual functions
      via force_align_arg_pointer attribute.  We use the smallest
      incoming stack boundary.  */
-  if (ix86_incoming_stack_boundary > MIN_STACK_BOUNDARY
+  if (incoming_stack_boundary > MIN_STACK_BOUNDARY
       && lookup_attribute (ix86_force_align_arg_pointer_string,
 			   TYPE_ATTRIBUTES (TREE_TYPE (current_function_decl))))
-    ix86_incoming_stack_boundary = MIN_STACK_BOUNDARY;
+    incoming_stack_boundary = MIN_STACK_BOUNDARY;
 
   /* The incoming stack frame has to be aligned at least at
      parm_stack_boundary.  */
-  if (ix86_incoming_stack_boundary < crtl->parm_stack_boundary)
-    ix86_incoming_stack_boundary = crtl->parm_stack_boundary;
+  if (incoming_stack_boundary < crtl->parm_stack_boundary)
+    incoming_stack_boundary = crtl->parm_stack_boundary;
 
   /* Stack at entrance of main is aligned by runtime.  We use the
      smallest incoming stack boundary. */
-  if (ix86_incoming_stack_boundary > MAIN_STACK_BOUNDARY
+  if (incoming_stack_boundary > MAIN_STACK_BOUNDARY
       && DECL_NAME (current_function_decl)
       && MAIN_NAME_P (DECL_NAME (current_function_decl))
       && DECL_FILE_SCOPE_P (current_function_decl))
-    ix86_incoming_stack_boundary = MAIN_STACK_BOUNDARY;
+    incoming_stack_boundary = MAIN_STACK_BOUNDARY;
+
+  return incoming_stack_boundary;
+}
+
+/* Update incoming stack boundary and estimated stack alignment.  */
+
+static void
+ix86_update_stack_boundary (void)
+{
+  ix86_incoming_stack_boundary
+    = ix86_minimum_incoming_stack_boundary (false);
 
   /* x86_64 vararg needs 16byte stack alignment for register save
      area.  */
diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h
index 33a5077..22187a9 100644
--- a/gcc/config/i386/i386.h
+++ b/gcc/config/i386/i386.h
@@ -706,9 +706,7 @@ enum target_cpu_default
    generate an alternate prologue and epilogue that realigns the
    runtime stack if nessary.  This supports mixing codes that keep a
    4-byte aligned stack, as specified by i386 psABI, with codes that
-   need a 16-byte aligned stack, as required by SSE instructions.  If
-   STACK_REALIGN_DEFAULT is 1 and PREFERRED_STACK_BOUNDARY_DEFAULT is
-   128, stacks for all functions may be realigned.  */
+   need a 16-byte aligned stack, as required by SSE instructions.  */
 #define STACK_REALIGN_DEFAULT 0
 
 /* Boundary (in *bits*) on which the incoming stack is aligned.  */
diff --git a/gcc/emit-rtl.c b/gcc/emit-rtl.c
index b868298..c1ca40e 100644
--- a/gcc/emit-rtl.c
+++ b/gcc/emit-rtl.c
@@ -872,6 +872,9 @@ gen_reg_rtx (enum machine_mode mode)
       unsigned int min_align = MINIMUM_ALIGNMENT (NULL, mode, align);
       if (crtl->stack_alignment_estimated < min_align)
 	crtl->stack_alignment_estimated = min_align;
+
+      if (crtl->hard_stack_alignment < min_align)
+	crtl->hard_stack_alignment = min_align;
     }
 
   if (generating_concat_p
diff --git a/gcc/function.c b/gcc/function.c
index 35c0cfd..eb8ecfe 100644
--- a/gcc/function.c
+++ b/gcc/function.c
@@ -356,6 +356,10 @@ assign_stack_local_1 (enum machine_mode mode, HOST_WIDE_INT size,
 		}
 	    }
 	}
+
+      if (!crtl->stack_realign_processed
+	  && crtl->hard_stack_alignment < alignment_in_bits)
+	crtl->hard_stack_alignment = alignment_in_bits;
     }
 
   if (crtl->stack_alignment_needed < alignment_in_bits)
@@ -3166,6 +3170,9 @@ assign_parms (tree fndecl)
 	      gcc_assert (!crtl->stack_realign_processed);
 	      crtl->stack_alignment_estimated = align;
 	    }
+
+	  if (crtl->hard_stack_alignment < align)
+	    crtl->hard_stack_alignment = align;
 	}
 	
       if (cfun->stdarg && !TREE_CHAIN (parm))
@@ -3223,6 +3230,9 @@ assign_parms (tree fndecl)
 		  gcc_assert (!crtl->stack_realign_processed);
 		  crtl->stack_alignment_estimated = align;
 		}
+
+	      if (crtl->hard_stack_alignment < align)
+		crtl->hard_stack_alignment = align;
 	    }
 	} 
     }
@@ -3538,6 +3548,10 @@ locate_and_pad_parm (enum machine_mode passed_mode, tree type, int in_regs,
 			  && crtl->stack_realign_needed);
 	    }
 	}
+
+      if (!crtl->stack_realign_processed
+	  && crtl->hard_stack_alignment < boundary)
+	crtl->hard_stack_alignment = boundary;
     }
 
   /* Remember if the outgoing parameter requires extra alignment on the
diff --git a/gcc/function.h b/gcc/function.h
index 4825d16..8b57c9a 100644
--- a/gcc/function.h
+++ b/gcc/function.h
@@ -341,6 +341,9 @@ struct GTY(()) rtl_data {
         local stack.  */
   unsigned int stack_alignment_estimated;
 
+  /* The largest hard alignment on the stack.  */
+  unsigned int hard_stack_alignment;
+
   /* For reorg.  */
 
   /* If some insns can be deferred to the delay slots of the epilogue, the
diff --git a/gcc/testsuite/gcc.target/i386/incoming-10.c b/gcc/testsuite/gcc.target/i386/incoming-10.c
new file mode 100644
index 0000000..31d9e61
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/incoming-10.c
@@ -0,0 +1,19 @@
+/* PR target/40838 */
+/* { dg-do compile { target { { ! *-*-darwin* } && ilp32 } } } */
+/* { dg-options "-w -mstackrealign -fomit-frame-pointer -O3 -march=barcelona -mpreferred-stack-boundary=4" } */
+
+struct s {
+	int x[8];
+};
+
+void g(struct s *);
+
+void f()
+{
+	int i;
+	struct s s;
+	for (i = 0; i < sizeof(s.x) / sizeof(*s.x); i++) s.x[i] = 0;
+	g(&s);
+}
+
+/* { dg-final { scan-assembler "andl\[\\t \]*\\$-16,\[\\t \]*%esp" } } */
diff --git a/gcc/testsuite/gcc.target/i386/incoming-11.c b/gcc/testsuite/gcc.target/i386/incoming-11.c
new file mode 100644
index 0000000..e5787af
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/incoming-11.c
@@ -0,0 +1,18 @@
+/* PR target/40838 */
+/* { dg-do compile { target { { ! *-*-darwin* } && ilp32 } } } */
+/* { dg-options "-w -mstackrealign -fomit-frame-pointer -O3 -march=barcelona -mpreferred-stack-boundary=4" } */
+
+void g();
+
+int p[100];
+int q[100];
+
+void f()
+{
+	int i;
+	for (i = 0; i < 100; i++) p[i] = 0;
+	g();
+	for (i = 0; i < 100; i++) q[i] = 0;
+}
+
+/* { dg-final { scan-assembler "andl\[\\t \]*\\$-16,\[\\t \]*%esp" } } */
diff --git a/gcc/testsuite/gcc.target/i386/incoming-12.c b/gcc/testsuite/gcc.target/i386/incoming-12.c
new file mode 100644
index 0000000..d7ef103
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/incoming-12.c
@@ -0,0 +1,20 @@
+/* PR target/40838 */
+/* { dg-do compile { target { { ! *-*-darwin* } && ilp32 } } } */
+/* { dg-options "-w -mstackrealign -O2 -msse2 -mpreferred-stack-boundary=4" } */
+
+typedef int v4si __attribute__ ((vector_size (16)));
+
+struct x {
+       v4si v;
+       v4si w;
+};
+
+void y(void *);
+
+v4si x(void)
+{
+       struct x x;
+       y(&x);
+}
+
+/* { dg-final { scan-assembler "andl\[\\t \]*\\$-16,\[\\t \]*%esp" } } */
diff --git a/gcc/testsuite/gcc.target/i386/incoming-13.c b/gcc/testsuite/gcc.target/i386/incoming-13.c
new file mode 100644
index 0000000..bbc8993
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/incoming-13.c
@@ -0,0 +1,15 @@
+/* PR target/40838 */
+/* { dg-do compile { target { { ! *-*-darwin* } && ilp32 } } } */
+/* { dg-options "-w -mstackrealign -O2 -mpreferred-stack-boundary=4" } */
+
+extern double y(double *s3);
+
+extern double s1, s2;
+
+double x(void)
+{
+  double s3 = s1 + s2;
+  return y(&s3);
+}
+
+/* { dg-final { scan-assembler-not "andl\[\\t \]*\\$-16,\[\\t \]*%esp" } } */
diff --git a/gcc/testsuite/gcc.target/i386/incoming-14.c b/gcc/testsuite/gcc.target/i386/incoming-14.c
new file mode 100644
index 0000000..d27179d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/incoming-14.c
@@ -0,0 +1,15 @@
+/* PR target/40838 */
+/* { dg-do compile { target { { ! *-*-darwin* } && ilp32 } } } */
+/* { dg-options "-w -mstackrealign -O2 -mpreferred-stack-boundary=4" } */
+
+extern int y(int *s3);
+
+extern int s1, s2;
+
+int x(void)
+{
+  int s3 = s1 + s2;
+  return y(&s3);
+}
+
+/* { dg-final { scan-assembler-not "andl\[\\t \]*\\$-16,\[\\t \]*%esp" } } */
diff --git a/gcc/testsuite/gcc.target/i386/incoming-15.c b/gcc/testsuite/gcc.target/i386/incoming-15.c
new file mode 100644
index 0000000..e6a1749
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/incoming-15.c
@@ -0,0 +1,15 @@
+/* PR target/40838 */
+/* { dg-do compile { target { { ! *-*-darwin* } && ilp32 } } } */
+/* { dg-options "-w -mstackrealign -O2 -mpreferred-stack-boundary=4" } */
+
+extern long long y(long long *s3);
+
+extern long long s1, s2;
+
+long long x(void)
+{
+  long long s3 = s1 + s2;
+  return y(&s3);
+}
+
+/* { dg-final { scan-assembler-not "andl\[\\t \]*\\$-16,\[\\t \]*%esp" } } */
diff --git a/gcc/testsuite/gcc.target/i386/incoming-6.c b/gcc/testsuite/gcc.target/i386/incoming-6.c
new file mode 100644
index 0000000..5cc4ab3
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/incoming-6.c
@@ -0,0 +1,17 @@
+/* PR target/40838 */
+/* { dg-do compile { target { { ! *-*-darwin* } && ilp32 } } } */
+/* { dg-options "-w -mstackrealign -O2 -msse2 -mpreferred-stack-boundary=4" } */
+
+typedef int v4si __attribute__ ((vector_size (16)));
+
+extern v4si y(v4si *s3);
+
+extern v4si s1, s2;
+
+v4si x(void)
+{
+  v4si s3 = s1 + s2;
+  return y(&s3);
+}
+
+/* { dg-final { scan-assembler "andl\[\\t \]*\\$-16,\[\\t \]*%esp" } } */
diff --git a/gcc/testsuite/gcc.target/i386/incoming-7.c b/gcc/testsuite/gcc.target/i386/incoming-7.c
new file mode 100644
index 0000000..cdd6037
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/incoming-7.c
@@ -0,0 +1,16 @@
+/* PR target/40838 */
+/* { dg-do compile { target { { ! *-*-darwin* } && ilp32 } } } */
+/* { dg-options "-w -mstackrealign -O2 -msse2 -mpreferred-stack-boundary=4" } */
+
+typedef int v4si __attribute__ ((vector_size (16)));
+
+extern v4si y(v4si, v4si, v4si, v4si, v4si);
+
+extern v4si s1, s2;
+
+v4si x(void)
+{
+  return y(s1, s2, s1, s2, s2);
+}
+
+/* { dg-final { scan-assembler "andl\[\\t \]*\\$-16,\[\\t \]*%esp" } } */
diff --git a/gcc/testsuite/gcc.target/i386/incoming-8.c b/gcc/testsuite/gcc.target/i386/incoming-8.c
new file mode 100644
index 0000000..2dd8800
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/incoming-8.c
@@ -0,0 +1,18 @@
+/* PR target/40838 */
+/* { dg-do compile { target { { ! *-*-darwin* } && ilp32 } } } */
+/* { dg-options "-w -mstackrealign -O3 -msse2 -mpreferred-stack-boundary=4" } */
+
+float
+foo (float f)
+{
+  float array[128];
+  float x;
+  int i;
+  for (i = 0; i < sizeof(array) / sizeof(*array); i++)
+    array[i] = f;
+  for (i = 0; i < sizeof(array) / sizeof(*array); i++)
+    x += array[i];
+  return x;
+}
+
+/* { dg-final { scan-assembler "andl\[\\t \]*\\$-16,\[\\t \]*%esp" } } */
diff --git a/gcc/testsuite/gcc.target/i386/incoming-9.c b/gcc/testsuite/gcc.target/i386/incoming-9.c
new file mode 100644
index 0000000..e43cbd6
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/incoming-9.c
@@ -0,0 +1,18 @@
+/* PR target/40838 */
+/* { dg-do compile { target { { ! *-*-darwin* } && ilp32 } } } */
+/* { dg-options "-w -mstackrealign -O3 -mno-sse -mpreferred-stack-boundary=4" } */
+
+float
+foo (float f)
+{
+  float array[128];
+  float x;
+  int i;
+  for (i = 0; i < sizeof(array) / sizeof(*array); i++)
+    array[i] = f;
+  for (i = 0; i < sizeof(array) / sizeof(*array); i++)
+    x += array[i];
+  return x;
+}
+
+/* { dg-final { scan-assembler-not "andl\[\\t \]*\\$-16,\[\\t \]*%esp" } } */


More information about the Gcc-patches mailing list