This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] i386: Always set cfun->machine->max_used_stack_alignment


On Sat, Aug 04, 2018 at 11:48:15PM +0200, Uros Bizjak wrote:
> On Sat, Aug 4, 2018 at 9:49 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
> > On Sat, Aug 4, 2018 at 12:09 PM, Uros Bizjak <ubizjak@gmail.com> wrote:
> >> On Sat, Aug 4, 2018 at 3:59 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
> >>> On Sat, Aug 4, 2018 at 3:42 AM, Uros Bizjak <ubizjak@gmail.com> wrote:
> >>>> On Fri, Aug 3, 2018 at 12:55 AM, H.J. Lu <hongjiu.lu@intel.com> wrote:
> >>>>> We should always set cfun->machine->max_used_stack_alignment if the
> >>>>> maximum stack slot alignment may be greater than 64 bits.
> >>>>>
> >>>>> Tested on i686 and x86-64.  OK for master and backport for GCC 8?
> >>>>
> >>>> Can you explain why 64 bits, and what this value represents? Is this
> >>>> value the same for 64bit and 32bit targets?
> >>>>
> >>>> Should crtl->max_used_stack_slot_alignment be compared to
> >>>> STACK_BOUNDARY or even MIN_STACK_BOUNDARY instead?
> >>>
> >>> In this case, we don't need to realign the incoming stack since both
> >>> crtl->max_used_stack_slot_alignment and crtl->preferred_stack_boundary
> >>> are 128 bits.  We don't compute the largest alignment of stack slots to
> >>> keep stack frame properly aligned for it.  Normally it is OK.   But if
> >>> the largest alignment of stack slots > 64 bits and we don't keep stack
> >>> frame proper aligned, we will get segfault if aligned vector load/store
> >>> are used on these unaligned stack slots. My patch computes the
> >>> largest alignment of stack slots, which are actually used,  if the
> >>> largest alignment of stack slots allocated is > 64 bits, which is
> >>> the smallest alignment for misaligned load/store.
> >>
> >> Does > 64 bits also hold for 32 bit targets, and -mabi=ms? I think
> >> that we need to compare to STACK_BOUNDARY instead:
> >
> > 64 bits requirement is independent of any psABIs nor 32-bit vs 64-bit.
> > cfun->machine->max_used_stack_alignment is used to decide how
> > stack frame should be aligned.  It is always safe to compute it.  I used
> >
> > else if (crtl->max_used_stack_slot_alignment > 64)
> >
> > to compute cfun->machine->max_used_stack_alignment only if
> > we have to.
> >
> >> --cut here--
> >> Index: config/i386/i386.c
> >> ===================================================================
> >> --- config/i386/i386.c  (revision 263307)
> >> +++ config/i386/i386.c  (working copy)
> >> @@ -13281,8 +13281,7 @@
> >>           recompute_frame_layout_p = true;
> >>         }
> >>      }
> >> -  else if (crtl->max_used_stack_slot_alignment
> >> -          > crtl->preferred_stack_boundary)
> >> +  else if (crtl->max_used_stack_slot_alignment > STACK_BOUNDARY)
> >>      {
> >
> > This isn't correct..  cygming.h has
> >
> > #define STACK_BOUNDARY (TARGET_64BIT && ix86_abi == MS_ABI ? 128 :
> > BITS_PER_WORD)
> >
> > darwin.h has
> >
> > #undef STACK_BOUNDARY
> > #define STACK_BOUNDARY \
> >  ((profile_flag || (TARGET_64BIT && ix86_abi == MS_ABI)) \
> >   ? 128 : BITS_PER_WORD)
> >
> > i386.h has
> >
> > /* Boundary (in *bits*) on which stack pointer should be aligned.  */
> > #define STACK_BOUNDARY \
> >  (TARGET_64BIT && ix86_abi == MS_ABI ? 128 : BITS_PER_WORD)
> >
> > If STACK_BOUNDARY is 128 and max_used_stack_slot_alignment is 128,
> > we will get segment when 128bit aligned load/store is generated on misaligned
> > stack slot.
> >
> >>        /* We don't need to realign stack.  But we still need to keep
> >>          stack frame properly aligned to satisfy the largest alignment
> >> --cut here--
> >>
> >> (The testcase works OK with -mabi=ms, which somehow suggests that we
> >> don't need realignment in this case).
> >
> > We may not hit 128bit aligned load/store on misaligned stack slot in this
> > case.  It doesn't mean that won't happen.
> >
> > else if (crtl->max_used_stack_slot_alignment > 64)
> >
> > is the correct thing to do here.
> 
> OK, but please add a comment, so in the future we will still know the
> purpose of the magic number.
> 

Like this?

H.J.
---
cfun->machine->max_used_stack_alignment is used to decide how stack frame
should be aligned.  This is independent of any psABIs nor 32-bit vs 64-bit.
It is always safe to compute max_used_stack_alignment.  We compute it only
if 128-bit aligned load/store may be generated on misaligned stack slot
which will lead to segfault.

gcc/

	PR target/86386
	* config/i386/i386.c (ix86_finalize_stack_frame_flags): Set
	cfun->machine->max_used_stack_alignment if needed.

gcc/testsuite/

	PR target/86386
	* gcc.target/i386/pr86386.c: New file.
---
 gcc/config/i386/i386.c                  | 14 +++++++------
 gcc/testsuite/gcc.target/i386/pr86386.c | 26 +++++++++++++++++++++++++
 2 files changed, 34 insertions(+), 6 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr86386.c

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index ee409cfe7e4..cf8c33bd909 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -13281,12 +13281,14 @@ ix86_finalize_stack_frame_flags (void)
 	  recompute_frame_layout_p = true;
 	}
     }
-  else if (crtl->max_used_stack_slot_alignment
-	   > crtl->preferred_stack_boundary)
-    {
-      /* We don't need to realign stack.  But we still need to keep
-	 stack frame properly aligned to satisfy the largest alignment
-	 of stack slots.  */
+  else if (crtl->max_used_stack_slot_alignment > 64)
+    {
+      /* We don't need to realign stack.  max_used_stack_alignment is
+	 used to decide how stack frame should be aligned.  This is
+	 independent of any psABIs nor 32-bit vs 64-bit.  It is always
+	 safe to compute max_used_stack_alignment.  We compute it only
+	 if 128-bit aligned load/store may be generated on misaligned
+	 stack slot which will lead to segfault.   */
       if (ix86_find_max_used_stack_alignment (stack_alignment, true))
 	cfun->machine->max_used_stack_alignment
 	  = stack_alignment / BITS_PER_UNIT;
diff --git a/gcc/testsuite/gcc.target/i386/pr86386.c b/gcc/testsuite/gcc.target/i386/pr86386.c
new file mode 100644
index 00000000000..a67cf45444e
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr86386.c
@@ -0,0 +1,26 @@
+/* PR target/86386 */
+/* { dg-do run { target { avx_runtime && int128 } } } */
+/* { dg-options "-Os -fno-tree-dce -mstringop-strategy=vector_loop -mavx" } */
+
+unsigned c, d, e, f;
+
+unsigned __attribute__((noipa))
+foo (unsigned char g, unsigned short h, unsigned i, unsigned long long j,
+     unsigned char k, unsigned short l, unsigned m, unsigned __int128 n)
+{
+  __builtin_memset (&e, 0, 3);
+  n <<= m;
+  __builtin_memcpy (&m, 2 + (char *) &n, 1);
+  m >>= 0;
+  d ^= __builtin_mul_overflow (l, n, &m);
+  return m;
+}
+
+int
+main ()
+{
+  unsigned __int128 x = foo (0, 0, 0, 0, 0, 4, 1, 3);
+  if (x != 24)
+    __builtin_abort ();
+  return 0;
+}
-- 
2.17.1


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]