This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Improve i?86/x86_64 prologue_and_epilogue for leaf functions (PR target/59501)


On Fri, Dec 20, 2013 at 8:06 AM, Jakub Jelinek <jakub@redhat.com> wrote:
> Hi!
>
> Honza recently changed the i?86 backend, so that it often doesn't
> do -maccumulate-outgoing-args by default on x86_64.
> Unfortunately, on some of the here included testcases this regressed
> quite a bit the generated code.  As AVX vectors are used, the dynamic
> realignment code needs to assume e.g. that some of them will need to be
> spilled, and for -mno-accumulate-outgoing-args the code needs to set
> need_drap early as well.  But in when emitting the prologue/epilogue,
> if need_drap is set, we don't perform the optimization for leaf functions
> which have zero size stack frame, thus we end up with uselessly doing
> dynamic stack realignment, setting up DRAP that nothing uses and later on
> restore everything back.
>
> This patch improves it, if the DRAP register isn't live at the start of
> entry bb successor and we aren't going to realign the stack, we don't
> need DRAP at all, and even if we need DRAP register, that can't be the sole
> reason for doing stack realignment, the prologue code is able to set up DRAP
> even without dynamic stack realignment.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>
> 2013-12-20  Jakub Jelinek  <jakub@redhat.com>
>
>         PR target/59501
>         * config/i386/i386.c (ix86_save_reg): Don't return true for drap_reg
>         if !crtl->stack_realign_needed.
>         (ix86_finalize_stack_realign_flags): If drap_reg isn't live on entry
>         and stack_realign_needed will be false, clear drap_reg and need_drap.
>         Optimize leaf functions that don't need stack frame even if
>         crtl->need_drap.
>
>         * gcc.target/i386/pr59501-1.c: New test.
>         * gcc.target/i386/pr59501-1a.c: New test.
>         * gcc.target/i386/pr59501-2.c: New test.
>         * gcc.target/i386/pr59501-2a.c: New test.
>         * gcc.target/i386/pr59501-3.c: New test.
>         * gcc.target/i386/pr59501-3a.c: New test.
>         * gcc.target/i386/pr59501-4.c: New test.
>         * gcc.target/i386/pr59501-4a.c: New test.
>         * gcc.target/i386/pr59501-5.c: New test.
>         * gcc.target/i386/pr59501-6.c: New test.
>
>
> --- gcc/testsuite/gcc.target/i386/pr59501-4a.c.jj       2013-12-20 12:19:20.603212859 +0100
> +++ gcc/testsuite/gcc.target/i386/pr59501-4a.c  2013-12-20 12:23:33.647881672 +0100
> @@ -0,0 +1,8 @@
> +/* PR target/59501 */
> +/* { dg-do compile { target { ! ia32 } } } */
> +/* { dg-options "-O2 -mavx -maccumulate-outgoing-args" } */
> +
> +#include "pr59501-3a.c"
> +
> +/* Verify no dynamic realignment is performed.  */
> +/* { dg-final { scan-assembler-not "and\[^\n\r]*sp" { xfail *-*-* } } } */
>

Since DRAP isn't used with -maccumulate-outgoing-args, pr59501-4a.c was
xfailed due to stack frame access via frame pointer instead of DARP.
This patch finds the maximum stack alignment from the stack frame access
instructions and avoids stack realignment if stack alignment needed is
less than incoming stack boundary.

I am testing this patch.  OK for trunk if there is no regression?


-- 
H.J.
From d9844cba6ce20498aab42a32927230cb7b56475d Mon Sep 17 00:00:00 2001
From: "H.J. Lu" <hjl.tools@gmail.com>
Date: Fri, 7 Jul 2017 06:01:16 -0700
Subject: [PATCH] i386: Avoid stack realignment if possible

Since DRAP isn't used with -maccumulate-outgoing-args, pr59501-4a.c was
xfailed due to stack frame access via frame pointer instead of DARP.
This patch finds the maximum stack alignment from the stack frame access
instructions and avoids stack realignment if stack alignment needed is
less than incoming stack boundary.

gcc/

	PR target/59501
	* config/i386/i386.c (ix86_finalize_stack_realign_flags): Don't
	realign stack if stack alignment needed is less than incoming
	stack boundary.

gcc/testsuite/

	PR target/59501
	* gcc.target/i386/pr59501-4a.c: Remove xfail.
---
 gcc/config/i386/i386.c                     | 83 +++++++++++++++++++-----------
 gcc/testsuite/gcc.target/i386/pr59501-4a.c |  2 +-
 2 files changed, 55 insertions(+), 30 deletions(-)

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index b041524..0c61998 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -14161,6 +14161,10 @@ ix86_finalize_stack_realign_flags (void)
       add_to_hard_reg_set (&set_up_by_prologue, Pmode, ARG_POINTER_REGNUM);
       add_to_hard_reg_set (&set_up_by_prologue, Pmode,
 			   HARD_FRAME_POINTER_REGNUM);
+
+      unsigned int stack_alignment = 0;
+      bool require_stack_frame = false;
+
       FOR_EACH_BB_FN (bb, cfun)
         {
           rtx_insn *insn;
@@ -14169,43 +14173,64 @@ ix86_finalize_stack_realign_flags (void)
 		&& requires_stack_frame_p (insn, prologue_used,
 					   set_up_by_prologue))
 	      {
-		if (crtl->stack_realign_needed != stack_realign)
-		  recompute_frame_layout_p = true;
-		crtl->stack_realign_needed = stack_realign;
-		crtl->stack_realign_finalized = true;
-		if (recompute_frame_layout_p)
-		  ix86_compute_frame_layout ();
-		return;
+		require_stack_frame = true;
+
+		/* Find the maximum stack alignment.  */
+		subrtx_iterator::array_type array;
+		FOR_EACH_SUBRTX (iter, array, PATTERN (insn), ALL)
+		  if ((MEM_P (*iter)))
+		    {
+		      unsigned int alignment = MEM_ALIGN (*iter);
+		      if (alignment > stack_alignment)
+			stack_alignment = alignment;
+		    }
 	      }
 	}
 
-      /* If drap has been set, but it actually isn't live at the start
-	 of the function, there is no reason to set it up.  */
-      if (crtl->drap_reg)
+      if (require_stack_frame)
 	{
-	  basic_block bb = ENTRY_BLOCK_PTR_FOR_FN (cfun)->next_bb;
-	  if (! REGNO_REG_SET_P (DF_LR_IN (bb), REGNO (crtl->drap_reg)))
+	  /* Stack frame is required.  If stack alignment needed is less
+	     than incoming stack boundary, don't realign stack.  */
+	  stack_realign = incoming_stack_boundary < stack_alignment;
+	  if (!stack_realign)
 	    {
-	      crtl->drap_reg = NULL_RTX;
-	      crtl->need_drap = false;
+	      crtl->max_used_stack_slot_alignment
+		= incoming_stack_boundary;
+	      crtl->stack_alignment_needed
+		= incoming_stack_boundary;
 	    }
 	}
       else
-	cfun->machine->no_drap_save_restore = true;
-
-      frame_pointer_needed = false;
-      stack_realign = false;
-      crtl->max_used_stack_slot_alignment = incoming_stack_boundary;
-      crtl->stack_alignment_needed = incoming_stack_boundary;
-      crtl->stack_alignment_estimated = incoming_stack_boundary;
-      if (crtl->preferred_stack_boundary > incoming_stack_boundary)
-	crtl->preferred_stack_boundary = incoming_stack_boundary;
-      df_finish_pass (true);
-      df_scan_alloc (NULL);
-      df_scan_blocks ();
-      df_compute_regs_ever_live (true);
-      df_analyze ();
-      recompute_frame_layout_p = true;
+	{
+	  /* If drap has been set, but it actually isn't live at the
+	     start of the function, there is no reason to set it up.  */
+	  if (crtl->drap_reg)
+	    {
+	      basic_block bb = ENTRY_BLOCK_PTR_FOR_FN (cfun)->next_bb;
+	      if (! REGNO_REG_SET_P (DF_LR_IN (bb),
+				     REGNO (crtl->drap_reg)))
+		{
+		  crtl->drap_reg = NULL_RTX;
+		  crtl->need_drap = false;
+		}
+	    }
+	  else
+	    cfun->machine->no_drap_save_restore = true;
+
+	  frame_pointer_needed = false;
+	  stack_realign = false;
+	  crtl->max_used_stack_slot_alignment = incoming_stack_boundary;
+	  crtl->stack_alignment_needed = incoming_stack_boundary;
+	  crtl->stack_alignment_estimated = incoming_stack_boundary;
+	  if (crtl->preferred_stack_boundary > incoming_stack_boundary)
+	    crtl->preferred_stack_boundary = incoming_stack_boundary;
+	  df_finish_pass (true);
+	  df_scan_alloc (NULL);
+	  df_scan_blocks ();
+	  df_compute_regs_ever_live (true);
+	  df_analyze ();
+	  recompute_frame_layout_p = true;
+	}
     }
 
   if (crtl->stack_realign_needed != stack_realign)
diff --git a/gcc/testsuite/gcc.target/i386/pr59501-4a.c b/gcc/testsuite/gcc.target/i386/pr59501-4a.c
index 5c3cb68..908c7f4 100644
--- a/gcc/testsuite/gcc.target/i386/pr59501-4a.c
+++ b/gcc/testsuite/gcc.target/i386/pr59501-4a.c
@@ -5,4 +5,4 @@
 #include "pr59501-3a.c"
 
 /* Verify no dynamic realignment is performed.  */
-/* { dg-final { scan-assembler-not "and\[^\n\r]*sp" { xfail *-*-* } } } */
+/* { dg-final { scan-assembler-not "and\[^\n\r]*sp" } } */
-- 
2.9.4


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]