This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[PATCH] Use fld b; fld a; instead of fld a; fld b; fxch %st(1) in reg-stack (PR target/70465)


Hi!

This patch adds a little optimization, if %st and %st(1) are results of
memory loads and we need to exchange them, it is shorter (and on older
machines probably also cheaper) to swap the two loads.

This triggers 26783 in x86_64-linux and i686-linux bootstraps+regtests.
Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Initially I thought I should also adjust debug insns in between i1 and i2
if they refer to i387 stack registers, but further look at regstack shows
that nothing does that and that such debug insns are wrong-debug for
multiple reasons.  In particular, e.g. emit_swap_insn alone emits the
fxch right after i1 or at the start of bb, without trying to adjust
debug insns in between that and insn (we should swap %st with the other reg
in those).  More importantly, at least if the debug regnos actually mean
what they stand in the assembly (i.e. the first one %st, the second one
%st(1) etc.), then we would need to ensure that every insn that pops or
pushes anything onto the i387 stack should update all DWARF expressions
that refer to those registers.  If that is really the case, the easiest
solution for this might be either to reset all debug insns that refer to %st
to %st(7) registers (easy), or replace all those registers in debug insns
with debug temporaries (8 debug temporaries for each pre-regstack register)
and then on i387 pushes or pops emit debug insns for those temporaries,
changing how they are mapped to the actual hw registers and let
var-tracking.c handle the rest.  I guess I should file a PR about this.

2017-01-25  Jakub Jelinek  <jakub@redhat.com>

	PR target/70465
	* reg-stack.c (emit_swap_insn): Instead of fld a; fld b; fxchg %st(1);
	emit fld b; fld a; if possible.

	* gcc.target/i386/pr70465.c: New test.

--- gcc/reg-stack.c.jj	2017-01-25 11:38:54.924320927 +0100
+++ gcc/reg-stack.c	2017-01-25 12:12:08.459045376 +0100
@@ -887,6 +887,52 @@ emit_swap_insn (rtx_insn *insn, stack_pt
 	  && REG_P (i1src) && REGNO (i1src) == FIRST_STACK_REG
 	  && find_regno_note (i1, REG_DEAD, FIRST_STACK_REG) == NULL_RTX)
 	return;
+
+      if (REG_P (i1dest)
+	  && REGNO (i1dest) == FIRST_STACK_REG
+	  && MEM_P (SET_SRC (i1set))
+	  && !side_effects_p (SET_SRC (i1set))
+	  && hard_regno == FIRST_STACK_REG + 1
+	  && i1 != BB_HEAD (current_block))
+	{
+	  rtx_insn *i2 = NULL;
+	  rtx i2set;
+	  rtx_insn *tmp = PREV_INSN (i1);
+	  rtx_insn *limit = PREV_INSN (BB_HEAD (current_block));
+	  while (tmp != limit)
+	    {
+	      if (LABEL_P (tmp)
+		  || CALL_P (tmp)
+		  || NOTE_INSN_BASIC_BLOCK_P (tmp)
+		  || (NONJUMP_INSN_P (tmp)
+		      && stack_regs_mentioned (tmp)))
+		{
+		  i2 = tmp;
+		  break;
+		}
+	      tmp = PREV_INSN (tmp);
+	    }
+	  if (i2 != NULL_RTX
+	      && (i2set = single_set (i2)) != NULL_RTX)
+	    {
+	      /* Instead of fld a; fld b; fxch %st(1); just
+		 use fld b; fld a; if possible.  */
+	      rtx i2dest = *get_true_reg (&SET_DEST (i2set));
+	      if (REG_P (i2dest)
+		  && REGNO (i2dest) == FIRST_STACK_REG
+		  && MEM_P (SET_SRC (i2set))
+		  && !side_effects_p (SET_SRC (i2set))
+		  && !modified_between_p (SET_SRC (i1set), i2, i1))
+		{
+		  remove_insn (i1);
+		  SET_PREV_INSN (i1) = NULL_RTX;
+		  SET_NEXT_INSN (i1) = NULL_RTX;
+		  set_block_for_insn (i1, NULL);
+		  emit_insn_before (i1, i2);
+		  return;
+		}
+	    }
+	}
     }
 
   /* Avoid emitting the swap if this is the first register stack insn
--- gcc/testsuite/gcc.target/i386/pr70465.c.jj	2017-01-25 12:24:25.183041154 +0100
+++ gcc/testsuite/gcc.target/i386/pr70465.c	2017-01-25 12:23:59.000000000 +0100
@@ -0,0 +1,12 @@
+/* PR target/70465 */
+/* { dg-do compile { target ia32 } } */
+/* { dg-options "-O2 -mfpmath=387 -fomit-frame-pointer" } */
+/* { dg-final { scan-assembler-not "fxch\t%st.1" } } */
+
+double
+atan2 (double y, double x)
+{
+  double res = 0.0;
+  asm ("fpatan" : "=t" (res) : "u" (y), "0" (x) : "st(1)");
+  return res;
+}

	Jakub


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]