[PATCH] Fix PR middle-end/28690, indexed load/store performance + reload bug

Peter Bergner bergner@vnet.ibm.com
Fri Jan 5 21:11:00 GMT 2007


On Tue, Dec 05, 2006 at 02:23:29PM +0100, Paolo Bonzini wrote:
> This patch does not necessarily work.  You should change other users of
> commutative_operand_precedence to agree with the new definition of
> swap_commutative_operands_p.
> 
> The issue of the relative order of objects is quite thorny.  The problem
> is that GCC prefers to have (set X (op X Y)) rather than (set X (op Y
> X)).  This blocks the possibility to give a different precedence to MEMs
>  and REGs (and in turn, this would allow to have a different precedence
> for REGs depending on REG_POINTER).  I think that this patch is not
> going to be safe unless the underlying problems with (set X (op Y X))
> are fixed.
[snip]
> If you can change s_p_m_o_d_c and still have no problems with PR26847,
> go ahead.
> 
> (BTW, remember I can help with x86 SPEC testing if you need it).

Ok, taking your sugestion from:

  http://gcc.gnu.org/ml/gcc-patches/2006-12/msg00289.html

and changing the three users of commutative_operand_precedence, I have the
following patch which I bootstrapped.  My regtesting showed no additional
errors.  However, this was before I noticed the the 64-bit libgcc wasn't
installed due to the libgcc changes.o all we know is that there were no
32-bit testsuite regressions.  I'm rerunning the testsuite now to check
the 64-bit results.

Anyway, I tried to be careful to preserve the previous behavior of 
simplify_plus_minus_op_data_cmp and swap_commutative_operands_with_target
wrt the special cases they were trying to handle at the same time as
attempting to optimize for the indexed load/store case.  This patch seems
to handle all the previous test cases that we handled before (ie, those
test cases where the pointer is not marked "artifical") as well as Pat's
last test case in Comment #32 in PR28690 which we were not handling with
the older patch.

How does this look?  One thing I can think of is that maybe we should
be testing for target != op0 in swap_commutative_operands_with_target.
Once my current regression test runs are over, I'll bootstrap with
that additional change.

Peter





Index: optabs.c
===================================================================
--- optabs.c	(revision 120449)
+++ optabs.c	(working copy)
@@ -1211,9 +1211,13 @@ swap_commutative_operands_with_target (r
   /* With equal precedence, both orders are ok, but it is better if the
      first operand is TARGET, or if both TARGET and OP0 are pseudos.  */
   if (target == 0 || REG_P (target))
-    return (REG_P (op1) && !REG_P (op0)) || target == op1;
+    return (REG_P (op1) && !REG_P (op0)) || target == op1
+	    || (REG_P (op0) && !REG_POINTER (op0)
+		&& REG_P (op1) && REG_POINTER (op1));
   else
-    return rtx_equal_p (op1, target);
+    return rtx_equal_p (op1, target)
+	   || (REG_P (op0) && !REG_POINTER (op0)
+	       && REG_P (op1) && REG_POINTER (op1));
 }
 
 
Index: rtlanal.c
===================================================================
--- rtlanal.c	(revision 120449)
+++ rtlanal.c	(working copy)
@@ -2835,8 +2835,18 @@ commutative_operand_precedence (rtx op)
 int
 swap_commutative_operands_p (rtx x, rtx y)
 {
-  return (commutative_operand_precedence (x)
-	  < commutative_operand_precedence (y));
+  int precedence_x = commutative_operand_precedence (x);
+  int precedence_y = commutative_operand_precedence (y);
+
+  /* Everything else being equal, preference registers with pointers
+     over registers without pointers.  This is desirable for performance
+     reasons on indexed load/store instructions for some processors.  */
+  if ((precedence_x == precedence_y)
+      && REG_P (x) && !REG_POINTER (x)
+      && REG_P (y) && REG_POINTER (y))
+    return 1;
+
+  return precedence_x < precedence_y;
 }
 
 /* Return 1 if X is an autoincrement side effect and the register is
Index: tree-ssa-address.c
===================================================================
--- tree-ssa-address.c	(revision 120449)
+++ tree-ssa-address.c	(working copy)
@@ -125,7 +125,7 @@ gen_addr_rtx (rtx symbol, rtx base, rtx 
   if (base)
     {
       if (*addr)
-	*addr = gen_rtx_PLUS (Pmode, *addr, base);
+	*addr = simplify_gen_binary (PLUS, Pmode, base, *addr);
       else
 	*addr = base;
     }
Index: simplify-rtx.c
===================================================================
--- simplify-rtx.c	(revision 120449)
+++ simplify-rtx.c	(working copy)
@@ -3205,11 +3205,18 @@ simplify_plus_minus_op_data_cmp (const v
   if (result)
     return result;
 
-  /* Group together equal REGs to do more simplification.  */
+  /* Group together equal REGs to do more simplification.
+     As a special case, we group REG_POINTERS before other regs.  */
   if (REG_P (d1->op) && REG_P (d2->op))
-    return REGNO (d1->op) - REGNO (d2->op);
+    {
+      if (REG_POINTER (d1->op) && REG_POINTER (d2->op)
+	  || !REG_POINTER (d1->op) && !REG_POINTER (d2->op))
+	return REGNO (d1->op) - REGNO (d2->op);
+      else
+	return REG_POINTER (d2->op);
+    }
   else
-    return 0;
+    return REG_P (d2->op) && REG_POINTER (d2->op);
 }
 
 static rtx



More information about the Gcc-patches mailing list