[PATCH] Improved target tuning in simplify-rtx.c

Roger Sayle roger@eyesopen.com
Mon Jun 14 21:40:00 GMT 2004


The following patch makes use of the backend's rtx_cost function when
deciding whether or not to apply transformations in simplify-rtx.c.
The affected parts of simplify_binary_operation below implement the
transformation of "x+x" to "x<<1" in the RTL optimizers.  This is
reasonable on the x86, as although Pentium4 shifts are more expensive
than additions, shift by one is special cased by i386.md's ashlsi3
which then emits an addition.  However investigating this revealed
some fragile assumptions being made in the RTL optimizers.

It turns out that the "X*C + X", and "X*C - X" classes of transformations
in simplify_binary_operation already use a very simple "cost" model to
decide whether to apply these transformations.  The test is that we
refuse to apply this transformation if the resulting tree contains a
multiplication and the original tree did not.  The heuristic works
moderately, well but ignores the costs of shifts and other operations,
and overlooks that some multiplications "X*4 + X" can be more efficient
than others, such as "X*5".

Instrumenting a bootstrap of all default languages on i686-pc-linux-gnu
produces the following table for this transformation in the <PLUS> section
of simplify_binary_operation.

             Mult    Mult   Cost    Cost
Rank  Count  Before  After  Before  After  CVS   Patch  Diff
   1  17741  No      Yes       8      16   No    No
   2   2392  No      Yes      24      24   No    Yes    +
   3    846  No      No       20      12   Yes   Yes
   4    736  No      Yes      16      20   No    No
   5    484  Yes     Yes       4      16   Yes   No     -
   6    466  No      Yes      12      20   No    No
   7    392  No      Yes      20      24   No    No
   8    170  No      No        4       4   Yes   Yes
   9     89  No      Yes      32      28   No    Yes    +
  10     89  No      Yes      28      28   No    Yes    +
  11     84  No      Yes      40      32   No    Yes    +
  12     84  No      Yes      28      20   No    Yes    +
  13     42  Yes     Yes      36      16   Yes   Yes
  15     38  Yes     Yes      20      16   Yes   Yes
  16     30  No      Yes      52      40   No    Yes    +
  17     30  No      Yes      26      25   No    Yes    +
  18     24  No      Yes      36      32   No    Yes    +
  19     22  No      Yes      64      44   No    Yes    +
  20     18  Yes     Yes      24      16   Yes   Yes
  21     18  Yes     No       20       4   Yes   Yes
  22     17  No      Yes      48      36   No    Yes    +
  23      8  No      Yes      12      16   No    No
  25      6  No      Yes      84      56   No    Yes    +
  26      6  No      Yes      72      48   No    Yes    +
  27      6  No      Yes      68      48   No    Yes    +
  28      6  No      Yes      44      36   No    Yes    +
  29      4  No      Yes      88      56   No    Yes    +
  30      4  No      No        8       8   Yes   Yes
  31      3  No      Yes      34      29   No    Yes    +
  32      2  No      Yes      42      33   No    Yes    +


This transformation is applicable about 28K times.  The most common
case occurs 17741 times, which attempts to turn a tree without a
multiplication costing 8, into a tree with a multiplication that
costs 16.  Both mainline CVS and the patch below agree this isn't
beneficial.  But for many of the cases above, the exisiting heuristic
of not introducing multiplications doesn't accurately reflect the
benefit as estimated by the target's costs.


The following patch has been tested on i686-pc-linux-gnu with a full
"make bootstrap", all default languages, and regression tested with a
top-level "make -k check" with no new regressions.

Ok for mainline?


2004-06-14  Roger Sayle  <roger@eyesopen.com>

	* simplify-rtx.c (simplify_binary_operation) <PLUS, MINUS>: Use
	rtx_cost instead of "had_mult" to determine whether the transformed
	expression is cheaper than the original.


Index: simplify-rtx.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/simplify-rtx.c,v
retrieving revision 1.192
diff -c -3 -p -r1.192 simplify-rtx.c
*** simplify-rtx.c	27 May 2004 08:28:29 -0000	1.192
--- simplify-rtx.c	14 Jun 2004 01:43:27 -0000
*************** simplify_binary_operation (enum rtx_code
*** 1471,1483 ****
  	     if the multiplication is written as a shift.  If so, we can
  	     distribute and make a new multiply, shift, or maybe just
  	     have X (if C is 2 in the example above).  But don't make
! 	     real multiply if we didn't have one before.  */

  	  if (! FLOAT_MODE_P (mode))
  	    {
  	      HOST_WIDE_INT coeff0 = 1, coeff1 = 1;
  	      rtx lhs = op0, rhs = op1;
- 	      int had_mult = 0;

  	      if (GET_CODE (lhs) == NEG)
  		coeff0 = -1, lhs = XEXP (lhs, 0);
--- 1471,1482 ----
  	     if the multiplication is written as a shift.  If so, we can
  	     distribute and make a new multiply, shift, or maybe just
  	     have X (if C is 2 in the example above).  But don't make
! 	     something more expensive than we had before.  */

  	  if (! FLOAT_MODE_P (mode))
  	    {
  	      HOST_WIDE_INT coeff0 = 1, coeff1 = 1;
  	      rtx lhs = op0, rhs = op1;

  	      if (GET_CODE (lhs) == NEG)
  		coeff0 = -1, lhs = XEXP (lhs, 0);
*************** simplify_binary_operation (enum rtx_code
*** 1485,1491 ****
  		       && GET_CODE (XEXP (lhs, 1)) == CONST_INT)
  		{
  		  coeff0 = INTVAL (XEXP (lhs, 1)), lhs = XEXP (lhs, 0);
- 		  had_mult = 1;
  		}
  	      else if (GET_CODE (lhs) == ASHIFT
  		       && GET_CODE (XEXP (lhs, 1)) == CONST_INT
--- 1484,1489 ----
*************** simplify_binary_operation (enum rtx_code
*** 1502,1508 ****
  		       && GET_CODE (XEXP (rhs, 1)) == CONST_INT)
  		{
  		  coeff1 = INTVAL (XEXP (rhs, 1)), rhs = XEXP (rhs, 0);
- 		  had_mult = 1;
  		}
  	      else if (GET_CODE (rhs) == ASHIFT
  		       && GET_CODE (XEXP (rhs, 1)) == CONST_INT
--- 1500,1505 ----
*************** simplify_binary_operation (enum rtx_code
*** 1515,1523 ****

  	      if (rtx_equal_p (lhs, rhs))
  		{
  		  tem = simplify_gen_binary (MULT, mode, lhs,
! 					GEN_INT (coeff0 + coeff1));
! 		  return (GET_CODE (tem) == MULT && ! had_mult) ? 0 : tem;
  		}
  	    }

--- 1512,1522 ----

  	      if (rtx_equal_p (lhs, rhs))
  		{
+ 		  rtx orig = gen_rtx_PLUS (mode, op0, op1);
  		  tem = simplify_gen_binary (MULT, mode, lhs,
! 					     GEN_INT (coeff0 + coeff1));
! 		  return rtx_cost (tem, SET) <= rtx_cost (orig, SET)
! 			 ? tem : 0;
  		}
  	    }

*************** simplify_binary_operation (enum rtx_code
*** 1626,1638 ****
  	     if the multiplication is written as a shift.  If so, we can
  	     distribute and make a new multiply, shift, or maybe just
  	     have X (if C is 2 in the example above).  But don't make
! 	     real multiply if we didn't have one before.  */

  	  if (! FLOAT_MODE_P (mode))
  	    {
  	      HOST_WIDE_INT coeff0 = 1, coeff1 = 1;
  	      rtx lhs = op0, rhs = op1;
- 	      int had_mult = 0;

  	      if (GET_CODE (lhs) == NEG)
  		coeff0 = -1, lhs = XEXP (lhs, 0);
--- 1625,1636 ----
  	     if the multiplication is written as a shift.  If so, we can
  	     distribute and make a new multiply, shift, or maybe just
  	     have X (if C is 2 in the example above).  But don't make
! 	     something more expensive than we had before.  */

  	  if (! FLOAT_MODE_P (mode))
  	    {
  	      HOST_WIDE_INT coeff0 = 1, coeff1 = 1;
  	      rtx lhs = op0, rhs = op1;

  	      if (GET_CODE (lhs) == NEG)
  		coeff0 = -1, lhs = XEXP (lhs, 0);
*************** simplify_binary_operation (enum rtx_code
*** 1640,1646 ****
  		       && GET_CODE (XEXP (lhs, 1)) == CONST_INT)
  		{
  		  coeff0 = INTVAL (XEXP (lhs, 1)), lhs = XEXP (lhs, 0);
- 		  had_mult = 1;
  		}
  	      else if (GET_CODE (lhs) == ASHIFT
  		       && GET_CODE (XEXP (lhs, 1)) == CONST_INT
--- 1638,1643 ----
*************** simplify_binary_operation (enum rtx_code
*** 1657,1663 ****
  		       && GET_CODE (XEXP (rhs, 1)) == CONST_INT)
  		{
  		  coeff1 = INTVAL (XEXP (rhs, 1)), rhs = XEXP (rhs, 0);
- 		  had_mult = 1;
  		}
  	      else if (GET_CODE (rhs) == ASHIFT
  		       && GET_CODE (XEXP (rhs, 1)) == CONST_INT
--- 1654,1659 ----
*************** simplify_binary_operation (enum rtx_code
*** 1670,1678 ****

  	      if (rtx_equal_p (lhs, rhs))
  		{
  		  tem = simplify_gen_binary (MULT, mode, lhs,
  					     GEN_INT (coeff0 - coeff1));
! 		  return (GET_CODE (tem) == MULT && ! had_mult) ? 0 : tem;
  		}
  	    }

--- 1666,1676 ----

  	      if (rtx_equal_p (lhs, rhs))
  		{
+ 		  rtx orig = gen_rtx_MINUS (mode, op0, op1);
  		  tem = simplify_gen_binary (MULT, mode, lhs,
  					     GEN_INT (coeff0 - coeff1));
! 		  return rtx_cost (tem, SET) <= rtx_cost (orig, SET)
! 			 ? tem : 0;
  		}
  	    }


Roger
--
Roger Sayle,                         E-mail: roger@eyesopen.com
OpenEye Scientific Software,         WWW: http://www.eyesopen.com/
Suite 1107, 3600 Cerrillos Road,     Tel: (+1) 505-473-7385
Santa Fe, New Mexico, 87507.         Fax: (+1) 505-473-0833



More information about the Gcc-patches mailing list