This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
[PATCH] Improved target tuning in simplify-rtx.c
- From: Roger Sayle <roger at eyesopen dot com>
- To: gcc-patches at gcc dot gnu dot org
- Date: Mon, 14 Jun 2004 12:53:48 -0600 (MDT)
- Subject: [PATCH] Improved target tuning in simplify-rtx.c
The following patch makes use of the backend's rtx_cost function when
deciding whether or not to apply transformations in simplify-rtx.c.
The affected parts of simplify_binary_operation below implement the
transformation of "x+x" to "x<<1" in the RTL optimizers. This is
reasonable on the x86, as although Pentium4 shifts are more expensive
than additions, shift by one is special cased by i386.md's ashlsi3
which then emits an addition. However investigating this revealed
some fragile assumptions being made in the RTL optimizers.
It turns out that the "X*C + X", and "X*C - X" classes of transformations
in simplify_binary_operation already use a very simple "cost" model to
decide whether to apply these transformations. The test is that we
refuse to apply this transformation if the resulting tree contains a
multiplication and the original tree did not. The heuristic works
moderately, well but ignores the costs of shifts and other operations,
and overlooks that some multiplications "X*4 + X" can be more efficient
than others, such as "X*5".
Instrumenting a bootstrap of all default languages on i686-pc-linux-gnu
produces the following table for this transformation in the <PLUS> section
of simplify_binary_operation.
Mult Mult Cost Cost
Rank Count Before After Before After CVS Patch Diff
1 17741 No Yes 8 16 No No
2 2392 No Yes 24 24 No Yes +
3 846 No No 20 12 Yes Yes
4 736 No Yes 16 20 No No
5 484 Yes Yes 4 16 Yes No -
6 466 No Yes 12 20 No No
7 392 No Yes 20 24 No No
8 170 No No 4 4 Yes Yes
9 89 No Yes 32 28 No Yes +
10 89 No Yes 28 28 No Yes +
11 84 No Yes 40 32 No Yes +
12 84 No Yes 28 20 No Yes +
13 42 Yes Yes 36 16 Yes Yes
15 38 Yes Yes 20 16 Yes Yes
16 30 No Yes 52 40 No Yes +
17 30 No Yes 26 25 No Yes +
18 24 No Yes 36 32 No Yes +
19 22 No Yes 64 44 No Yes +
20 18 Yes Yes 24 16 Yes Yes
21 18 Yes No 20 4 Yes Yes
22 17 No Yes 48 36 No Yes +
23 8 No Yes 12 16 No No
25 6 No Yes 84 56 No Yes +
26 6 No Yes 72 48 No Yes +
27 6 No Yes 68 48 No Yes +
28 6 No Yes 44 36 No Yes +
29 4 No Yes 88 56 No Yes +
30 4 No No 8 8 Yes Yes
31 3 No Yes 34 29 No Yes +
32 2 No Yes 42 33 No Yes +
This transformation is applicable about 28K times. The most common
case occurs 17741 times, which attempts to turn a tree without a
multiplication costing 8, into a tree with a multiplication that
costs 16. Both mainline CVS and the patch below agree this isn't
beneficial. But for many of the cases above, the exisiting heuristic
of not introducing multiplications doesn't accurately reflect the
benefit as estimated by the target's costs.
The following patch has been tested on i686-pc-linux-gnu with a full
"make bootstrap", all default languages, and regression tested with a
top-level "make -k check" with no new regressions.
Ok for mainline?
2004-06-14 Roger Sayle <roger@eyesopen.com>
* simplify-rtx.c (simplify_binary_operation) <PLUS, MINUS>: Use
rtx_cost instead of "had_mult" to determine whether the transformed
expression is cheaper than the original.
Index: simplify-rtx.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/simplify-rtx.c,v
retrieving revision 1.192
diff -c -3 -p -r1.192 simplify-rtx.c
*** simplify-rtx.c 27 May 2004 08:28:29 -0000 1.192
--- simplify-rtx.c 14 Jun 2004 01:43:27 -0000
*************** simplify_binary_operation (enum rtx_code
*** 1471,1483 ****
if the multiplication is written as a shift. If so, we can
distribute and make a new multiply, shift, or maybe just
have X (if C is 2 in the example above). But don't make
! real multiply if we didn't have one before. */
if (! FLOAT_MODE_P (mode))
{
HOST_WIDE_INT coeff0 = 1, coeff1 = 1;
rtx lhs = op0, rhs = op1;
- int had_mult = 0;
if (GET_CODE (lhs) == NEG)
coeff0 = -1, lhs = XEXP (lhs, 0);
--- 1471,1482 ----
if the multiplication is written as a shift. If so, we can
distribute and make a new multiply, shift, or maybe just
have X (if C is 2 in the example above). But don't make
! something more expensive than we had before. */
if (! FLOAT_MODE_P (mode))
{
HOST_WIDE_INT coeff0 = 1, coeff1 = 1;
rtx lhs = op0, rhs = op1;
if (GET_CODE (lhs) == NEG)
coeff0 = -1, lhs = XEXP (lhs, 0);
*************** simplify_binary_operation (enum rtx_code
*** 1485,1491 ****
&& GET_CODE (XEXP (lhs, 1)) == CONST_INT)
{
coeff0 = INTVAL (XEXP (lhs, 1)), lhs = XEXP (lhs, 0);
- had_mult = 1;
}
else if (GET_CODE (lhs) == ASHIFT
&& GET_CODE (XEXP (lhs, 1)) == CONST_INT
--- 1484,1489 ----
*************** simplify_binary_operation (enum rtx_code
*** 1502,1508 ****
&& GET_CODE (XEXP (rhs, 1)) == CONST_INT)
{
coeff1 = INTVAL (XEXP (rhs, 1)), rhs = XEXP (rhs, 0);
- had_mult = 1;
}
else if (GET_CODE (rhs) == ASHIFT
&& GET_CODE (XEXP (rhs, 1)) == CONST_INT
--- 1500,1505 ----
*************** simplify_binary_operation (enum rtx_code
*** 1515,1523 ****
if (rtx_equal_p (lhs, rhs))
{
tem = simplify_gen_binary (MULT, mode, lhs,
! GEN_INT (coeff0 + coeff1));
! return (GET_CODE (tem) == MULT && ! had_mult) ? 0 : tem;
}
}
--- 1512,1522 ----
if (rtx_equal_p (lhs, rhs))
{
+ rtx orig = gen_rtx_PLUS (mode, op0, op1);
tem = simplify_gen_binary (MULT, mode, lhs,
! GEN_INT (coeff0 + coeff1));
! return rtx_cost (tem, SET) <= rtx_cost (orig, SET)
! ? tem : 0;
}
}
*************** simplify_binary_operation (enum rtx_code
*** 1626,1638 ****
if the multiplication is written as a shift. If so, we can
distribute and make a new multiply, shift, or maybe just
have X (if C is 2 in the example above). But don't make
! real multiply if we didn't have one before. */
if (! FLOAT_MODE_P (mode))
{
HOST_WIDE_INT coeff0 = 1, coeff1 = 1;
rtx lhs = op0, rhs = op1;
- int had_mult = 0;
if (GET_CODE (lhs) == NEG)
coeff0 = -1, lhs = XEXP (lhs, 0);
--- 1625,1636 ----
if the multiplication is written as a shift. If so, we can
distribute and make a new multiply, shift, or maybe just
have X (if C is 2 in the example above). But don't make
! something more expensive than we had before. */
if (! FLOAT_MODE_P (mode))
{
HOST_WIDE_INT coeff0 = 1, coeff1 = 1;
rtx lhs = op0, rhs = op1;
if (GET_CODE (lhs) == NEG)
coeff0 = -1, lhs = XEXP (lhs, 0);
*************** simplify_binary_operation (enum rtx_code
*** 1640,1646 ****
&& GET_CODE (XEXP (lhs, 1)) == CONST_INT)
{
coeff0 = INTVAL (XEXP (lhs, 1)), lhs = XEXP (lhs, 0);
- had_mult = 1;
}
else if (GET_CODE (lhs) == ASHIFT
&& GET_CODE (XEXP (lhs, 1)) == CONST_INT
--- 1638,1643 ----
*************** simplify_binary_operation (enum rtx_code
*** 1657,1663 ****
&& GET_CODE (XEXP (rhs, 1)) == CONST_INT)
{
coeff1 = INTVAL (XEXP (rhs, 1)), rhs = XEXP (rhs, 0);
- had_mult = 1;
}
else if (GET_CODE (rhs) == ASHIFT
&& GET_CODE (XEXP (rhs, 1)) == CONST_INT
--- 1654,1659 ----
*************** simplify_binary_operation (enum rtx_code
*** 1670,1678 ****
if (rtx_equal_p (lhs, rhs))
{
tem = simplify_gen_binary (MULT, mode, lhs,
GEN_INT (coeff0 - coeff1));
! return (GET_CODE (tem) == MULT && ! had_mult) ? 0 : tem;
}
}
--- 1666,1676 ----
if (rtx_equal_p (lhs, rhs))
{
+ rtx orig = gen_rtx_MINUS (mode, op0, op1);
tem = simplify_gen_binary (MULT, mode, lhs,
GEN_INT (coeff0 - coeff1));
! return rtx_cost (tem, SET) <= rtx_cost (orig, SET)
! ? tem : 0;
}
}
Roger
--
Roger Sayle, E-mail: roger@eyesopen.com
OpenEye Scientific Software, WWW: http://www.eyesopen.com/
Suite 1107, 3600 Cerrillos Road, Tel: (+1) 505-473-7385
Santa Fe, New Mexico, 87507. Fax: (+1) 505-473-0833