This is the mail archive of the
`gcc-patches@gcc.gnu.org`
mailing list for the GCC project.

Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|

Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |

Other format: | [Raw text] |

*From*: Roger Sayle <roger at eyesopen dot com>*To*: gcc-patches at gcc dot gnu dot org*Date*: Mon, 14 Jun 2004 12:53:48 -0600 (MDT)*Subject*: [PATCH] Improved target tuning in simplify-rtx.c

The following patch makes use of the backend's rtx_cost function when deciding whether or not to apply transformations in simplify-rtx.c. The affected parts of simplify_binary_operation below implement the transformation of "x+x" to "x<<1" in the RTL optimizers. This is reasonable on the x86, as although Pentium4 shifts are more expensive than additions, shift by one is special cased by i386.md's ashlsi3 which then emits an addition. However investigating this revealed some fragile assumptions being made in the RTL optimizers. It turns out that the "X*C + X", and "X*C - X" classes of transformations in simplify_binary_operation already use a very simple "cost" model to decide whether to apply these transformations. The test is that we refuse to apply this transformation if the resulting tree contains a multiplication and the original tree did not. The heuristic works moderately, well but ignores the costs of shifts and other operations, and overlooks that some multiplications "X*4 + X" can be more efficient than others, such as "X*5". Instrumenting a bootstrap of all default languages on i686-pc-linux-gnu produces the following table for this transformation in the <PLUS> section of simplify_binary_operation. Mult Mult Cost Cost Rank Count Before After Before After CVS Patch Diff 1 17741 No Yes 8 16 No No 2 2392 No Yes 24 24 No Yes + 3 846 No No 20 12 Yes Yes 4 736 No Yes 16 20 No No 5 484 Yes Yes 4 16 Yes No - 6 466 No Yes 12 20 No No 7 392 No Yes 20 24 No No 8 170 No No 4 4 Yes Yes 9 89 No Yes 32 28 No Yes + 10 89 No Yes 28 28 No Yes + 11 84 No Yes 40 32 No Yes + 12 84 No Yes 28 20 No Yes + 13 42 Yes Yes 36 16 Yes Yes 15 38 Yes Yes 20 16 Yes Yes 16 30 No Yes 52 40 No Yes + 17 30 No Yes 26 25 No Yes + 18 24 No Yes 36 32 No Yes + 19 22 No Yes 64 44 No Yes + 20 18 Yes Yes 24 16 Yes Yes 21 18 Yes No 20 4 Yes Yes 22 17 No Yes 48 36 No Yes + 23 8 No Yes 12 16 No No 25 6 No Yes 84 56 No Yes + 26 6 No Yes 72 48 No Yes + 27 6 No Yes 68 48 No Yes + 28 6 No Yes 44 36 No Yes + 29 4 No Yes 88 56 No Yes + 30 4 No No 8 8 Yes Yes 31 3 No Yes 34 29 No Yes + 32 2 No Yes 42 33 No Yes + This transformation is applicable about 28K times. The most common case occurs 17741 times, which attempts to turn a tree without a multiplication costing 8, into a tree with a multiplication that costs 16. Both mainline CVS and the patch below agree this isn't beneficial. But for many of the cases above, the exisiting heuristic of not introducing multiplications doesn't accurately reflect the benefit as estimated by the target's costs. The following patch has been tested on i686-pc-linux-gnu with a full "make bootstrap", all default languages, and regression tested with a top-level "make -k check" with no new regressions. Ok for mainline? 2004-06-14 Roger Sayle <roger@eyesopen.com> * simplify-rtx.c (simplify_binary_operation) <PLUS, MINUS>: Use rtx_cost instead of "had_mult" to determine whether the transformed expression is cheaper than the original. Index: simplify-rtx.c =================================================================== RCS file: /cvs/gcc/gcc/gcc/simplify-rtx.c,v retrieving revision 1.192 diff -c -3 -p -r1.192 simplify-rtx.c *** simplify-rtx.c 27 May 2004 08:28:29 -0000 1.192 --- simplify-rtx.c 14 Jun 2004 01:43:27 -0000 *************** simplify_binary_operation (enum rtx_code *** 1471,1483 **** if the multiplication is written as a shift. If so, we can distribute and make a new multiply, shift, or maybe just have X (if C is 2 in the example above). But don't make ! real multiply if we didn't have one before. */ if (! FLOAT_MODE_P (mode)) { HOST_WIDE_INT coeff0 = 1, coeff1 = 1; rtx lhs = op0, rhs = op1; - int had_mult = 0; if (GET_CODE (lhs) == NEG) coeff0 = -1, lhs = XEXP (lhs, 0); --- 1471,1482 ---- if the multiplication is written as a shift. If so, we can distribute and make a new multiply, shift, or maybe just have X (if C is 2 in the example above). But don't make ! something more expensive than we had before. */ if (! FLOAT_MODE_P (mode)) { HOST_WIDE_INT coeff0 = 1, coeff1 = 1; rtx lhs = op0, rhs = op1; if (GET_CODE (lhs) == NEG) coeff0 = -1, lhs = XEXP (lhs, 0); *************** simplify_binary_operation (enum rtx_code *** 1485,1491 **** && GET_CODE (XEXP (lhs, 1)) == CONST_INT) { coeff0 = INTVAL (XEXP (lhs, 1)), lhs = XEXP (lhs, 0); - had_mult = 1; } else if (GET_CODE (lhs) == ASHIFT && GET_CODE (XEXP (lhs, 1)) == CONST_INT --- 1484,1489 ---- *************** simplify_binary_operation (enum rtx_code *** 1502,1508 **** && GET_CODE (XEXP (rhs, 1)) == CONST_INT) { coeff1 = INTVAL (XEXP (rhs, 1)), rhs = XEXP (rhs, 0); - had_mult = 1; } else if (GET_CODE (rhs) == ASHIFT && GET_CODE (XEXP (rhs, 1)) == CONST_INT --- 1500,1505 ---- *************** simplify_binary_operation (enum rtx_code *** 1515,1523 **** if (rtx_equal_p (lhs, rhs)) { tem = simplify_gen_binary (MULT, mode, lhs, ! GEN_INT (coeff0 + coeff1)); ! return (GET_CODE (tem) == MULT && ! had_mult) ? 0 : tem; } } --- 1512,1522 ---- if (rtx_equal_p (lhs, rhs)) { + rtx orig = gen_rtx_PLUS (mode, op0, op1); tem = simplify_gen_binary (MULT, mode, lhs, ! GEN_INT (coeff0 + coeff1)); ! return rtx_cost (tem, SET) <= rtx_cost (orig, SET) ! ? tem : 0; } } *************** simplify_binary_operation (enum rtx_code *** 1626,1638 **** if the multiplication is written as a shift. If so, we can distribute and make a new multiply, shift, or maybe just have X (if C is 2 in the example above). But don't make ! real multiply if we didn't have one before. */ if (! FLOAT_MODE_P (mode)) { HOST_WIDE_INT coeff0 = 1, coeff1 = 1; rtx lhs = op0, rhs = op1; - int had_mult = 0; if (GET_CODE (lhs) == NEG) coeff0 = -1, lhs = XEXP (lhs, 0); --- 1625,1636 ---- if the multiplication is written as a shift. If so, we can distribute and make a new multiply, shift, or maybe just have X (if C is 2 in the example above). But don't make ! something more expensive than we had before. */ if (! FLOAT_MODE_P (mode)) { HOST_WIDE_INT coeff0 = 1, coeff1 = 1; rtx lhs = op0, rhs = op1; if (GET_CODE (lhs) == NEG) coeff0 = -1, lhs = XEXP (lhs, 0); *************** simplify_binary_operation (enum rtx_code *** 1640,1646 **** && GET_CODE (XEXP (lhs, 1)) == CONST_INT) { coeff0 = INTVAL (XEXP (lhs, 1)), lhs = XEXP (lhs, 0); - had_mult = 1; } else if (GET_CODE (lhs) == ASHIFT && GET_CODE (XEXP (lhs, 1)) == CONST_INT --- 1638,1643 ---- *************** simplify_binary_operation (enum rtx_code *** 1657,1663 **** && GET_CODE (XEXP (rhs, 1)) == CONST_INT) { coeff1 = INTVAL (XEXP (rhs, 1)), rhs = XEXP (rhs, 0); - had_mult = 1; } else if (GET_CODE (rhs) == ASHIFT && GET_CODE (XEXP (rhs, 1)) == CONST_INT --- 1654,1659 ---- *************** simplify_binary_operation (enum rtx_code *** 1670,1678 **** if (rtx_equal_p (lhs, rhs)) { tem = simplify_gen_binary (MULT, mode, lhs, GEN_INT (coeff0 - coeff1)); ! return (GET_CODE (tem) == MULT && ! had_mult) ? 0 : tem; } } --- 1666,1676 ---- if (rtx_equal_p (lhs, rhs)) { + rtx orig = gen_rtx_MINUS (mode, op0, op1); tem = simplify_gen_binary (MULT, mode, lhs, GEN_INT (coeff0 - coeff1)); ! return rtx_cost (tem, SET) <= rtx_cost (orig, SET) ! ? tem : 0; } } Roger -- Roger Sayle, E-mail: roger@eyesopen.com OpenEye Scientific Software, WWW: http://www.eyesopen.com/ Suite 1107, 3600 Cerrillos Road, Tel: (+1) 505-473-7385 Santa Fe, New Mexico, 87507. Fax: (+1) 505-473-0833

**Follow-Ups**:**Re: [PATCH] Improved target tuning in simplify-rtx.c***From:*Eric Christopher

**Re: [PATCH] Improved target tuning in simplify-rtx.c***From:*tm_gccmail

Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|

Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |