This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
[PATCH] Fold integer divisions to arithmetic shifts
- From: Dmitry Melnik <dm at ispras dot ru>
- To: gcc-patches at gcc dot gnu dot org
- Date: Thu, 14 Oct 2010 19:58:18 +0400
- Subject: [PATCH] Fold integer divisions to arithmetic shifts
We've found that in Webkit code the following pattern is frequently used to
implement arithmetic shift right by 4 bits:
r = (v& (~15)) / 16;
This is done to comply with C99 standard, which defines arithmetic
shifts in C as implementation-dependent.
Unfortunately, GCC can't transform this code back directly into an
arithmetic shift, but it rather expresses division through arithmetic
shift as follows:
unsigned r, v = ...;
t = (v&& (~15));
if( t> 0 )
r = t>> 4;
else
r = (t + 15)>> 4;
To fix this, we wrote a new transformation to fold-const.c.
It recognizes such pattern as an expression for arithmetic shift.
Original source code:
int foo(int a)
{
int x = (a& (~15)) / 16;
return x;
}
Original output:
bic r0, r0, #15
add r3, r0, #15
cmp r0, #0
movlt r0, r3
mov r0, r0, asr #4
Output with new transformation:
mov r0, r0, asr #4
We have tested the patch on Webkit. It have shown very small
(about 0.5%) speedup on SunSpider JavaScript test, and 12Kb
(0.16%) code size reduction of the stripped 7.5Mb shared library
binary.
Bootstrapped and regtested on x86-64-linux.
OK for trunk?
--
Best regards,
Dmitry
2010-10-13 Dmitry Melnik <dm@ispras.ru>
gcc/
* fold-const.c (fold_binary_loc): New transformation.
gcc/testsuite/
* gcc.target/arm/20101013-1.c: New test.
diff --git a/gcc/fold-const.c b/gcc/fold-const.c
index 8146920..de0f350 100644
--- a/gcc/fold-const.c
+++ b/gcc/fold-const.c
@@ -11599,6 +11599,22 @@ fold_binary_loc (location_t loc,
return NULL_TREE;
case TRUNC_DIV_EXPR:
+ /* Optimize (X & (-A)) / A where A is a power of 2,
+ to X >> log2(A) */
+ if (TREE_CODE (arg0) == BIT_AND_EXPR
+ && !TYPE_UNSIGNED (type)
+ && integer_pow2p (arg1) && tree_int_cst_sgn (arg1) > 0)
+ {
+ tree sum = fold_binary_loc (loc, PLUS_EXPR, TREE_TYPE (arg1),
+ arg1, TREE_OPERAND (arg0, 1));
+ if (sum && integer_zerop (sum))
+ return fold_build2_loc (loc, RSHIFT_EXPR, type,
+ TREE_OPERAND (arg0, 0),
+ build_int_cst (NULL_TREE,
+ exact_log2 (TREE_INT_CST_LOW (arg1))));
+ }
+
+ /* Fall thru */
case FLOOR_DIV_EXPR:
/* Simplify A / (B << N) where A and B are positive and B is
a power of 2, to A >> (N + log2(B)). */
diff --git a/gcc/testsuite/gcc.target/arm/20101013-1.c b/gcc/testsuite/gcc.target/arm/20101013-1.c
new file mode 100644
index 0000000..34b19d7
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/20101013-1.c
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+int foo(int a)
+{
+ int x = (a & (~15)) / 16;
+ return x;
+}
+
+/* { dg-final { scan-assembler-not "cmp" } } */