This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[PATCH] Fold integer divisions to arithmetic shifts


We've found that in Webkit code the following pattern is frequently used to
implement arithmetic shift right by 4 bits:

r = (v& (~15)) / 16;

This is done to comply with C99 standard, which defines arithmetic
shifts in C as implementation-dependent.
Unfortunately, GCC can't transform this code back directly into an
arithmetic shift, but it rather expresses division through arithmetic
shift as follows:

       unsigned r, v = ...;
       t = (v&&  (~15));
       if( t>  0 )
         r = t>>  4;
       else
         r = (t + 15)>>  4;

To fix this, we wrote a new transformation to fold-const.c.
It recognizes such pattern as an expression for arithmetic shift.

Original source code:

     int foo(int a)
     {
       int x = (a&  (~15)) / 16;
       return x;
     }

Original output:

       bic     r0, r0, #15
       add     r3, r0, #15
       cmp     r0, #0
       movlt   r0, r3
       mov     r0, r0, asr #4

Output with new transformation:

mov r0, r0, asr #4


We have tested the patch on Webkit. It have shown very small (about 0.5%) speedup on SunSpider JavaScript test, and 12Kb (0.16%) code size reduction of the stripped 7.5Mb shared library binary. Bootstrapped and regtested on x86-64-linux.

OK for trunk?


-- Best regards, Dmitry


2010-10-13  Dmitry Melnik  <dm@ispras.ru>

gcc/
	* fold-const.c (fold_binary_loc): New transformation.

gcc/testsuite/
	* gcc.target/arm/20101013-1.c: New test.

diff --git a/gcc/fold-const.c b/gcc/fold-const.c
index 8146920..de0f350 100644
--- a/gcc/fold-const.c
+++ b/gcc/fold-const.c
@@ -11599,6 +11599,22 @@ fold_binary_loc (location_t loc,
       return NULL_TREE;

     case TRUNC_DIV_EXPR:
+      /* Optimize (X & (-A)) / A where A is a power of 2,
+        to X >> log2(A) */
+      if (TREE_CODE (arg0) == BIT_AND_EXPR
+         && !TYPE_UNSIGNED (type)
+         && integer_pow2p (arg1) && tree_int_cst_sgn (arg1) > 0)
+       {
+         tree sum = fold_binary_loc (loc, PLUS_EXPR, TREE_TYPE (arg1),
+                                     arg1, TREE_OPERAND (arg0, 1));
+         if (sum && integer_zerop (sum))
+           return fold_build2_loc (loc, RSHIFT_EXPR, type,
+                         TREE_OPERAND (arg0, 0),
+                         build_int_cst (NULL_TREE,
+                               exact_log2 (TREE_INT_CST_LOW (arg1))));
+       }
+
+      /* Fall thru */
     case FLOOR_DIV_EXPR:
       /* Simplify A / (B << N) where A and B are positive and B is
         a power of 2, to A >> (N + log2(B)).  */
diff --git a/gcc/testsuite/gcc.target/arm/20101013-1.c b/gcc/testsuite/gcc.target/arm/20101013-1.c
new file mode 100644
index 0000000..34b19d7
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/20101013-1.c
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+int foo(int a)
+{
+  int x = (a & (~15)) / 16;
+  return x;
+}
+
+/* { dg-final { scan-assembler-not "cmp" } } */

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]