This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[PATCH/RFC] Simplify wrapped RTL op


Hi,

as announced in the wrapped-binop gimple patch mail, on s390 we still
emit odd code in front of loops:

  void v1 (unsigned long *in, unsigned long *out, unsigned int n)
  {
    int i;
    for (i = 0; i < n; i++)
    {
      out[i] = in[i];
    }
   }

   -->

   aghi    %r1,-8
   srlg    %r1,%r1,3
   aghi    %r1,1

This is created by doloop after getting niter from the loop as n - 1 or
"n * 8 - 8" with a step width of 8.  Realizing s390's doloop pattern
compares against 1, we add 1 to niter resulting in the code above.

When going a similar route as with the gimple patch, something like

diff --git a/gcc/simplify-rtx.c b/gcc/simplify-rtx.c
index 9359a3cdb4d..9c06c9b6ee9 100644
--- a/gcc/simplify-rtx.c
+++ b/gcc/simplify-rtx.c
@@ -2364,6 +2364,24 @@ simplify_binary_operation_1 (enum rtx_code code,
machine_mode mode,
                                                           in1, in2));
        }

+      /* Transform (plus (lshiftrt (plus A -C1) C2) C3) to (lshiftrt A C2)
+         if C1 == -C3 * (1 << C2).  */
+      if (CONST_SCALAR_INT_P (op1)
+         && GET_CODE (op0) == LSHIFTRT
+         && CONST_SCALAR_INT_P (XEXP (op0, 1))
+         && GET_CODE (XEXP (op0, 0)) == PLUS
+         && CONST_SCALAR_INT_P (XEXP (XEXP (op0, 0), 1)))
+       {
+         rtx c3 = op1;
+         rtx c2 = XEXP (op0, 1);
+         rtx c1 = XEXP (XEXP (op0, 0), 1);
+
+         rtx a = XEXP (XEXP (op0, 0), 0);
+
+         if (-INTVAL (c3) * (1 << INTVAL (c2)) == INTVAL (c1))
+           return simplify_gen_binary (LSHIFTRT, mode, a, c2);
+       }
+
       /* (plus (comparison A B) C) can become (neg (rev-comp A B)) if
         C is 1 and STORE_FLAG_VALUE is -1 or if C is -1 and
STORE_FLAG_VALUE
         is 1.  */

helps immediately, yet overflow/range information is not considered.  Do
we somehow guarantee that the niter-related we created until doloop do
not overflow?  I did not note something when looking through the code.
Granted, the simplification seems oddly specific and is probably not
useful for a wide range of targets and situations.


Another approach would be to store "niter+1" (== n) when niter (== n-1)
is calculated and, when we need to do the increment, use the niter+1
that we already have without needing to simplify (n - 8) >> 3 + 1.

Any comments on this?

The patch above bootstraps and test suite is without regressions on s390
fwiw.

Regards
 Robin


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]