This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[PATCH] Simplifying RTL shifts to narrower modes


The following patch addresses one aspect of PR middle-end/19154 which
is a possible performance regression for AVR on mainline.

The following testcase from that PR demonstrates the inefficiency:

char g(char c)
{
  return (c & 8) != 0;
}

which currently generates the following assembler:

g:      clr r25
        sbrc r24,7
        com r25
        lsr r25
        ror r24
        lsr r25
        ror r24
        lsr r25
        ror r24
        clr r25
        sbrc r24,7
        com r25
        andi r24,lo8(1)
        andi r25,hi8(1)
        ret

The code has been transformed into the equivalent "return (c >> 3) & 1",
but unfortunately the shift itself is being performed in HImode and not
QImode.

There are several places this mis-optimization can be tackled.
The patch below implements the fix in simplify-rtx.c, but I'm also
investigating a related patch to do this in fold_single_bit_test
and also possible tweaks to the AVR backend.  In the combine and
the RTL optimizers, we should be able to convert

  (subreg:QI (lshiftrt:HI (sign_extend:HI (reg:QI)) (const_int 3)))

into the equivalent

  (lshiftrt:QI (reg:QI) (const_int 3))


Whilst I was there and adding this to simplify_subreg, I decided to
implement all six combinations (three types of shift * two types
of extension).  With this patch, we now generate a much improved
sequence on AVR (but still not perfect) for the above example:


g:      asr r24
        asr r24
        asr r24
        clr r25
        sbrc r24,7
        com r25
        andi r24,lo8(1)
        andi r25,hi8(1)
        ret

Narrowing the mode of operators in simplify-rtx.c seems reasonable
from a middle-end perspective, but I'm posting this RFA to enquire of
the backend folks if this sort of transformation is likely to cause
problems for targets with PARTIAL_REGISTER_STALLs etc...  Presumably
this should already be handled by .md file patterns and constraints,
and the target's rtx_costs, but I didn't feel confident enough to commit
this patch without at least enquiring first.

The following patch has been tested on i686-pc-linux-gnu with a full
"make bootstrap", all default languages, and regression tested with a
top-level "make -k check" with no new failures.

Ok for mainline?



2005-01-06  Roger Sayle  <roger@eyesopen.com>

	* simplify-rtx.c (simplify_subreg): Simplify truncations of shifts
	of sign or zero extended values.


Index: simplify-rtx.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/simplify-rtx.c,v
retrieving revision 1.220
diff -c -3 -p -r1.220 simplify-rtx.c
*** simplify-rtx.c	4 Jan 2005 15:44:13 -0000	1.220
--- simplify-rtx.c	6 Jan 2005 03:04:54 -0000
*************** simplify_subreg (enum machine_mode outer
*** 3829,3834 ****
--- 3829,3879 ----
  	return CONST0_RTX (outermode);
      }

+   /* Simplify (subreg:QI (lshiftrt:SI (sign_extend:SI (x:QI)) C), 0) into
+      to (ashiftrt:QI (x:QI) C), where C is a suitable small constant and
+      the outer subreg is effectively a truncation to the original mode.  */
+   if ((GET_CODE (op) == LSHIFTRT
+        || GET_CODE (op) == ASHIFTRT)
+       && SCALAR_INT_MODE_P (outermode)
+       && (2 * GET_MODE_BITSIZE (outermode)) <= GET_MODE_BITSIZE (innermode)
+       && GET_CODE (XEXP (op, 1)) == CONST_INT
+       && GET_CODE (XEXP (op, 0)) == SIGN_EXTEND
+       && GET_MODE (XEXP (XEXP (op, 0), 0)) == outermode
+       && INTVAL (XEXP (op, 1)) < GET_MODE_BITSIZE (outermode)
+       && subreg_lsb_1 (outermode, innermode, byte) == 0)
+     return simplify_gen_binary (ASHIFTRT, outermode,
+ 				XEXP (XEXP (op, 0), 0), XEXP (op, 1));
+
+   /* Likewise (subreg:QI (lshiftrt:SI (zero_extend:SI (x:QI)) C), 0) into
+      to (lshiftrt:QI (x:QI) C), where C is a suitable small constant and
+      the outer subreg is effectively a truncation to the original mode.  */
+   if ((GET_CODE (op) == LSHIFTRT
+        || GET_CODE (op) == ASHIFTRT)
+       && SCALAR_INT_MODE_P (outermode)
+       && GET_MODE_BITSIZE (outermode) < GET_MODE_BITSIZE (innermode)
+       && GET_CODE (XEXP (op, 1)) == CONST_INT
+       && GET_CODE (XEXP (op, 0)) == ZERO_EXTEND
+       && GET_MODE (XEXP (XEXP (op, 0), 0)) == outermode
+       && INTVAL (XEXP (op, 1)) < GET_MODE_BITSIZE (outermode)
+       && subreg_lsb_1 (outermode, innermode, byte) == 0)
+     return simplify_gen_binary (LSHIFTRT, outermode,
+ 				XEXP (XEXP (op, 0), 0), XEXP (op, 1));
+
+   /* Likewise (subreg:QI (ashift:SI (zero_extend:SI (x:QI)) C), 0) into
+      to (ashift:QI (x:QI) C), where C is a suitable small constant and
+      the outer subreg is effectively a truncation to the original mode.  */
+   if (GET_CODE (op) == ASHIFT
+       && SCALAR_INT_MODE_P (outermode)
+       && GET_MODE_BITSIZE (outermode) < GET_MODE_BITSIZE (innermode)
+       && GET_CODE (XEXP (op, 1)) == CONST_INT
+       && (GET_CODE (XEXP (op, 0)) == ZERO_EXTEND
+ 	  || GET_CODE (XEXP (op, 0)) == SIGN_EXTEND)
+       && GET_MODE (XEXP (XEXP (op, 0), 0)) == outermode
+       && INTVAL (XEXP (op, 1)) < GET_MODE_BITSIZE (outermode)
+       && subreg_lsb_1 (outermode, innermode, byte) == 0)
+     return simplify_gen_binary (ASHIFT, outermode,
+ 				XEXP (XEXP (op, 0), 0), XEXP (op, 1));
+
    return NULL_RTX;
  }


Roger
--


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]