This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH][AArch64] Split X-reg UBFIZ into W-reg LSL when possible
- From: Kyrill Tkachov <kyrylo dot tkachov at foss dot arm dot com>
- To: James Greenhalgh <james dot greenhalgh at arm dot com>
- Cc: GCC Patches <gcc-patches at gcc dot gnu dot org>, Marcus Shawcroft <marcus dot shawcroft at arm dot com>, Richard Earnshaw <Richard dot Earnshaw at arm dot com>, nd at arm dot com
- Date: Fri, 16 Dec 2016 12:21:55 +0000
- Subject: Re: [PATCH][AArch64] Split X-reg UBFIZ into W-reg LSL when possible
- Authentication-results: sourceware.org; auth=none
- References: <5849294D.6040003@foss.arm.com> <20161215115618.GA14881@arm.com>
On 15/12/16 11:56, James Greenhalgh wrote:
On Thu, Dec 08, 2016 at 09:35:09AM +0000, Kyrill Tkachov wrote:
Hi all,
Similar to the previous patch this transforms X-reg UBFIZ instructions into
W-reg LSL instructions when the UBFIZ operands add up to 32, so we can take
advantage of the implicit zero-extension to DImode
when writing to a W-register.
This is done by splitting the existing *andim_ashift<mode>_bfi pattern into
its two SImode and DImode specialisations and changing the DImode pattern
into a define_insn_and_split that splits into a
zero-extended SImode ashift when the operands match up.
So for the code in the testcase we generate:
LSL W0, W0, 5
instead of:
UBFIZ X0, X0, 5, 27
Bootstrapped and tested on aarch64-none-linux-gnu.
Since we're in stage 3 perhaps this is not for GCC 6, but it is fairly low
risk. I'm happy for it to wait for the next release if necessary.
My comments on the previous patch also apply here. This patch should only
need to add one new split pattern.
Thanks,
James
Thanks, here is the version adding just a single define_split.
Bootstrapped and tested on aarch64-none-linux-gnu.
Ok?
Thanks,
Kyrill
2016-12-16 Kyrylo Tkachov <kyrylo.tkachov@arm.com>
* config/aarch64/aarch64.md: New define_split above bswap<mode>2.
2016-12-16 Kyrylo Tkachov <kyrylo.tkachov@arm.com>
* gcc.target/aarch64/ubfiz_lsl_1.c: New test.
diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 5a40ee6abd5e123116aaaa478dced2207dd59478..b0f7bcbb84159fc8c0c733d0b40f2f08eea241a9 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -4454,6 +4454,24 @@ (define_insn "*andim_ashift<mode>_bfiz"
[(set_attr "type" "bfx")]
)
+;; When the bitposition and width of the equivalent extraction add up to 32
+;; we can use a W-reg LSL instruction taking advantage of the implicit
+;; zero-extension of the X-reg.
+(define_split
+ [(set (match_operand:DI 0 "register_operand")
+ (and:DI (ashift:DI (match_operand:DI 1 "register_operand")
+ (match_operand 2 "const_int_operand"))
+ (match_operand 3 "const_int_operand")))]
+ "aarch64_mask_and_shift_for_ubfiz_p (DImode, operands[3], operands[2])
+ && (INTVAL (operands[2]) + popcount_hwi (INTVAL (operands[3])))
+ == GET_MODE_BITSIZE (SImode)"
+ [(set (match_dup 0)
+ (zero_extend:DI (ashift:SI (match_dup 4) (match_dup 2))))]
+ {
+ operands[4] = gen_lowpart (SImode, operands[1]);
+ }
+)
+
(define_insn "bswap<mode>2"
[(set (match_operand:GPI 0 "register_operand" "=r")
(bswap:GPI (match_operand:GPI 1 "register_operand" "r")))]
diff --git a/gcc/testsuite/gcc.target/aarch64/ubfiz_lsl_1.c b/gcc/testsuite/gcc.target/aarch64/ubfiz_lsl_1.c
new file mode 100644
index 0000000000000000000000000000000000000000..d3fd3f234f2324d71813298210fdcf0660ac45b4
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/ubfiz_lsl_1.c
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+/* Check that an X-reg UBFIZ can be simplified into a W-reg LSL. */
+
+long long
+f2 (long long x)
+{
+ return (x << 5) & 0xffffffff;
+}
+
+/* { dg-final { scan-assembler "lsl\tw" } } */
+/* { dg-final { scan-assembler-not "ubfiz\tx" } } */