[PATCH][AArch64] Split X-reg UBFX into W-reg LSR when possible
James Greenhalgh
james.greenhalgh@arm.com
Fri Dec 16 16:36:00 GMT 2016
On Fri, Dec 16, 2016 at 12:21:52PM +0000, Kyrill Tkachov wrote:
>
> On 15/12/16 11:53, James Greenhalgh wrote:
> >On Thu, Dec 08, 2016 at 09:35:08AM +0000, Kyrill Tkachov wrote:
> >>Hi all,
> >>
> >>In this patch we split X-register UBFX instructions that extract up to the
> >>edge of a W-register into a W-register LSR instruction. So for the example in
> >>the testcase instead of:
> >>UBFX X0, X0, 24, 8
> >>
> >>we'd generate:
> >>LSR w0, w0, 24
> >>
> >>An LSR is a simpler instruction and there's a higher chance that it can be
> >>combined with other instructions.
> >>
> >>To do this the patch separates the sign_extract case from the zero_extract
> >>case in the *<optab><mode> ANY_EXTRACT pattern and further splits the
> >>SImode/DImode patterns from the resulting zrero_extract pattern.
> >>The DImode zero_extract pattern then becomes a define_insn_and_split that
> >>splits into a zero_extend of an lshiftrt when the bitposition and width of
> >>the zero_extract add up to 32.
> >>
> >>Bootstrapped and tested on aarch64-none-linux-gnu.
> >>
> >>Since we're in stage 3 perhaps this is not for GCC 6, but it is fairly low
> >>risk. I'm happy for it to wait for the next release if necessary.
> >I'm OK with the idea, and I'm OK taking this in for Stage 3, but I'm not
> >convinced by the implementation.
> >
> >>2016-12-08 Kyrylo Tkachov <kyrylo.tkachov@arm.com>
> >>
> >> * config/aarch64/aarch64.md (*<optab><mode>): Split into...
> >> (*extv<mode>): ...This...
> >> (*extzvsi): ...This...
> >> (*extzvdi:): ... And this. Add splitting to lshiftrt when possible.
> >Why do we want to to it this way, rather than simply defining a single
> >"split" which works in the case you're trying to catch.
> >
> >i.e. (untested)
> >
> >(define_split
> > [(set (match_operand:DI 0 "register_operand")
> > (zero_extract:DI (match_operand:DI 1 "register_operand")
> > (match_operand 2
> > "aarch64_simd_shift_imm_offset_di")
> > (match_operand 3
> > "aarch64_simd_shift_imm_di")))]
> > "IN_RANGE (INTVAL (operands[2]) + INTVAL (operands[3]),
> > 1, GET_MODE_BITSIZE (DImode) - 1)
> > && (INTVAL (operands[2]) + INTVAL (operands[3]))
> > == GET_MODE_BITSIZE (SImode)"
> > [(set (match_dup 0)
> > (zero_extend:DI (lshiftrt:SI (match_dup 4) (match_dup 3))))]
> > {
> > operands[4] = gen_lowpart (SImode, operands[1]);
> > }
> >)
> >
> >Thanks,
> >James
>
> Yes, that works and is simpler.
> Is this ok then?
OK with the typo fixed.
Thanks,
James
> 2016-12-16 Kyrylo Tkachov <kyrylo.tkachov@arm.com>
>
> * config/aarch64/aarch64.md: New define_split above insv<mode>.
>
> 2016-12-16 Kyrylo Tkachov <kyrylo.tkachov@arm.com>
>
> * gcc.target/aarch64/ubfx_lsr_1.c: New test.
> diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
> index 65ea04587442b0cab18b1e4652537524b82d5b86..5a40ee6abd5e123116aaaa478dced2207dd59478 100644
> --- a/gcc/config/aarch64/aarch64.md
> +++ b/gcc/config/aarch64/aarch64.md
> @@ -4340,6 +4340,26 @@ (define_insn "*<optab><mode>"
> [(set_attr "type" "bfx")]
> )
>
> +;; When the bitposition and width add up to 32 we can use a W-reg LSR
s/bitposition/bit position/
More information about the Gcc-patches
mailing list