[PATCH][AArch64] Split X-reg UBFX into W-reg LSR when possible

James Greenhalgh james.greenhalgh@arm.com
Fri Dec 16 16:36:00 GMT 2016


On Fri, Dec 16, 2016 at 12:21:52PM +0000, Kyrill Tkachov wrote:
> 
> On 15/12/16 11:53, James Greenhalgh wrote:
> >On Thu, Dec 08, 2016 at 09:35:08AM +0000, Kyrill Tkachov wrote:
> >>Hi all,
> >>
> >>In this patch we split X-register UBFX instructions that extract up to the
> >>edge of a W-register into a W-register LSR instruction. So for the example in
> >>the testcase instead of:
> >>UBFX    X0, X0, 24, 8
> >>
> >>we'd generate:
> >>LSR     w0, w0, 24
> >>
> >>An LSR is a simpler instruction and there's a higher chance that it can be
> >>combined with other instructions.
> >>
> >>To do this the patch separates the sign_extract case from the zero_extract
> >>case in the *<optab><mode> ANY_EXTRACT pattern and further splits the
> >>SImode/DImode patterns from the resulting zrero_extract pattern.
> >>The DImode zero_extract pattern then becomes a define_insn_and_split that
> >>splits into a zero_extend of an lshiftrt when the bitposition and width of
> >>the zero_extract add up to 32.
> >>
> >>Bootstrapped and tested on aarch64-none-linux-gnu.
> >>
> >>Since we're in stage 3 perhaps this is not for GCC 6, but it is fairly low
> >>risk.  I'm happy for it to wait for the next release if necessary.
> >I'm OK with the idea, and I'm OK taking this in for Stage 3, but I'm not
> >convinced by the implementation.
> >
> >>2016-12-08  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
> >>
> >>     * config/aarch64/aarch64.md (*<optab><mode>): Split into...
> >>     (*extv<mode>): ...This...
> >>     (*extzvsi): ...This...
> >>     (*extzvdi:): ... And this.  Add splitting to lshiftrt when possible.
> >Why do we want to to it this way, rather than simply defining a single
> >"split" which works in the case you're trying to catch.
> >
> >i.e. (untested)
> >
> >(define_split
> >    [(set (match_operand:DI 0 "register_operand")
> >	(zero_extract:DI (match_operand:DI 1 "register_operand")
> >			 (match_operand 2
> >			   "aarch64_simd_shift_imm_offset_di")
> >			 (match_operand 3
> >			   "aarch64_simd_shift_imm_di")))]
> >   "IN_RANGE (INTVAL (operands[2]) + INTVAL (operands[3]),
> >	     1, GET_MODE_BITSIZE (DImode) - 1)
> >    && (INTVAL (operands[2]) + INTVAL (operands[3]))
> >	== GET_MODE_BITSIZE (SImode)"
> >   [(set (match_dup 0)
> >	(zero_extend:DI (lshiftrt:SI (match_dup 4) (match_dup 3))))]
> >   {
> >     operands[4] = gen_lowpart (SImode, operands[1]);
> >   }
> >)
> >
> >Thanks,
> >James
> 
> Yes, that works and is simpler.
> Is this ok then?

OK with the typo fixed.

Thanks,
James

> 2016-12-16  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
> 
>     * config/aarch64/aarch64.md: New define_split above insv<mode>.
> 
> 2016-12-16  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
> 
>     * gcc.target/aarch64/ubfx_lsr_1.c: New test.

> diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
> index 65ea04587442b0cab18b1e4652537524b82d5b86..5a40ee6abd5e123116aaaa478dced2207dd59478 100644
> --- a/gcc/config/aarch64/aarch64.md
> +++ b/gcc/config/aarch64/aarch64.md
> @@ -4340,6 +4340,26 @@ (define_insn "*<optab><mode>"
>    [(set_attr "type" "bfx")]
>  )
>  
> +;; When the bitposition and width add up to 32 we can use a W-reg LSR

s/bitposition/bit position/



More information about the Gcc-patches mailing list