[PATCH][AArch64] Add combine pattern for storing lane zero of a vecto
James Greenhalgh
james.greenhalgh@arm.com
Fri Jun 2 13:52:00 GMT 2017
On Fri, Apr 21, 2017 at 09:39:29AM +0100, Kyrill Tkachov wrote:
> Hi all,
>
> Consider the code:
> typedef long long v2di __attribute__ ((vector_size (16)));
> void
> store_laned (v2di x, long long *y)
> {
> y[0] = x[1];
> y[3] = x[0];
> }
>
> AArch64 GCC will generate:
> store_laned:
> umov x1, v0.d[0]
> st1 {v0.d}[1], [x0]
> str x1, [x0, 24]
> ret
>
> It moves the zero lane into a core register and does a scalar store when instead it could have used a scalar FP store
> that supports the required addressing mode:
> store_laned:
> st1 {v0.d}[1], [x0]
> str d0, [x0, 24]
> ret
>
> Combine already tries to match this pattern:
>
> Trying 10 -> 11:
> Failed to match this instruction:
> (set (mem:DI (plus:DI (reg/v/f:DI 76 [ y ])
> (const_int 24 [0x18])) [1 MEM[(long long int *)y_4(D) + 24B]+0 S8 A64])
> (vec_select:DI (reg/v:V2DI 75 [ x ])
> (parallel [
> (const_int 0 [0])
> ])))
>
> but we don't match it in the backend. It's not hard to add it, so this patch does that for all the relevant vector modes.
> With this patch we generate the second sequence above and in SPEC2006 eliminate some address computation instructions
> because we use the more expressive STR instead of ST1 or we eliminate such moves to the integer registers because we
> can just do the store of the D-reg.
Good spot!
> Bootstrapped and tested on aarch64-none-linux-gnu.
>
> Ok for trunk?
OK.
Thanks,
James
> 2017-04-21 Kyrylo Tkachov <kyrylo.tkachov@arm.com>
>
> * config/aarch64/aarch64-simd.md (aarch64_store_lane0<mode>):
> New pattern.
>
> 2017-04-21 Kyrylo Tkachov <kyrylo.tkachov@arm.com>
>
> * gcc.target/aarch64/store_lane0_str_1.c: New test.
>
More information about the Gcc-patches
mailing list