[PATCH][AArch64] Add combine pattern for storing lane zero of a vecto

James Greenhalgh james.greenhalgh@arm.com
Fri Jun 2 13:52:00 GMT 2017


On Fri, Apr 21, 2017 at 09:39:29AM +0100, Kyrill Tkachov wrote:
> Hi all,
> 
> Consider the code:
> typedef long long v2di __attribute__ ((vector_size (16)));
> void
> store_laned (v2di x, long long *y)
> {
>   y[0] = x[1];
>   y[3] = x[0];
> }
> 
> AArch64 GCC will generate:
> store_laned:
>         umov    x1, v0.d[0]
>         st1     {v0.d}[1], [x0]
>         str     x1, [x0, 24]
>         ret
> 
> It moves the zero lane into a core register and does a scalar store when instead it could have used a scalar FP store
> that supports the required addressing mode:
> store_laned:
>         st1     {v0.d}[1], [x0]
>         str     d0, [x0, 24]
>         ret
> 
> Combine already tries to match this pattern:
> 
> Trying 10 -> 11:
> Failed to match this instruction:
> (set (mem:DI (plus:DI (reg/v/f:DI 76 [ y ])
>             (const_int 24 [0x18])) [1 MEM[(long long int *)y_4(D) + 24B]+0 S8 A64])
>     (vec_select:DI (reg/v:V2DI 75 [ x ])
>         (parallel [
>                 (const_int 0 [0])
>             ])))
> 
> but we don't match it in the backend. It's not hard to add it, so this patch does that for all the relevant vector modes.
> With this patch we generate the second sequence above and in SPEC2006 eliminate some address computation instructions
> because we use the more expressive STR instead of ST1 or we eliminate such moves to the integer registers because we
> can just do the store of the D-reg.

Good spot!

> Bootstrapped and tested on aarch64-none-linux-gnu.
> 
> Ok for trunk?

OK.

Thanks,
James

> 2017-04-21  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
> 
> 	* config/aarch64/aarch64-simd.md (aarch64_store_lane0<mode>):
> 	New pattern.
> 
> 2017-04-21  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
> 
> 	* gcc.target/aarch64/store_lane0_str_1.c: New test.
> 




More information about the Gcc-patches mailing list