[PATCH][AArch64] Use Q-reg loads/stores in movmem expansion

Kyrill Tkachov kyrylo.tkachov@foss.arm.com
Fri Dec 21 12:42:00 GMT 2018

Hi all,

Our movmem expansion currently emits TImode loads and stores when copying 128-bit chunks.
This generates X-register LDP/STP sequences as these are the most preferred registers for that mode.

For the purpose of copying memory, however, we want to prefer Q-registers.
This uses one fewer register, so helping with register pressure.
It also allows merging of 256-bit and larger copies into Q-reg LDP/STP, further helping code size.

The implementation of that is easy: we just use a 128-bit vector mode (V4SImode in this patch)
rather than a TImode.

With this patch the testcase:
#define N 8
int src[N], dst[N];

foo (void)
   __builtin_memcpy (dst, src, N * sizeof (int));

         adrp    x1, src
         add     x1, x1, :lo12:src
         adrp    x0, dst
         add     x0, x0, :lo12:dst
         ldp     q1, q0, [x1]
         stp     q1, q0, [x0]

instead of:
         adrp    x1, src
         add     x1, x1, :lo12:src
         adrp    x0, dst
         add     x0, x0, :lo12:dst
         ldp     x2, x3, [x1]
         stp     x2, x3, [x0]
         ldp     x2, x3, [x1, 16]
         stp     x2, x3, [x0, 16]

Bootstrapped and tested on aarch64-none-linux-gnu.
I hope this is a small enough change for GCC 9.
One could argue that it is finishing up the work done this cycle to support Q-register LDP/STPs

I've seen this give about 1.8% on 541.leela_r on Cortex-A57 with other changes in SPEC2017 in the noise
but there is reduction in code size everywhere (due to more LDP/STP-Q pairs being formed)

Ok for trunk?


2018-12-21  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>

     * config/aarch64/aarch64.c (aarch64_expand_movmem): Use V4SImode for
     128-bit moves.

2018-12-21  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>

     * gcc.target/aarch64/movmem-q-reg_1.c: New test.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: movmem-q.patch
Type: text/x-patch
Size: 1703 bytes
Desc: not available
URL: <http://gcc.gnu.org/pipermail/gcc-patches/attachments/20181221/7901417d/attachment.bin>

More information about the Gcc-patches mailing list