[PATCH][AArch64]Add vec_shr pattern for 64-bit vectors using ush{l,r}; enable tests.

Alan Lawrence alan.lawrence@arm.com
Fri Nov 14 15:43:00 GMT 2014


Following recent vectorizer changes to reductions via shifts, AArch64 will now 
reduce loops such as this

unsigned char in[8] = {1, 3, 5, 7, 9, 11, 13, 15};

int
main (unsigned char argc, char **argv)
{
   unsigned char prod = 1;

   /* Prevent constant propagation of the entire loop below.  */
   asm volatile ("" : : : "memory");

   for (unsigned char i = 0; i < 8; i++)
     prod *= in[i];

   if (prod != 17)
       __builtin_printf("Failed %d\n", prod);

   return 0;
}

using an 'ext' instruction from aarch64_expand_vec_perm_const:

main:
         adrp    x0, .LANCHOR0
         movi    v2.2s, 0    <=== note reg used here
         ldr     d1, [x0, #:lo12:.LANCHOR0]
         ext     v0.8b, v1.8b, v2.8b, #4
         mul     v1.8b, v1.8b, v0.8b
         ext     v0.8b, v1.8b, v2.8b, #2
         mul     v0.8b, v1.8b, v0.8b
         ext     v2.8b, v0.8b, v2.8b, #1
         mul     v0.8b, v0.8b, v2.8b
         umov    w1, v0.b[0]

The 'ext' works for both 64-bit vectors, and 128-bit vectors; but for 64-bit 
vectors, we can do slightly better using ushr; this patch improves the above to:

main:
         adrp    x0, .LANCHOR0
         ldr     d0, [x0, #:lo12:.LANCHOR0]
         ushr d1, d0, 32
         mul     v0.8b, v0.8b, v1.8b
         ushr d1, d0, 16
         mul     v0.8b, v0.8b, v1.8b
         ushr d1, d0, 8
         mul     v0.8b, v0.8b, v1.8b
         umov    w1, v0.b[0]
	...

Tested with bootstrap + check-gcc on aarch64-none-linux-gnu.
Cross-testing of check-gcc on aarch64_be-none-elf in progress.

Ok if no regressions on big-endian?

Cheers,
--Alan

gcc/ChangeLog:

	* config/aarch64/aarch64-simd.md (vec_shr<mode>): New.

gcc/testsuite/ChangeLog:

	* lib/target-supports.exp
	(check_effective_target_whole_vector_shift): Add aarch64{,_be}.



More information about the Gcc-patches mailing list