This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[PATCH][AArch64]Add vec_shr pattern for 64-bit vectors using ush{l,r}; enable tests.


Following recent vectorizer changes to reductions via shifts, AArch64 will now reduce loops such as this

unsigned char in[8] = {1, 3, 5, 7, 9, 11, 13, 15};

int
main (unsigned char argc, char **argv)
{
  unsigned char prod = 1;

  /* Prevent constant propagation of the entire loop below.  */
  asm volatile ("" : : : "memory");

  for (unsigned char i = 0; i < 8; i++)
    prod *= in[i];

  if (prod != 17)
      __builtin_printf("Failed %d\n", prod);

  return 0;
}

using an 'ext' instruction from aarch64_expand_vec_perm_const:

main:
        adrp    x0, .LANCHOR0
        movi    v2.2s, 0    <=== note reg used here
        ldr     d1, [x0, #:lo12:.LANCHOR0]
        ext     v0.8b, v1.8b, v2.8b, #4
        mul     v1.8b, v1.8b, v0.8b
        ext     v0.8b, v1.8b, v2.8b, #2
        mul     v0.8b, v1.8b, v0.8b
        ext     v2.8b, v0.8b, v2.8b, #1
        mul     v0.8b, v0.8b, v2.8b
        umov    w1, v0.b[0]

The 'ext' works for both 64-bit vectors, and 128-bit vectors; but for 64-bit vectors, we can do slightly better using ushr; this patch improves the above to:

main:
        adrp    x0, .LANCHOR0
        ldr     d0, [x0, #:lo12:.LANCHOR0]
        ushr d1, d0, 32
        mul     v0.8b, v0.8b, v1.8b
        ushr d1, d0, 16
        mul     v0.8b, v0.8b, v1.8b
        ushr d1, d0, 8
        mul     v0.8b, v0.8b, v1.8b
        umov    w1, v0.b[0]
	...

Tested with bootstrap + check-gcc on aarch64-none-linux-gnu.
Cross-testing of check-gcc on aarch64_be-none-elf in progress.

Ok if no regressions on big-endian?

Cheers,
--Alan

gcc/ChangeLog:

	* config/aarch64/aarch64-simd.md (vec_shr<mode>): New.

gcc/testsuite/ChangeLog:

	* lib/target-supports.exp
	(check_effective_target_whole_vector_shift): Add aarch64{,_be}.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]