[AArch64/ARM 2/3] Rewrite AArch64 UZP Intrinsics using __builtin_shuffle

Marcus Shawcroft marcus.shawcroft@gmail.com
Wed Apr 23 18:00:00 GMT 2014


On 27 March 2014 17:25, Alan Lawrence <alan.lawrence@arm.com> wrote:
> This patch replaces the temporary inline assembler for vuzp_* in arm_neon.h
> with equivalent calls to __builtin_shuffle.  These are matched by
> aarch64_expand_vec_perm_const{,_1} to output (generally) the same assembler
> instructions.  That is, except for two-element vectors, where ZIP, UZP and
> TRN instructions all have the same effect; gcc's backend chooses to output
> ZIP so this patch also updates the 3 affected tests.
>
> Regressed, and tests from first patch still passing modulo updates herein,
> on aarch64-none-elf and aarch64_be-none-elf.
>
> gcc/testsuite/ChangeLog:
> 2014-03-27  Alan Lawrence  <alan.lawrence@arm.com>
>
>         * gcc.target/aarch64/vuzps32_1.c: Expect zip1/2 insn rather than
> uzp1/2.
>         * gcc.target/aarch64/vuzpu32_1.c: Likewise.
>         * gcc.target/aarch64/vuzpf32_1.c: Likewise.
>
> gcc/ChangeLog:
> 2014-03-27  Alan Lawrence  <alan.lawrence@arm.com>
>
>         * config/aarch64/arm_neon.h (vuzp1_f32, vuzp1_p8, vuzp1_p16,
> vuzp1_s8,
>         vuzp1_s16, vuzp1_s32, vuzp1_u8, vuzp1_u16, vuzp1_u32, vuzp1q_f32,
>         vuzp1q_f64, vuzp1q_p8, vuzp1q_p16, vuzp1q_s8, vuzp1q_s16,
> vuzp1q_s32,
>         vuzp1q_s64, vuzp1q_u8, vuzp1q_u16, vuzp1q_u32, vuzp1q_u64,
> vuzp2_f32,
>         vuzp2_p8, vuzp2_p16, vuzp2_s8, vuzp2_s16, vuzp2_s32, vuzp2_u8,
>         vuzp2_u16, vuzp2_u32, vuzp2q_f32, vuzp2q_f64, vuzp2q_p8, vuzp2q_p16,
>         vuzp2q_s8, vuzp2q_s16, vuzp2q_s32, vuzp2q_s64, vuzp2q_u8,
> vuzp2q_u16,
>         vuzp2q_u32, vuzp2q_u64): Replace temporary asm with
> __builtin_shuffle.

OK /Marcus



More information about the Gcc-patches mailing list