This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [AArch64/ARM 2/3] Reimplement AArch64 TRN intrinsics with __builtin_shuffle
- From: Marcus Shawcroft <marcus dot shawcroft at gmail dot com>
- To: Alan Lawrence <alan dot lawrence at arm dot com>
- Cc: "gcc-patches at gcc dot gnu dot org" <gcc-patches at gcc dot gnu dot org>
- Date: Thu, 24 Apr 2014 09:03:47 +0100
- Subject: Re: [AArch64/ARM 2/3] Reimplement AArch64 TRN intrinsics with __builtin_shuffle
- Authentication-results: sourceware.org; auth=none
- References: <533594A3 dot 8070207 at arm dot com> <533596FA dot 8080407 at arm dot com>
On 28 March 2014 15:36, Alan Lawrence <alan.lawrence@arm.com> wrote:
> This patch replaces the temporary inline assembler for vtrn[q]_* in
> arm_neon.h with equivalent calls to __builtin_shuffle. These are matched by
> existing patterns in aarch64.c (aarch64_expand_vec_perm_const_1), outputting
> the same assembler instructions. For two-element vectors, ZIP, UZP and TRN
> instructions all have the same effect, and the backend chooses to output
> ZIP, so this patch also updates the 3 affected tests.
>
> Regressed, and tests from first patch still passing modulo updates herein,
> on
> aarch64-none-elf and aarch64_be-none-elf.
>
> gcc/testsuite/ChangeLog:
> 2014-03-28 Alan Lawrence <alan.lawrence@arm.com>
>
> * gcc.target/aarch64/vtrns32.c: Expect zip[12] insn rather than
> trn[12].
> * gcc.target/aarch64/vtrnu32.c: Likewise.
> * gcc.target/aarch64/vtrnf32.c: Likewise.
>
> gcc/ChangeLog:
> 2014-03-28 Alan Lawrence <alan.lawrence@arm.com>
>
> * config/aarch64/arm_neon.h (vtrn1_f32, vtrn1_p8, vtrn1_p16,
> vtrn1_s8,
> vtrn1_s16, vtrn1_s32, vtrn1_u8, vtrn1_u16, vtrn1_u32, vtrn1q_f32,
> vtrn1q_f64, vtrn1q_p8, vtrn1q_p16, vtrn1q_s8, vtrn1q_s16,
> vtrn1q_s32,
> vtrn1q_s64, vtrn1q_u8, vtrn1q_u16, vtrn1q_u32, vtrn1q_u64,
> vtrn2_f32,
> vtrn2_p8, vtrn2_p16, vtrn2_s8, vtrn2_s16, vtrn2_s32, vtrn2_u8,
> vtrn2_u16, vtrn2_u32, vtrn2q_f32, vtrn2q_f64, vtrn2q_p8, vtrn2q_p16,
> vtrn2q_s8, vtrn2q_s16, vtrn2q_s32, vtrn2q_s64, vtrn2q_u8,
> vtrn2q_u16,
> vtrn2q_u32, vtrn2q_u64): Replace temporary asm with
> __builtin_shuffle.
OK /Marcus