This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH][AArch64][2/2] (Re)Implement vcopy<q>_lane<q> intrinsics
- From: James Greenhalgh <james dot greenhalgh at arm dot com>
- To: Kyrill Tkachov <kyrylo dot tkachov at foss dot arm dot com>
- Cc: GCC Patches <gcc-patches at gcc dot gnu dot org>, Marcus Shawcroft <marcus dot shawcroft at arm dot com>, Richard Earnshaw <Richard dot Earnshaw at arm dot com>, <nd at arm dot com>
- Date: Tue, 14 Jun 2016 10:38:36 +0100
- Subject: Re: [PATCH][AArch64][2/2] (Re)Implement vcopy<q>_lane<q> intrinsics
- Authentication-results: sourceware.org; auth=none
- Nodisclaimer: True
- References: <5756FCD3 dot 5050009 at foss dot arm dot com>
- Spamdiagnosticmetadata: NSPM
- Spamdiagnosticoutput: 1:99
On Tue, Jun 07, 2016 at 05:56:51PM +0100, Kyrill Tkachov wrote:
> Hi all,
>
> This is the second part of James's patch from:
> https://gcc.gnu.org/ml/gcc-patches/2013-09/msg01068.html
> separated out. It reimplements the vcopyq_lane* intrinsics in C and
> adds implementations of the other missing vcopy<q>_lane_<q> intrinsics.
>
> The differences from that patch are in the use of __aarch64_vset_lane_any and
> __aarch64_vget_lane_any rather than the typed variants of these that were
> used back in 2013 (and don't exist anymore).
>
> The testcase is also adjusted for the ABI change in GCC 5 where integer x1
> vectors are now passed and returned in SIMD registers.
>
> The vcopy_laneq_f64 test in the testcase is currently XFAILed because it
> currently doesn't generate the optimal DUP instruction but instead emits a
> UMOV to a scalar register and then an FMOV. This is a GCC 7 regression
> tracked by PR 71307 and I think unrelated to this patch.
>
> Bootstrapped and tested on aarch64-none-linux-gnu. Also tested on
> aarch64_be-none-elf.
>
> Ok for trunk?
Again, this looks OK to me, but as it is based on my code I can't approve
it within the spirit of the write access policies. Please wait for Marcus
or Richard to take a look.
Thanks,
James
>
> Thanks,
> Kyrill
>
> 2016-06-07 Kyrylo Tkachov <kyrylo.tkachov@arm.com>
> James Greenhalgh <james.greenhalgh@arm.com>
>
> * config/aarch64/arm_neon.h (vcopyq_lane_f32, vcopyq_lane_f64,
> vcopyq_lane_p8, vcopyq_lane_p16, vcopyq_lane_s8, vcopyq_lane_s16,
> vcopyq_lane_s32, vcopyq_lane_s64, vcopyq_lane_u8, vcopyq_lane_u16,
> vcopyq_lane_u32, vcopyq_lane_u64): Reimplement in C.
> (vcopy_lane_f32, vcopy_lane_f64, vcopy_lane_p8, vcopy_lane_p16,
> vcopy_lane_s8, vcopy_lane_s16, vcopy_lane_s32, vcopy_lane_s64,
> vcopy_lane_u8, vcopy_lane_u16, vcopy_lane_u32, vcopy_lane_u64,
> vcopy_laneq_f32, vcopy_laneq_f64, vcopy_laneq_p8, vcopy_laneq_p16,
> vcopy_laneq_s8, vcopy_laneq_s16, vcopy_laneq_s32, vcopy_laneq_s64,
> vcopy_laneq_u8, vcopy_laneq_u16, vcopy_laneq_u32, vcopy_laneq_u64,
> vcopyq_laneq_f32, vcopyq_laneq_f64, vcopyq_laneq_p8, vcopyq_laneq_p16,
> vcopyq_laneq_s8, vcopyq_laneq_s16, vcopyq_laneq_s32, vcopyq_laneq_s64,
> vcopyq_laneq_u8, vcopyq_laneq_u16, vcopyq_laneq_u32, vcopyq_laneq_u64):
> New intrinsics.
>
> 2016-06-07 Kyrylo Tkachov <kyrylo.tkachov@arm.com>
> James Greenhalgh <james.greenhalgh@arm.com>
>
> * gcc.target/aarch64/vect_copy_lane_1.c: New test.