This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH][AArch64][2/2] (Re)Implement vcopy<q>_lane<q> intrinsics


Ping.
Richard, Marcus, do you have any feedback on this?

https://gcc.gnu.org/ml/gcc-patches/2016-06/msg00503.html

Thanks,
Kyrill

On 14/06/16 10:38, James Greenhalgh wrote:
On Tue, Jun 07, 2016 at 05:56:51PM +0100, Kyrill Tkachov wrote:
Hi all,

This is the second part of James's patch from:
https://gcc.gnu.org/ml/gcc-patches/2013-09/msg01068.html
separated out. It reimplements the vcopyq_lane* intrinsics in C and
adds implementations of the other missing vcopy<q>_lane_<q> intrinsics.

The differences from that patch are in the use of __aarch64_vset_lane_any and
__aarch64_vget_lane_any rather than the typed variants of these that were
used back in 2013 (and don't exist anymore).

The testcase is also adjusted for the ABI change in GCC 5 where integer x1
vectors are now passed and returned in SIMD registers.

The vcopy_laneq_f64 test in the testcase is currently XFAILed because it
currently doesn't generate the optimal DUP instruction but instead emits a
UMOV to a scalar register and then an FMOV. This is a GCC 7 regression
tracked by PR 71307 and I think unrelated to this patch.

Bootstrapped and tested on aarch64-none-linux-gnu.  Also tested on
aarch64_be-none-elf.

Ok for trunk?
Again, this looks OK to me, but as it is based on my code I can't approve
it within the spirit of the write access policies. Please wait for Marcus
or Richard to take a look.

Thanks,
James

Thanks,
Kyrill

2016-06-07  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
             James Greenhalgh  <james.greenhalgh@arm.com>

     * config/aarch64/arm_neon.h (vcopyq_lane_f32, vcopyq_lane_f64,
     vcopyq_lane_p8, vcopyq_lane_p16, vcopyq_lane_s8, vcopyq_lane_s16,
     vcopyq_lane_s32, vcopyq_lane_s64, vcopyq_lane_u8, vcopyq_lane_u16,
     vcopyq_lane_u32, vcopyq_lane_u64): Reimplement in C.
     (vcopy_lane_f32, vcopy_lane_f64, vcopy_lane_p8, vcopy_lane_p16,
     vcopy_lane_s8, vcopy_lane_s16, vcopy_lane_s32, vcopy_lane_s64,
     vcopy_lane_u8, vcopy_lane_u16, vcopy_lane_u32, vcopy_lane_u64,
     vcopy_laneq_f32, vcopy_laneq_f64, vcopy_laneq_p8, vcopy_laneq_p16,
     vcopy_laneq_s8, vcopy_laneq_s16, vcopy_laneq_s32, vcopy_laneq_s64,
     vcopy_laneq_u8, vcopy_laneq_u16, vcopy_laneq_u32, vcopy_laneq_u64,
     vcopyq_laneq_f32, vcopyq_laneq_f64, vcopyq_laneq_p8, vcopyq_laneq_p16,
     vcopyq_laneq_s8, vcopyq_laneq_s16, vcopyq_laneq_s32, vcopyq_laneq_s64,
     vcopyq_laneq_u8, vcopyq_laneq_u16, vcopyq_laneq_u32, vcopyq_laneq_u64):
     New intrinsics.

2016-06-07  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
             James Greenhalgh  <james.greenhalgh@arm.com>

     * gcc.target/aarch64/vect_copy_lane_1.c: New test.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]