[PATCH 2/3][AArch64] Extend aarch64_simd_vec_set pattern, replace asm for vld1_lane
Alan Lawrence
alan.lawrence@arm.com
Fri Nov 14 10:46:00 GMT 2014
The vld1_lane intrinsic is currently implemented using inline asm. This patch
replaces that with a load and a straightforward use of vset_lane (this gives us
correct bigendian lane-flipping in a simple manner).
Naively this would produce assembler along the lines of (for vld1_lane_u8):
ldrb w0, [x0]
ins v0.b[5], w0
Hence, the patch also extends the aarch64_simd_vec_set pattern, adding a variant
that reads from a memory operand, producing the expected:
ld1 {v0.b}[5], [x0]
...and thus we'll also get that assembler from a programmer writing natively in
GCC vector extensions and not using intrinsics :).
I've also added a testcase, as existing tests in aarch64 and advsimd-intrinsics
seemed only to cover vld{2,3,4}_lane, not vld1_lane.
gcc/ChangeLog:
* config/aarch64/aarch64-simd.md (aarch64_simd_vec_set<mode>): Add
variant reading from memory and assembling to ld1.
* config/aarch64/arm_neon.h (vld1_lane_f32, vld1_lane_f64, vld1_lane_p8,
vld1_lane_p16, vld1_lane_s8, vld1_lane_s16, vld1_lane_s32,
vld1_lane_s64, vld1_lane_u8, vld1_lane_u16, vld1_lane_u32,
vld1_lane_u64, vld1q_lane_f32, vld1q_lane_f64, vld1q_lane_p8,
vld1q_lane_p16, vld1q_lane_s8, vld1q_lane_s16, vld1q_lane_s32,
vld1q_lane_s64, vld1q_lane_u8, vld1q_lane_u16, vld1q_lane_u32,
vld1q_lane_u64): Replace asm with vset_lane and pointer dereference.
gcc/testsuite/ChangeLog:
gcc.target/aarch64/vld1_lane.c: New test.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: vld1_lane.patch
Type: text/x-patch
Size: 30889 bytes
Desc: not available
URL: <http://gcc.gnu.org/pipermail/gcc-patches/attachments/20141114/30a7047a/attachment.bin>
More information about the Gcc-patches
mailing list