[PATCH 2/3][AArch64] Extend aarch64_simd_vec_set pattern, replace asm for vld1_lane

Fri Nov 14 10:46:00 GMT 2014

The vld1_lane intrinsic is currently implemented using inline asm. This patch 
replaces that with a load and a straightforward use of vset_lane (this gives us 
correct bigendian lane-flipping in a simple manner).

Naively this would produce assembler along the lines of (for vld1_lane_u8):
         ldrb    w0, [x0]
         ins     v0.b[5], w0
Hence, the patch also extends the aarch64_simd_vec_set pattern, adding a variant 
that reads from a memory operand, producing the expected:
         ld1     {v0.b}[5], [x0]
...and thus we'll also get that assembler from a programmer writing natively in 
GCC vector extensions and not using intrinsics :).

I've also added a testcase, as existing tests in aarch64 and advsimd-intrinsics 
seemed only to cover vld{2,3,4}_lane, not vld1_lane.

gcc/ChangeLog:

	* config/aarch64/aarch64-simd.md (aarch64_simd_vec_set<mode>): Add
	variant reading from memory and assembling to ld1.

	* config/aarch64/arm_neon.h (vld1_lane_f32, vld1_lane_f64, vld1_lane_p8,
	vld1_lane_p16, vld1_lane_s8, vld1_lane_s16, vld1_lane_s32,
	vld1_lane_s64, vld1_lane_u8, vld1_lane_u16, vld1_lane_u32,
	vld1_lane_u64, vld1q_lane_f32, vld1q_lane_f64, vld1q_lane_p8,
	vld1q_lane_p16, vld1q_lane_s8, vld1q_lane_s16, vld1q_lane_s32,
	vld1q_lane_s64, vld1q_lane_u8, vld1q_lane_u16, vld1q_lane_u32,
	vld1q_lane_u64): Replace asm with vset_lane and pointer dereference.

gcc/testsuite/ChangeLog:

	gcc.target/aarch64/vld1_lane.c: New test.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: vld1_lane.patch
Type: text/x-patch
Size: 30889 bytes
Desc: not available
URL: <http://gcc.gnu.org/pipermail/gcc-patches/attachments/20141114/30a7047a/attachment.bin>