This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH][AArch64]Fix ICE at -O0 on vld1_lane intrinsics


Ping.

Alan Lawrence wrote:
vld1_lane intrinsics ICE at -O0 because they contain a call to the vset_lane intrinsics, through which the lane index is not constant-propagated. (They are fine at -O1 and higher!). This fixes the ICE by replacing said call by a macro.

Rather than defining many individual macros __aarch64_vset(q?)_lane_[uspf](8|16|32|64), instead this introduces a __AARCH64_NUM_LANES macro using sizeof(), such that a single __aarch64_vset_lane_any macro handles all variants (with bounds-checking and endianness-flipping). This reduces potential for error vs. writing the number of lanes for each variant by hand as previously.

Also factor the endianness-flipping out to a separate macro __aarch64_lane; I intend to use this for vget_lane too in another patch.

Tested with check-gcc on aarch64-none-elf and aarch64_be-none-elf (including new test that FAILs without this patch).

Ok for trunk?


gcc/ChangeLog:

	* config/aarch64/arm_neon.h (__AARCH64_NUM_LANES, __aarch64_lane *2):
	New.
	(aarch64_vset_lane_any): Redefine using previous, same for BE + LE.
	(vset_lane_f32, vset_lane_f64, vset_lane_p8, vset_lane_p16,
	vset_lane_s8, vset_lane_s16, vset_lane_s32, vset_lane_s64,
	vset_lane_u8, vset_lane_u16, vset_lane_u32, vset_lane_u64): Remove
	number of lanes.
	(vld1_lane_f32, vld1_lane_f64, vld1_lane_p8, vld1_lane_p16,
	vld1_lane_s8, vld1_lane_s16, vld1_lane_s32, vld1_lane_s64,
	vld1_lane_u8, vld1_lane_u16, vld1_lane_u32, vld1_lane_u64): Call
	__aarch64_vset_lane_any rather than vset_lane_xxx.

gcc/testsuite/ChangeLog:

	* gcc.target/aarch64/vld1_lane-o0.c: New test.



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]