[PATCH ARM] PR/61062 Fix arm_neon.h ZIP/UZP/TRN for bigendian

Alan Lawrence alan.lawrence@arm.com
Wed May 14 13:52:00 GMT 2014


Hi,

Due to differences in how the ARM C Language Extensions and gcc's vector 
extensions deal with indices within vectors, the __builtin_shuffle masks used to 
implement the ZIP, UZP and TRN Neon Intrinsics in arm_neon.h are correct only 
for little-endian. (The problem on bigendian has recently been revealed by new 
tests in gcc.target/arm/simd/.)

This patch corrects the indices using "#ifdef __ARM_BIG_ENDIAN" through 
arm_neon.h. I've tested all the arm-specific tests (arm.exp acle.exp aapcs.exp 
simd.exp neon.exp) on both arm-none-eabi and armeb-none-eabi and there are no 
regressions, and on armeb-none-eabi this patch fixes FAIL -> PASS for 
simd/v{uzp,zip,trn}*_1.c.

Note the patch also modifies gcc.target/arm/pr48252.c. A bit of diving into the 
history of this test reveals
    *the test was first written in the days when the arm_neon.h implementation 
used builtins such as __builtin_neon_vzipv8qi (which were thus correct for 
bigendian).
    *In SVN rev 189294, ZIP intrinsics were rewritten to use __builtin_shuffle 
(with little-endian masks); this broke pr48252.c on bigendian, but this was not 
detected until...
    *In SVN rev 191200, in which pr48252.c was modified to expect different 
results according to endianness - that is, "fixing" the test to match the broken 
implementation. (I have verified that this updated test failed on the original 
__builtin_neon_vzipv8qi implementation.)

The fix to pr48252.c here largely reverts to the original form although keeps 
the (correct, proper) use of vld1.

gcc/ChangeLog:

	* config/arm/arm_neon.h (vtrn_s8, vtrn_s16, vtrn_u8, vtrn_u16, vtrn_p8,
	vtrn_p16, vtrn_s32, vtrn_f32, vtrn_u32, vtrnq_s8, vtrnq_s16, vtrnq_s32,
	vtrnq_f32, vtrnq_u8, vtrnq_u16, vtrnq_u32, vtrnq_p8, vtrnq_p16, vzip_s8,
	vzip_s16, vzip_u8, vzip_u16, vzip_p8, vzip_p16, vzip_s32, vzip_f32,
	vzip_u32, vzipq_s8, vzipq_s16, vzipq_s32, vzipq_f32, vzipq_u8,
	vzipq_u16, vzipq_u32, vzipq_p8, vzipq_p16, vuzp_s8, vuzp_s16, vuzp_s32,
	vuzp_f32, vuzp_u8, vuzp_u16, vuzp_u32, vuzp_p8, vuzp_p16, vuzpq_s8,
	vuzpq_s16, vuzpq_s32, vuzpq_f32, vuzpq_u8, vuzpq_u16, vuzpq_u32,
	vuzpq_p8, vuzpq_p16): Correct mask for bigendian.

gcc/testsuite/ChangeLog:

	* gcc.target/arm/pr48252.c (main): Expect same result as endian-neutral.

Cheers, Alan
-------------- next part --------------
A non-text attachment was scrubbed...
Name: arm_neon_bigendian.patch
Type: text/x-patch
Size: 44892 bytes
Desc: not available
URL: <http://gcc.gnu.org/pipermail/gcc-patches/attachments/20140514/5244abd2/attachment.bin>


More information about the Gcc-patches mailing list