int64x1_t, uint64x1_t and float64x1_t in AArch64 arm_neon.h shall be vector types. However currently they are treated as different names for the corresponding C/C++ types: int64_t, uint64_t and double in the GCC AArch64 backend. Function parameters of such types shall be passed in NEON SIMD registers and the names of C++ functions with parameters of any of those types shall be mangled properly. The following test cases demonstrate the issue: ---------------- test.c ---------------- /* Parameter passing issue */ #include "arm_neon.h" int64x1_t aaaa; uint64x1_t bbbb; float64x1_t cccc; void ffff (int64x1_t); void gggg (uint64x1_t); void hhhh (float64x1_t); void test (void) { ffff (aaaa); gggg (bbbb); hhhh (cccc); } ---------------- CUT ---------------- 'aaaa', 'bbbb' and 'cccc' shall be passed in register 'd0', however 'aaaa' and 'bbbb' are currently passed in 'x0' instead. 'cccc' is correctly passed in 'd0', but only by chance, as parameters of double are passed in FP registers which overlay with SIMD registers. ---------------- test.cpp ---------------- /* C++ name mangling issue */ #include "arm_neon.h" void ffff (int64x1_t aaaa) {} void gggg (uint64x1_t bbbb) {} void hhhh (float64x1_t cccc) {} ---------------- CUT ---------------- Instead of the following expected mangled names: _Z4ffff11__Int64x1_t _Z4gggg12__Uint64x1_t _Z4hhhh13__Float64x1_t the function names are currently mangled to _Z4ffffl _Z4ggggm _Z4hhhhd
Confirmed. The problem is: typedef int64_t int64x1_t; typedef int32_t int32x1_t; typedef int16_t int16x1_t; typedef int8_t int8x1_t; typedef double float64x1_t; typedef uint64_t uint64x1_t; typedef uint32_t uint32x1_t; typedef uint16_t uint16x1_t; typedef uint8_t uint8x1_t; Those all should add __attribute__((vector_size(sizeof(type) )) to them to make them a vector of one element.
Apart from the parameter passing and C++ name mangling issues, there is also an issue w.r.t. the implicit conversion between the scalar types and their vector-type peers. For intrinsics code to be portable, conversion between neon intrinsics vector types and the scalar types should be done via the corresponding vcreat and vget_lane intrinsics. Currently, arm_neon.h has the following typedefs: typedef int64_t int64x1_t; typedef double float64x1_t; typedef uint64_t uint64x1_t; which have unfortunately allowed variables of e.g. type int64_t and type int64x1_t, to be interoperable. User code replying on this mistake may encounter compilation errors after the fix.
Author: alalaw01 Date: Mon Jun 23 12:46:52 2014 New Revision: 211892 URL: https://gcc.gnu.org/viewcvs?rev=211892&root=gcc&view=rev Log: PR/60825 Make float64x1_t in arm_neon.h a proper vector type gcc/ChangeLog: PR target/60825 * config/aarch64/aarch64.c (aarch64_simd_mangle_map): Add entry for V1DFmode. * config/aarch64/aarch64-builtins.c (aarch64_simd_builtin_type_mode): add V1DFmode (BUILTIN_VD1): New. (BUILTIN_VD_RE): Remove. (aarch64_init_simd_builtins): Add V1DF to modes/modenames. (aarch64_fold_builtin): Update reinterpret patterns, df becomes v1df. * config/aarch64/aarch64-simd-builtins.def (create): Make a v1df variant but not df. (vreinterpretv1df*, vreinterpret*v1df): New. (vreinterpretdf*, vreinterpret*df): Remove. * config/aarch64/aarch64-simd.md (aarch64_create, aarch64_reinterpret*): Generate V1DFmode pattern not DFmode. * config/aarch64/iterators.md (VD_RE): Include V1DF, remove DF. (VD1): New. * config/aarch64/arm_neon.h (float64x1_t): typedef with gcc extensions. (vcreate_f64): Remove cast, use v1df builtin. (vcombine_f64): Remove cast, get elements with gcc vector extensions. (vget_low_f64, vabs_f64, vceq_f64, vceqz_f64, vcge_f64, vgfez_f64, vcgt_f64, vcgtz_f64, vcle_f64, vclez_f64, vclt_f64, vcltz_f64, vdup_n_f64, vdupq_lane_f64, vld1_f64, vld2_f64, vld3_f64, vld4_f64, vmov_n_f64, vst1_f64): Use gcc vector extensions. (vget_lane_f64, vdupd_lane_f64, vmulq_lane_f64, ): Use gcc extensions, add range check using __builtin_aarch64_im_lane_boundsi. (vfma_lane_f64, vfmad_lane_f64, vfma_laneq_f64, vfmaq_lane_f64, vfms_lane_f64, vfmsd_lane_f64, vfms_laneq_f64, vfmsq_lane_f64): Fix type signature, use gcc vector extensions. (vreinterpret_p8_f64, vreinterpret_p16_f64, vreinterpret_f32_f64, vreinterpret_f64_f32, vreinterpret_f64_p8, vreinterpret_f64_p16, vreinterpret_f64_s8, vreinterpret_f64_s16, vreinterpret_f64_s32, vreinterpret_f64_s64, vreinterpret_f64_u8, vreinterpret_f64_u16, vreinterpret_f64_u32, vreinterpret_f64_u64, vreinterpret_s8_f64, vreinterpret_s16_f64, vreinterpret_s32_f64, vreinterpret_s64_f64, vreinterpret_u8_f64, vreinterpret_u16_f64, vreinterpret_u32_f64, vreinterpret_u64_f64): Use v1df builtin not df. gcc/testsuite/ChangeLog: * g++.dg/abi/mangle-neon-aarch64.C: Also test mangling of float64x1_t. * gcc.target/aarch64/aapcs/test_64x1_1.c: New test. * gcc.target/aarch64/aapcs/func-ret-64x1_1.c: New test. * gcc.target/aarch64/simd/ext_f64_1.c (main): Compare vector elements. * gcc.target/aarch64/vadd_f64.c: Rewrite with macro to use vector types. * gcc.target/aarch64/vsub_f64.c: Likewise. * gcc.target/aarch64/vdiv_f.c (INDEX*, RUN_TEST): Remove indexing scheme as now the same for all variants. * gcc.target/aarch64/vrnd_f64_1.c (compare_f64): Return float64_t not float64x1_t. Added: trunk/gcc/testsuite/gcc.target/aarch64/aapcs64/func-ret-64x1_1.c trunk/gcc/testsuite/gcc.target/aarch64/aapcs64/test_64x1_1.c Modified: trunk/gcc/ChangeLog trunk/gcc/config/aarch64/aarch64-builtins.c trunk/gcc/config/aarch64/aarch64-simd-builtins.def trunk/gcc/config/aarch64/aarch64-simd.md trunk/gcc/config/aarch64/aarch64.c trunk/gcc/config/aarch64/arm_neon.h trunk/gcc/config/aarch64/iterators.md trunk/gcc/testsuite/ChangeLog trunk/gcc/testsuite/g++.dg/abi/mangle-neon-aarch64.C trunk/gcc/testsuite/gcc.target/aarch64/simd/ext_f64_1.c trunk/gcc/testsuite/gcc.target/aarch64/vadd_f64.c trunk/gcc/testsuite/gcc.target/aarch64/vdiv_f.c trunk/gcc/testsuite/gcc.target/aarch64/vrnd_f64_1.c trunk/gcc/testsuite/gcc.target/aarch64/vsub_f64.c
Author: alalaw01 Date: Mon Jun 23 14:07:42 2014 New Revision: 211894 URL: https://gcc.gnu.org/viewcvs?rev=211894&root=gcc&view=rev Log: PR/60825 Make {int,uint}64x1_t in arm_neon.h a proper vector type gcc/ChangeLog: PR target/60825 * config/aarch64/aarch64-builtins.c (aarch64_types_unop_qualifiers): Ignore third operand if present by marking qualifier_internal. * config/aarch64/aarch64-simd-builtins.def (abs): Comment. * config/aarch64/arm_neon.h (int64x1_t, uint64x1_t): Typedef to GCC vector extension. (aarch64_vget_lane_s64, aarch64_vdup_lane_s64, arch64_vdupq_lane_s64, aarch64_vdupq_lane_u64): Remove macro. (vqadd_s64, vqadd_u64, vqsub_s64, vqsub_u64, vqneg_s64, vqabs_s64, vcreate_s64, vcreate_u64, vreinterpret_s64_f64, vreinterpret_u64_f64, vcombine_u64, vbsl_s64, vbsl_u64, vceq_s64, vceq_u64, vceqz_s64, vceqz_u64, vcge_s64, vcge_u64, vcgez_s64, vcgt_s64, vcgt_u64, vcgtz_s64, vcle_s64, vcle_u64, vclez_s64, vclt_s64, vclt_u64, vcltz_s64, vdup_n_s64, vdup_n_u64, vld1_s64, vld1_u64, vmov_n_s64, vmov_n_u64, vqdmlals_lane_s32, vqdmlsls_lane_s32, vqdmulls_lane_s32, vqrshl_s64, vqrshl_u64, vqrshl_u64, vqshl_s64, vqshl_u64, vqshl_n_s64, vqshl_n_u64, vqshl_n_s64, vqshl_n_u64, vqshlu_n_s64, vrshl_s64, vrshl_u64, vrshr_n_s64, vrshr_n_u64, vrsra_n_s64, vrsra_n_u64, vshl_n_s64, vshl_n_u64, vshl_s64, vshl_u64, vshr_n_s64, vshr_n_u64, vsli_n_s64, vsli_n_u64, vsqadd_u64, vsra_n_s64, vsra_n_u64, vsri_n_s64, vsri_n_u64, vst1_s64, vst1_u64, vtst_s64, vtst_u64, vuqadd_s64): Wrap existing logic in GCC vector extensions (vpaddd_s64, vaddd_s64, vaddd_u64, vceqd_s64, vceqd_u64, vceqzd_s64 vceqzd_u64, vcged_s64, vcged_u64, vcgezd_s64, vcgtd_s64, vcgtd_u64, vcgtzd_s64, vcled_s64, vcled_u64, vclezd_s64, vcltd_s64, vcltd_u64, vcltzd_s64, vqdmlals_s32, vqdmlsls_s32, vqmovnd_s64, vqmovnd_u64 vqmovund_s64, vqrshld_s64, vqrshld_u64, vqrshrnd_n_s64, vqrshrnd_n_u64, vqrshrund_n_s64, vqshld_s64, vqshld_u64, vqshld_n_u64, vqshrnd_n_s64, vqshrnd_n_u64, vqshrund_n_s64, vrshld_u64, vrshrd_n_u64, vrsrad_n_u64, vshld_n_u64, vshld_s64, vshld_u64, vslid_n_u64, vsqaddd_u64, vsrad_n_u64, vsrid_n_u64, vsubd_s64, vsubd_u64, vtstd_s64, vtstd_u64): Fix type signature. (vabs_s64): Use GCC vector extensions; call __builtin_aarch64_absdi. (vget_high_s64, vget_high_u64): Reimplement with GCC vector extensions. (__GET_LOW, vget_low_u64): Wrap result using vcreate_u64. (vget_low_s64): Use __GET_LOW macro. (vget_lane_s64, vget_lane_u64, vdupq_lane_s64, vdupq_lane_u64): Use gcc vector extensions, add call to __builtin_aarch64_lane_boundsi. (vdup_lane_s64, vdup_lane_u64,): Add __builtin_aarch64_lane_bound_si. (vdupd_lane_s64, vdupd_lane_u64): Fix type signature, add __builtin_aarch64_lane_boundsi, use GCC vector extensions. (vcombine_s64): Use GCC vector extensions; remove cast. (vqaddd_s64, vqaddd_u64, vqdmulls_s32, vqshld_n_s64, vqshlud_n_s64, vqsubd_s64, vqsubd_u64, vrshld_s64, vrshrd_n_s64, vrsrad_n_s64, vshld_n_s64, vshrd_n_s64, vslid_n_s64, vsrad_n_s64, vsrid_n_s64): Fix type signature; remove cast. gcc/testsuite/ChangeLog: * g++.dg/abi/mangle-neon-aarch64.C (f22, f23): New tests of [u]int64x1_t. * gcc.target/aarch64/aapcs64/func-ret-64x1_1.c: Add {u,}int64x1 cases. * gcc.target/aarch64/aapcs64/test_64x1_1.c: Likewise. * gcc.target/aarch64/scalar_intrinsics.c (test_vaddd_u64, test_vaddd_s64, test_vceqd_s64, test_vceqzd_s64, test_vcged_s64, test_vcled_s64, test_vcgezd_s64, test_vcged_u64, test_vcgtd_s64, test_vcltd_s64, test_vcgtzd_s64, test_vcgtd_u64, test_vclezd_s64, test_vcltzd_s64, test_vqaddd_u64, test_vqaddd_s64, test_vqdmlals_s32, test_vqdmlsls_s32, test_vqdmulls_s32, test_vuqaddd_s64, test_vsqaddd_u64, test_vqmovund_s64, test_vqmovnd_s64, test_vqmovnd_u64, test_vsubd_u64, test_vsubd_s64, test_vqsubd_u64, test_vqsubd_s64, test_vshld_s64, test_vshld_u64, test_vrshld_s64, test_vrshld_u64, test_vshrd_n_s64, test_vshrd_n_u64, test_vsrad_n_s64, test_vsrad_n_u64, test_vrshrd_n_s64, test_vrshrd_n_u64, test_vrsrad_n_s64, test_vrsrad_n_u64, test_vqrshld_s64, test_vqrshld_u64, test_vqshlud_n_s64, test_vqshld_s64, test_vqshld_u64, test_vqshld_n_u64, test_vqshrund_n_s64, test_vqrshrund_n_s64, test_vqshrnd_n_s64, test_vqshrnd_n_u64, test_vqrshrnd_n_s64, test_vqrshrnd_n_u64, test_vshld_n_s64, test_vshdl_n_u64, test_vslid_n_s64, test_vslid_n_u64, test_vsrid_n_s64, test_vsrid_n_u64): Fix signature to match intrinsic. (test_vabs_s64): Remove. (test_vaddd_s64_2, test_vsubd_s64_2): Use force_simd. (test_vdupd_lane_s64): Rename to... (test_vdupd_laneq_s64): ...and remove a call to force_simd. (test_vdupd_lane_u64): Rename to... (test_vdupd_laneq_u64): ...and remove a call to force_simd. (test_vtst_s64): Rename to... (test_vtstd_s64): ...and change int64x1_t to int64_t. (test_vtst_u64): Rename to... (test_vtstd_u64): ...and change uint64x1_t to uint64_t. * gcc.target/aarch64/singleton_intrinsics_1.c: New file. * gcc.target/aarch64/vdup_lane_1.c, gcc.target/aarch64/vdup_lane_2.c: Remove out-of-bounds tests. * gcc.target/aarch64/vneg_s.c (INDEX*, RUN_TEST): Remove INDEX macro. Added: trunk/gcc/testsuite/gcc.target/aarch64/singleton_intrinsics_1.c Modified: trunk/gcc/ChangeLog trunk/gcc/config/aarch64/aarch64-builtins.c trunk/gcc/config/aarch64/aarch64-simd-builtins.def trunk/gcc/config/aarch64/aarch64.c trunk/gcc/config/aarch64/arm_neon.h trunk/gcc/testsuite/ChangeLog trunk/gcc/testsuite/g++.dg/abi/mangle-neon-aarch64.C trunk/gcc/testsuite/gcc.target/aarch64/aapcs64/func-ret-64x1_1.c trunk/gcc/testsuite/gcc.target/aarch64/aapcs64/test_64x1_1.c trunk/gcc/testsuite/gcc.target/aarch64/scalar_intrinsics.c trunk/gcc/testsuite/gcc.target/aarch64/simd/ext_s64.x trunk/gcc/testsuite/gcc.target/aarch64/simd/ext_u64.x trunk/gcc/testsuite/gcc.target/aarch64/vdup_lane_1.c trunk/gcc/testsuite/gcc.target/aarch64/vdup_lane_2.c trunk/gcc/testsuite/gcc.target/aarch64/vneg_s.c
GCC 4.9.1 has been released.
GCC 4.9.2 has been released.
GCC 4.9.3 has been released.
Fixed so closing.