This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [patch][aarch64]: add intrinsics for vld1(q)_x4 and vst1(q)_x4


Hi all,

On 8/6/19 10:51 AM, Richard Earnshaw (lists) wrote:
On 18/07/2019 18:18, James Greenhalgh wrote:
> On Mon, Jun 10, 2019 at 06:21:05PM +0100, Sylvia Taylor wrote:
>> Greetings,
>>
>> This patch adds the intrinsic functions for:
>> - vld1_<mode>_x4
>> - vst1_<mode>_x4
>> - vld1q_<mode>_x4
>> - vst1q_<mode>_x4
>>
>> Bootstrapped and tested on aarch64-none-linux-gnu.
>>
>> Ok for trunk? If yes, I don't have any commit rights, so can someone
>> please commit it on my behalf.
>
> Hi,
>
> I'm concerned by this strategy for implementing the arm_neon.h builtins:
>
>> +__extension__ extern __inline int8x8x4_t
>> +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
>> +vld1_s8_x4 (const int8_t *__a)
>> +{
>> +  union { int8x8x4_t __i; __builtin_aarch64_simd_xi __o; } __au;
>> +  __au.__o
>> +    = __builtin_aarch64_ld1x4v8qi ((const __builtin_aarch64_simd_qi *) __a);
>> +  return __au.__i;
>> +}
>
> As far as I know this is undefined behaviour in C++11. This was the best
> resource I could find pointing to the relevant standards paragraphs.
>
> https://stackoverflow.com/questions/11373203/accessing-inactive-union-member-and-undefined-behavior
>
> That said, GCC explicitly allows it, so maybe this is fine?
>
> https://gcc.gnu.org/onlinedocs/gcc-9.1.0/gcc/Optimize-Options.html#Type-punning
>
> Can anyone from the languages side chime in on whether we're exposing
> undefined behaviour (in either C or C++) here?

Yes, this is a GNU extension.  My only question is whether or not this
can be disabled within GCC if you're trying to check for strict
standards conformance of your code?  And if so, is there a way of making
sure that this header still works in that case?  A number of GNU
extensions can be protected with __extension__ but it's not clear how
that could be applied in this case.  Perhaps the outer __extension__ on
the function will already do that.

It should still work. The only relevant flag is -fstrict-aliasing and it is documented to preserve this case:

https://gcc.gnu.org/onlinedocs/gcc-9.2.0/gcc/Optimize-Options.html#Optimize-Options

Note that we've already been using this idiom in arm_neon.h since 2014 [1] and it's worked fine.

Thanks,

Kyrill

[1] http://gcc.gnu.org/r209880



R.



>
> Thanks,
> James
>
>
>
>>
>> Cheers,
>> Syl
>>
>> gcc/ChangeLog:
>>
>> 2019-06-10  Sylvia Taylor <sylvia.taylor@arm.com>
>>
>>       * config/aarch64/aarch64-simd-builtins.def:
>>       (ld1x4): New.
>>       (st1x4): Likewise.
>>       * config/aarch64/aarch64-simd.md:
>>       (aarch64_ld1x4<VALLDIF:mode>): New pattern.
>>       (aarch64_st1x4<VALLDIF:mode>): Likewise.
>>       (aarch64_ld1_x4_<mode>): Likewise.
>>       (aarch64_st1_x4_<mode>): Likewise.
>>       * config/aarch64/arm_neon.h:
>>       (vld1_s8_x4): New function.
>>       (vld1q_s8_x4): Likewise.
>>       (vld1_s16_x4): Likewise.
>>       (vld1q_s16_x4): Likewise.
>>       (vld1_s32_x4): Likewise.
>>       (vld1q_s32_x4): Likewise.
>>       (vld1_u8_x4): Likewise.
>>       (vld1q_u8_x4): Likewise.
>>       (vld1_u16_x4): Likewise.
>>       (vld1q_u16_x4): Likewise.
>>       (vld1_u32_x4): Likewise.
>>       (vld1q_u32_x4): Likewise.
>>       (vld1_f16_x4): Likewise.
>>       (vld1q_f16_x4): Likewise.
>>       (vld1_f32_x4): Likewise.
>>       (vld1q_f32_x4): Likewise.
>>       (vld1_p8_x4): Likewise.
>>       (vld1q_p8_x4): Likewise.
>>       (vld1_p16_x4): Likewise.
>>       (vld1q_p16_x4): Likewise.
>>       (vld1_s64_x4): Likewise.
>>       (vld1_u64_x4): Likewise.
>>       (vld1_p64_x4): Likewise.
>>       (vld1q_s64_x4): Likewise.
>>       (vld1q_u64_x4): Likewise.
>>       (vld1q_p64_x4): Likewise.
>>       (vld1_f64_x4): Likewise.
>>       (vld1q_f64_x4): Likewise.
>>       (vst1_s8_x4): Likewise.
>>       (vst1q_s8_x4): Likewise.
>>       (vst1_s16_x4): Likewise.
>>       (vst1q_s16_x4): Likewise.
>>       (vst1_s32_x4): Likewise.
>>       (vst1q_s32_x4): Likewise.
>>       (vst1_u8_x4): Likewise.
>>       (vst1q_u8_x4): Likewise.
>>       (vst1_u16_x4): Likewise.
>>       (vst1q_u16_x4): Likewise.
>>       (vst1_u32_x4): Likewise.
>>       (vst1q_u32_x4): Likewise.
>>       (vst1_f16_x4): Likewise.
>>       (vst1q_f16_x4): Likewise.
>>       (vst1_f32_x4): Likewise.
>>       (vst1q_f32_x4): Likewise.
>>       (vst1_p8_x4): Likewise.
>>       (vst1q_p8_x4): Likewise.
>>       (vst1_p16_x4): Likewise.
>>       (vst1q_p16_x4): Likewise.
>>       (vst1_s64_x4): Likewise.
>>       (vst1_u64_x4): Likewise.
>>       (vst1_p64_x4): Likewise.
>>       (vst1q_s64_x4): Likewise.
>>       (vst1q_u64_x4): Likewise.
>>       (vst1q_p64_x4): Likewise.
>>       (vst1_f64_x4): Likewise.
>>       (vst1q_f64_x4): Likewise.
>>
>> gcc/testsuite/ChangeLog:
>>
>> 2019-06-10  Sylvia Taylor <sylvia.taylor@arm.com>
>>
>>       * gcc.target/aarch64/advsimd-intrinsics/vld1x4.c: New test.
>>       * gcc.target/aarch64/advsimd-intrinsics/vst1x4.c: New test.
>



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]