[patch][aarch64]: add intrinsics for vld1(q)_x4 and vst1(q)_x4

James Greenhalgh james.greenhalgh@arm.com
Mon Aug 19 16:44:00 GMT 2019


On Thu, Aug 15, 2019 at 12:28:27PM +0100, Kyrill Tkachov wrote:
> Hi all,
> 
> On 8/6/19 10:51 AM, Richard Earnshaw (lists) wrote:
> On 18/07/2019 18:18, James Greenhalgh wrote:
> > On Mon, Jun 10, 2019 at 06:21:05PM +0100, Sylvia Taylor wrote:
> >> Greetings,
> >>
> >> This patch adds the intrinsic functions for:
> >> - vld1_<mode>_x4
> >> - vst1_<mode>_x4
> >> - vld1q_<mode>_x4
> >> - vst1q_<mode>_x4
> >>
> >> Bootstrapped and tested on aarch64-none-linux-gnu.
> >>
> >> Ok for trunk? If yes, I don't have any commit rights, so can someone
> >> please commit it on my behalf.
> >
> > Hi,
> >
> > I'm concerned by this strategy for implementing the arm_neon.h builtins:
> >
> >> +__extension__ extern __inline int8x8x4_t
> >> +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> >> +vld1_s8_x4 (const int8_t *__a)
> >> +{
> >> +  union { int8x8x4_t __i; __builtin_aarch64_simd_xi __o; } __au;
> >> +  __au.__o
> >> +    = __builtin_aarch64_ld1x4v8qi ((const __builtin_aarch64_simd_qi *) __a);
> >> +  return __au.__i;
> >> +}
> >
> > As far as I know this is undefined behaviour in C++11. This was the best
> > resource I could find pointing to the relevant standards paragraphs.
> >
> >    https://stackoverflow.com/questions/11373203/accessing-inactive-union-member-and-undefined-behavior
> >
> > That said, GCC explicitly allows it, so maybe this is fine?
> >
> >    https://gcc.gnu.org/onlinedocs/gcc-9.1.0/gcc/Optimize-Options.html#Type-punning
> >
> > Can anyone from the languages side chime in on whether we're exposing
> > undefined behaviour (in either C or C++) here?
> 
> Yes, this is a GNU extension.  My only question is whether or not this
> can be disabled within GCC if you're trying to check for strict
> standards conformance of your code?  And if so, is there a way of making
> sure that this header still works in that case?  A number of GNU
> extensions can be protected with __extension__ but it's not clear how
> that could be applied in this case.  Perhaps the outer __extension__ on
> the function will already do that.
> 
> 
> It should still work. The only relevant flag is -fstrict-aliasing and it is
> documented to preserve this case:
> 
> https://gcc.gnu.org/onlinedocs/gcc-9.2.0/gcc/Optimize-Options.html#Optimize-Options
> 
> Note that we've already been using this idiom in arm_neon.h since 2014 [1]
> and it's worked fine.

Based on that input, this is OK for trunk.

Thanks,
James

> 
> Thanks,
> 
> Kyrill
> 
> [1] http://gcc.gnu.org/r209880
> 
> 
> 
> R.



More information about the Gcc-patches mailing list