This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [PATCH, rs6000] 2/3 Add x86 SSE <xmmintrin.h> intrinsics to GCC PPC64LE taget

From: Segher Boessenkool <segher at kernel dot crashing dot org>
To: Steven Munroe <munroesj at linux dot vnet dot ibm dot com>
Cc: gcc-patches <gcc-patches at gcc dot gnu dot org>, David Edelsohn <dje dot gcc at gmail dot com>
Date: Fri, 18 Aug 2017 18:50:44 -0500
Subject: Re: [PATCH, rs6000] 2/3 Add x86 SSE <xmmintrin.h> intrinsics to GCC PPC64LE taget
Authentication-results: sourceware.org; auth=none
References: <1502915740.16102.62.camel@oc7878010663> <20170817052841.GH13471@gate.crashing.org> <1503020434.7915.65.camel@oc7878010663>

On Thu, Aug 17, 2017 at 08:40:34PM -0500, Steven Munroe wrote:
> > > +/* Convert the lower SPFP value to a 32-bit integer according to the current
> > > +   rounding mode.  */
> > > +extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__))
> > > +_mm_cvtss_si32 (__m128 __A)
> > > +{
> > > +  __m64 res = 0;
> > > +#ifdef _ARCH_PWR8
> > > +  __m128 vtmp;
> > > +  __asm__(
> > > +      "xxsldwi %x1,%x2,%x2,3;\n"
> > > +      "xscvspdp %x1,%x1;\n"
> > > +      "fctiw  %1,%1;\n"
> > > +      "mfvsrd  %0,%x1;\n"
> > > +      : "=r" (res),
> > > +	"=&wi" (vtmp)
> > > +      : "wa" (__A)
> > > +      : );
> > > +#endif
> > > +  return (res);
> > > +}
> > 
> > Maybe it could do something better than return the wrong answer for non-p8?
> 
> Ok this gets tricky. Before _ARCH_PWR8 the vector to scalar transfer
> would go through storage. But that is not the worst of it.

Float to int conversion goes trough storage on older systems, too.

> The semantic of cvtss requires rint or llrint. But __builtin_rint will
> generate a call to libm unless we assert -ffast-math.

Yeah, we should fix that some day.  If we can.

> And we don't have
> builtins to generate fctiw/fctid directly.

Yup.  Well, __builtin_rint*, but that currently calls out to libm.

> So I will add the #else using __builtin_rint if that libm dependency is
> ok (this will pop in the DG test for older machines.

Another option is to not support this intrinsic for < POWER8.

I don't have a big (or well-informed) opinion on which it best; but I
doubt always returning 0 is the best we can do ;-)


Segher

References:
- [PATCH, rs6000] 2/3 Add x86 SSE <xmmintrin.h> intrinsics to GCC PPC64LE taget
  - From: Steven Munroe
- Re: [PATCH, rs6000] 2/3 Add x86 SSE <xmmintrin.h> intrinsics to GCC PPC64LE taget
  - From: Segher Boessenkool
- Re: [PATCH, rs6000] 2/3 Add x86 SSE <xmmintrin.h> intrinsics to GCC PPC64LE taget
  - From: Steven Munroe

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]