PATCH: Enable FTZ/DAZ for SSE via fast math

Wed Aug 10 14:19:00 GMT 2005

On Wed, Aug 10, 2005 at 07:09:04AM -0700, H. J. Lu wrote:
> On Tue, Aug 09, 2005 at 02:58:51PM -0700, Richard Henderson wrote:
> > On Tue, Aug 09, 2005 at 02:30:46PM -0700, H. J. Lu wrote:
> > > There is a minor problem. How can I add crtfastmath.o for SSE targets
> > > only? 
> > 
> > You don't.  You either add code to detect sse, or you make the
> > spec depend on -mfpmath=sse.
> > 
> 
> Here is the patch to enable FTZ/DAZ for SSE via fast math. There are
> no regressions on Linux/x86_64 nor Linux/ia32. The performance of one
> FP benchmark on EM64T is more than doubled with -ffast-math.

Not all i?86 CPUs support cpuid instruction.
Please look at
gcc/testsuite/gcc.dg/i386-cpuid.h
for the ugly details.

> +static void __attribute__((constructor))
> +set_fast_math (void)
> +{
> +  /* Check if SSE is available.  */
> +  unsigned int eax, ebx, ecx, edx;
> +  asm volatile ("xchgl %%ebx, %1; cpuid; xchgl %%ebx, %1"
> +		: "=a" (eax), "=r" (ebx), "=c" (ecx), "=d" (edx)
> +		: "0" (1));
> +
> +  if (edx & (1 << 25))
> +    {
> +      unsigned int mxcsr = __builtin_ia32_stmxcsr ();
> +      mxcsr |= MXCSR_DAZ | MXCSR_FTZ;
> +      __builtin_ia32_ldmxcsr (mxcsr);
> +    }
> +}

	Jakub