This is the mail archive of the
fortran@gcc.gnu.org
mailing list for the GNU Fortran project.
Re: [PATCH, libgfortran]: Use __builtin_ia32_{stmxcsr,ldmxcsr} intrinsics in config/fpu-i387.h
On Wed, Sep 5, 2012 at 11:30 PM, Uros Bizjak <ubizjak@gmail.com> wrote:
>> * config/fpu-387.h (set_fpu): Use __builtin_ia32_stmxcsr and
>> __builtin_ia32_ldmxcsr intrinsics.
>>
>> Tested on x86_64-pc-linux-gnu {,-m32}, committed to mainline an 4.7 branch.
>
> I forgot that these builtins are enabled for SSE only (and x86_64
> bootstrap enables SSE2 by default), so following addition is needed:
>
> --cut here--
> Index: config/fpu-387.h
> ===================================================================
> --- config/fpu-387.h (revision 190992)
> +++ config/fpu-387.h (working copy)
> @@ -96,7 +96,11 @@
> #define _FPU_MASK_UM 0x10
> #define _FPU_MASK_PM 0x20
>
> -void set_fpu (void)
> +void
> +#ifndef __SSE__
> +__attribute__((__target__("sse")))
> +#endif
> +set_fpu (void)
> {
> unsigned short cw;
>
> --cut here--
>
> Re-tested on x86_64-pc-linux-gnu and committed.
... Not really. This option enables cmove, which should not be used on
plain x86_32.
At the end, lets revert back to assembly, with following change that
was intended from the beginning:
Index: config/fpu-387.h
===================================================================
--- config/fpu-387.h (revision 190992)
+++ config/fpu-387.h (working copy)
@@ -112,7 +112,7 @@
if (options.fpe & GFC_FPE_UNDERFLOW) cw &= ~_FPU_MASK_UM;
if (options.fpe & GFC_FPE_INEXACT) cw &= ~_FPU_MASK_PM;
- asm volatile ("fldcw %0" : : "m" (cw));
+ asm volatile ("%vstmxcsr %0" : "=m" (cw_sse));
if (has_sse())
{
@@ -131,6 +131,6 @@
if (options.fpe & GFC_FPE_UNDERFLOW) cw_sse &= ~(_FPU_MASK_UM << 7);
if (options.fpe & GFC_FPE_INEXACT) cw_sse &= ~(_FPU_MASK_PM << 7);
- __builtin_ia32_ldmxcsr (cw_sse);
+ asm volatile ("%vldmxcsr %0" : : "m" (cw_sse));
}
}
Sorry for troubles,
Uros.