This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[PATCH] -mfpmath=sse should disable x387 intrinsics


The following simple patch addresses a performance regression I
observed upgrading an x86_64/amd64 box from gcc 3.3.3 to gcc 3.4.3.
I turns out that the problem is that x86_64 uses FPMATH_SSE by
default, and the regression was inadvertantly caused by GCC's
improved support for inline FP intrinsics on x87.

The slowdown is caused by the shuffling of floating point values
between SSE registers and the x87 registers required by these
intrinsics.  Additionally, glibc for x86_64 contains fast and
often more accurate implementations of these math routines in
it's libm.

The obvious, but perhaps controversial, patch below is to only allow
x87 intrinsics when fpmath is either "387", "387,sse" or "sse,387"
and disable them when fpmath is "sse".

On an Athlon64 3500+ running SuSE Linux v9.1, this patch produces
a consistent 4.5% performance improvement in whetstone (before:
1980.084MFLOPS, 1979.810MFLOPS, after: 2069.737MFLOPS, 2069.148MFLOPS),
when compiled with "-O2 -ffast-math".  Although this change makes
"sense", I'm also cautious of microbenchmarks, so if anyone could
benchmark this change on SPEC (or elsewhere) with "-ffast-math
-fpmath=sse" that would be great.  Clearly, the decision to use
"-fpmath=sse,387" vs. "-fpmath=sse" will be influenced by this tweak.


The following patch has been tested on both i686-pc-linux-gnu and
x86_64-unknown-linux-gnu with a full "make bootstrap", all default
languages, and regression tested with a top-level "make -k check"
with no new failures.

Ok for mainline?



2004-11-23  Roger Sayle  <roger@eyesopen.com>

	* config/i386/i386.c (override_options): Disable x87 fancy math
	intrinsics if -mfpmath= doesn't include 387 (default on x86_64).


Index: config/i386/i386.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/i386/i386.c,v
retrieving revision 1.742
diff -c -3 -p -r1.742 i386.c
*** config/i386/i386.c	23 Nov 2004 01:22:58 -0000	1.742
--- config/i386/i386.c	23 Nov 2004 04:42:33 -0000
*************** override_options (void)
*** 1548,1553 ****
--- 1548,1557 ----
  	error ("bad value (%s) for -mfpmath= switch", ix86_fpmath_string);
      }

+   /* If fpmath doesn't include 387, disable use of x87 intrinsics.  */
+   if (! (ix86_fpmath & FPMATH_387))
+     target_flags |= MASK_NO_FANCY_MATH_387;
+
    /* It makes no sense to ask for just SSE builtins, so MMX is also turned
       on by -msse.  */
    if (TARGET_SSE)

Roger
--
Roger Sayle,                         E-mail: roger@eyesopen.com
OpenEye Scientific Software,         WWW: http://www.eyesopen.com/
Suite 1107, 3600 Cerrillos Road,     Tel: (+1) 505-473-7385
Santa Fe, New Mexico, 87507.         Fax: (+1) 505-473-0833


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]