This is the mail archive of the
mailing list for the GCC project.
Re: What is acceptable for -ffast-math? A numerical viewpoint
- To: gcc at gcc dot gnu dot org
- Subject: Re: What is acceptable for -ffast-math? A numerical viewpoint
- From: Jonathan Thornburg <jthorn at galileo dot thp dot univie dot ac dot at>
- Date: Fri, 3 Aug 2001 18:16:56 +0200
- Cc: Jonathan Thornburg <jthorn at thp dot univie dot ac dot at>
- References: <url:http://gcc.gnu.org/ml/gcc/2001-08/msg00009.html> <url:http://gcc.gnu.org/ml/gcc/2001-07/msg02141.html>
In message <url:http://gcc.gnu.org/ml/gcc/2001-08/msg00009.html>,
Wolfgang Bangerth <wolfgang dot bangerth at iwr dot uni-heidelberg dot de>
> Due to repeated demand: here is the opinion of a numerical analyst. To
> further define my position: there is some kind of schism in the numerics
> society, the "traditionalists" doing Fortran with highly tuned codes and
> relatively simple data structures, and the "modernists" using extremely
> large C++ programs on complex data structures. I belong to the latter
> group and can only speak for them.
> My opinion is that -fast-math could well include slight deviations from
> IEEE modes for denormals, rounding modes, associative redistribution, etc.
> So, concluding: if you have programs that run for several days, you'd be
> happy if you could cut that time by some hours using optimizations that
> - still do what common sense would dictate (IEEE is not pure common sense,
> but rather a definition), i.e. swapping operands would be allowed, but
> replacing NaNs by zero would not
> - can be switched off in case I know I have to treat boundary cases
> I myself would certainly appreciate a compiler turning a*c+b*c into
> (a+b)*c, I'd definitely not be shocked about that.
I normally refrain from posting "me too"s, but in this case
Wolfgang has expressed my thoughts so precisely that I'd like to
chime in. I too do number-crunching for a living, with large C++
programs running for hours to weeks on a mix of workstations and
supercomputers, using a mix of simple and complex data structures.
I normally enable -ffast-math in my programs. I would be delighted
to see gcc -ffast-math turn a*c + b*c into (a+b)*c if this ran
faster. I have no qualms about underflows being flushed to zero,
and I suspect I could live fairly well with even old Cray arithmetic.
I would have major problems with 2.0/3.0 evaluating to 0.5, though. :)
It's worth pointing out that a lot of my sort of computing involves
machine-generated C/C++ code (usually generated by a symbolic algebra
system like Maple, Macsyma, or Mathematica). Often various limitations
of the symbolic code mean that the generated C code is seriously ugly,
with even "obvious inefficiencies" like dividing by 5.0 instead of
multiplying by 0.2, etc. So if I grant license to do this by saying
-ffast-math, I would like very much for the compiler to (eg) convert
(...)/5.0 into 0.2*(...). My philosophy is that if I care about
exact bit-for-bit rounding, then I shouldn't use -ffast-math.
Having optimization change program results is a trickier case.
I guess I'd like this to be user-controllable. I'd certainly like
the _option_ of cranking maximum gigaflops for my black hole simulations
(with -super-duper-optimize maybe giving slightly different results
from -g), but I'd also like the option of retaining identical results
(presumably at some performance penalty) for debugging.
Another "interesting" property of my codes is that they often include
particular subsections where I _do_ need either bit-for-bit rounding,
or at least something fairly close to it. For example, sometimes I
fake quad precision with Keith Briggs' C++ doubledouble class
And I often use Brent's ZEROIN routine, which (eg) infinite-loops on
x86 if compiled with gcc unless -ffloat-store is specified.
Presently I handle this via Makefile hackery to remove -ffast-math
and (on x86) add -ffloat-store) as needed (at a per-compilation-unit
granularity). But it would sometimes be useful if I could get this
on a function or even block granularity, e.g. with #pragma or some
moral equivalent (__careful-math or whatever). That way only the
minimal necessary section of code would have to pay the performance
penalties of "being careful".
If -ffast-math is *not* enabled, then I think gcc needs to be a lot
more careful with this sort of thing. I view the absence of -ffast-math
as saying that the user _does_ care about getting every last IEEE bit
right, at least modulo x86 -ffloat-store wierdness. Perhaps this is
too conservative, and we need to generalize this into a "math mode"
flag which can be "fast", "exact", or "normal", the latter being
a compromise position (suitable for a default setting) which would
hopefully give most of the performance of "fast" while (eg) not
using the associative law.
-- Jonathan Thornburg <firstname.lastname@example.org>
Max-Planck-Institut fuer Gravitationsphysik (Albert-Einstein-Institut),
Golm, Germany http://www.aei.mpg.de/~jthorn/home.html
"Space travel is utter bilge" -- common misquote of UK Astronomer Royal
Richard Woolley's remrks of 1956
"All this writing about space travel is utter bilge. To go to the
moon would cost as much as a major war." -- what he actually said