This is the mail archive of the
mailing list for the GCC project.
Re: x86 FP optimizer behaviour
- From: Gabriel Paubert <paubert at iram dot es>
- To: James Macnicol <jamesm at ee dot adfa dot edu dot au>
- Cc: <gcc at gcc dot gnu dot org>
- Date: Thu, 22 Nov 2001 13:50:44 +0100 (CET)
- Subject: Re: x86 FP optimizer behaviour
On Thu, 22 Nov 2001, James Macnicol wrote:
> I have spent today beating my head against a brick wall with
> the following code (made somewhat more concise than what I started
> with). On the surface it looks like an optimizer bug: the value
> printed out is different (and wrong) if you compile with -O than
> without. Notice in the second case however, that the problem goes
> away if you move the printf statement to be before the comparison that
> fails. So I'm not sure what the actual problem is given the presence
> of the printf shouldn't affect anything at all (surely the optimizer
> can't do anything with it) ....
Welcome to the wonderful world of numerical analysis with randomly varying
precision depending on compiler options, version or vendor, the phase of
the moon, and other well-controlled variables :-)
In your case -ffloat-store might help (or using SSE2 on a Pentium IV, but
I'm not sure that it's fully implemented), but not always; the problem is
that, without the printf statement, the compiler keeps extra precision
in the internal registers (64 bit mantissa versus 53) and you just hit
a case where rounding makes a difference when storing to memory across
If -ffloat-store does not help, it is also possibe to set the x87 FPU in
53 bit precision mode. Unfortunately none of the portable glibc functions
allows to do it, so you have to modify the control word yourself (or at
least the control word field in the structure used by fesetenv/fegetenv).
At least this gives you results which have the same precision across
architectures using IEEE floating-point (if you tell the compiler to avoid
fused multiply-add). But the results are not yet always the same because
the valid exponent range differs (other machines might get
underflows/overflows/denormals, treat denormals as zeros for speed, etc).
However, I'm not even sure that these are the right answers. Floating
point is inherently inexact, and having code which depends on precision
down to the last few least significant bits is bad.
In short, GCC is right and the variations you see are expected.