This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: FWD: FLOATING-POINT CONSISTENCY, -FFLOAT-STORE, AND X86
- To: "Harvey J. Stein" <hjstein at bfr dot co dot il>, "moshier at mediaone dot net" <moshier at mediaone dot net>
- Subject: Re: FWD: FLOATING-POINT CONSISTENCY, -FFLOAT-STORE, AND X86
- From: "Geert Bosch" <bosch at gnat dot com>
- Date: Tue, 15 Dec 1998 10:46:09 +0100
- Cc: "egcs at cygnus dot com" <egcs at cygnus dot com>, "hjstein at bfr dot co dot il" <hjstein at bfr dot co dot il>, "tprince at cat dot e-mail dot com" <tprince at cat dot e-mail dot com>
- Reply-To: "Geert Bosch" <bosch at gnat dot com>
On 14 Dec 1998 11:51:23 +0200, Harvey J. Stein wrote:
Reasonable floating point code should expect that reordering
operations will produce slightly different results due to round off
error, and should be tolerant of the optimizer doing such. Especially
given how little control the programmer has over exactly how
computations are ordered.
Many useful fpt algorithms rely on ordering of operations to be honored,
and a compiler evaluating B + (A - B) as (B + A) - B or even as A
is seriously broken for numerical stuff.
Having spills to memory retain full precision is very useful as this allows
one to prove much more about fpt code. Here is an example of what I mean,
using a decimal fpt type with 4 digits for extended precision in registers
and 3 for the in-memory precision of a variable. (Examples using binary
64-bit and 80-bit fpt types are similar but harder to read.)
Calculate S = (10.0 + 0.454) - (0.454 + 10.0), spilling one partial sum to T.
Case 1) Case 2) Case 3)
10.0 10.0 10.0
0.454 + 0.454 + 0.454 +
----- ------ -----
10.5 10.45 10.45
T = 10.5 T = 10.45 T = 10.4
0.454 0.454 0.454
10.0 + 10.00 + 10.00 +
----- ------ ------
10.5 - << 10.5 10.45 - << 10.45 10.45 - << 10.45
----- ----- -----
S = 0.00 S = 0.00 S = -0.05
Case 1 does not use extended registers and rounds at every operation.
This is completely IEEE conformant behaviour.
Case 2 uses extended registers and same precision for spilled value.
This is not IEEE-conformant, but guarantees consistent rounding behaviour.
In particular the relative error is never more than that of case 1.
For most algorithms this will work fine, double rounding will only
occur on the final assignment. This is not ideal, but now worst-case
is one double rounding per statement instead of one per operation.
If assignments are forced to go to memory (using volatile var's for example),
fpt behaviour is independent of optimization level.
Case 3 uses extended registers, but lower precision for spilled value.
This is the worst case and is what is causing problems right now.
The intermediate values while evaluating the expression may be subject
to double rounding errors. People who care about right answers often
turn off optimization, but ironically this makes the problems only worse!
Regards,
Geert