This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: Floating point optimization?
- To: slouken at devolution dot com
- Subject: Re: Floating point optimization?
- From: craig at jcb-sc dot com
- Date: 30 Apr 1999 14:17:49 -0000
- Cc: egcs at egcs dot cygnus dot com, highlander at lokigames dot com, smariotti at activision dot com, rmyers at activision dot com
- Cc: craig at jcb-sc dot com
- References: <E10d9Hu-0001zp-00@roboto.devolution.com>
>Why are they different?
Short answer: because you're using floating-point arithmetic.
Longer answer: see the node "Floating-point Errors" in the g77 documentation.
Use the most up-to-date version at:
<http://egcs.cygnus.com/onlinedocs/g77_toc.html>
Also see the description of the `-ffloat-store' option in the gcc
documentation, *and* in the g77 documentation, which explains further
what is going on.
Once you've read all that, and all the docs to which it points,
you'll have a better understanding of what's happening.
>An analysis of the assembly output shows that the -O0 version has
> fstps -4(%ebp)
> flds -4(%ebp)
>between the floating point adds and the floating point compare.
>I don't know assembly well enough to know what these are doing.
They're spilling a value computed to 80 bits of precision to a
temporary that holds only 32 bits, thus losing precision. In
this case, that extra precision, when *not* lost, causes the
program to behave differently than you expect. Sometimes it's
the other way around, depending on the expectations of the programmer.
>The result, however, is that non-optimized code results a different
>branch of the logic being taken from the optimized code.
That'll always be possible in languages like C and Fortran, unless
you're using a compiler that guarantees Java-like predictability
(which would mean it'd produce slower code than most C and Fortran
compilers).
But, we already know that gcc (egcs) compounds the problem by not
using the floating-point stack on the x86 as it was meant to be used.
The reason it does that is, mainly, that the resulting code runs much
faster. So programmers who care about consistency of intermediate
results to more than 32 or 64 bits of precision have to write their
code differently, or (for now) use a different compiler, since gcc
doesn't offer any option to solve this problem one way (everything
computed in 32/64 bits, as declared) or another (every value computed
in 80 bits guaranteed to be preserved that way, even when spilled from
the FP stack to memory).
(I believe this problem affects the m68k target as well.)
tq vm, (burley)