This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: FWD: FLOATING-POINT CONSISTENCY, -FFLOAT-STORE, AND X86
- To: Joe Buck <jbuck at Synopsys dot COM>
- Subject: Re: FWD: FLOATING-POINT CONSISTENCY, -FFLOAT-STORE, AND X86
- From: Jeffrey A Law <law at hurl dot cygnus dot com>
- Date: Tue, 15 Dec 1998 11:12:01 -0700
- cc: bosch at gnat dot com, hjstein at bfr dot co dot il, moshier at mediaone dot net, egcs at cygnus dot com, tprince at cat dot e-mail dot com
- Reply-To: law at cygnus dot com
In message <199812151728.JAA00746@yamato.synopsys.com>you write:
>
> > Many useful fpt algorithms rely on ordering of operations to be honored,
> > and a compiler evaluating B + (A - B) as (B + A) - B or even as A
> > is seriously broken for numerical stuff.
>
> gcc is not broken in that way: parentheses prevent reordering for FP
> operations.
Actually, having recently looked into reassociation optimizations I'll chime
in with a minor clarification.
GCC does not show parens anywhere in its tree or rtl structures. We prevent
these transformations across parens by simply never performing these
transformations on floating point values.
At some point we'll need to be able to perform more fine grained tests, both
to help the floating point issues and to deal with overflow issues in languages
like Ada.
For FP, we would like the ability to reassociate some expressions. Take
(a * b * c * d) * e
Right now we'll genrate
t1 = a * b;
t2 = t1 * c;
t3 = t2 * d;
t4 = t3 * e;
Note the dependency of each insn on the previous insn. This can be a major
performance penalty -- especially on targets which have dual FP units or where
a fpmul isn't incredibly fast (data dependency stalls at each step).
t1 = a * b;
t2 = c * d;
t3 = t1 * t2;
t4 = t3 * e;
Is a much better (and safe as far as I know) sequence. The first two insns
are totally independent, which at the minimum reduces one of the 3 stall
conditions due to data dependency. For a target with a pipelined FPU or
dual FPUs the second sequence sequence will be significantly faster.
For integer, we need to know where the parens are to preserve integer overflow
semantics in languages like Ada for similar transformations.
jeff