This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]

Re: FWD: FLOATING-POINT CONSISTENCY, -FFLOAT-STORE, AND X86

To: Joe Buck <jbuck at Synopsys dot COM>
Subject: Re: FWD: FLOATING-POINT CONSISTENCY, -FFLOAT-STORE, AND X86
From: Jeffrey A Law <law at hurl dot cygnus dot com>
Date: Tue, 15 Dec 1998 11:12:01 -0700
cc: bosch at gnat dot com, hjstein at bfr dot co dot il, moshier at mediaone dot net, egcs at cygnus dot com, tprince at cat dot e-mail dot com
Reply-To: law at cygnus dot com


  In message <199812151728.JAA00746@yamato.synopsys.com>you write:
  > 
  > > Many useful fpt algorithms rely on ordering of operations to be honored, 
  > > and a compiler evaluating  B + (A - B) as (B + A) - B or even as A 
  > > is seriously broken for numerical stuff.
  > 
  > gcc is not broken in that way: parentheses prevent reordering for FP
  > operations.
Actually, having recently looked into reassociation optimizations I'll chime
in with a minor clarification.


GCC does not show parens anywhere in its tree or rtl structures.  We prevent
these transformations across parens by simply never performing these
transformations on floating point values.

At some point we'll need to be able to perform more fine grained tests, both
to help the floating point issues and to deal with overflow issues in languages
like Ada.

For FP, we would like the ability to reassociate some expressions.  Take
(a * b * c * d) * e

Right now we'll genrate

t1 = a * b;
t2 = t1 * c;
t3 = t2 * d;
t4 = t3 * e;

Note the dependency of each insn on the previous insn.  This can be a major
performance penalty -- especially on targets which have dual FP units or where
a fpmul isn't incredibly fast (data dependency stalls at each step).

t1 = a * b;
t2 = c * d;
t3 = t1 * t2;
t4 = t3 * e;


Is a much better (and safe as far as I know) sequence.  The first two insns
are totally independent, which at the minimum reduces one of the 3 stall
conditions due to data dependency.  For a target with a pipelined FPU or
dual FPUs the second sequence sequence will be significantly faster.


For integer, we need to know where the parens are to preserve integer overflow
semantics in languages like Ada for similar transformations.


jeff

Follow-Ups:
- Re: FWD: FLOATING-POINT CONSISTENCY, -FFLOAT-STORE, AND X86
  - From: Sylvain Pion

References:
- Re: FWD: FLOATING-POINT CONSISTENCY, -FFLOAT-STORE, AND X86
  - From: Joe Buck

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]