[Bug target/27827] [4.0/4.1 Regression] gcc 4 produces worse x87 code on all platforms than gcc 3
paolo dot bonzini at lu dot unisi dot ch
Thu Aug 10 14:29:00 GMT 2006
------- Comment #61 from paolo dot bonzini at lu dot unisi dot ch 2006-08-10 14:28 -------
Subject: Re: [4.0/4.1 Regression] gcc 4 produces worse
x87 code on all platforms than gcc 3
> Making vectorization depend on a flag that says it is allowed to violate IEEE
> is therefore a killer for me (and most knowledgable fp guys). This is ironic,
> since vectorization of sums (as in GEMM) is usually implemented as scalar
> expansion on the accumulators
In case of GCC, it performs the transformation that Dorit explained. It
may not produce an IEEE-compliant answer if there are zeros and you
expect to see a particular sign for the zero.
> and this not only produces an IEEE-compliant answer
The IEEE standard mandates particular rules for performing operations on
infinities, NaNs, signed zeros, denormals, ... The C standard, by
mandating no reassociation, ensures that you don't mess with NaNs,
infinities, and signed zeros. As soon as you perform reassociation,
there is *no way* you can be sure that you get IEEE-compliant math.
+Inf + (1 / +0) = Inf, +Inf + (1 / -0) = NaN.
> but it is *more* accurate for almost all data.
http://citeseer.ist.psu.edu/589698.html is an example of a paper that
shows FP code that avoids accuracy problems. Any kind of reassociation
will break that code, and lower its accuracy. That's why reassociation
is an "unsafe" math optimization.
If you want a -freassociate-fp math, open an enhancement PR and somebody
might be more than happy to separate reassociation from the other
effects of -funsafe-math-optimizations.
(Independent of this, you should also open a separate PR for ATLAS
vectorization, because that would not be a regression and would not be
on x87) :-)
More information about the Gcc-bugs