This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: RFC: x86: Emit rsqrt with only -ffast-math
- From: Geert Bosch <bosch at adacore dot com>
- To: Michael Matz <matz at suse dot de>
- Cc: gcc-patches at gcc dot gnu dot org
- Date: Wed, 4 Nov 2009 21:33:19 -0500
- Subject: Re: RFC: x86: Emit rsqrt with only -ffast-math
- References: <Pine.LNX.4.64.0911031425320.15566@wotan.suse.de>
On Nov 3, 2009, at 08:35, Michael Matz wrote:
* replacing sqrt(r) with r*[rsqrt(r)+NR] breaks aermod of polyhedron
Wouldn't you want to do NR on the whole expression directly?
This way the final accuracy will have twice the number of bits
of the initial rsqrt estimate. Doing the multiplication after
the Newton Raphson iteration will cost an additional bit of
precision.
Something like:
float sqrt(float r) {
float q = rsqrt(r); // approx. 1.0 / sqrt (r);
float y = q * r; // approx sqrt (r)
float e = y * y - r; // y == sqrt (r + e), with e small
return y - e / (2.0 * y); // Newton Raphson correction
}
If the initial approximation y is sufficiently close, you
might be able to get away with y - e * rcp(2.0 * y).
Basically, if y is already correct to 12 bits, the division
result needs only to be accurate to 12 bits itself, to get
close to 24 bits of final accuracy.
Anyway, I think the proposed replacement needs a bit
more discussion and numerical analysis.
Using the reciprocal approximations does not in itself mean
reduced accuracy. The IA64 for example relies on these
techniques to implement correctly rounded and efficient
division and square root operations.
The question here is what our speed/accuracy trade-off is.
-Geert
PS. It would be interesting to know the problematic sqrt
evaluations in the aermod test case...