This is the mail archive of the
mailing list for the GCC project.
Re: [PATCH] Constant folding for cabs and fabs.
- From: Roger Sayle <roger at eyesopen dot com>
- To: "Joseph S. Myers" <jsm28 at cam dot ac dot uk>
- Cc: Brad Lucier <lucier at math dot purdue dot edu>, Richard Henderson <rth at redhat dot com>, <gcc-patches at gcc dot gnu dot org>
- Date: Thu, 5 Jun 2003 15:41:51 -0600 (MDT)
- Subject: Re: [PATCH] Constant folding for cabs and fabs.
On Thu, 5 Jun 2003, Joseph S. Myers wrote:
> On Thu, 5 Jun 2003, Roger Sayle wrote:
> > The main hunks are to evaluate cabs of complex constants at compile-time,
> > the extra precision of GCC's internal floating point representation means
> > that its always safe to do this. [Glibc is able to use sqrt(x*x + y*y)
> Could you give more detail of why this is safe (especially given that I
> don't recall an answer to the questions about the correctness of the last
> bit in GCC's sqrt for 106-bit mantissas)? I.e., a description of the
> algorithms, precisions involved (maximum floating point format precision,
> minimum internal precision) and theoretical justification.
The most important issue to remember is that unlike sqrt where the
relevant standards require 0.5 ulp for float and double, C99 cabs
and hypot only require 1 ulp precision. In GCC real.c terms this is
the difference with "perfect" and wildly inaccurate. Getting perfect
results is difficult, but its probably even harder to do worse than
0.6 ulp on simple functions given GCC's 160 bit mantissae.
Taken from libc/sysdeps/ieee754/dbl-64/e_hypot.c:
If (assume round-to-nearest) z=x*x+y*y has error less
than sqrt(2)/2 ulp, than sqrt(z) has error less than
1 ulp (exercise).
Hence without additional precision naive squaring and adding can't be
used. See the above file for a suitable implementation. However it
requires only a little additional precision to provide the appropriate
guarantees for a 1 ulp function. Hence, glibc's x87 implementation
can use the simple algorithm due to its internal 80 bit precision.
In practice, only an additional two or three bits are required.
I think we need a bit of perspective here. real_sqrt is perfectly
rounded to 0.5 ulp when using IEEE singles and doubles. And
admittedly there is a precision issue with native floating point
formats with >= 79 bit mantissa for their long doubles. Currently
the only platforms with this issue are those that use ibm_extended_format
or mips-extended_format with its 106 bit mantissa or IEEE quad format
with is 113 bit mantissae: On these platforms in exceptionally rare
cases it is possible to incorrectly round the last bit with a probability
of approximately 2^-54 or 2^-47 respectively, which is less than
.0000000000000071. In *all* remaining cases, we get the perfectly
rounded result to 0.5 ulp. Hence we achieve 0.5+epsilon ulp for some
small epsilon << 0.0001! The requirement for a conforming cabsl
implementation is sqrt(2)/2 (which is only 0.7071 ulp) prior to sqrt,
when GCC is actually better than 0.5001...
I'm unaware of a default run-time implementation, either in hardware
or via software emulation, that provides anywhere near this level of
accuracy by default for cabs/hypot on any of GCC's supported platforms.
Indeed, relatively few platforms even provide full IEEE long double
support at all.
Of course, I'll be delighted for anyone to submit patches to
improve upon GCC's mathematical function implementations. The
non-zero epsilon still leaves room for improvement, even though
we are already ridiculously more accurate than required by even
the strictest language standards. Indeed I tip my hat to anyone
worthy of the challenge :>
[Perhaps Brad will correct, confirm or refute the above arguments?]
I'm more concerned about monotonicity requirements. Even if GCC
always produced perfect results, we still couldn't use them if the
system library isn't as accurate. And worse still, GCC is competing
in a environment where commercial compilers, such as Intel's and
Microsoft's, enable GCC's equivalent of -ffast-math by default...