This is the mail archive of the
`gcc-patches@gcc.gnu.org`
mailing list for the GCC project.

Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|

Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |

Other format: | [Raw text] |

*From*: Roger Sayle <roger at eyesopen dot com>*To*: Richard Guenther <rguenth at tat dot physik dot uni-tuebingen dot de>, Paolo Carlini <pcarlini at suse dot de>*Cc*: gcc-patches at gcc dot gnu dot org, libstdc++ <libstdc++ at gcc dot gnu dot org>*Date*: Sat, 13 Mar 2004 05:49:23 -0700 (MST)*Subject*: Re: 11706 vs ([lno] Canonical iv creation)

On Sat, 13 Mar 2004, Richard Guenther wrote: > > Really interesting issue... > > > > Anyway, from a practical point of view I'm not really worried, since: > > > > 1- Glibc uses internally the very same algorithm, at least on x86 (just > > checked). > > > > 2- Which requires O(log n) multiplications and is discussed by Knuth in > > Sec. 4.6.3 of the second volume. You can find it almost *everywhere*, > > for instance also in SGI's power. > > So there is no point in not optimizing ::pow(x, 4) without -ffast-math? There is a potential accuracy problem, which is why GCC doesn't inline ::pow(x,4) without -ffast-math. To demonstrate via an example, consider a decimal floating point format whose mantissa is only a single digit. The correct, perfectly rounded result for pow(4,4) is 3e2, as 256 correctly rounded to nearest is 300. However, x*x produces the result 2e1 (as 16 rounds to 20), and this value squared results in 4e2. [Interestingly, for this example (((x*x)*x)*x) with three multiplications/roundings produces a better result for this particular example, but the general case error analysis is beyond me] Basically, "pow" can only be guaranteed to be rounded perfectly when implemented using multiplications, if only a single multiplication (i.e. rounding step) is required. Hence exponents -2, -1, 0, 1 and 2 of ::pow are the only ones optimized without -ffast-math. The great philosophical issue is GCC's decision that if it can't produce a perfectly rounded 0.5 ulp result itself it should call the system library, which theoretically may implement a more accurate algorithm. Often in practice, however, the system library isn't perfect and uses the same flawed method that would be used by -ffast-math. A good example are "sin" and "cos" on x86, where GCC knows that using the x87's fsin and fcos instructions produces significant errors for some arguments, so call the libm versions rather than use these opcode inline. Most system libraries [glibc, libc, newlib, msvcrt etc...] favour speed over accuracy and therefore use the same instructions that GCC was concerned about! As for implementing pow(3) using the binary method for constructing addition chains *everywhere*, these actually require more multiplications that the method GCC uses to inline exponentiation by a constant, see: http://gcc.gnu.org/ml/gcc-patches/2003-06/msg02447.html. Compare std::pow(x,15) which uses 6 multiplications with ::pow(x,15) which uses only 5. The binary method is a reasonable choice where the exponent isn't known ahead of time (for small calculations), but GCC has the compile-time luxury for constant powers or choosing a more efficient addition chain. A quick peek at OpenSSL and similar cryptographic software, for example, shows that raising very large numbers to large (almost) prime exponents almost never use the binary method. A 1024-bit RSA encryption/decryption exponentiation can typically be done in far fewer than the 1536 multiplications required on average by the binary method. Indeed the section of Knuth quoted above describes the NP-complete addition chain problem, and the "obvious" non-optimality of the binary method. Whether libstdc++-v3 can be more pragmatic than GCC about system library accuracy, and even whether -ffast-math is a reasonable default for the compiler (as its is with several commercial compilers), are larger arguments. I doubt anyone would practically use a method more accurate than two multiplications to implement pow(x,4). So I agree with Paolo's disclaimer "from a practical point of view I'm not really worried.." :> I hope this helps, Roger --

**Follow-Ups**:**Re: 11706 vs ([lno] Canonical iv creation)***From:*Richard Guenther

**References**:**Re: 11706 vs ([lno] Canonical iv creation)***From:*Richard Guenther

Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|

Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |