This is the mail archive of the
mailing list for the GNU Fortran project.
Re: [RFC] Quad-float support, round 4
On Mon, Sep 13, 2010 at 08:17:56PM +0200, FX wrote:
> (Yeah, Intel compiler doesn't like real(kind=10)...)
> Again, comparing __float128 to double:
> SQRT is slower by 160, SIN is slower by only 6, COS by 10,
> ASINH by merely 10 and ERFC by 6 again. I'm actually amazed
> at how it does that!
First, I agree that worrying about performance at this point
should not be a priority. Perhaps, someone like Tim Prince
might step forward to lend a hand in optimizing the code.
I took a peek at sinq.c and sinq_kernel.c. These are doing
quite a bit of the arithmetic in __float128 precision. It's
possible that Intel might being some things in double with
the FPU. One possible optimization (that would need testing!)
is reducing the argument to the range [0,2*pi); call this
arg. Now, split arg into 3 pieces of the top 38 bits b1, middle 38
bits b2, and finally the last 37 bits b3, where b1, b2, and b3
are doubles and the splitting is exact.
arg = b1 + b2 + b3
sin(arg) = sin(b1) * cos(b2 + b3) + cos(b1) * sin(b2 + b3)
= sin(b1) * [cos(b2) * cos(b3) - sin(b2) * sin(b3)] + etc
So, you need 6 calls to sin() and cos() and a hand full of
__float128 basic arithmetic operations. While the splitting
of arg into the pieces and trig identities are exact, I would
need to play around to determine the accuracy of the above