This is the mail archive of the
mailing list for the GCC project.
Re: sin/cos via SSE2, and an alignment bug (was Re: sqrt via SSE2)
> > > There's then the issue, though it's probably more one for the next glibc
> > > release after gcc-3.1 appears, of whether a sin() implementation using
> > > code and a suitable rational-function approximation could get adequate
> > > results in less than the 190-or-so cycles that fsin takes: I'm pretty
> > > it's possible, even given that the necessary two divides can't take less
> > > than 70 ticks and that one might want a table-lookup for argument
> > Yes, we need to address this issue eventually.
> Annoyingly, whilst I've quickly cobbled together a strategy that ought to
> work for sin() -- a degree-4 Pad\'e approximation is accurate to within
> 7.5e-18 in [0 .. Pi/32], a 64-V2DF lookup table and a trig identity extend
> to [0 .. 2*Pi], and I trust that
> MOVSD twopi, XMM0
> DIVSD XMM0, XMM1 -- divide by 53-bit-precision 2*PI
> CVTTPD2DQ XMM1, XMM2 -- round to nearest integer
> CVTTDQ2PD XMM2, XMM3 -- bring back to a double
> MULSD XMM0, XMM3 -- multiple by 53-bit-precision 2*PI
> SUBSD XMM3, XMM1 -- and get the remainder
> is good-enough argument reduction for -ffast-math -- the actual
> implementation really wants to be written using the SSE2 built-ins which at
> present don't exist.
> So I'll put that on a back-burner for the moment and continue bug-hunting:
> I've got a rather suspicious problem at the moment where the use of unions
> containing attribute(("V4SI")) elements either crashes the compiler in
> expr.c, or generates code which uses MOVPD on non-16-byte-aligned objects
> and segfaults.
Can you show me the testcase? Note that gcc does not align properly stack
frame of function main () in case your runtime don't. Many Linux distros
contain glibc compiled by gcc 2.95 that miscompiles it in a way that stack
is missaligned at the entry.
Stack alignment works for nested functions, but just until you don't get it
missaligned by being called via some callback from 2.95 code.
> I'll bring in a more complete report tomorrow if the problem still exists
> in -20020218, but for the moment see PR C/5680.