This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: GCC viciously beaten by ICC in trig test!
- From: Dan Nicolaescu <dann at godzilla dot ics dot uci dot edu>
- To: Stelios Xanthakis <sxanth at ceid dot upatras dot gr>
- Cc: Roger Sayle <roger at eyesopen dot com>, gcc at gcc dot gnu dot org
- Date: Mon, 15 Mar 2004 10:21:26 -0800
- Subject: Re: GCC viciously beaten by ICC in trig test!
- References: <Pine.GSO.4.21.0403151400100.29891-100000@zenon.ceid.upatras.gr>
Stelios Xanthakis <sxanth@ceid.upatras.gr> writes:
> On Sun, 14 Mar 2004, Dan Nicolaescu wrote:
>
> > Roger Sayle <roger@eyesopen.com> writes:
> > > fsin
> > > fmul %st(0), %st
> >
> > Intel 8.0 (that was used in the original test) generates something
> > very different:
Please be careful when snipping, the essential part that you deleted
is this:
call __libm_sse2_sincos #7.15
# LOE ebp esi edi xmm0 xmm1
..B1.4: # Preds ..B1.1
i.e. ICC 8 generates a call to an SSE library function instead of
using the fsin instruction. Given that this changed from ICC 7 to ICC
8, the library function is probably faster.
> > mulsd %xmm1, %xmm1 #10.25
> > mulsd %xmm0, %xmm0 #10.15
> > addsd %xmm1, %xmm0 #10.25
> > movsd %xmm0, (%esp) #10.25
> > fldl (%esp) #10.25
> >
>
> Does --fpmath=sse fix this?
> Can the processor in question do sse for doubles?
>
> In my experience, "--fpmath=sse --fsingle-precision-constants"
> generates much faster code for a raytracer I have here.
See above.