This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: GCC Benchmarks (coybench), AMD64 and i686, 14 August 2004
- From: Uros Bizjak <uros at kss-loka dot si>
- To: gcc at gcc dot gnu dot org, dberlin at dberlin dot org
- Date: Wed, 18 Aug 2004 12:03:47 +0200
- Subject: Re: GCC Benchmarks (coybench), AMD64 and i686, 14 August 2004
Without -xN, it does:
# parameter 1: 8 + %ebx
..B3.1: # Preds ..B3.0
pushl %ebx #62.1
movl %esp, %ebx #62.1
andl $-8, %esp #62.1
fldl 8(%ebx) #61.15
fldl PI2 #63.28
fcom %st(1) #63.28
fnstsw %ax #63.28
sahf #63.28
ja .L9 # Prob 50% #63.28
fst %st(1) #63.28
.L9: #
fstp %st(0) #63.28
fsincos #63.18
fxch %st(1) #63.18
fadd %st(0), %st #63.18
fmulp %st, %st(1) #63.47
movl %ebx, %esp #63.47
popl %ebx #63.47
ret #63.47
.align 4,0x90
# LOE
Icc transforms dv() function from:
{
return 2.0 * sin (((x < PI2) ? x : PI2)) * cos (((x < PI2) ? x : PI2));
}
into:
{
double tmp = (x < PI2) ? x : PI2;
return 2.0 * sin (tmp) * cos (tmp);
}
By introducint temp variable, gcc is able to produce even better code
for second line:
.L4:
fsincos
fmulp %st, %st(1)
fadd %st(0), %st
ret
... and somewhat unoptimized first line (jumps!):
dv:
fldl .LC0
fldl 4(%esp)
fld %st(1)
fxch %st(1)
fcom %st(2)
fnstsw %ax
fstp %st(2)
sahf
ja .L6
fstp %st(0)
jmp .L4
.p2align 4,,7
.L6:
fstp %st(1)
.L4:
It looks that gcc does not detect that parameters to sin() and cos() are
actually the same.
Uros.