This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: FP compares and TARGET_SSE_MATH
Jan Hubicka wrote:
Do you have any benchmarks that suggest that avoiding the mixture of
both is a lost? I briefly benchmarked this on SPECfp at the time I was
implementing this and using both sets was a win that time, but times
might've changed.
I have tried some FP benchmark with -mno-sse, but -mno-sse also disabled
cvtt* patterns, so I don't trust the results. But consider this simple
testcase:
double test (double a, double b) {
if (a > b)
return a;
else
return b;
}
with '-O2 -S -march=pentium4 -mfpmath=sse -ffast-math':
test:
pushl %ebp
movl %esp, %ebp
subl $8, %esp
fldl 8(%ebp)
movsd 16(%ebp), %xmm0
movsd %xmm0, -8(%ebp)
fldl -8(%ebp)
fcomi %st(1), %st
fcmovb %st(1), %st
fstp %st(1)
leave
ret
I also sent patch to teach regclass to discover the dependencies (ie to
avoid putting register X in x87 when it is used in comparsion operator
with register Y that needs to live in SSE). perhaps we might thing
about sollution in this dirrection. (I originally gave up mostly because
new-RA seemed to make progress)
Would this patch solve these situations, when -mfpmath=387 is choosen:
80a0f0a: d9 03 flds (%ebx)
80a0f0c: d9 5c 24 4c fstps 0x4c(%esp,1)
80a0f10: f3 0f 10 44 24 4c movss 0x4c(%esp,1),%xmm0
80a0f16: 0f 2f 00 comiss (%eax),%xmm0
80a0f19: 0f 43 c3 cmovae %ebx,%eax
And similar for -mfpmath=sse:
804d112: dd 43 28 fldl 0x28(%ebx)
804d115: f2 0f 11 04 24 movsd %xmm0,(%esp,1)
804d11a: dd 04 24 fldl (%esp,1)
804d11d: 31 c0 xor %eax,%eax
804d11f: df f1 fcomip %st(1),%st
804d121: 0f 95 c0 setne %al
These codes are from PovRay, where:
grep fcomi povray.sse | wc -l
668
grep comis povray.387 | wc -l
414
Uros.