[Bug rtl-optimization/19780] Floating point computation far slower for -mfpmath=sse

Thu Apr 5 18:40:00 GMT 2007

------- Comment #20 from ubizjak at gmail dot com  2007-04-05 19:39 -------
(In reply to comment #19)
> what are you using for a compiler? Im using a mainline from mid march, and 

gcc version 4.3.0 20070404 (experimental) on i686-pc-linux-gnu

with
> it, my .optimized files diff exactly the same, and I get the aforementioned
> time differences in the executables.

This is because -march=pentium4 enables all sse builtins for both cases.

> (sse.c and sse-bad.c are same, just different names to get different output
> files)
> 
> 2007-03-13/gcc> diff sse.c sse-bad.c
> 
> 2007-03-13/gcc>./xgcc -B./ sse.c -fdump-tree-optimized -O3 -march=pentium4 -o
> sse
> 
> 2007-03-13/gcc>./xgcc -B./ sse-bad.c -fdump-tree-optimized -O3 -march=pentium4
> -mfpmath=sse -o sse-bad

This is known effect of SFmode SSE being slower than SFmode x87. But again, you
have enabled sse(2) builtins due to -march=pentium4.

Please try to compile using only "-O2" and "-O2 -msse". x87 math will be used
in both cases, but .optimized will show the difference. You can also try to
compile with and without -ffast-math.

IMO it is not acceptabe for tree dumps to depend on target compile flag in any
way...

-- 

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19780