This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug target/27827] gcc 4 produces worse x87 code on all platforms than gcc 3
- From: "uros at kss-loka dot si" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: 1 Jun 2006 08:43:34 -0000
- Subject: [Bug target/27827] gcc 4 produces worse x87 code on all platforms than gcc 3
- References: <bug-27827-12761@http.gcc.gnu.org/bugzilla/>
- Reply-to: gcc-bugzilla at gcc dot gnu dot org
------- Comment #9 from uros at kss-loka dot si 2006-06-01 08:43 -------
The benchmark run on a Pentium4 3.2G/800MHz FSB (32bit):
vendor_id : GenuineIntel
cpu family : 15
model : 2
model name : Intel(R) Pentium(R) 4 CPU 3.20GHz
stepping : 9
cpu MHz : 3191.917
cache size : 512 KB
shows even more interesting results:
gcc version 3.4.6
vs.
gcc version 4.2.0 20060601 (experimental)
-fomit-frame-pointer -O -msse2 -mfpmath=sse
GCC 3.x performance:
./xmm_gcc
ALGORITHM NB REPS TIME MFLOPS
========= ===== ===== ========== ==========
atlasmm 60 1000 0.162 2664.87
GCC 4.x performance:
./xmm_gc4
ALGORITHM NB REPS TIME MFLOPS
========= ===== ===== ========== ==========
atlasmm 60 1000 0.164 2633.13
and
-fomit-frame-pointer -O -mfpmath=387
GCC 3.x performance:
./xmm_gcc
ALGORITHM NB REPS TIME MFLOPS
========= ===== ===== ========== ==========
atlasmm 60 1000 0.160 2697.37
GCC 4.x performance:
./xmm_gc4
ALGORITHM NB REPS TIME MFLOPS
========= ===== ===== ========== ==========
atlasmm 60 1000 0.164 2633.15
There is a small performance drop on gcc-4.x, but nothing critical.
I can confirm, that code indeed runs >50% slower on 64bit athlon. Perhaps the
problem is in the order of instructions (Software Optimization Guide for AMD
Athlon 64, Section 10.2). The gcc-3.4 code looks similar to the example, how
things should be, and gcc-4.2 code looks similar to the example, how things
should _NOT_ be.
BTW: Did you try to run the benchmark on AMD target with -march=k8? The effects
of this flag are devastating on Pentium4 CPU:
-O -msse2 -mfpmath=sse -march=k8
./xmm_gcc
ALGORITHM NB REPS TIME MFLOPS
========= ===== ===== ========== ==========
atlasmm 60 1000 0.836 516.79
GCC 4.x performance:
./xmm_gc4
ALGORITHM NB REPS TIME MFLOPS
========= ===== ===== ========== ==========
atlasmm 60 1000 0.287 1504.66
--
uros at kss-loka dot si changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |NEW
Ever Confirmed|0 |1
Last reconfirmed|0000-00-00 00:00:00 |2006-06-01 08:43:34
date| |
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=27827