This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

RE: 3.0.1 performance


interesting results.
on the topic of performance on numerical/fp intensive code, i think another
good benchmark of pratical code might be dave eberly's lib?
http://www.magic-software.com/

and this lib is pretty commonly used in the game development community.

see sleep;
-akbar A.



-----Original Message-----
From: gcc-owner@gcc.gnu.org [mailto:gcc-owner@gcc.gnu.org]On Behalf Of
Kurt Garloff
Sent: Friday, August 17, 2001 2:17 AM
To: gcc@gcc.gnu.org
Subject: 3.0.1 performance


Hi,

doing some performance evaluation of the 3.0.1 prerelease (20010816)
gives rather disappointing ressults :-(
The code used is the benchmark from TBCI NumLib 2.2.1
http://pebbles.e-technik.uni-dortmund.de/numlib/
(If you want to test yourself, please ask me to send you an updated
 Makefile.)
doxygen docu of this code is also available.

The code is heavily inlined templated numerical code representing Vectors
and Matrices and including solvers and similar stuff.

Here's the results:

3.0
---
garloff@pckurt:~/Physics/numerix-2.0-gcc3/bench $ time make OPT=1 OPT_ARCH=1
CC="gcc -V 3.0" CXX="g++ -V 3.0" SYSDEP=bin-ix86-300 -j2
[...]
real    2m32.318s	user    3m48.110s	sys     0m7.130s

garloff@pckurt:~/Physics/numerix-2.0-gcc3/bench $
bin-ix86-300/libbench_double
=============================================================
TBCI NumLib (2.2.1) benchmark 1.35 (Aug 17 2001 10:40:53)
[...]
Total for ALL           : 20.394s e, 20.330s u+s
=============================================================
=> libbench SPEC values :  0.853      0.855

This value is compared with the same benchmark on this reference machine:
[Athlon1-700, 256MB PC-133, Linux-2.4, TBCI 2.1.2, gcc-3.0.0, OPT=1
OPT_ARCH=1]
(OPT_ARCH evaluates to -march=athlon there)

Flags are:
Compiler: Reading specs from
 /raid/gcc300/lib/gcc-lib/i686-pc-linux-gnu/3.0/specs ../configure
 --with-gcc-version-trigger=/raid/egcs/gcc/version.c --host=i686-pc-linux-gn
u
 --with-system-zlib --with-gnu-ld --with-gnu-as --enable-libstdcxx-v3
 --prefix=/raid/gcc300 --enable-haifa --enable-threads=posix Thread model:
 posix gcc driver version 3.0.1 20010816 (prerelease) executing gcc version
3.0 g++: No input files
Bench flags:  -O2  -ffast-math -felide-constructors -O3 -fschedule-insns2
 -funroll-loops -freduce-all-givs -frerun-loop-opt -fno-inline-functions
 -fstrict-aliasing   -mcpu=pentiumpro -march=pentiumpro -D__GNUC_SUBVER__=0
Machine info: pckurt.casa-etp.nl Intel Intel 686 686 Pentium III
 (Coppermine) Pentium III (Coppermine) [8] [8] st.6 st.6 708.115 MHz 708.115
 MHz 1412.30 Bogos 1415.57 Bogos 627 MB Linux 2.4.7-SMP SMP
Fri Aug 17 10:46:29 CEST 2001 up 1 day, 6:53, 10 users, load: 2.64, 2.81,
2.74

3.0.1
-----
garloff@pckurt:~/Physics/numerix-2.0-gcc3/bench $ time make OPT=1 OPT_ARCH=1
SYSDEP=bin-ix86-301 -j2
real    1m56.750s	user    2m49.840s	sys     0m3.970s

garloff@pckurt:~/Physics/numerix-2.0-gcc3/bench $
bin-ix86-301/libbench_double
[...]
Total for ALL           : 54.430s e, 54.390s u+s
=============================================================
=> libbench SPEC values :  0.319      0.320

For reference:
2.95.3 (SuSE)
------
garloff@pckurt:~/Physics/numerix-2.0-gcc295/bench $ time make OPT=1
OPT_ARCH=1 -j2
real    2m24.167s	user    3m35.440s	sys     0m5.790s

garloff@pckurt:~/Physics/numerix-2.0-gcc295/bench $ bin-ix86/libbench_double
[...]
Total for ALL           : 20.148s e, 20.010s u+s
=============================================================
=> libbench SPEC values :  0.863      0.869


Conclusion:
===========
While compile performance of 3.0.1 is improved by a factor of 1.34 over 3.0
(1.27 over 2.95.3), performance dropped dramatically, by a factor of
2.67 compared to 3.0 (2.71 comp. to 2.95.3).

Looking at the detailed results, the vector operations seem to have almost
the same performance, whereas the matrix operations are much slower, and
thus
the solvers are also very slow.

Is this problem known?

I've not been following discussions on the gcc list too well the last weeks,
but I'd suspect the inlining limitations. I guess, we've gone too far there.
Or did the meaning of -fno-inline-functions change and does not only prevent
automatic inlining but inlining completely?

I did not yet play with parameters, like using plain -O3 (is it useable in
3.0.1?) or -finline-limit yet. Should I?
Other things I should try?
Suggestions, patches, command-line parameters, ... welcome.

(Please Cc: me when answering, I'm not subscribed to the gcc list.)

Regards,
--
Kurt Garloff                   <kurt@garloff.de>         [Eindhoven, NL]
Physics: Plasma simulations  <K.Garloff@Phys.TUE.NL>  [TU Eindhoven, NL]
Linux: SCSI, Security          <garloff@suse.de>    [SuSE Nuernberg, DE]
 (See mail header or public key servers for PGP2 and GPG public keys.)


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]