This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: Gcc 3.1 performance regressions with respect to 2.95.3
- From: Jan Hubicka <jh at suse dot cz>
- To: Peter Schmid <schmid at snake dot iap dot physik dot tu-darmstadt dot de>
- Cc: gcc at gcc dot gnu dot org, libstdc++ at gcc dot gnu dot org
- Date: Tue, 12 Mar 2002 11:29:37 +0100
- Subject: Re: Gcc 3.1 performance regressions with respect to 2.95.3
- References: <Pine.LNX.4.30.0203120018400.27249-100000@snake.iap.physik.tu-darmstadt.de>
> I ran the bench++ test suite
> <www.research.att.com/~orost/bench_plus_plus.html> on my system.
> My system setup is:
> Pentium II running at 350 Mhz, linux 2.4.18, SuSE 7.3, glibc 2.2.4 +
> patches, Binutils 020305, gcc 3.1 0020306 on the 686-pc-linux-gnu target.
>
> I compared the performances of the binaries compiled by gcc 3.1 and gcc
> 2.95 at the -O2 optimization level. Unfortunately, there are
> regressions. 48 of the 121 tests ran faster when compiled by gcc
> 2.95.3. 30 of these ran more than 15 % faster.
>
> Gcc 3.1 is slower in the areas: E exception handling, L loop
> overhead, G io and S Stepanov.
This looks like very interesting benchmark. We now do have periodic
testing of basic code generation for C/fortran/C++ tests in spec2000,
where 3.1 is quite superrior now.
This benchmark appears to test more aspects of C++ compiler, like exception
handling completely ignored by SPECS. The slowdowns in EH are expectable,
but I am quite surprised about loop overhead. Does the benchamrks contain
some simple enought internal loops that slow downs and are examinable by hand?
Perhaps we can think about running some periodic testing of this or similar
C++ centric benchmark, as C++ performance is much less tested right now than
C is.
OK time to take a look at refered page :)
Honza
>
> Hope this helps,
>
> Peter Schmid
>
> RELATIVE TIMES ..........
> TEST NAME Pentium II, 350 MHz Pentium II, 350 MHz
> gcc 2.95.3 gcc 3.1
> -O2 -O2
> --------- ------------------- -------------------
> A000091 1.00 0.94
> A000092 1.00 0.89
> A000094a 1.00 1.01 *
> A000094b 1.00 0.93
> A000094c 1.00 0.70
>
> A000094d 1.00 0.99
> A000094e 1.00 1.05 *
> A000094f 1.00 0.78
> A000094g 1.00 0.96
> A000094h 1.00 0.75
>
> A000094i 1.00 0.66
> A000094j 1.00 1.38 * +
> A000094k 1.00 0.60
> B000002b 1.00 0.94
> B000003b 1.00 0.76
>
> B000004b 1.00 0.95
> B000010 1.00 0.94
> B000011 1.00 0.86
> B000013 1.00 0.69
> D000001 1.00 0.96
>
> D000002 1.00 1.00
> D000003 1.00 0.99
> D000004 1.00 1.00
> D000005 1.00 0.31
> D000006 1.00 0.99
>
> E000001 1.00 2.95 * +
> E000002 1.00 1.54 * +
> E000003 1.00 1.34 * +
> E000004 1.00 1.11 *
> E000006 1.00 -- (does not compile)
>
> E000007 1.00 1.93 * +
> E000008 1.00 1.25 * +
> F000001 1.00 0.18
> F000002 1.00 1.48 * +
> F000003 1.00 0.71
>
> F000004 1.00 0.76
> F000005 1.00 0.92
> F000006 1.00 0.50
> F000007 1.00 0.84
> F000008 1.00 0.61
>
> G000001 1.00 2.99 * +
> G000002 1.00 0.64
> G000003 1.00 0.12
> G000004 1.00 0.46
> G000005 1.00 4.89 * +
>
> G000006 1.00 1.61 * +
> G000007 1.00 3.11 * +
> H000001 1.00 0.45
> H000002 1.00 0.50
> H000003 1.00 0.76
>
> H000004 1.00 0.75
> H000005 1.00 0.00
> H000006 1.00 0.24
> H000007 1.00 2.07 * +
> H000008 1.00 0.35
>
> H000009 1.00 0.55
> L000001 1.00 1.13 *
> L000002 1.00 1.13 *
> L000003 1.00 1.23 * +
> L000004 1.00 1.17 *
>
> O000001a 1.00 0.78
> O000001b 1.00 0.77
> O000002a 1.00 0.79
> O000002b 1.00 0.96
> O000003a 1.00 0.90
>
> O000003b 1.00 0.81
> O000004a 1.00 1.07 *
> O000004b 1.00 1.07 *
> O000005a 1.00 0.80
> O000005b 1.00 0.36
>
> O000006a 1.00 0.66
> O000006b 1.00 0.66
> O000007a 1.00 1.05 *
> O000007b 1.00 1.25 * +
> O000008a 1.00 1.05 *
>
> O000008b 1.00 0.93
> O000009a 1.00 0.62
> O000009b 1.00 0.61
> O000010a 1.00 1.04 *
> O000010b 1.00 0.95
>
> O000011a 1.00 1.22 * +
> O000011b 1.00 0.85
> O000012a 1.00 0.91
> O000012b 1.00 0.99
> P000001 1.00 0.68
>
> P000002 1.00 0.66
> P000003 1.00 0.65
> P000004 1.00 nan
> P000005 1.00 1.16 * +
> P000006 1.00 1.04 *
>
> P000007 1.00 0.96
> P000008 1.00 1.30 * +
> P000010 1.00 1.09 *
> P000011 1.00 1.08 *
> P000012 1.00 0.94
>
> P000013 1.00 1.00
> P000020 1.00 0.89
> P000021 1.00 0.62
> P000022 1.00 0.62
> P000023 1.00 1.16 * +
>
> S000001a 1.00 0.67
> S000001b 1.00 0.75
> S000002a 1.00 0.89
> S000002b 1.00 1.01 *
> S000003a 1.00 0.55
>
> S000003b 1.00 0.98
> S000004a 1.00 1.04 *
> S000004b 1.00 1.23 * +
> S000005a 1.00 1.37 * +
> S000005b 1.00 1.50 * +
>
> S000005c 1.00 1.05 *
> S000005d 1.00 1.47 * +
> S000005e 1.00 1.24 * +
> S000005f 1.00 1.53 * +
> S000005g 1.00 1.24 * +
>
> S000005h 1.00 1.55 * +
> S000005i 1.00 1.04 *
> S000005j 1.00 1.47 * +
> S000005k 1.00 1.21 * +
> S000005l 1.00 1.50 * +
>
> S000005m 1.00 1.52 * +
>
> a000090.cpp Measure clock resolution by second differences
> a000091.cpp Dhrystone
> a000092.cpp Whetstone
> a000094a..k Hennesy benchmarks ***+
>
> This group of tests measures the performance of some
> real (useful) C++ code, including a tracker algorithm,
> an Orbit calculation, a Kalman filter, and a Centroid
> algorithm. Here is where other small useful benchmarks
> should be added. Please send ideas to
> "joseph.orost@att.com".
> b000002b.cpp Tracker: float
> b000003b.cpp Tracker: double
> b000004b.cpp Tracker: float & int
> b000010.cpp Orbit
> b000011.cpp Kalman
> b000013.cpp Centroid
>
> This group of tests measures dynamic allocation related
> timing.
> d000001.cpp malloc & free: 1000 ints
> d000002.cpp malloc & init & free: 1000 ints
> d000003.cpp new & delete: 1000 ints
> d000004.cpp new & init & delete: 1000 ints
> d000005.cpp alloca: 1000 ints (optional test)
> d000006.cpp alloca & init: 1000 ints (optional test)
>
> This group of tests measures exception related timing.
> For historical reasons, e000005 is missing.
> e000001.cpp Local exception caught * +
> e000002.cpp Class method exception caught * +
> e000003.cpp Procedure exception caught: 3-deep * +
> e000004.cpp Procedure exception caught: 4-deep *
> e000006.cpp Declared Procedure exception caught: 4-deep
> e000007.cpp Procedure exception caught: 4-deep re-thrown at each
> level * +
> e000008.cpp Procedure exception 4-deep: Implemented using
> setjmp/longjmp * +
>
> This group of tests measures coding style related timing.
> f000001.cpp Boolean assignment
> f000002.cpp Boolean if * +
> f000003.cpp 2-way if/else
> f000004.cpp 2-way switch
> f000005.cpp 10-way if/else
> f000006.cpp 10-way switch
> f000007.cpp 10-way sparse switch
> f000008.cpp 10-way virtual function call
>
> This group of tests measures I/O related timing.
> g000001.cpp iostream.getline: 20 char buffer * +
> g000002.cpp iostream.>> : 20 chars in loop
> g000003.cpp iostream.<< : 20 char buffer
> g000004.cpp iostream.<< : 20 chars in loop
> g000005.cpp istrstream.>> : int * +
> g000006.cpp istrstream.>> : float * +
> g000007.cpp fstream.open/fstream.close * +
>
> This group of tests measures machine level features.
> h000001.cpp packed bit arrays
> h000002.cpp unpacked bit arrays
> h000003.cpp packed bit ops in loop
> h000004.cpp unpacked bit ops in loop
> h000005.cpp int conversion
> h000006.cpp 10-float conversion
> h000007.cpp bit-fields * +
> h000008.cpp bit-fields and packed bit array
> h000009.cpp pack and unpack class objects
>
> This group of tests measures loop overhead related timing.
> l000001.cpp "for" loop *
> l000002.cpp "while" loop *
> l000003.cpp inf. loop w/break * +
> l000004.cpp 5-iteration loop *
>
> This group of tests measures optimizer performance.
> o000001a.cpp Constant Propagation (including math functions)
> o000001b.cpp " Hand Optimized
> o000002a.cpp Local Common Sub-expression (including math functions)
> o000002b.cpp " Hand Optimized
> o000003a.cpp Global Common Sub-expression
> o000003b.cpp " Hand Optimized
> o000004a.cpp Unnecessary Copy *
> o000004b.cpp " Hand Optimized *
> o000005a.cpp Code Motion (including math functions)
> o000005b.cpp " Hand Optimized
> o000006a.cpp Induction Variable
> o000006b.cpp " Hand Optimized
> o000007a.cpp Reduction in Strength (including math functions) *
> o000007b.cpp " Hand Optimized * +
> o000008a.cpp Dead Code *
> o000008b.cpp " Hand Optimized
> o000009a.cpp Loop Jamming
> o000009b.cpp " Hand Optimized
> o000010a.cpp Redundant Code *
> o000010b.cpp " Hand Optimized
> o000011a.cpp Unreachable Code * +
> o000011b.cpp " Hand Optimized
> o000012a.cpp String Ops
> o000012b.cpp " Hand Optimized
>
> This group of tests measures procedure call related timing.
> There is no test p000009, nor 14 thru 19.
> p000001.cpp Procedure Call: No Args
> p000002.cpp Procedure Call: No Args: Catches Exceptions
> p000003.cpp Static Class Method Call: No Args: Catches Exceptions
> p000004.cpp Inline Procedure Call: No Args
> p000005.cpp Static Class Method Call: 1-int Arg: Catches Exceptions * +
> p000006.cpp Static Class Method Call: 1-int *Arg: Catches Exceptions *
> p000007.cpp Static Class Method Call: 1-int &Arg: Catches Exceptions
> p000008.cpp Procedure Call: No Parameters: Called thru pointer,
> Catches Exceptions * +
> p000010.cpp Procedure Call: 10-int Args: Catches Exceptions *
> p000011.cpp Procedure Call: 20-int Args: Catches Exceptions *
> p000012.cpp Procedure Call: 10-(3-int) Args: Catches Exceptions
> p000013.cpp Procedure Call: 20-(3-int) Args: Catches Exceptions
> p000020.cpp Class Method Call: 1-"this" Arg: Catches Exceptions
> p000021.cpp Virtual Class Method Call: 1-"this" Arg: Catches Exceptions
> p000022.cpp Virtual Const Class Method Call: 1-"this" Arg: Catches Exceptions
> p000023.cpp Same as p000022: called in loop to see if lookup is
> optimized * +
>
> This group of tests measures object oriented style vs.
> C style.
> s000001a.cpp Max: C++ Style
> s000001b.cpp Max: C Style
> s000002a.cpp Matrix: C++ Style
> s000002b.cpp Matrix: C Style +
> s000003a.cpp Iterator: C++ Style
> s000003b.cpp Iterator: C Style
> s000004a.cpp Complex: C++ Style *
> s000004b.cpp Complex: C Style * +
> s000005a.cpp Stepanov: C++ Style Abstraction Level 12 * +
> s000005b.cpp Stepanov: C++ Style Abstraction Level 11 * +
> s000005c.cpp Stepanov: C++ Style Abstraction Level 10 *
> s000005d.cpp Stepanov: C++ Style Abstraction Level 9 * +
> s000005e.cpp Stepanov: C++ Style Abstraction Level 8 * +
> s000005f.cpp Stepanov: C++ Style Abstraction Level 7 * +
> s000005g.cpp Stepanov: C++ Style Abstraction Level 6 * +
> s000005h.cpp Stepanov: C++ Style Abstraction Level 5 * +
> s000005i.cpp Stepanov: C++ Style Abstraction Level 4 * +
> s000005j.cpp Stepanov: C++ Style Abstraction Level 3 * +
> s000005k.cpp Stepanov: C++ Style Abstraction Level 2 * +
> s000005l.cpp Stepanov: C++ Style Abstraction Level 1 * +
> s000005m.cpp Stepanov: C++ Style Abstraction Level 0 * +