This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: GCC Benchmarks (coybench), AMD64 and i686, 14 August 2004

From: Scott Robert Ladd <coyote at coyotegulch dot com>
To: "Kaveh R. Ghazi" <ghazi at caip dot rutgers dot edu>
Cc: arnaud dot desitter at ouce dot ox dot ac dot uk, gcc at gcc dot gnu dot org
Date: Wed, 18 Aug 2004 12:43:25 -0400
Subject: Re: GCC Benchmarks (coybench), AMD64 and i686, 14 August 2004
References: <411F794B.8090704@coyotegulch.com> <200408180133.i7I1XkkY006915@caip.rutgers.edu> <01da01c48524$20a5a520$d92601a3@ouce.ox.ac.uk> <200408181400.i7IE01G4002722@caip.rutgers.edu>

Kaveh R. Ghazi wrote:

 > >icc -E -dM -O3 -xN qq.c | grep INLINE
 > #define __NO_MATH_INLINES 1
 > #define __NO_STRING_INLINES 1
 > #define __NO_INLINE__ 1


Well, I for one think this is *highly* unfair. :-)
If icc gets to avoid using glibc's macros, then so should we!

This issue has arisen before. Let's throw all these suggestions together (including the suggested changes in "mole"), run some compiles with GCC 3.5-20040814, and see what we get:

A = gcc -o bench_3.5_O3_p4 -lrt -lm -std=gnu99
        -O3
        -march=pentium4
        *.c

B = gcc -o bench_3.5_O3_p4_fm -lrt -lm -std=gnu99
        -O3
        -march=pentium4
        -ffast-math
        *.c

C = gcc -o bench_3.5_all -lrt -lm -std=gnu99
        -O3
        -march=pentium4
        -ffast-math
        -mfpmath=sse
        -D__NO_MATH_INLINES
        -D__NO_STRING_INLINES
        -D__NO_INLINE
        *.c

icc = icc -o iccbench -O3 -xN -tpp7 -ipo -lm -lrt *.c

              A      B      C     icc
            -----  -----  -----  -----
     alma:   43.2   22.2   23.7   13.2
     arco:   27.4   27.2   27.2   20.5
      evo:   43.4   42.0   63.8   29.8
      fft:   27.7   27.5   28.4   30.4
     huff:   13.9   13.1   13.2   16.4
      lin:   20.2   19.6   19.8   19.2
     mat1:    7.7    7.5    7.4    7.4
     mole:    8.8    6.7    6.8    2.1
     tree:   26.0   25.7   25.8   28.8
    -----   -----  -----  -----  -----
    total:  218.4  191.7  216.2  167.8

Hmmmm... I don't see where adding the -D__NO_??? options helped GCC -- in fact, those options hindered run time severely on the evo test!

Now people know why I don't specify all those #defines when I run my tests; I haven't seen a measurable gain in generated code speed from their use.

I note that ICC wins 4 benchmarks decisively, including three are are wholly of my own design and completely original (alma, arco, evo). Furthermore, the code changes in "mole" helped both gcc and icc.

I'm likely to toss out mat1 and mole soon, replacing them with a wavelet transform and a particle system computation.

More food for thought, I hope.

--
Scott Robert Ladd
Coyote Gulch Productions (http://www.coyotegulch.com)
Software Invention for High-Performance Computing

Follow-Ups:
- Re: GCC Benchmarks (coybench), AMD64 and i686, 14 August 2004
  - From: Kaveh R. Ghazi

References:
- GCC Benchmarks (coybench), AMD64 and i686, 14 August 2004
  - From: Scott Robert Ladd
- Re: GCC Benchmarks (coybench), AMD64 and i686, 14 August 2004
  - From: Kaveh R. Ghazi
- Re: GCC Benchmarks (coybench), AMD64 and i686, 14 August 2004
  - From: Arnaud Desitter
- Re: GCC Benchmarks (coybench), AMD64 and i686, 14 August 2004
  - From: Kaveh R. Ghazi

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]