spec2k comparison of gcc 4.1 and 4.2 on AMD K8

Vladimir N. Makarov vmakarov@redhat.com
Sun Feb 25 21:21:00 GMT 2007


Serge Belyshev wrote:

>I have compared 4.1.2 release (r121943) with three revisions of 4.2 on spec2k
>on an 2GHz AMD Athlon64 box (in 64bit mode), detailed results are below.
>
>In short, current 4.2 performs just as good as 4.1 on this target
>with the exception of huge 80% win on 178.galgel. All other difference
>lies almost in the noise.
>
>results:
>
>first number in each column is a runtime difference in %
>between corresponding 4.2 revision and 4.1.2 (+ is better, - is worse).
>
>second number is a +- confidence interval, i.e. according to my results,
>current 4.2 does (82.0+-1.7)% better than 4.1.2 on 178.galgel.
>
>(note some results are clearly noisy, but I've tried hard to avoid this --
>I did three runs on a completely idle machine, wasting 14 hours of machine time in total).
>
>  
>
I run SPEC2000 several times per week and always look at 3 runs (to be 
sure that is nothing wrong happened) but I never saw such big 
"confidence" intervals (as I understand that is difference between max 
and min of 3 runs divided by the score).  Although I should acknowledge 
that I never ran SPEC2000 on AMD machines and some processors generates 
less "confident intervals".  There are tests like art for which the 
difference between min and max can be big but geometric meaning makes 
the effect of such differences  smaller in  the  overall  score.  If the 
machine has only 512 Mb memory (even they write that it is enough for 
SPEC2000), the scores for some benchmark programs may be unstable.  Also 
if  the middle score (of 3 runs) for base or peak is bigger on a program 
even the best (max) scores for peak or base are the same,  usually the 
opposite happens on another benchmark program so it also makes the 
overall score smoother.

So I trust overall score SPEC2000 and on my evaluation the measure error 
of the overal score for Core2 Duo (which I usually use for Spec2000) is 
+-0.3%.  It would be better if you posted them (but probably something 
wrong happened on your machine during the run).

Although I must say you did a really big job, thank you.

>r117890 -- 4.2 just before DannyB's aliasing fixes
>r117891 -- 4.2 with aliasing fixes.
>r122236 -- 4.2 current.
>
>CINT2000         r117890         r117891         r122236
>
>164.gzip        -4.2 1.7        -4.2 1.2        -4.0 1.3
>175.vpr          1.7 2.6         1.4 2.3         1.1 2.5
>176.gcc         -0.5 0.8        -0.8 1.1        -1.2 4.0
>181.mcf         -0.4 2.0        -0.1 2.1        -0.6 2.7
>186.crafty      -0.4 6.4        -1.3 7.0         0.8 4.4
>197.parser       0.7 1.3         0.8 1.5        -0.3 1.6
>252.eon          8.8 3.7        10.6 9.4         6.9 4.7
>253.perlbmk      2.7 1.0         3.4 1.4         3.0 1.9
>254.gap         -0.6 0.5        -0.5 0.4        -0.4 0.6
>255.vortex       1.3 0.9         1.2 1.2         1.4 1.1
>256.bzip2        0.6 1.6         0.9 1.6         0.4 1.7
>300.twolf        0.1 4.5         0.8 1.4        -0.6 2.0
>
>
>CFP2000
>
>168.wupwise      0.2 22.0        0.1 22.1        2.2 13.6
>171.swim        -0.1 0.7        -0.3 0.1        -0.3 0.2
>172.mgrid       -6.3 0.4        -6.1 0.4        -6.6 0.3
>173.applu       -0.1 0.8         0.1 0.9        -0.4 0.1
>177.mesa         6.9 15.1        7.2 15.1        3.9 5.3
>178.galgel      80.8 1.7        80.9 2.0        82.0 1.7
>179.art          0.8 8.9        -1.6 8.1        -0.3 5.1
>183.equake      -0.9 1.0        -0.8 0.9        -0.9 0.9
>187.facerec      2.7 0.7         2.9 0.8         3.0 0.6
>188.ammp        -0.4 0.5        -0.1 1.0        -0.5 0.7
>189.lucas       -0.8 0.5        -0.7 0.6        -0.4 0.6
>191.fma3d        1.1 2.1        -0.9 2.3        -1.0 2.2
>200.sixtrack    -0.7 0.4        -0.7 0.5        -1.3 0.4
>301.apsi        -3.0 1.4        -2.7 1.1        -3.1 0.3
>
>
>remarks:
>
>1. big jump on 178.galgel can be seen here too:
>   http://www.suse.de/~aj/SPEC/amd64/CFP/sandbox-britten/178_galgel_big.png
>
>2. even though I did three runs, most of the difference is noise,
>   which means that one should treat single-run spec results with a *big* grain of salt.
>
>3. on this AMD K8 machine the difference between 4.2 with aliasing fixes and 4.2 w/o
>   aliasing fixes lies completely in the noise (modulo small 2% 191.fma3d regression).
>  
>



More information about the Gcc mailing list