This is the mail archive of the fortran@gcc.gnu.org mailing list for the GNU Fortran project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Polyhedron benchmark on Opteron


François-Xavier Coudert wrote:

I wanted to report some results for the Polyhedron benchmark** on
Opteron (hardware details at the bottom of this mail). I used gfortran
mainline (4.2.0 on 2006-09-28) and Intel 9.1.037 for comparison.
Options used are :
 * gfortran -march=k8 -ffast-math -funroll-loops -static -O3
 * ifort -O3 -xW -ipo -static -V


Unfortunately, there are also tests for which Intel is a clear winner:

 -- fatigue, by 22%
 -- gas_dyn, by 115%
I used Core 2 Duo 2.93Ghz, 4GB DDR2-667, a somewhat older version of gfortran 4.2
gfortran -funroll-loops -ftree-vectorize -pg
ifort -xW -fp-model precise -pg


gfortran profile of fatigue:
Each sample counts as 0.01 seconds.
% cumulative self self total
time seconds seconds calls s/call s/call name
52.75 5.49 5.49 28446735 0.00 0.00 __perdida_m__perdida
34.01 9.03 3.54 31712641 0.00 0.00 __perdida_m__generalized_h
ookes_law
13.07 10.39 1.36 1 1.36 10.41 MAIN__
0.19 10.41 0.02 1443280 0.00 0.00 __perdida_m__damage_rate


ifort profile:
65.84 6.32 6.32 28446735 0.00 0.00 perdida_m_mp_perdida_
15.00 7.76 1.44 31712641 0.00 0.00 perdida_m_mp_generalized_h
ookes_law_
14.48 9.15 1.39 1 1.39 9.18 MAIN__
2.19 9.36 0.21 cos.L
2.08 9.56 0.20 sin.L


So, gfortran loses performance only in generalized_hookes_law.

gfortran profile of gas_dyn:
  %   cumulative   self              self     total
 time   seconds   seconds    calls   s/call   s/call  name
 86.71      5.15     5.15    10002     0.00     0.00  eos_
  6.40      5.53     0.38    10001     0.00     0.00  chozdt_
  3.70      5.75     0.22  1434725     0.00     0.00  area_
  3.20      5.94     0.19        1     0.19     5.94  MAIN__
  0.00      5.94     0.00    50000     0.00     0.00  drag_

ifort:
39.53 0.83 0.83 10002 0.00 0.00 eos_
28.57 1.43 0.60 10001 0.00 0.00 chozdt_
15.24 1.75 0.32 1434730 0.00 0.00 area_
5.71 1.87 0.12 1 0.12 1.87 MAIN__
3.33 1.94 0.07 for_write_seq_fmt_xmit


By fixing those silly loops where the last value is set to the next to last value inside the loop, rather than afterwards, gfortran can chop off 0.30 seconds, leaving the monster array assignment with vector sqrt in eos as the one performance differentiation.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]