This is the mail archive of the
fortran@gcc.gnu.org
mailing list for the GNU Fortran project.
Re: Polyhedron benchmark on Opteron
- From: Tim Prince <timothyprince at sbcglobal dot net>
- To: François-Xavier Coudert <fxcoudert at gmail dot com>
- Cc: Fortran List <fortran at gcc dot gnu dot org>, franke dot daniel at gmail dot com, burnus at net-b dot de
- Date: Sun, 01 Oct 2006 14:04:47 -0700
- Subject: Re: Polyhedron benchmark on Opteron
- Domainkey-signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=sbcglobal.net; h=Received:Message-ID:Date:From:Reply-To:User-Agent:MIME-Version:To:CC:Subject:References:In-Reply-To:Content-Type:Content-Transfer-Encoding; b=ZocL6YOzJmx81OmNYH0nMG50Nh3r/Z7/qxgl3PKBN0m+WaPC93CEvIu2gkN3uapx9OQZoQpYSI09KH4KtjzSET1qzlr+DaUPmMoDflQ4FdjHk9Pnr9IK7N40+ZtD/rSAXTrc08XoXKYONHL0somCbs8X3IkKXrlOAnBdltaJez8= ;
- References: <19c433eb0609290713x64f74089m45e5ea291343e1d7@mail.gmail.com>
- Reply-to: tprince at myrealbox dot com
François-Xavier Coudert wrote:
Unfortunately, there are also tests for which Intel is a clear winner:
-- aermod, by 44%
-- air, by 30%
On these benchmarks, the glibc x86-64 math library and memmove/memset
functions are slower than those supplied with ifort (even with ifort
options which prevent substitution of svml vector functions):
bash-3.1$ head -25 air*.pg
==> airgf.pg <==
Flat profile:
Each sample counts as 0.01 seconds.
% cumulative self self total
time seconds seconds calls s/call s/call name
50.72 4.59 4.59 1 4.59 7.36 MAIN__
9.28 5.43 0.84 24395 0.00 0.00 derivy_
8.51 6.20 0.77 24395 0.00 0.00 derivx_
7.51 6.88 0.68 __ieee754_pow
5.52 7.38 0.50 3484 0.00 0.00 state_
5.19 7.85 0.47 __exp1
4.31 8.24 0.39 3485 0.00 0.00 fvsplty2_
2.10 8.43 0.19 pow
2.10 8.62 0.19 3485 0.00 0.00 fvspltx2_
0.66 8.68 0.06 isnan
0.55 8.73 0.05 __ieee754_log
0.44 8.77 0.04 __ieee754_exp
0.44 8.81 0.04 __printf_fp
0.33 8.84 0.03 3484 0.00 0.00 aexit_
0.33 8.87 0.03 __mpn_mul_1
0.22 8.89 0.02 3484 0.00 0.00 botwall_
0.22 8.91 0.02 __ieee754_log10l
0.22 8.93 0.02 finite
0.11 8.94 0.01 276740 0.00 0.00 fd_alloc_w_at
0.11 8.95 0.01 90029 0.00 0.00 output_float
==> airif.pg <==
Flat profile:
Each sample counts as 0.01 seconds.
% cumulative self self total
time seconds seconds calls s/call s/call name
57.63 4.57 4.57 1 4.57 7.32 MAIN__
11.85 5.51 0.94 24395 0.00 0.00 derivx_
10.59 6.35 0.84 24395 0.00 0.00 derivy_
7.06 6.91 0.56 3484 0.00 0.00 state_
5.80 7.37 0.46 pow.L
3.66 7.66 0.29 3485 0.00 0.00 fvsplty2_
0.88 7.73 0.07 3485 0.00 0.00 fvspltx2_
0.63 7.78 0.05 write
0.38 7.81 0.03 cvtas_t_to_a
0.25 7.83 0.02 3484 0.00 0.00 aexit_
0.25 7.85 0.02 exp.L
0.13 7.86 0.01 3484 0.00 0.00 botwall_
0.13 7.87 0.01 3484 0.00 0.00 inlet_
0.13 7.88 0.01 3484 0.00 0.00 topwall_
0.13 7.89 0.01 cvt_ieee_t_to_text_ex
0.13 7.90 0.01 for__interp_fmt
0.13 7.91 0.01 log.A
0.13 7.92 0.01 matherr
0.13 7.93 0.01 pow
0.00 7.93 0.00 23 0.00 0.00 spectop_
bash-3.1$ head -25 aer*.pg
==> aermgf.pg <==
Flat profile:
Each sample counts as 0.01 seconds.
% cumulative self self total
time seconds seconds calls s/call s/call name
21.60 9.30 9.30 __ieee754_powf
7.56 12.56 3.26 memmove
6.30 15.27 2.71 78026930 0.00 0.00 anyavg_
5.90 17.81 2.54 325550061 0.00 0.00 gintrp_
5.37 20.12 2.31 memset
4.73 22.15 2.04 30838444 0.00 0.00 sigz_
4.24 23.98 1.83 382411111 0.00 0.00
_gfortran_copy_string
4.03 25.71 1.74 99400982 0.00 0.00 locate_
3.23 27.10 1.39 __ieee754_expf
3.02 28.40 1.30 hasmntopt
2.92 29.66 1.26 69603722 0.00 0.00
_gfortrani_compare_string
2.72 30.83 1.17 20043492 0.00 0.00 iblval_
2.16 31.76 0.93 __profile_frequency
2.15 32.68 0.93 30838444 0.00 0.00 rmssig_
1.97 33.53 0.85 8796344 0.00 0.00 refl_ht_
1.78 34.30 0.77 powf
1.41 34.90 0.61 30485138 0.00 0.00 szsfcl_
1.10 35.38 0.48 13774613 0.00 0.00 sigy_
0.82 35.73 0.36
_gfortran_string_repeat
0.81 36.08 0.35 8817433 0.00 0.00 vrtsbl_
==> aermif.pg <==
Flat profile:
Each sample counts as 0.01 seconds.
% cumulative self self total
time seconds seconds calls s/call s/call name
10.81 2.37 2.37 78028283 0.00 0.00 anyavg_
10.26 4.62 2.25 powf.L
9.85 6.78 2.16 99401989 0.00 0.00 locate_
7.20 8.36 1.58 30838467 0.00 0.00 sigz_
7.20 9.94 1.58 321952483 0.00 0.00 gintrp_
4.97 11.03 1.09 20043496 0.00 0.00 iblval_
4.38 11.99 0.96 __getclktck
3.97 12.86 0.87 8796380 0.00 0.00 refl_ht_
3.83 13.70 0.84 30838467 0.00 0.00 rmssig_
3.56 14.48 0.78 __profile_frequency
2.69 15.07 0.59 13774543 0.00 0.00 sigy_
2.07 15.53 0.46 for_cpstr
1.69 15.90 0.37 30485161 0.00 0.00 szsfcl_
1.55 16.24 0.34 _intel_fast_memcmp
1.50 16.57 0.33 8797813 0.00 0.00 aer_achi_
1.23 16.84 0.27 21904741 0.00 0.00 heff_
1.09 17.08 0.24 8622155 0.00 0.00 pwidth_
1.09 17.32 0.24 3220978 0.00 0.00 unlump_
1.05 17.55 0.23 6956351 0.00 0.00 position_
1.00 17.77 0.22 8817469 0.00 0.00 vrtsbl_