[graphite] Remove checks for flag_loop_strip_mine, flag_loop_block, flag_loop_interchange in graphite_trans_bb_block.
Jack Howarth
howarth@bromo.msbb.uc.edu
Fri Aug 15 14:26:00 GMT 2008
Tobias,
I am seeing the following results with the proposed patch from...
http://gcc.gnu.org/ml/gcc-patches/2008-08/msg01000.html
applied over gcc trunk with the previous graphite patch and all of
the subsequent changes in graphite branch applied...
================================================================================
Date & Time : 15 Aug 2008 1:33:29
Test Name : gfortran_lin_p4_graphite123
Compile Command : gfortran -ffast-math -funroll-loops -O3 -floop-block -floop-interchange -floop-strip-mine -fgraphite %n.f90 -o %n
Benchmarks : ac aermod air capacita channel doduc fatigue gas_dyn induct linpk mdbx nf protein rnflow test_fpu tfft
Maximum Times : 2000.0
Target Error % : 0.100
Minimum Repeats : 10
Maximum Repeats : 100
Benchmark Compile Executable Ave Run Number Estim
Name (secs) (bytes) (secs) Repeats Err %
--------- ------- ---------- ------- ------- ------
ac 5.17 10000 15.16 10 0.0275
aermod 26.97 0 -1.00 10 0.0275
air 4.91 10000 9.14 10 0.0699
capacita 3.58 10000 56.19 17 0.0885
channel 1.63 10000 3.79 10 0.0898
doduc 11.31 10000 48.37 10 0.0342
fatigue 5.80 10000 13.05 10 0.0421
gas_dyn 5.35 10000 11.10 35 0.0778
induct 9.24 10000 36.33 10 0.0151
linpk 1.78 10000 26.37 10 0.0286
mdbx 3.45 10000 15.65 10 0.0432
nf 4.99 10000 30.53 18 0.0950
protein 11.55 10000 51.00 10 0.0634
rnflow 12.10 10000 42.72 10 0.0565
test_fpu 11.19 10000 14.14 12 0.0962
tfft 1.00 10000 2.86 13 0.0983
Geometric Mean Execution Time = 23.45 seconds
================================================================================
================================================================================
Date & Time : 15 Aug 2008 2:54:19
Test Name : gfortran_lin_p4_graphite
Compile Command : gfortran -ffast-math -funroll-loops -O3 -fgraphite %n.f90 -o %n
Benchmarks : ac aermod air capacita channel doduc fatigue gas_dyn induct linpk mdbx nf protein rnflow test_fpu tfft
Maximum Times : 2000.0
Target Error % : 0.100
Minimum Repeats : 10
Maximum Repeats : 100
Benchmark Compile Executable Ave Run Number Estim
Name (secs) (bytes) (secs) Repeats Err %
--------- ------- ---------- ------- ------- ------
ac 2.17 10000 15.16 12 0.0286
aermod 26.31 0 -1.00 12 0.0286
air 4.87 10000 9.17 12 0.0910
capacita 2.93 10000 56.51 18 0.0903
channel 1.38 10000 3.79 10 0.0541
doduc 11.29 10000 48.38 14 0.0983
fatigue 4.69 10000 13.02 10 0.0552
gas_dyn 4.71 10000 11.23 26 0.0765
induct 8.24 10000 36.33 10 0.0201
linpk 1.45 10000 26.40 12 0.0322
mdbx 3.08 10000 15.66 10 0.0936
nf 4.03 10000 30.49 10 0.0832
protein 9.85 10000 51.04 10 0.0640
rnflow 10.74 10000 42.70 10 0.0625
test_fpu 9.99 10000 14.16 12 0.0739
tfft 0.95 10000 2.86 13 0.0784
Geometric Mean Execution Time = 23.47 seconds
================================================================================
On i686-apple-darwin9. I am surprised that I'm not seeing any
major performance changes one way or another with the new
loop optimizations. Does this match what you are seeing from
the Polyhedron 2005 benchmarks?
Jack
More information about the Gcc-patches
mailing list