[PATCH][RFC] Add an early loop unrolling pass, address PRs 18754 and 34223
Dominique Dhumieres
dominiq@lps.ens.fr
Wed Apr 23 15:57:00 GMT 2008
I have applied the patch on i686-apple-darwin9 (Core2Duo 2.16Ghz). For the
polyhedron test, the gain is marginal for ac.f90:
before: 12.67s, after: 12.27s
almost a factor 2 for induct.f90:
before: 60.94s, after: 35.84s
and slightly slower (within the upper bond of the noise):
capacita.f90, before: 55.05s, after: 55.42s
and
protein.f90, before: 46.05s, after: 46.46s.
There are still some problem with my hand-optimized variants of
induct.f90 (replacement of the dot-products by the sums of
their nonzero products):
induct_v2.f90, before: 33.54s, after: 58.41s
induct_v3.f90, before: 32.73s, after: 38.75s
(with the previous attempt, see #11 in pr34265, it was:
induct_v2.f90, before: 35.07s, after: 60.08s
induct_v3.f90, before: 34.40s, after: 58.64s
so induct_v3.f90 is now better handled.)
I confirm that the new patch has neglegible impact on the compilation
time (if any, faster compilation) and that gfortran.dg/array_1.f90
fails at -O3 as reported in comment #27 of pr34265. regtesting now in
progress.
Dominique
More information about the Gcc-patches
mailing list