[PATCH][RFC] Add an early loop unrolling pass, address PRs 18754 and 34223

Dominique Dhumieres dominiq@lps.ens.fr
Wed Apr 23 15:57:00 GMT 2008


I have applied the patch on i686-apple-darwin9 (Core2Duo 2.16Ghz). For the 
polyhedron test, the gain is marginal for ac.f90:

before: 12.67s, after: 12.27s

almost a factor 2 for induct.f90:

before: 60.94s, after: 35.84s

and slightly slower (within the upper bond of the noise):

capacita.f90, before: 55.05s, after: 55.42s

and

protein.f90, before: 46.05s, after: 46.46s.

There are still some problem with my hand-optimized variants of
induct.f90 (replacement of the dot-products by the sums of
their nonzero products):

induct_v2.f90, before: 33.54s, after: 58.41s
induct_v3.f90, before: 32.73s, after: 38.75s

(with the previous attempt, see #11 in pr34265, it was:

induct_v2.f90, before: 35.07s, after: 60.08s
induct_v3.f90, before: 34.40s, after: 58.64s

so induct_v3.f90 is now better handled.)

I confirm that the new patch has neglegible impact on the compilation
time (if any, faster compilation) and that gfortran.dg/array_1.f90
fails at -O3 as reported in comment #27 of pr34265. regtesting now in
progress.

Dominique



More information about the Gcc-patches mailing list