[Bug tree-optimization/14741] graphite with loop blocking and interchanging doesn't optimize a matrix multiplication loop
dominiq at lps dot ens.fr
gcc-bugzilla@gcc.gnu.org
Fri Sep 11 21:55:00 GMT 2015
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=14741
--- Comment #35 from Dominique d'Humieres <dominiq at lps dot ens.fr> ---
I get
[Book15] f90/bug% /opt/gcc/gcc6p-227264p1/bin/gfortran -Ofast pr14741.f90
-floop-interchange -march=native -Wa,-q
[Book15] f90/bug% time a.out
0.48728300000000002 10.239999999999826
0.491u 0.006s 0:00.50 98.0% 0+0k 0+0io 0pf+0w
[Book15] f90/bug% /opt/gcc/gcc6p-227383p1/bin/gfortran -Ofast pr14741.f90
-floop-interchange -march=native -Wa,-q
[Book15] f90/bug% time a.out
1.4271590000000001 10.239999999999826
1.430u 0.008s 0:01.44 99.3% 0+0k 0+0io 0pf+0w
i.e., r227264 (or 4.8, 4.9, and 5.2) is ~3 time faster than r227383.
More information about the Gcc-bugs
mailing list