This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug libfortran/51119] MATMUL slow for large matrices
- From: "Joost.VandeVondele at mat dot ethz.ch" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: Tue, 24 Nov 2015 17:45:25 +0000
- Subject: [Bug libfortran/51119] MATMUL slow for large matrices
- Auto-submitted: auto-generated
- References: <bug-51119-4 at http dot gcc dot gnu dot org/bugzilla/>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51119
--- Comment #29 from Joost VandeVondele <Joost.VandeVondele at mat dot ethz.ch> ---
(In reply to Thomas Koenig from comment #27)
> (In reply to Joost VandeVondele from comment #22)
> If the compiler turns out not to be reasonably smart, file a bug report :-)
what is needed for large matrices (in my opinion) is some simple loop tiling,
as can, in principle, be achieved with graphite : this is my PR14741
Good vectorization, which gcc already does well, just requires the proper
compiler options for the matmul implementation, i.e. '-O3 -march=native
-ffast-math'. However, this would require the Fortran runtime to be compiled
with such options, or at least a way to provide specialized (avx2 etc)
routines.
There is however the related PR (inner loop of matmul) : PR25621, where some
unusual flag combo helps (-fvariable-expansion-in-unroller -funroll-loops)
I think external blas and inlining of small matmuls are good things, but I
would expect the default library implementation to reach at least 50% of peak
(for e.g. a 4000x4000 matrix), which is not all that hard. Actually, would be
worth an experiment, but a Fortran loop nest which implements a matmul compiled
with ifort would presumably reach that or higher :-).
These slides show how to reach 90% of peak:
http://wiki.cs.utexas.edu/rvdg/HowToOptimizeGemm/
the code actually is not too ugly, and I think there is no need for the
explicit vector intrinsics with gcc.
I believe I had once a bug report open for small matrices, but this might have
been somewhat fixed in the meanwhile.