This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug fortran/68600] New: Inlined MATMUL is too slow.
- From: "dominiq at lps dot ens.fr" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: Sat, 28 Nov 2015 18:00:44 +0000
- Subject: [Bug fortran/68600] New: Inlined MATMUL is too slow.
- Auto-submitted: auto-generated
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68600
Bug ID: 68600
Summary: Inlined MATMUL is too slow.
Product: gcc
Version: 6.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: fortran
Assignee: unassigned at gcc dot gnu.org
Reporter: dominiq at lps dot ens.fr
Target Milestone: ---
Expected results:
(1) to be at least as fast as the MATMUL from the library when compiled
with the same options (-O2 -ftree-vectorize -funroll-loops);
(2) to be at least as fast as dgemm from lapack when compiled
with the same options.
Options tested
(a) -O2 -ftree-vectorize -funroll-loops -fno-frontend-optimize, i.e., MATMUL
from the library;
(b) -O2 -ftree-vectorize -funroll-loops, i.e., inlined MATMUL;
(c) -Ofast -march=native -funroll-loops;
(d) -O2 -ftree-vectorize.
Timings in Gflops/s on a Corei7 2.8Ghz (turbo 3.8Ghz) show that neither (1) nor
(2) are true (comparing columns 4 and 6 gives an idea of timings accuracy).
(a) (b)
Size Loops Matmul dgemm Matmul dgemm
===================================================
2 200000 0.360 0.218 0.723 0.221
4 200000 1.246 0.959 1.379 0.969
8 200000 2.098 2.396 2.186 2.385
16 200000 3.748 3.648 2.920 3.645
32 200000 5.386 5.406 3.096 5.418
64 30757 6.364 6.385 3.220 6.494
128 3829 6.362 6.760 3.256 6.702
256 477 6.515 6.527 3.164 6.444
512 59 6.313 6.634 3.189 6.675
1024 7 4.796 4.842 2.935 4.853
2048 1 4.026 4.032 2.824 3.996
4096 1 3.355 3.467 2.652 3.475
(c) (d)
Size Loops Matmul dgemm Matmul dgemm
========================================================
2 200000 0.403 0.172 0.919 0.204
4 200000 0.956 0.799 1.668 1.104
8 200000 1.796 2.089 2.060 2.310
16 200000 2.948 4.297 2.253 3.475
32 200000 4.119 6.219 2.049 4.229
64 30757 5.174 7.652 2.268 4.464
128 3829 5.042 6.985 2.371 4.353
256 477 5.052 6.492 2.423 4.696
512 59 5.136 6.738 2.421 4.704
1024 7 3.978 5.075 2.361 4.012
2048 1 3.476 4.304 2.372 3.543
4096 1 2.966 3.307 2.370 3.333