This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

[Bug libfortran/51119] MATMUL slow for large matrices

From: "jvdelisle at gcc dot gnu.org" <gcc-bugzilla at gcc dot gnu dot org>
To: gcc-bugs at gcc dot gnu dot org
Date: Sun, 08 Nov 2015 23:33:14 +0000
Subject: [Bug libfortran/51119] MATMUL slow for large matrices
Auto-submitted: auto-generated
References: <bug-51119-4 at http dot gcc dot gnu dot org/bugzilla/>

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51119

Jerry DeLisle <jvdelisle at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jvdelisle at gcc dot gnu.org

--- Comment #16 from Jerry DeLisle <jvdelisle at gcc dot gnu.org> ---
For what its worth:

$ gfc pr51119.f90 -lblas -fno-external-blas -Ofast -march=native 
$ ./a.out 
 Time, MATMUL:    21.2483196       21.254449646000001     1.5055670945599979    
 Time, dgemm:    33.2441711       33.243087289000002      .96260614189671445    

This is on a laptop not taking any advantage of a tuned BLAS.  If I replace
-Ofast with -O2 I get:

$ ./a.out 
 Time, MATMUL:    43.6199570       43.625358022999997    0.73351833543988521    
 Time, dgemm:    33.2262650       33.226961453000001     0.96307331759072967 

-O3 brings performance back to match with -Ofast. It seems odd to me that -O2
does not do well.

Regardless, the internal MATMUL is doing better than BLAS on this platform, but
1.5 gflops is pretty lame either way.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]