This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug libfortran/51119] MATMUL slow for large matrices


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51119

Jerry DeLisle <jvdelisle at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jvdelisle at gcc dot gnu.org

--- Comment #16 from Jerry DeLisle <jvdelisle at gcc dot gnu.org> ---
For what its worth:

$ gfc pr51119.f90 -lblas -fno-external-blas -Ofast -march=native 
$ ./a.out 
 Time, MATMUL:    21.2483196       21.254449646000001     1.5055670945599979    
 Time, dgemm:    33.2441711       33.243087289000002      .96260614189671445    

This is on a laptop not taking any advantage of a tuned BLAS.  If I replace
-Ofast with -O2 I get:

$ ./a.out 
 Time, MATMUL:    43.6199570       43.625358022999997    0.73351833543988521    
 Time, dgemm:    33.2262650       33.226961453000001     0.96307331759072967 

-O3 brings performance back to match with -Ofast. It seems odd to me that -O2
does not do well.

Regardless, the internal MATMUL is doing better than BLAS on this platform, but
1.5 gflops is pretty lame either way.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]