[libgfortran, configury] BLAS-based implementation of matmul

FX Coudert fxcoudert@gmail.com
Sat Apr 1 15:48:00 GMT 2006

Hi all,

I've been playing this afternoon with a toy patch to have BLAS-based 
matmul routines in gfortran. What I tested, and which currently works, 
is the following: the BLAS routines (the ?GEMM routines, precisely) are 
called from within the libgfortran real matmul routines, depending on 
conditions about floating-point type, array size and presence of strides 
(if there are BLAS implementations that can perform on arrays with 
general strides, I'm not aware of it).

This differs from the approach Janne wanted to pursue, i.e. having the 
front-end directly generating the calls to BLAS routines. I think that 
easiness of implementation and maintainability both go for the 
libgfortran solution, while the performance impact shouldn't be so 

Anyway, my patch currently works and gives great speedups with an 
optimized BLAS on i686-linux (the Intel MKL). There are a few questions 
on which I'd like general feedback from the Fortran community, as well 
as people skilled in autoconf and top-level configury:

   - the BLAS library should be detected at compile-time, when the 
front-end is configured, since the specs need to include its path; 
unlike GMP/MPFR which are linked with the front-end, it will be linked 
to the created executable: are there example of how to handle this in 
the current gcc code? any particular idea how we might achieve this?

   - could that kind of idea raise any "political" objection?

And to the Fortran people specifically:

   - we could have the BLAS selection overridable (is that a real word?) 
at compile time, like -fforce-blas and -fforce-no-blas (or just -fblas 
and -fno-blas)

   - about the current matmul: why do we have special cases for 
(axstride == 1 && bxstride == 1) and the case (aystride == 1 && bxstride 
== 1), but not (axstride == 1 && bystride == 1) and (aystride == 1 && 
bystride == 1)?


More information about the Gcc-patches mailing list