This is the mail archive of the
fortran@gcc.gnu.org
mailing list for the GNU Fortran project.
Re: Is it a bug: data aligment of matrix returned by gemm
- From: Tobias Burnus <burnus at net-b dot de>
- To: Maciej dot Zwierzycki at ifmpan dot poznan dot pl
- Cc: fortran at gcc dot gnu dot org
- Date: Thu, 25 Jun 2009 16:07:18 +0200
- Subject: Re: Is it a bug: data aligment of matrix returned by gemm
- References: <20090625123831.GD19665@ifmpan.poznan.pl>
Maciej Zwierzycki wrote:
> I am not sure if the following behaviour is a bug or simply the matter
> of my poor understanding of F90 standard so I am not submitting this a
> a formal bug report.
>
It is indeed a GNU Fortran bug (no regression); I filled:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40551
General background:
The issue relates to when is data passed as array descriptor (dope
vector), i.e. including information about the bounds of the array, and
when just as a chunk of bytes. For "a(:)" (assumed shape) arrays, the
data is passed as is (via an array descriptor). On the other hand, all
arguments of "dgemm" are assumed size arguments ("a(*)") and the
interface is even not known; in this case, the arrays are passed
contiguous stream of bytes.
Thus when passing
call dgemm( ...., a, ...)
the following happens: If "a" is contiguous (i.e. it has no strides),
it(s data component) is passed as argument; if it is not contiguous, a
temporary array is created and the data of "a" is copied to the
temporary array. After the call, the data is copied back (copy in, copy
out).
That also works. The problem arises for your:
c(1,:,:)=matmul_func(a,b)
Here, c(1,:,:) has strides and as later in the call "dgemm" the "c"
argument needs to be contigious, at some point a "temp_c(2,2)" has to be
created and [copy in]/copy out of the data needs to happen, which - this
is the bug - does not happen in this case. For details, see below:
> function matmul_func(a,b)
> real(kind=8) :: matmul_func(size(a,1),size(b,2))
> [...]
>
>
The problem is that the result "c(1,:,:)" is passed by reference to
matmul; the dump looks as (-fdump-tree-original) follows:
matmul_func (&parm.31, D.1704, D.1714);
where &parm.31 is the "c" array with strides. The problem is now that
"matmul_func" is internally an array descriptor, but regarded as
contiguous. Thus for
> call dgemm('N','N',size(a,1),size(b,2),size(a,2),1.0d0,a,size(a,1),&
> &b,size(b,1),0.0d0,matmul_func,size(a,1))
>
the strides are ignored and simply the pointer to the data is passed:
dgemm (..., (real(kind=8)[0:] *) parm.26.data, ...);
The solution is that at some point a copying of the data needs to happen
- a natural choice would be for the call of matmul_func; actually, a
copy in is not needed for a function result thus only a copy out should
be implemented.
Thanks for the bug report!
Side note: If you just want to do a matrix multiplication, consider
using the MATMUL intrinsic directly. If you think that for larger
matrices, the BLAS function is better and you have a highly optimized
library, you can use with gfortran the options -fexternal-blas and
-fblas-matmul-limit=<n> (see manpage).
* * *
Tim Prince wrote:
> The compiler can't diagnose your syntax error, where you declare a with
> intent(in), although I suppose it could in principle if it could scan the
> source of dgemm.
Frankly, I do not understand this comment. "a" is unchanged according to
the dgemm man page - and according to the program. Only "c" and the
result variable "matmul_func" is changed.
> It should diagnose a syntax error when you don't set
> rank, and I can't see how you could get results without rank.
Indeed the compiler needs to have "rank" defined - and paramAter should
be paramEter.
MZ: In general, it helps if you provde a complete test case, which can
be compiled; i.e. the "..." of the interface filled and there should be
no rank/paramater problem. Though the omissions where small enough to be
filled in.
Tobias