This is the mail archive of the
fortran@gcc.gnu.org
mailing list for the GNU Fortran project.
performance question: cloning allocated arrays
- From: "Daniel Franke" <franke dot daniel at gmail dot com>
- To: fortran at gcc dot gnu dot org
- Date: Mon, 26 Feb 2007 19:12:59 +0100
- Subject: performance question: cloning allocated arrays
- Dkim-signature: a=rsa-sha1; c=relaxed/relaxed; d=gmail.com; s=beta; h=domainkey-signature:received:received:message-id:date:from:to:subject:mime-version:content-type:content-transfer-encoding:content-disposition; b=GAwawlM7wfBzhLEvo/fvO4zmpLaY575K8/pXl5ub+NSz70drYNN3XAVu8vS6XK7Q1wqJ2xAZpQRgv5rh9Sb4xx4w6gUUqbDYltBo5ATe/cdkkjkFfInJH/AoN0nswKlQYp7txWIsk94r8Q2yT88aXZV/Lpm3REDH787r9eFG1xQ=
- Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=beta; h=received:message-id:date:from:to:subject:mime-version:content-type:content-transfer-encoding:content-disposition; b=lbnKLmLMSkpJTkqUo1XRYnVDwaJ7juqKQNxq3rk8SbOYgGNBzbMX4pW8QNYTabV5EDNtSBhql5lzuPOTbR8MGLUN0w+DdE30N/z2D6kM51Z0pSvOO1Ax7tZZ5fi6udWs+Qt2lXFj1ZsvwNNBaJJM+6R+i/ba2MtbbhOzdzgeCUk=
I'd post this sort-of general question to c.l.f, but my newsserver
doesn't work ...
Please consider this code:
TYPE :: summed_amplitude
COMPLEX(DBL), DIMENSION(:,:), POINTER :: alm
END TYPE
SUBROUTINE summed_amplitude_init_copy(this, other)
TYPE(summed_amplitude), INTENT(out) :: this
TYPE(summed_amplitude), INTENT(in) :: other
ALLOCATE(this%alm(size(other%alm,1), size(other%alm,2)))
this%alm = other%alm ! <-----
END SUBROUTINE
gprof shows, that my program spents about 20% of its runtime copying
arrays. Checking the dump, gfortran seems to assign the array elements
one by one (pasted below). Is there a way to get this done in a single
memcpy (which hopefully would speed things up)?
Thanks.
Daniel
dump:
{
int4 D.1111;
int4 D.1110;
int4 D.1109;
int4 D.1108;
int4 D.1107;
int4 D.1106;
int4 D.1105;
complex8[0:] * D.1104;
int4 D.1103;
int4 D.1102;
int4 D.1101;
int4 D.1100;
int4 D.1099;
complex8[0:] * D.1098;
D.1098 = (complex8[0:] *) other->alm.data;
D.1099 = other->alm.offset;
D.1100 = other->alm.dim[0].lbound;
D.1101 = other->alm.dim[0].ubound;
D.1102 = other->alm.dim[1].lbound;
D.1103 = other->alm.dim[1].ubound;
D.1104 = (complex8[0:] *) this->alm.data;
D.1105 = this->alm.offset;
D.1106 = this->alm.dim[0].lbound;
D.1107 = this->alm.dim[0].ubound;
D.1108 = this->alm.dim[1].lbound;
D.1109 = this->alm.dim[1].ubound;
D.1110 = D.1106 - D.1100;
D.1111 = D.1108 - D.1102;
{
int4 D.1114;
int4 D.1113;
int4 S.5;
D.1113 = other->alm.dim[0].stride;
D.1114 = this->alm.dim[0].stride;
S.5 = D.1102;
while (1)
{
if (S.5 > other->alm.dim[1].ubound) goto L.4;
{
int4 D.1117;
int4 D.1116;
int4 S.6;
D.1116 = other->alm.dim[1].stride * S.5 + D.1099;
D.1117 = (S.5 + D.1111) * this->alm.dim[1].stride + D.1105;
S.6 = D.1100;
while (1)
{
if (S.6 > other->alm.dim[0].ubound) goto L.3;
(*D.1104)[(S.6 + D.1110) * D.1114 + D.1117] =
(*D.1098)[S.6 * D.1113 + D.1116];
S.6 = S.6 + 1;
}
L.3:;
}
S.5 = S.5 + 1;
}
L.4:;
}