This is the mail archive of the fortran@gcc.gnu.org mailing list for the GNU Fortran project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

performance question: cloning allocated arrays

From: "Daniel Franke" <franke dot daniel at gmail dot com>
To: fortran at gcc dot gnu dot org
Date: Mon, 26 Feb 2007 19:12:59 +0100
Subject: performance question: cloning allocated arrays
Dkim-signature: a=rsa-sha1; c=relaxed/relaxed; d=gmail.com; s=beta; h=domainkey-signature:received:received:message-id:date:from:to:subject:mime-version:content-type:content-transfer-encoding:content-disposition; b=GAwawlM7wfBzhLEvo/fvO4zmpLaY575K8/pXl5ub+NSz70drYNN3XAVu8vS6XK7Q1wqJ2xAZpQRgv5rh9Sb4xx4w6gUUqbDYltBo5ATe/cdkkjkFfInJH/AoN0nswKlQYp7txWIsk94r8Q2yT88aXZV/Lpm3REDH787r9eFG1xQ=
Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=beta; h=received:message-id:date:from:to:subject:mime-version:content-type:content-transfer-encoding:content-disposition; b=lbnKLmLMSkpJTkqUo1XRYnVDwaJ7juqKQNxq3rk8SbOYgGNBzbMX4pW8QNYTabV5EDNtSBhql5lzuPOTbR8MGLUN0w+DdE30N/z2D6kM51Z0pSvOO1Ax7tZZ5fi6udWs+Qt2lXFj1ZsvwNNBaJJM+6R+i/ba2MtbbhOzdzgeCUk=

I'd post this sort-of general question to c.l.f, but my newsserver
doesn't work ...

Please consider this code:

TYPE :: summed_amplitude
 COMPLEX(DBL), DIMENSION(:,:), POINTER :: alm
END TYPE

SUBROUTINE summed_amplitude_init_copy(this, other)
 TYPE(summed_amplitude), INTENT(out) :: this
 TYPE(summed_amplitude), INTENT(in)  :: other
 ALLOCATE(this%alm(size(other%alm,1), size(other%alm,2)))
 this%alm = other%alm      !  <-----
END SUBROUTINE

gprof shows, that my program spents about 20% of its runtime copying
arrays. Checking the dump, gfortran seems to assign the array elements
one by one (pasted below). Is there a way to get this done in a single
memcpy (which hopefully would speed things up)?

Thanks.
   Daniel


dump:
 {
   int4 D.1111;
   int4 D.1110;
   int4 D.1109;
   int4 D.1108;
   int4 D.1107;
   int4 D.1106;
   int4 D.1105;
   complex8[0:] * D.1104;
   int4 D.1103;
   int4 D.1102;
   int4 D.1101;
   int4 D.1100;
   int4 D.1099;
   complex8[0:] * D.1098;

   D.1098 = (complex8[0:] *) other->alm.data;
   D.1099 = other->alm.offset;
   D.1100 = other->alm.dim[0].lbound;
   D.1101 = other->alm.dim[0].ubound;
   D.1102 = other->alm.dim[1].lbound;
   D.1103 = other->alm.dim[1].ubound;
   D.1104 = (complex8[0:] *) this->alm.data;
   D.1105 = this->alm.offset;
   D.1106 = this->alm.dim[0].lbound;
   D.1107 = this->alm.dim[0].ubound;
   D.1108 = this->alm.dim[1].lbound;
   D.1109 = this->alm.dim[1].ubound;
   D.1110 = D.1106 - D.1100;
   D.1111 = D.1108 - D.1102;
   {
     int4 D.1114;
     int4 D.1113;
     int4 S.5;

     D.1113 = other->alm.dim[0].stride;
     D.1114 = this->alm.dim[0].stride;
     S.5 = D.1102;
     while (1)
       {
         if (S.5 > other->alm.dim[1].ubound) goto L.4;
         {
           int4 D.1117;
           int4 D.1116;
           int4 S.6;

           D.1116 = other->alm.dim[1].stride * S.5 + D.1099;
           D.1117 = (S.5 + D.1111) * this->alm.dim[1].stride + D.1105;
           S.6 = D.1100;
           while (1)
             {
               if (S.6 > other->alm.dim[0].ubound) goto L.3;
               (*D.1104)[(S.6 + D.1110) * D.1114 + D.1117] =
(*D.1098)[S.6 * D.1113 + D.1116];
               S.6 = S.6 + 1;
             }
           L.3:;
         }
         S.5 = S.5 + 1;
       }
     L.4:;
   }

Follow-Ups:
- Re: performance question: cloning allocated arrays
  - From: Steve Kargl

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]