This is the mail archive of the fortran@gcc.gnu.org mailing list for the GNU Fortran project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

performance question: cloning allocated arrays


I'd post this sort-of general question to c.l.f, but my newsserver
doesn't work ...

Please consider this code:

TYPE :: summed_amplitude
 COMPLEX(DBL), DIMENSION(:,:), POINTER :: alm
END TYPE

SUBROUTINE summed_amplitude_init_copy(this, other)
 TYPE(summed_amplitude), INTENT(out) :: this
 TYPE(summed_amplitude), INTENT(in)  :: other
 ALLOCATE(this%alm(size(other%alm,1), size(other%alm,2)))
 this%alm = other%alm      !  <-----
END SUBROUTINE

gprof shows, that my program spents about 20% of its runtime copying
arrays. Checking the dump, gfortran seems to assign the array elements
one by one (pasted below). Is there a way to get this done in a single
memcpy (which hopefully would speed things up)?

Thanks.
   Daniel


dump: { int4 D.1111; int4 D.1110; int4 D.1109; int4 D.1108; int4 D.1107; int4 D.1106; int4 D.1105; complex8[0:] * D.1104; int4 D.1103; int4 D.1102; int4 D.1101; int4 D.1100; int4 D.1099; complex8[0:] * D.1098;

   D.1098 = (complex8[0:] *) other->alm.data;
   D.1099 = other->alm.offset;
   D.1100 = other->alm.dim[0].lbound;
   D.1101 = other->alm.dim[0].ubound;
   D.1102 = other->alm.dim[1].lbound;
   D.1103 = other->alm.dim[1].ubound;
   D.1104 = (complex8[0:] *) this->alm.data;
   D.1105 = this->alm.offset;
   D.1106 = this->alm.dim[0].lbound;
   D.1107 = this->alm.dim[0].ubound;
   D.1108 = this->alm.dim[1].lbound;
   D.1109 = this->alm.dim[1].ubound;
   D.1110 = D.1106 - D.1100;
   D.1111 = D.1108 - D.1102;
   {
     int4 D.1114;
     int4 D.1113;
     int4 S.5;

     D.1113 = other->alm.dim[0].stride;
     D.1114 = this->alm.dim[0].stride;
     S.5 = D.1102;
     while (1)
       {
         if (S.5 > other->alm.dim[1].ubound) goto L.4;
         {
           int4 D.1117;
           int4 D.1116;
           int4 S.6;

           D.1116 = other->alm.dim[1].stride * S.5 + D.1099;
           D.1117 = (S.5 + D.1111) * this->alm.dim[1].stride + D.1105;
           S.6 = D.1100;
           while (1)
             {
               if (S.6 > other->alm.dim[0].ubound) goto L.3;
               (*D.1104)[(S.6 + D.1110) * D.1114 + D.1117] =
(*D.1098)[S.6 * D.1113 + D.1116];
               S.6 = S.6 + 1;
             }
           L.3:;
         }
         S.5 = S.5 + 1;
       }
     L.4:;
   }


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]