This is the mail archive of the fortran@gcc.gnu.org mailing list for the GNU Fortran project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

really slow ALLOCATE / MOVE_ALLOC


> Anyway, my point is that if some other allocation has used the space behind
> the object you want to resize, then realloc() has to do a new allocation and
> copy the data over. Unless, as I mentioned previously, the allocation is big
> enough that malloc has switched from (s)brk to mmap.

Good realloc implementations will avoid copies for all sizes larger
than a single page, not just very large arrays.

I think people simply don't appreciate how big the difference between
realloc and MOVE_ALLOC really is, and how inefficient dynamic storage
management in GFortran can be.

Here are two simple programs that extend arrays one integer at a time.
 The C version scales linearly, the Fortran version is quadratic.  For
1000000 elements, the C version takes 39 milliseconds, the Fortran
version takes 18 minutes.  The C version scales up easily to 100000000
elements (1.2 seconds).  The C version of realloc is so efficient
that, except for inner loops, you simply just call it whenever it
makes sense.  In gfortran, calling ALLOCATE seems to be so costly that
one really has to think very carefully about where to use it and adopt
error prone and costly pre-allocation strategies to work around these
inefficiencies.

Of course, you can generate a list of the first 10000000 integers much
more easily.  But analogous allocation patterns do exist in many
real-world compute-intensive programs.   GFortran performs poorly on
these benchmarks means that it will perform poorly on those real-world
tasks as well.  There are some workarounds (defining data structures
with exponential doubling), but they are more cumbersome and still
less efficient than just doing the simple and efficient reallocation
(and fixing whatever other overhead there may be).


Tom

PS: Note that I would have liked to write the realloc_fortran.f95
using "a = [a,i]", but that crashes in the version of 4.3 that I have.

$ cat realloc_c.c
#include <stdlib.h>
#include <stdio.h>
#include <malloc.h>

int main(int argc,char **argv) {
    if(argc<2) {printf("bad args\n"); exit(1);}
    int n = atoi(argv[1]);
    int *data = malloc(sizeof data[0]);
    for(int i=1;i<n;i++) {
        data = realloc(data,(i+1)*sizeof data[0]);
        if(!data) abort();
        data[i] = i;
    }
}
$ cat realloc_fortran.f95
program ra
    integer, allocatable :: a(:),temp(:)
    integer n,i
    character*100 buf
    call getarg(1,buf)
    read(buf,*) n
    allocate(a(1))
    do i=2,n
        allocate(temp(i))
        temp(1:i-1) = a
        call move_alloc(temp,a)
        a(i) = i
    end do
    print *,sum(a)
end program
$ gcc --std=c99 -O3 realloc_c.c -o realloc_c
$ gfortran -O3 -pedantic --std=f95 realloc_fortran.f95 -o realloc_fortran
$ time realloc_c 1000000

real	0m0.039s
user	0m0.016s
sys	0m0.024s
$ time realloc_fortran 1000000
  1784293662

real	18m20.474s
user	10m7.414s
sys	8m12.463s
$

If you don't like the above example, here's another one that simulates
reallocations in a long-running dynamic program (and if you still
think that's unrealistic, replace int *data with int *data[100]).
And, yes, this does access uninitialized memory.

$ cat rc.c
#include <stdlib.h>
#include <stdio.h>
#include <malloc.h>
#include <math.h>

int main(int argc,char **argv) {
    int *data = malloc(sizeof data[0]);
    int i,size;
    double total;
    for(i=0;i<100000;i++) {
        size = exp(1.0+15*drand48());
        data = realloc(data,size * sizeof data[0]);
        data[lrand48()%size] = lrand48();
    }
    for(i=0;i<size;i++) total += data[i];
    printf("%g\n",total);
}
$ cat rf.f95
program ra
    integer, allocatable :: a(:),temp(:)
    integer n,k,i
    real rv(3)
    allocate(a(1))
    do i=1,100000
        call random_number(rv)
        n = floor(exp(1.0+15.0*rv(1)))
        allocate(temp(n))
        k = min(n,size(a))
        temp(1:k) = a(1:k)
        call move_alloc(temp,a)
        a(1+floor(rv(2)*n)) = floor(rv(3)*n)
    end do
    print *,sum(a)
end program
$ gfortran -O3 rf.f95
$ time a.out
   176436294

real	0m14.984s
user	0m14.233s
sys	0m0.752s
$ gcc -O3 rc.c -lm
$ time a.out

real	0m0.243s
user	0m0.028s
sys	0m0.216s


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]