This is the mail archive of the fortran@gcc.gnu.org mailing list for the GNU Fortran project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: really slow ALLOCATE / MOVE_ALLOC


Thanks for the response; looks like we're on the same page.

>> Good implementations of malloc use VM hardware and page remapping,
>> rather than reserve extra space.  Look at the mremap system call on
>> Linux.
>
> I understand this but reducing the number of system calls by
> allocating oversize will certainly result in a performance
> improvement.

Yes, but there needs to be some minimal user control.  When I work
with 3.5Gbyte arrays on a 4Gbyte machine, I need to be able to tell
the library not to pre-allocate 1.5x the array size.

>From working with arrays classes in C++, a simple scheme that lets the
user control absolute reserved space and tell the array class what to
do on resizing seems to work well.  Based on that experience, I think
it would be possible to do this transparently in Fortran through a
collection of subroutines that pass hints to the allocator on
implementations that understand it (see below).

Fortran implementations that don't understand reserved storage can
provide implementations of realloc in terms of standard, portable
primitives of realloc, and do nothing for the other calls.  Fortran
implementations that can perform efficient array reallocation through
a C realloc call or through direct VM manipulation internally make the
right tradeoffs between reserve allocation and VM-based resizing when
necessary.

A good library implementation for gfortran could perhaps encourage
incorporation of these features into a future standard.

Note that full automation of these just isn't possible; the compiler
simply doesn''t have the information from a single run of the program
to make the right choices.

Conformal versions of these primitives are rarely needed in my
experience, and I think it would be bad to make good
reservation/reallocation an all-or-nothing proposition.  That's why I
think it's good to have a non-conformal realloc and a separate
conformal_realloc.  Doing a good job at the non-conformal version is
fairly easy, and implementors could just do a minimal job on the
conformal_realloc (it would still be better and more convenient than
the current solutions).

> True but a naive application with realloc would be a good start:-)

Yes, my point exactly :-)

Tom

Here's the interface.  Again, the two simplest implementations for this are:

(1) each subroutine does nothing

(2) most of the subroutines do nothing, but realloc calls C's realloc

The next one would be:

(3) implement a simple storage reservation scheme (with some care,
this can actually be done in C library code with almost no compiler
modifications)

This kind of API would let programmers express their intent and needs
portably and give them much better efficiency than currently possible
on platforms that make a minimal effort to implement (2) or (3).

call realloc(a,[10,2000000])

Reallocate the array a with the new bounds, keeping old content for
overlapping elements.  The reallocation is performed as if the array
has been flattened into a rank 1 array, grown/shrunk to the desired
size, and then reshaped in place to the new size.  Internally, this
may use reserved space, C's realloc, or even explicit VM page
manipulation.  If there is reserved space, reallocations that use less
total space than the reserved space many not require any kind of new
storage allocation.

d = reservation(a)

Returns a hint about the current reservation for a.  The result is
either the actual array dimensions, the exact reservation, a vector at
least as large in each dimension than the actual reservation (if the
system has reserved extra space), or a vector of huge values if the
system uses an efficient reallocation scheme that does not require
reservations.  (The intent is that code can determine whether a
reallocation request can be carried out efficiently by checking
whether each dimension is no larger than the reservation.)

call reserve(a,[10,1000000])

Reserve storage for future growth but don't change the allocation.  If
any of the dimensions is smaller than the current size, it is treated
as if it were given as the current size.  This may be called on an
unallocated array and will then affect the next allocation.  If it is
called on an allocated array, it may cause the array to be reallocated
with identical bounds and the new reservation immediately, or it may
take effect at the next explicit reallocation.

call growth(a,[1.0,1.5])

Whenever the array has grown beyond its reserve and the user attempts
reallocation to some size, allocate extra space as given by the growth
factors.  In this case, "call realloc(a,[n,m])" or "allocate(a(n,m))"
would implicitly be followed by a "call
reserve(a,[floor(1.0*n),floor(1.5*m)]).  The default growth for any
array is 1.0 in every dimension.

call tighten(a)

Shrink the reserve of the array to its actual allocation and
deallocate any storage freed in the process.  This is always a
conformal operation.

call conformal_realloc(a,[50,50],offset=[10,10])

Perform a conformal reallocation of the array, mapping the array
elements onto the new array index-by-index.  The contents of newly
allocated elements is undefined.  A good implementation will attempt
to perform any necessary data movements in place if possible.  Other
possibilities for conformal remapping include laying out the array to
avoid data movement on resize when the array has a multidimensional
reservation, and/or controlling the array-too-page mapping so that the
array can be extended without copying by rearranging pages in memory.

A good system for conformal reallocation is much harder to implement
and much less frequently needed than the non-conformal version;
implementors should get the non-conformal version out the door
quickly, since that has a real performance impact for a lot of code.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]