This is the mail archive of the libstdc++@sourceware.cygnus.com mailing list for the libstdc++ project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

An suggested extension


I know that we have a lot of work to do to support the standard before we
should be thinking about extensions to the standard, but I've been thinking
about this one lately.  I would like input on the idea.

The problem: vectors (and deques) are often confronted with the problem of
moving (possibly large) amounts of their contained data.  The default
behavior is to copy the data and then destroy the old data.  This is
inefficient if the data type contained can be shallow-copied and then have
its destructor call skipped (!) (Note - not all data types can do this); a
classic example is the vector class itself.

So what if we define a few extended algorithms to wrap the "copy-construct,
then destruct" into a single operation?  The name I use for this is the
"uninitialized move" operation:
template<class T>
void __uninitialized_move(T * const Source, T * const Dest)
{
  construct(Dest, *Source);
  destroy(Source);
}

template <class T>
void __uninitialized_move(T * SourceFirst, T * const SourceLast, T * Dest)
{
  while (SourceFirst != SourceLast)
  {
    __uninitialized_move(SourceFirst, Dest);
    ++SourceFirst;
    ++Dest;
  }
}

Of course, in actual implementation, both of the algorithms above will be
able to use the __type_traits to further improve efficiency; for POD types,
the second __uninitialized_move should simplify to a single memmove().
Also, the simple examples above ignore exception-handling complications.

For libstdc++-supplied classes (such as vector), we would include
specializations for the above algorithms, like the following (this requires
a private constructor for vector, another member function to destroy only
the allocator if necessary, and the __uninitialized_move's to be declared as
friends (because we don't know how the user-defined allocator should be
copied - if we did we could just use memcpy)):
template <class T>
void __uninitialized_move(vector<T> * const Source, vector<T> * const Dest)
{
  // shallow copy
  new (reinterpret_cast<void *>(Dest)) vector<T>(Source->get_allocator(),
Source->_M_start, Source->_M_finish, Source->_M_end_of_storage);
  // do not call destructor for Source; however, we do have to pick up the
destructor for the user-defined allocator
  Source->__destroy_allocator();
}

Another interesting note: this new technique also allows us to add list-like
splice functions on vectors (and deques?).

Final note: this technique allows very fast moving of any types that 1) we
supply, or 2) the programmer specializes the moving of.  It has no effect on
the correctness of the STL implementation; only its speed.

Thoughts? Comments?

	-Steve

Afterword: a bit of information about me, since this is my first post to the
list.

My name is Stephen Cleary.  I'm a 1997 graduate of Michigan Technological
University, now employed at Control Engineering Company of Petoskey, MI.

I first heard of the STL shortly after it was accepted into the ANSI/ISO
draft standard.  I immediately knew that that was what I wanted to work on
(esp. containers and algorithms).  The problem was choosing which
implementation I should work on.  HP is eons old and RogueWave would require
a change of scenery (and their implementation is just OK, anyway).  When I
heard about a week ago that Cygnus was now in charge of gcc (egcs), I
decided to check what kind of STL they were using.  I was quite impressed
with the bazaar-style implementation of egcs (compiler optimization is also
an interest of mine - esp. recursive algorithms), and I think that libstdc++
will be my library of choice to work on.

The extension suggested above is the first that I have thought of.  It was
originally conceived (though not implemented) when I was writing my own
vector class from scratch, having been fairly disgusted at what other STL
libraries offered.  It was needed (though not used) a few weeks later when I
had a CPU-intensive program (containing vectors of vectors of
user-defined-types-containing-vectors, etc.)  that suffered an approx. 20x
speed decrease because it was not implemented (I got around it in my program
by declaring vectors of auto_ptr's of vectors).

There are a few other extensions I would like to see in the completed
library; for example: a member function to return a pointer to the
underlying array in a vector (named __ptr), a singly-linked-list class
(named __sll), and a few others that I think about from time to time.



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]