This is the mail archive of the libstdc++@gcc.gnu.org mailing list for the libstdc++ project.

Index Nav: Message Nav: [Date Index] [Subject Index] [Author Index] [Thread Index] [Date Prev] [Date Next] [Thread Prev] [Thread Next]

# Re: Amortized Analysis

• To: David Sankel <camio at yahoo dot com>
• Subject: Re: Amortized Analysis
• From: Kevin Atkinson <kevina at users dot sourceforge dot net>
• Date: Sat, 7 Jul 2001 18:00:02 -0400 (EDT)
• Cc: <libstdc++ at gcc dot gnu dot org>

```On Fri, 6 Jul 2001, David Sankel wrote:

> For people who don't know what it is:
>
> Page 356
> Introduction To Algrothms
> Cormen Leiserson Rivest
> David J. Sankel

Here is a more helpful explanation:

The amortized time an operation takes is the average time that operation
takes over an entire sequence of operation.  For example in a serious of
100 operations if all but the first operation has time n (100) but all the
other operations have constant time than the amortized time is roughly (n
+ n)/n which is (100 + 100)/100 which is 2.

Since what probably takes the most time when expanding a string is the
copying of the data for this analysis I will only count the number of
characters copied.  The same numbers will come out if the number of bytes
allocated or deleted was counted. I will also assume characters are
appended a character at a time.  It is also the cause the the roughly the
same numbers will come out if characters were always added in constant
size blocks or for that matter blocks or random sizes.

In the case of always allocating an additional character the average
time number of characters copied to add n character to an empty string is
roughly (1 + 2 + 3 + ... + n )/n which roughly (n(n+1))/(2 n) which is
(n+1)/2 which is about n/2.  Thus the amoritized time is not constant but
linear to the number of appends.

Tn the case of always allocating x size blocks when ever more space is
needed the number of characters copies is roughly
(x + 2 x + ... + n)/n which is x (1 + 2 + ... + n/x)/n which is
x (n/x)(n/x+1)/(2 n) which is about n/(2 x) which is better but still
linear.

On the other hand when always allocated x times the length the string the
number of copies is roughly:

(1 + x + x^2 + x^3 + ... + n)/n

which is:

sum(x^i, i = 0 ... log[x](n))/n

where log[x](n) means log base x of n.

this becomes:

x^log[x](n) - 1     n^log[x](x) - 1      n - 1
---------------  = ---------------- = -----------
n (x-1)               n (x-1)         n (x - 1)

which is about 1/(x-1).  In the case of doubling this number is 1.  In the
case of expanding by a factor of 1.5 this number is 2 but you don't waste
as much space.

Personally I think all STL containers should have a shrink method so that
the container can shrink to the exact size needed once you are done adding
to it.  I know you can always create a new container and make a copy but
this has the potential to be not as efficient as just freeing the unused
memory.  Because in some cases this memory can be freed without any
copying or reallocation being done.  For example in the case of very large
strings which are allocated with mmap.

---
Kevin Atkinson
kevina at users sourceforge net
http://www.ibiblio.org/kevina/

```

Index Nav: Message Nav: [Date Index] [Subject Index] [Author Index] [Thread Index] [Date Prev] [Date Next] [Thread Prev] [Thread Next]