This is the mail archive of the libstdc++@gcc.gnu.org mailing list for the libstdc++ project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [C++] Should the complexity of std::list::size() be O(n) or O(1)?


On Nov 24, 2005, at 3:57 PM, Gabriel Dos Reis wrote:

Certainly.  My point was that I'm not convinced penalizing proper list
operations is the right resolution.

I'm not convinced that labeling list::size() as improper is correct characterization. I have no trouble thinking of applications which need to know the size of the list. Admittedly I also have no trouble thinking of applications which don't need to know the size of the list.


What is improper is having an O(N) size. It should either be O(1), or not exist. The fact that it exists and is O(N) is what is improper imho. I continue to come across code that uses list::size in the mistaken belief that it is O(1).

http://gpstk.sourceforge.net/doxygen/FileFilter_8hpp-source.html

00103 std::vector<lItrType> data(dataVec.size()); // dataVec is std::list
00104 lItrType itr = dataVec.begin();
00105 typename std::vector<lItrType>::size_type i = 0;
00106 while (itr != dataVec.end())
00107 {
00108 data[i] = itr;
00109 i++;
00110 itr++;
00111 }

This is not a simple case of people not being sufficiently educated. This is a very poor interface design on the part of std::list.


  3.  Leave it as is.
  4.  Be clear that size() is linear for list<>.

These are unacceptable because they leave in place the bad interface.


Fwiw, I've been working on an example which uses both list::size, and list::splice-some-from-other. My motivation is to get real data on the various effects, instead of just tossing words back and forth. The results are obvious for examples that would use only size, or only splice, so I find it more interesting to explore a case that would use both.

The spec for the example I'm working with is:

Consider an aircraft seating program which has a function called free_upgrade. This function tries to upgrade passengers from economy to first class, but is subject to the size limits of first class seating, and also subject to weight & balance rules to keep the aircraft stable. It is also smart enough to not split up a party that's traveling together. Here is how it might look:

/*
Upgrade as many economy passengers as you can to first class without overbooking
first class, or violating weight/balance limits that state there must be more
economy passengers than first class passengers.


Passengers travel in groups. Never partially upgrade a group. Passengers
traveling together want to sit together. Grouped Passengers are adjacent in
the lists.


Passenger address must remain stable as other data structures will reference
individual Passengers.
*/


void free_upgrade(std::list<Passenger>& first_class, std::list<Passenger>& economy)
{
// splices Passengers from economy to first_class subject to the above constraints.
// splices a group at a time.
}


The parameters to the problem might look something like:

const std::size_t first_class_initial_size = 25;
const std::size_t first_class_limit = 50;
const std::size_t economy_class_initial_size = 100;
const std::size_t max_group_size = 5;

I'm giving each Passenger a first and last name. And Passengers that are adjacent in the lists, and have the same last name are constrained to travel together (either all or none of that subgroup will be upgraded). Passenger groups occur with uniform randomness in the range [1, max_group_size].

I've coded this up 3 ways so far:

1. Assuming an O(1) size, and being right.
2. Assuming an O(N) size, and being right - sizes are computed and cached externally.
3. Assuming an O(1) size, but being wrong (size is actually O(N)).


The results vary greatly of course with the values of the initial parameters. And obviously as the parameters are increased in value the differences in the timing results become more exaggerated.

I've yet to find a case where solution 1 is not fastest. Solution 2 is usually a close second. The best run it has had is only 1% slower (with the above parameters). The worst is 37% slower with:

const std::size_t first_class_initial_size = 2500;
const std::size_t first_class_limit = 5000;
const std::size_t economy_class_initial_size = 10000;
const std::size_t max_group_size = 100;

(I really don't care if the above numbers aren't realistic for an aircraft - that is beside the point. I'm exploring performance characteristics of list)

Case 3 is pretty much always a disaster. With the first set of parameters it is 39% slower. With the latter set of parameters it is 73 times slower.

Imho, we need to do whatever it takes to make sure our customers don't accidently fall into the case 3 trap.

-Howard

PS: Source code for this experiment is available, but I wanted to give others a chance for independent verification.

PSS: Iteration, size and splice-some-from-other are the only list functions being timed. Specifically left out of this analysis are list assignment, resize and operator== (that is perhaps another test).


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]