Associative-Container Performance Tests

Settings

This section describes performance tests and their results. In the following, g++, msvc++, and local (the build used for generating this documentation) stand for three different builds:

g++

  • CPU speed - cpu MHz : 2660.644
  • Memory - MemTotal: 484412 kB
  • Platform - Linux-2.6.12-9-386-i686-with-debian-testing-unstable
  • Compiler - g++ (GCC) 4.0.2 20050808 (prerelease) (Ubuntu 4.0.1-4ubuntu9) Copyright (C) 2005 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

msvc++

  • CPU speed - cpu MHz : 2660.554
  • Memory - MemTotal: 484412 kB
  • Platform - Windows XP Pro
  • Compiler - Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 13.10.3077 for 80x86 Copyright (C) Microsoft Corporation 1984-2002. All rights reserved.

local

  • CPU speed - cpu MHz : 2250.000
  • Memory - MemTotal: 2076248 kB
  • Platform - Linux-2.6.16-1.2133_FC5-i686-with-redhat-5-Bordeaux
  • Compiler - g++ (GCC) 4.1.1 20060525 (Red Hat 4.1.1-1) Copyright (C) 2006 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

Tests

Hash-Based Containers

  1. Hash-Based Text find Find Timing Test
  2. Hash-Based Random-Integer find Find Timing Test
  3. Hash-Based Random-Integer Subscript Find Timing Test
  4. Hash-Based Random-Integer Subscript Insert Timing Test
  5. Hash-Based Skewed-Distribution Random-Integer find Find Timing Test
  6. Hash-Based Erase Memory Use Test

Tree-Like-Based Containers

  1. Tree-Based and Trie-Based Text Insert Timing Test
  2. Tree-Based and Trie-Based Text find Find Timing Test
  3. Tree-Based Locality-of-Reference Text find Find Timing Test
  4. Tree-Based Random-Integer find Find Timing Test
  5. Tree Split and Join Timing Test
  6. Tree Order-Statistics Timing Test

Multimaps

  1. "Multimap" Text Find Timing Test with Small Average Secondary-Key to Primary-Key Ratio
  2. "Multimap" Text Find Timing Test with Large Average Secondary-Key to Primary-Key Ratio
  3. "Multimap" Text Insert Timing Test with Small Average Secondary-Key to Primary-Key Ratio
  4. "Multimap" Text Insert Timing Test with Large Average Secondary-Key to Primary-Key Ratio
  5. "Multimap" Text Insert Memory-Use Test with Small Average Secondary-Key to Primary-Key Ratio
  6. "Multimap" Text Insert Memory-Use Test with Large Average Secondary-Key to Primary-Key Ratio

Observations

Underlying Data-Structure Families

In general, hash-based containers (see Design::Associative Containers::Hash-Based Containers) have better timing performance than containers based on different underlying-data structures. The main reason to choose a tree-based (see Design::Associative Containers::Tree-Based Containers) or trie-based container (see Design::Associative Containers::Trie-Based Containers) is if a byproduct of the tree-like structure is required: either order-preservation, or the ability to utilize node invariants (see Design::Associative Containers::Tree-Based Containers::Node Invariants and Design::Associative Containers::Trie-Based Containers::Node Invariants). If memory-use is the major factor, an ordered-vector tree (see Design::Associative Containers::Tree-Based Containers) gives optimal results (albeit with high modificiation costs), and a list-based container (see Design::Associative Containers::List-Based Containers) gives reasonable results.

Hash-Based Container Types

Hash-based containers are typically either collision chaining or probing (see Design::Associative Containers::Hash-Based Containers). Collision-chaining containers are more flexible internally, and so offer better timing performance. Probing containers, if used for simple value-types, manage memory more efficiently (they perform far fewer allocation-related calls). In general, therefore, a collision-chaining table should be used. A probing container, conversely, might be used efficiently for operations such as eliminating duplicates in a sequence, or counting the number of occurrences within a sequence. Probing containers might be more useful also in multithreaded applications where each thread manipulates a hash-based container: in the STL, allocators have class-wise semantics (see [meyers96more] - Item 10); a probing container might incur less contention in this case.

Hash-Based Containers' Policies

In hash-based containers, the range-hashing scheme (see Design::Associative Containers::Hash-Based Containers::Hash Policies) seems to affect performance more than other considerations. In most settings, a mask-based scheme works well (or can be made to work well). If the key-distribution can be estimated a-priori, a simple hash function can produce nearly uniform hash-value distribution. In many other cases (e.g., text hashing, floating-point hashing), the hash function is powerful enough to generate hash values with good uniformity properties [knuth98sorting]; a modulo-based scheme, taking into account all bits of the hash value, appears to overlap the hash function in its effort.

The range-hashing scheme determines many of the other policies (see Design::Hash-Based Containers::Policy Interaction). A mask-based scheme works well with an exponential-size policy (see Design::Associative Containers::Hash-Based Containers::Resize Policies) ; for probing-based containers, it goes well with a linear-probe function (see Design::Associative Containers::Hash-Based Containers::Hash Policies).

An orthogonal consideration is the trigger policy (see Design::Associative Containers::Hash-Based Containers::Resize Policies). This presents difficult tradeoffs. E.g., different load factors in a load-check trigger policy yield a space/amortized-cost tradeoff.

Tree-Like-Based Container Types

In general, there are several families of tree-based underlying data structures: balanced node-based trees (e.g., red-black or AVL trees), high-probability balanced node-based trees (e.g., random treaps or skip-lists), competitive node-based trees (e.g., splay trees), vector-based "trees", and tries. (Additionally, there are disk-residing or network-residing trees, such as B-Trees and their numerous variants. An interface for this would have to deal with the execution model and ACID guarantees; this is out of the scope of this library.) Following are some observations on their application to different settings.

Of the balanced node-based trees, this library includes a red-black tree (see Design::Associative Containers::Tree-Based Containers), as does STL (in practice). This type of tree is the "workhorse" of tree-based containers: it offers both reasonable modification and reasonable lookup time. Unfortunately, this data structure stores a huge amount of metadata. Each node must contain, besides a value, three pointers and a boolean. This type might be avoided if space is at a premium [austern00noset].

High-probability balanced node-based trees suffer the drawbacks of deterministic balanced trees. Although they are fascinating data structures, preliminary tests with them showed their performance was worse than red-black trees. The library does not contain any such trees, therefore.

Competitive node-based trees have two drawbacks. They are usually somewhat unbalanced, and they perform a large number of comparisons. Balanced trees perform one comparison per each node they encounter on a search path; a splay tree performs two comparisons. If the keys are complex objects, e.g., std::string, this can increase the running time. Conversely, such trees do well when there is much locality of reference. It is difficult to determine in which case to prefer such trees over balanced trees. This library includes a splay tree (see Design::Associative Containers::Tree-Based Containers).

Ordered-vector trees (see Design::Associative Containers::Tree-Based Containers) use very little space [austern00noset]. They do not have any other advantages (at least in this implementation).

Large-fan-out PATRICIA tries (see Design::Associative Containers::Trie-Based Containers) have excellent lookup performance, but they do so through maintaining, for each node, a miniature "hash-table". Their space efficiency is low, and their modification performance is bad. These tries might be used for semi-static settings, where order preservation is important. Alternatively, red-black trees cross-referenced with hash tables can be used. [okasaki98mereable] discusses small-fan-out PATRICIA tries for integers, but the cited results seem to indicate that the amortized cost of maintaining such trees is higher than that of balanced trees. Moderate-fan-out trees might be useful for sequences where each element has a limited number of choices, e.g., DNA strings (see Examples::Associative Containers::Trie-Based Containers).

Mapping-Semantics Considerations

Different mapping semantics were discussed in Motivation::Associative Containers::Alternative to Multiple Equivalent Keys and Tutorial::Associative Containers::Associative Containers Others than Maps. We will focus here on the case where a keys can be composed into primary keys and secondary keys. (In the case where some keys are completely identical, it is trivial that one should use an associative container mapping values to size types.) In this case there are (at least) five possibilities:

  1. Use an associative container that allows equivalent-key values (such as std::multimap)
  2. Use a unique-key value associative container that maps each primary key to some complex associative container of secondary keys, say a tree-based or hash-based container (see Design::Associative Containers::Tree-Based Containers and Design::Associative Containers::Hash-Based Containers)
  3. Use a unique-key value associative container that maps each primary key to some simple associative container of secondary keys, say a list-based container (see Design::Associative Containers::List-Based Containers)
  4. Use a unique-key value associative container that maps each primary key to some non-associative container (e.g., std::vector)
  5. Use a unique-key value associative container that takes into account both primary and secondary keys.

We do not think there is a simple answer for this (excluding option 1, which we think should be avoided in all cases).

If the expected ratio of secondary keys to primary keys is small, then 3 and 4 seem reasonable. Both types of secondary containers are relatively lightweight (in terms of memory use and construction time), and so creating an entire container object for each primary key is not too expensive. Option 4 might be preferable to option 3 if changing the secondary key of some primary key is frequent - one cannot modify an associative container's key, and the only possibility, therefore, is erasing the secondary key and inserting another one instead; a non-associative container, conversely, can support in-place modification. The actual cost of erasing a secondary key and inserting another one depends also on the allocator used for secondary associative-containers (The tests above used the standard allocator, but in practice one might choose to use, e.g., [boost_pool]). Option 2 is definitely an overkill in this case. Option 1 loses out either immediately (when there is one secondary key per primary key) or almost immediately after that. Option 5 has the same drawbacks as option 2, but it has the additional drawback that finding all values whose primary key is equivalent to some key, might be linear in the total number of values stored (for example, if using a hash-based container).

If the expected ratio of secondary keys to primary keys is large, then the answer is more complicated. It depends on the distribution of secondary keys to primary keys, the distribution of accesses according to primary keys, and the types of operations most frequent.

To be more precise, assume there are m primary keys, primary key i is mapped to ni secondary keys, and each primary key is mapped, on average, to n secondary keys (i.e., E(ni) = n).

Suppose one wants to find a specific pair of primary and secondary keys. Using 1 with a tree based container (std::multimap), the expected cost is E(Θ(log(m) + ni)) = Θ(log(m) + n); using 1 with a hash-based container (std::tr1::unordered_multimap), the expected cost is Θ(n). Using 2 with a primary hash-based container and secondary hash-based containers, the expected cost is O(1); using 2 with a primary tree-based container and secondary tree-based containers, the expected cost is (using the Jensen inequality [motwani95random]) E(O(log(m) + log(ni)) = O(log(m)) + E(O(log(ni)) = O(log(m)) + O(log(n)), assuming that primary keys are accessed equiprobably. 3 and 4 are similar to 1, but with lower constants. Using 5 with a hash-based container, the expected cost is O(1); using 5 with a tree based container, the cost is E(Θ(log(mn))) = Θ(log(m) + log(n)).

Suppose one needs the values whose primary key matches some given key. Using 1 with a hash-based container, the expected cost is Θ(n), but the values will not be ordered by secondary keys (which may or may not be required); using 1 with a tree-based container, the expected cost is Θ(log(m) + n), but with high constants; again the values will not be ordered by secondary keys. 2, 3, and 4 are similar to 1, but typically with lower constants (and, additionally, if one uses a tree-based container for secondary keys, they will be ordered). Using 5 with a hash-based container, the cost is Θ(mn).

Suppose one wants to assign to a primary key all secondary keys assigned to a different primary key. Using 1 with a hash-based container, the expected cost is Θ(n), but with very high constants; using 1 with a tree-based container, the cost is Θ(nlog(mn)). Using 2, 3, and 4, the expected cost is Θ(n), but typically with far lower costs than 1. 5 is similar to 1.