Unordered Associative

Hash Code

Hash Code Caching Policy

The unordered containers in libstdc++ may cache the hash code for each element alongside the element itself. In some cases not recalculating the hash code every time it's needed can improve performance, but the additional memory overhead can also reduce performance, so whether an unordered associative container caches the hash code or not depends on a number of factors. The caching policy for GCC 4.8 is described below.

The C++ standard requires that erase and swap operations must not throw exceptions. Those operations might need an element's hash code, but cannot use the hash function if it could throw. This means the hash codes will be cached unless the hash function has a non-throwing exception specification such as noexcept or throw().

Secondly, libstdc++ also needs the hash code in the implementation of local_iterator and const_local_iterator in order to know when the iterator has reached the end of the bucket. This means that the local iterator types will embed a copy of the hash function when possible. Because the local iterator types must be DefaultConstructible and CopyAssignable, if the hash function type does not model those concepts then it cannot be embedded and so the hash code must be cached. Note that a hash function might not be safe to use when default-constructed (e.g if it a function pointer) so a hash function that is contained in a local iterator won't be used until the iterator is valid, so the hash function has been copied from a correctly-initialized object.

If the hash function is non-throwing, DefaultConstructible and CopyAssignable then libstdc++ doesn't need to cache the hash code for correctness, but might still do so for performance if computing a hash code is an expensive operation, as it may be for arbitrarily long strings. As an extension libstdc++ provides a trait type to describe whether a hash function is fast. By default hash functions are assumed to be fast unless the trait is specialized for the hash function and the trait's value is false, in which case the hash code will always be cached. The trait can be specialized for user-defined hash functions like so:

      #include <unordered_set>

      struct hasher
      {
        std::size_t operator()(int val) const noexcept
        {
          // Some very slow computation of a hash code from an int !
          ...
        }
      }

      namespace std
      {
        template<>
          struct __is_fast_hash<hasher> : std::false_type
          { };
      }