[Bug libstdc++/50257] New: unordered_map slow initialization due to huge __prime_list

justin at fathomdb dot com gcc-bugzilla@gcc.gnu.org
Thu Sep 1 00:42:00 GMT 2011


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50257

             Bug #: 50257
           Summary: unordered_map slow initialization due to huge
                    __prime_list
    Classification: Unclassified
           Product: gcc
           Version: 4.6.1
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: libstdc++
        AssignedTo: unassigned@gcc.gnu.org
        ReportedBy: justin@fathomdb.com


The constructor for std::unordered_map calls _Prime_rehash_policy::_M_next_bkt
to determine the initial size of the hashtable.  That method computes the size
by looking for the next largest number in a large list of primes (in
src/hashtable-aux.cc), which it does using lower_bound.  However, this list is
very big - I counted it as 310 elements of 8 bytes each on a 64 bit machine, or
~ 2.4KB, accessed in a way (lower_bound) that probably causes lots of cache
misses.  This lower_bound call comes up as a performance hotspot on my
profiles.

I think that:

1) the initial granularity is probably overkill.  The inital entries are:
 2ul, 3ul, 5ul, 7ul, 11ul, 13ul, 17ul, 19ul, 23ul, 29ul, 31ul,
    37ul, 41ul, 43ul, 47ul, 53ul, 59ul, 61ul, 67ul, 71ul, 73ul, 79ul,
    83ul, 89ul, 97ul, 103ul, 109ul, 113ul, 127ul, 137ul, 139ul, 149ul,
    157ul, 167ul, 179ul, 193ul, 199ul, 211ul, 227ul, 241ul, 257ul,...
Overall performance would probably be better/comparable with e.g. 11ul, 31ul,
127ul, 241ul.

2) we should special-case the first values, or (at the very least) the value
when size is specified in the unordered_map constructor (I think that's 10).

I also found this prime list, which looks a lot more reasonable in terms of the
density: http://gcc.gnu.org/ml/gcc-patches/2009-05/msg01293.html

One of the problems is that this list of primes serves three purposes:
1) Initial allocation when no size is specified (for which it is slow)
2) Initial allocation when the user is specifying the size (when we probably
want a size as close as possible to the target number)
3) Resizing (when again we want a size as close as possible to the size we've
computed)

I think that this is probably a good list for purpose #2 and #3, but it is
pretty painful for #1.  Unfortunately #1 is the probably the most common use
case, I would think.  An alternative fix would be to have two different
_Prime_rehash_policy strategies: one for people that know the size of their
hashtable, and one for people that just want sensible defaults.  I don't think
_Prime_rehash_policy is a good match for the second group, and I suspect the
second group is the majority.

Is this the right place to raise this, or would the mailing list be better?



More information about the Gcc-bugs mailing list