Hashtable Small size optimization

François Dumont frs.dumont@gmail.com
Mon Oct 15 20:46:00 GMT 2018


I started considering PR libstdc++/68303.

First thing was to write a dedicated performance test case, it is the 
unordered_small_size.cc I'd like to add with this patch.

The first runs show a major difference between tr1 and std 
implementations, tr1 being much better:

std::tr1::unordered_set<int> without hash code cached: 1st insert       
9r    9u    1s  14725920mem    0pf
std::tr1::unordered_set<int> with hash code cached: 1st insert       
8r    9u    0s  14719680mem    0pf
std::unordered_set<int> without hash code cached: 1st insert      17r   
17u    0s  16640080mem    0pf
std::unordered_set<int> with hash code cached: 1st insert      14r   
14u    0s  16638656mem    0pf

I had a look in gdb to find out why and the answer was quite obvious. 
For 20 insertions tr1 implementation bucket count goes through [11, 23] 
whereas for std it is [2, 5, 11, 23], so 2 more expensive rehash.

As unordered containers are dedicated to rather important number of 
elements I propose to review the rehash policy with this patch so that 
std also starts at 11 on the 1st insertion. After the patch figures are:

std::tr1::unordered_set<int> without hash code cached: 1st insert       
9r    9u    0s  14725920mem    0pf
std::tr1::unordered_set<int> with hash code cached: 1st insert       
8r    8u    0s  14719680mem    0pf
std::unordered_set<int> without hash code cached: 1st insert      15r   
15u    0s  16640128mem    0pf
std::unordered_set<int> with hash code cached: 1st insert      12r   
12u    0s  16638688mem    0pf

Moreover I noticed that performance tests are built with -O2, is that 
intentional ? The std implementation uses more abstractions than tr1, 
looks like building with -O3 optimizes away most of those abstractions 
making tr1 and std implementation much closer:

std::tr1::unordered_set<int> without hash code cached: 1st insert       
2r    1u    1s  14725952mem    0pf
std::tr1::unordered_set<int> with hash code cached: 1st insert       
2r    1u    0s  14719536mem    0pf
std::unordered_set<int> without hash code cached: 1st insert       2r    
2u    0s  16640064mem    0pf
std::unordered_set<int> with hash code cached: 1st insert       2r    
2u    0s  16638608mem    0pf

Note that this patch also rework the alternative rehash policy based on 
powers of 2 so that it also starts with a larger number of bucket (16) 
and respects LWG2156.

Last I had to wider the memory column so that alignment is preserved 
even when memory diff is negative.

Tested under Linux x86_64.

Ok to commit ?

François


-------------- next part --------------
A non-text attachment was scrubbed...
Name: hashtable_rehash.patch
Type: text/x-patch
Size: 12092 bytes
Desc: not available
URL: <http://gcc.gnu.org/pipermail/libstdc++/attachments/20181015/1eea191f/attachment.bin>


More information about the Libstdc++ mailing list