Hashtable Small size optimization

Mon Oct 15 20:46:00 GMT 2018

I started considering PR libstdc++/68303.

First thing was to write a dedicated performance test case, it is the 
unordered_small_size.cc I'd like to add with this patch.

The first runs show a major difference between tr1 and std 
implementations, tr1 being much better:

std::tr1::unordered_set<int> without hash code cached: 1st insertÂ Â Â  Â Â  
9rÂ Â Â  9uÂ Â Â  1sÂ  14725920memÂ Â Â  0pf
std::tr1::unordered_set<int> with hash code cached: 1st insertÂ Â Â  Â Â  
8rÂ Â Â  9uÂ Â Â  0sÂ  14719680memÂ Â Â  0pf
std::unordered_set<int> without hash code cached: 1st insertÂ Â Â  Â  17rÂ Â  
17uÂ Â Â  0sÂ  16640080memÂ Â Â  0pf
std::unordered_set<int> with hash code cached: 1st insertÂ Â Â  Â  14rÂ Â  
14uÂ Â Â  0sÂ  16638656memÂ Â Â  0pf

I had a look in gdb to find out why and the answer was quite obvious. 
For 20 insertions tr1 implementation bucket count goes through [11, 23] 
whereas for std it is [2, 5, 11, 23], so 2 more expensive rehash.

As unordered containers are dedicated to rather important number of 
elements I propose to review the rehash policy with this patch so that 
std also starts at 11 on the 1st insertion. After the patch figures are:

std::tr1::unordered_set<int> without hash code cached: 1st insertÂ Â Â  Â Â  
9rÂ Â Â  9uÂ Â Â  0sÂ  14725920memÂ Â Â  0pf
std::tr1::unordered_set<int> with hash code cached: 1st insertÂ Â Â  Â Â  
8rÂ Â Â  8uÂ Â Â  0sÂ  14719680memÂ Â Â  0pf
std::unordered_set<int> without hash code cached: 1st insertÂ Â Â  Â  15rÂ Â  
15uÂ Â Â  0sÂ  16640128memÂ Â Â  0pf
std::unordered_set<int> with hash code cached: 1st insertÂ Â Â  Â  12rÂ Â  
12uÂ Â Â  0sÂ  16638688memÂ Â Â  0pf

Moreover I noticed that performance tests are built with -O2, is that 
intentional ? The std implementation uses more abstractions than tr1, 
looks like building with -O3 optimizes away most of those abstractions 
making tr1 and std implementation much closer:

std::tr1::unordered_set<int> without hash code cached: 1st insertÂ Â Â  Â Â  
2rÂ Â Â  1uÂ Â Â  1sÂ  14725952memÂ Â Â  0pf
std::tr1::unordered_set<int> with hash code cached: 1st insertÂ Â Â  Â Â  
2rÂ Â Â  1uÂ Â Â  0sÂ  14719536memÂ Â Â  0pf
std::unordered_set<int> without hash code cached: 1st insertÂ Â Â  Â Â  2rÂ Â Â  
2uÂ Â Â  0sÂ  16640064memÂ Â Â  0pf
std::unordered_set<int> with hash code cached: 1st insertÂ Â Â  Â Â  2rÂ Â Â  
2uÂ Â Â  0sÂ  16638608memÂ Â Â  0pf

Note that this patch also rework the alternative rehash policy based on 
powers of 2 so that it also starts with a larger number of bucket (16) 
and respects LWG2156.

Last I had to wider the memory column so that alignment is preserved 
even when memory diff is negative.

Tested under Linux x86_64.

Ok to commit ?

FranÃ§ois

-------------- next part --------------
A non-text attachment was scrubbed...
Name: hashtable_rehash.patch
Type: text/x-patch
Size: 12092 bytes
Desc: not available
URL: <http://gcc.gnu.org/pipermail/libstdc++/attachments/20181015/1eea191f/attachment.bin>