Hashtable Small size optimization
François Dumont
frs.dumont@gmail.com
Mon Oct 15 20:46:00 GMT 2018
I started considering PR libstdc++/68303.
First thing was to write a dedicated performance test case, it is the
unordered_small_size.cc I'd like to add with this patch.
The first runs show a major difference between tr1 and std
implementations, tr1 being much better:
std::tr1::unordered_set<int> without hash code cached: 1st insert    Â
9r   9u   1s 14725920mem   0pf
std::tr1::unordered_set<int> with hash code cached: 1st insert    Â
8r   9u   0s 14719680mem   0pf
std::unordered_set<int> without hash code cached: 1st insert    17r Â
17u   0s 16640080mem   0pf
std::unordered_set<int> with hash code cached: 1st insert    14r Â
14u   0s 16638656mem   0pf
I had a look in gdb to find out why and the answer was quite obvious.
For 20 insertions tr1 implementation bucket count goes through [11, 23]
whereas for std it is [2, 5, 11, 23], so 2 more expensive rehash.
As unordered containers are dedicated to rather important number of
elements I propose to review the rehash policy with this patch so that
std also starts at 11 on the 1st insertion. After the patch figures are:
std::tr1::unordered_set<int> without hash code cached: 1st insert    Â
9r   9u   0s 14725920mem   0pf
std::tr1::unordered_set<int> with hash code cached: 1st insert    Â
8r   8u   0s 14719680mem   0pf
std::unordered_set<int> without hash code cached: 1st insert    15r Â
15u   0s 16640128mem   0pf
std::unordered_set<int> with hash code cached: 1st insert    12r Â
12u   0s 16638688mem   0pf
Moreover I noticed that performance tests are built with -O2, is that
intentional ? The std implementation uses more abstractions than tr1,
looks like building with -O3 optimizes away most of those abstractions
making tr1 and std implementation much closer:
std::tr1::unordered_set<int> without hash code cached: 1st insert    Â
2r   1u   1s 14725952mem   0pf
std::tr1::unordered_set<int> with hash code cached: 1st insert    Â
2r   1u   0s 14719536mem   0pf
std::unordered_set<int> without hash code cached: 1st insert     2r  Â
2u   0s 16640064mem   0pf
std::unordered_set<int> with hash code cached: 1st insert     2r  Â
2u   0s 16638608mem   0pf
Note that this patch also rework the alternative rehash policy based on
powers of 2 so that it also starts with a larger number of bucket (16)
and respects LWG2156.
Last I had to wider the memory column so that alignment is preserved
even when memory diff is negative.
Tested under Linux x86_64.
Ok to commit ?
François
-------------- next part --------------
A non-text attachment was scrubbed...
Name: hashtable_rehash.patch
Type: text/x-patch
Size: 12092 bytes
Desc: not available
URL: <http://gcc.gnu.org/pipermail/libstdc++/attachments/20181015/1eea191f/attachment.bin>
More information about the Libstdc++
mailing list