New hashtable power 2 rehash policy

Sat May 14 16:14:00 GMT 2016

Le 28/04/2016 12:22, Jonathan Wakely a Ã©crit :
> On 23/04/16 10:27 +0200, FranÃ§ois Dumont wrote:
>> Hi
>>
>>    Here is a patch to introduce a new power of 2 based rehash policy. 
>> It enhances performance as it avoids modulo float operations. I have 
>> updated performance benches and here is the result:
>>
> Thanks, now that we're back in stage 1 we can make this change.
>
>
>>
>> 2016-04-22  FranÃ§ois Dumont <fdumont@gcc.gnu.org>
>>
>>    * include/bits/hashtable_policy.h
>>    (_Prime_rehash_policy::__has_load_factor): New. Mark rehash policy
>>    having load factor management.
>>    (_Mask_range_hashing): New.
>>    (_NextPower2<size_t>): New.
>>    (_Power2_rehash_policy): New.
>>    (_Inserts<>): Remove last template parameter _Unique_keys. Use the 
>> same
>>    implementation when keys are unique no matter if iterators are 
>> constant
>>    or not.
>
> I found this change description confusing, because it's really using
> the same implementation whether keys are unique or not, but this says
> "Use the same implementation whether iterators are constant or not".
>
> Shouldn't it be "Use the same implementation when iterators are
> constant, no matter if keys are unique or not" ?
>
> Maybe this would be clearer:
>
>    (_Inserts<>): Remove last template parameter, _Unique_keys, so
>    that partial specializations only depend on whether iterators are
>    constant or not.

Yes, sorry, I prepared this patch a long time ago and didn't reconsider 
it in enough details.

>
>> +  struct _Mask_range_hashing
>> +  {
>> +    typedef std::size_t first_argument_type;
>> +    typedef std::size_t second_argument_type;
>> +    typedef std::size_t result_type;
>> +
>> +    result_type
>> +    operator()(first_argument_type __num,
>> +           second_argument_type __den) const noexcept
>> +    { return __num & (__den - 1); }
>> +  };
>> +
>> +
>> +  /// Helper type to compute next power of 2.
>> +  template<typename _SizeT,
>> +       std::size_t _N = sizeof(_SizeT) << 2, bool _Increment = true>
>> +    struct _NextPower2
>> +    {
>> +      static _SizeT
>> +      _Get(_SizeT __n)
>> +      {
>> +    _SizeT __next = _NextPower2<_SizeT, (_N >> 1), false>::_Get(__n);
>> +    __next |= __next >> _N;
>> +    if (_Increment)
>> +      ++__next;
>> +
>> +    return __next;
>> +      }
>> +    };
>> +
>> +  template<typename _SizeT>
>> +    struct _NextPower2<_SizeT, 1, false>
>> +    {
>> +      static _SizeT
>> +      _Get(_SizeT __n)
>> +      {
>> +    --__n;
>> +    return __n |= __n >> 1;
>> +      }
>> +    };
>
> What's the reason to keep this recursive template instead of using a
> simple function like the clp2() we discussed? A simple function (which
> could be _GLIBCXX14_CONSTEXPR) compiles faster, and produces similar
> object code for the default -std=gnu++14 mode. And it doesn't require
> six calls to _NextPower2::_Get to calculate the result.

I though we could have a wide variety of std::size_t definition on 
different platforms. If we just need to consider 32/64 bits then it is 
fine. Note that I thought that gcc would optimize away the calls to 
_NextPower2::_Get.

>
> If you're worried about the final shift being unnecessary on 32-bit
> you can use the preprocessor, something like:
>
>  _GLIBCXX14_CONSTEXPR
>  std::size_t
>  __clp2(std::size_t n)
>  {
> #if __SIZEOF_SIZE_T__ >= 8
>    std::uint_fast64_t x = n;
> #else
>    std::uint_fast32_t x = n;
> #endif
>    // Algorithm from Hacker's Delight, Figure 3-3.
>    x = x - 1;
>    x = x | (x >> 1);
>    x = x | (x >> 2);
>    x = x | (x >> 4);
>    x = x | (x >> 8);
>    x = x | (x >>16);
> #if __SIZEOF_SIZE_T__ >= 8
>    x = x | (x >>32);
> #endif
>    return x + 1;
>  }
>
> I don't think we need to worry about 128-bit integers, because that
> would be a ridiculous number of buckets anyway. We can just limit
> __max_bkt so we don't have to deal with more than 2^63 buckets.

Adopted.

>> +  /// operations.
>> +  struct _Power2_rehash_policy
>> +  {
>> +    using __has_load_factor = std::true_type;
>> +
>> +    _Power2_rehash_policy(float __z = 1.0) noexcept
>> +    : _M_max_load_factor(__z), _M_next_resize(0) { }
>> +
>> +    float
>> +    max_load_factor() const noexcept
>> +    { return _M_max_load_factor; }
>> +
>> +    // Return a bucket size no smaller than n (as long as n is not 
>> above the
>> +    // highest power of 2).
>> +    std::size_t
>> +    _M_next_bkt(std::size_t __n) const
>> +    {
>> +      constexpr auto __max_bkt
>> +    = std::size_t(1) << ((sizeof(std::size_t) << 3) - 1);
>
> Should this be sizeof(size_t) * __CHAR_BITS__ instead of << 3 ?
> Otherwise targets with char wider than 8 bits would get a smaller
> maximum number of buckets, even if they have 64-bit size_t.

sizeof() returns number of char elements in the given type ? I thought 
it was the number of bytes and that a byte was always 8 bits. But I know 
that you know your stuff so fixed in the new patch.

>
> So, including the suggestion above to limit it to 2^63, it would be:
>
>     constexpr size_t __max_width = std::min(sizeof(size_t), 8);
>     constexpr auto __max_bkt
>      = std::size_t(1) << (__max_width * __CHAR_BITS__ - 1);

Ok,I just had to introduce a _GLIBCXX14_USE_CONSTEXPR to be able to 
define this expression as constexpr.
>
>> +
>> +      std::size_t __res = _NextPower2<std::size_t>::_Get(__n);
>> +
>> +      if (__res == 0)
>> +    __res = __max_bkt;
>> +
>> +      if (__res == __max_bkt)
>> +    // Set next resize to the max value so that we never try to 
>> rehash again
>> +    // as we already reach the biggest possible bucket number.
>> +    // Note that it might result in max_load_factor not being 
>> respected.
>> +    _M_next_resize = std::size_t(0) - 1;
>
> This can be simplified to size_t(-1), there's no need for arithmetic.

This was my mental representation of it, size_t(-1) adopted.

>
>
>> @@ -775,8 +907,7 @@
>>        typename _ExtractKey, typename _Equal,
>>        typename _H1, typename _H2, typename _Hash,
>>        typename _RehashPolicy, typename _Traits,
>> -       bool _Constant_iterators = _Traits::__constant_iterators::value,
>> -       bool _Unique_keys = _Traits::__unique_keys::value>
>> +       bool _Constant_iterators = _Traits::__constant_iterators::value>
>>     struct _Insert;
>>
>>   /// Specialization.
>
> (I'm sure I've said it before, but having 10+ template parameters here
> makes me sad).

Me too but there are quite some work to reduce it. It would sure be a 
very interesting design change to work on.

I took the time to adapt a number of test case to also cover this new 
hash policy.

New patch attached, tested under linux x86_64.

FranÃ§ois

-------------- next part --------------
A non-text attachment was scrubbed...
Name: hashtable.patch
Type: text/x-patch
Size: 26798 bytes
Desc: not available
URL: <http://gcc.gnu.org/pipermail/libstdc++/attachments/20160514/67849b89/attachment.bin>