[PATCH][Hashtable 5/6] Remove H1/H2 template parameters

François Dumont frs.dumont@gmail.com
Thu Aug 6 06:35:25 GMT 2020


On 17/07/20 1:35 pm, Jonathan Wakely wrote:
> On 19/12/19 20:22 +0100, François Dumont wrote:
>> Because of this change printers.py has to be updated too.
>>
>> François
>>
>>
>> On 11/17/19 10:15 PM, François Dumont wrote:
>>> H1 used to be a reference to the user Hash, now _Hashtable and 
>>> unordered types agree on the same Hash type which is more intuitive.
>>>
>>> I also chose to not support anymore a stateful ranged hash functor. 
>>> We only use _Mod_range_hashing and _Mask_range_hashing.
>>>
>>> Thanks to this simplification _M_bucket_index can also be simplified.
>>>
>>>     * include/bits/hashtable_policy.h (_Hashtable<>): Remove _H1 
>>> and _H2
>>>     template parameters.
>>>     (_Hastable_base<>): Likewise.
>>>     (_Default_ranged_hash): Remove.
>>>     (_Prime_rehash_policy::__ranged_hash): New.
>>>     (_Power2_rehash_policy::__ranged_hash): New.
>>>     (_Map_base<>): Remove _H1 and _H2 template parameters.
>>>     (_Insert_base<>): Likewise.
>>>     (_Insert<>): Likewise.
>>>     (_Rehash_base<>): Likewise.
>>>     (_Local_iterator_base<>): Remove _H1 and _H2 template 
>>> parameters and add
>>>     _RangedHash.
>>>     (_Hash_code_base<>): Likewise.
>>>     (_Hash_code_base<_Key, _Value, _ExtractKey, _H1, _H2, _Hash,
>>>     __hash_not_cached_t>): Remove.
>>>     (_Hash_code_base<>::_M_bucket_index(const _Key&, __hash_code, 
>>> size_t)):
>>>     Replace by...
>>>     (_Hash_code_base<>::_M_bucket_index(__hash_code, size_t)): 
>>> ...this.
>>>     (_Local_iterator<>): Remove _H1 and _H2 template parameters.
>>>     (_Local_const_iterator<>): Likewise.
>>>     (_Equality<>): Likewise.
>>>     * include/bits/hashtable.h (_Hashtable<>): Remove _H1 and _H2 
>>> template
>>>     parameters.
>>>     * include/bits/node_handle.h: Adapt.
>>>     * include/bits/unordered_map.h: Adapt.
>>>     * include/bits/unordered_set.h: Adapt.
>>>     * testsuite/23_containers/unordered_set/hash_policy/26132.cc: 
>>> Adapt.
>>>     * testsuite/23_containers/unordered_set/hash_policy/71181.cc: 
>>> Adapt.
>>>     * 
>>> testsuite/23_containers/unordered_set/hash_policy/load_factor.cc:
>>>     Adapt.
>>>     * 
>>> testsuite/23_containers/unordered_set/hash_policy/rehash.cc: Adapt.
>>>     * 
>>> testsuite/23_containers/unordered_set/insert/hash_policy.cc: Adapt.
>>>     * 
>>> testsuite/23_containers/unordered_set/max_load_factor/robustness.cc:
>>>     Adapt.
>>>     * testsuite/performance/23_containers/insert/54075.cc: Adapt.
>>>     * testsuite/performance/23_containers/insert_erase/41975.cc: 
>>> Adapt.
>>>     * testsuite/performance/23_containers/insert_erase/
>>>     unordered_small_size.cc: Adapt.
>>>
>>> Tested under Linux x86_64.
>>>
>>> François
>>>
>>>
>>
>
>> diff --git a/libstdc++-v3/include/bits/hashtable.h 
>> b/libstdc++-v3/include/bits/hashtable.h
>> index 41be821782d..9dadc62e328 100644
>> --- a/libstdc++-v3/include/bits/hashtable.h
>> +++ b/libstdc++-v3/include/bits/hashtable.h
>> @@ -69,31 +69,18 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>>    *  and returns a bool-like value that is true if the two objects
>>    *  are considered equal.
>>    *
>> -   *  @tparam _H1  The hash function. A unary function object with
>> +   *  @tparam _Hash  The hash function. A unary function object with
>
> Renaming this to _Hash is great, that makes the code much easier to
> understand for a casual reader (like me every time I have to remember
> how our hash tables work).
>
> It makes much more sense for the _Hash parameter of unordered_set and
> the _Hash parameter of _Hashtable to be the same thing!
>
>>    *  argument type _Key and result type size_t. Return values should
>>    *  be distributed over the entire range [0, 
>> numeric_limits<size_t>:::max()].
>>    *
>> -   *  @tparam _H2  The range-hashing function (in the terminology of
>> -   *  Tavori and Dreizin).  A binary function object whose argument
>> -   *  types and result type are all size_t.  Given arguments r and N,
>> -   *  the return value is in the range [0, N).
>> -   *
>> -   *  @tparam _Hash  The ranged hash function (Tavori and Dreizin). A
>> -   *  binary function whose argument types are _Key and size_t and
>> -   *  whose result type is size_t.  Given arguments k and N, the
>> -   *  return value is in the range [0, N).  Default: hash(k, N) =
>> -   *  h2(h1(k), N).  If _Hash is anything other than the default, _H1
>> -   *  and _H2 are ignored.
>
> But I would prefer to keep these parameters, and just rename them to
> _Unused1 and _Unused2 or something like that. Maybe _U1 and _U2.
>
> You can still get rid of them as constructor parameters and get rid of
> the base classes using those types. Just keep them in the template
> argument lists, so that we keep generating the same mangled names for
> specializations of _Hashtable.
>
> That way you don't need to change printers.py (which is good, because
> that change to printers.py is not OK anyway - it would make it
> impossible to debug code built with older versions of GCC, because the
> printers would only support the new layout - we need to support
> debugging programs that were not all built by the same version of
> GCC).
>
>> @@ -168,19 +155,19 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>>    */
>>   template<typename _Key, typename _Value, typename _Alloc,
>>        typename _ExtractKey, typename _Equal,
>> -       typename _H1, typename _H2, typename _Hash,
>> -       typename _RehashPolicy, typename _Traits>
>> +       typename _Hash, typename _RehashPolicy, typename _Traits>
>
> So to be clear, what I'm suggesting is:
>
>   template<typename _Key, typename _Value, typename _Alloc,
>        typename _ExtractKey, typename _Equal,
>        typename _Hash, typename _u1, typename _U2,
>        typename _RehashPolicy, typename _Traits>
>     class _Hashtable
>     : public __detail::_Hashtable_base<_Key, _Value, _ExtractKey, _Equal,
>                        _Hash, _U1, _U2, _Traits>,
>
> So the parameters are still there at each level of the hierarchy.
>
>> @@ -456,15 +434,14 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>>       void
>>       _M_reset() noexcept;
>>
>> -      _Hashtable(const _H1& __h1, const _H2& __h2, const _Hash& __h,
>> -         const _Equal& __eq, const _ExtractKey& __exk,
>> +      _Hashtable(const _Hash& __h, const _Equal& __eq, const 
>> _ExtractKey& __exk,
>>          const allocator_type& __a)
>> -      : __hashtable_base(__exk, __h1, __h2, __h, __eq),
>> +      : __hashtable_base(__exk, __h, __eq),
>>     __hashtable_alloc(__node_alloc_type(__a))
>>       { }
>
> This change is still fine. There's no point passing around objects of
> the types that will now be unused.
>
>> diff --git a/libstdc++-v3/include/bits/hashtable_policy.h 
>> b/libstdc++-v3/include/bits/hashtable_policy.h
>> index 5cc943b3d22..11ea47b322e 100644
>> --- a/libstdc++-v3/include/bits/hashtable_policy.h
>> +++ b/libstdc++-v3/include/bits/hashtable_policy.h
>> @@ -41,8 +41,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>>
>>   template<typename _Key, typename _Value, typename _Alloc,
>>        typename _ExtractKey, typename _Equal,
>> -       typename _H1, typename _H2, typename _Hash,
>> -       typename _RehashPolicy, typename _Traits>
>> +       typename _Hash, typename _RehashPolicy,
>> +       typename _Traits>
>>     class _Hashtable;
>>
>> namespace __detail
>> @@ -54,7 +54,8 @@ namespace __detail
>>    */
>>   template<typename _Key, typename _Value,
>>        typename _ExtractKey, typename _Equal,
>> -       typename _H1, typename _H2, typename _Hash, typename _Traits>
>> +       typename _Hash, typename _RangedHash,
>> +       typename _Traits>
>>     struct _Hashtable_base;
>>
>>   // Helper function: return distance(first, last) for forward
>> @@ -442,18 +443,12 @@ namespace __detail
>>     { return __num % __den; }
>>   };
>>
>> -  /// Default ranged hash function H.  In principle it should be a
>> -  /// function object composed from objects of type H1 and H2 such that
>> -  /// h(k, N) = h2(h1(k), N), but that would mean making extra 
>> copies of
>> -  /// h1 and h2.  So instead we'll just use a tag to tell class 
>> template
>> -  /// hashtable to do that composition.
>> -  struct _Default_ranged_hash { };
>
> We'd still need to keep this empty type, so that it's still used by
> the aliases like __umap_hashtable and still appears in mangled names.
>
> Does my suggestion make sense?
>
> I really like the general idea of getting rid of some of the
> complexity and not supporting infinite customization. But we can do
> that without changing mangled names of the _Hashtable specialiations.
>
>
I didn't thought we need to keep abi compatibility for extensions.

So yes, I'll do as you advised and re-submit the patch then.

François



More information about the Libstdc++ mailing list