hashtable optimization
François Dumont
frs.dumont@gmail.com
Sun Feb 22 18:30:00 GMT 2015
On 20/02/2015 14:22, Jonathan Wakely wrote:
> On 18/02/15 10:35 +0100, François Dumont wrote:
>> Hello
>>
>> I am still studying hashtable performances and especially how to
>> reduce overhead compared to tr1 implementation. Most of the overhead
>> is coming from the additional modulo operations required with the new
>> data model. Having a closer look at PR 57885 bench I realized that we
>> can quite easily avoid an important modulo operation in
>> _M_insert_bucket_begin thanks to an additional std::size_t in the
>> container.
>>
>> The patch is quite straightforward, it optimizes insertion of a
>> node in an empty bucket which is the most important kind of insertion
>> as long as hash functor is doing a good job as per default we should
>> have at most 1 element per buckets:
>>
>> Without patch:
>> Container:std::unordered_map<int,int> Key:int
>> Insertion: 1106 671 634 634 635 min=634 max=1106
>>
>> Container:std::tr1::unordered_map<int,int> Key:int
>> Insertion: 885 491 487 490 511 min=487 max=885
>>
>> With patch:
>> Container:std::unordered_map<int,int> Key:int
>> Insertion: 1099 581 583 584 592 min=581 max=1099
>>
>> Container:std::tr1::unordered_map<int,int> Key:int
>> Insertion: 889 491 519 492 493 min=491 max=889
>>
>> I prefer to propose it now because it impacts ABI.
>>
>> 2015-02-19 François Dumont <fdumont@gcc.gnu.org>
>>
>> * include/bits/hashtable.h (_Hashtable<>::_M_bbegin_bkt): New, bucket
>> pointing to _M_before_begin.
>> (_Hashtable<>): Maintain and use later.
>>
>> Tested under Linux x86_64.
>
> I'm nervous about this kind of change at this stage. The patch looks
> right, but are you sure you've covered *every* case where the cached
> value needs to be updated?
>
> Since this only affects performance, not correctness, it might be hard
> to convince the release managers it is necessary during stage 4, and
> I'm not entirely convinced myself either.
>
Ok, I understand, lets keep it for later then. I am also still
looking for the best implementation, maybe me or someone else will come
with the perfect one having tr1 performances for insert/lookup and std
performances for iteration/erase.
François
More information about the Libstdc++
mailing list