hashtable optimization

François Dumont frs.dumont@gmail.com
Sun Feb 22 18:30:00 GMT 2015


On 20/02/2015 14:22, Jonathan Wakely wrote:
> On 18/02/15 10:35 +0100, François Dumont wrote:
>> Hello
>>
>>    I am still studying hashtable performances and especially how to 
>> reduce overhead compared to tr1 implementation. Most of the overhead 
>> is coming from the additional modulo operations required with the new 
>> data model. Having a closer look at PR 57885 bench I realized that we 
>> can quite easily avoid an important modulo operation in 
>> _M_insert_bucket_begin thanks to an additional std::size_t in the 
>> container.
>>
>>    The patch is quite straightforward, it optimizes insertion of a 
>> node in an empty bucket which is the most important kind of insertion 
>> as long as hash functor is doing a good job as per default we should 
>> have at most 1 element per buckets:
>>
>> Without patch:
>> Container:std::unordered_map<int,int>  Key:int
>> Insertion: 1106 671 634 634 635  min=634 max=1106
>>
>> Container:std::tr1::unordered_map<int,int>  Key:int
>> Insertion: 885 491 487 490 511  min=487 max=885
>>
>> With patch:
>> Container:std::unordered_map<int,int>  Key:int
>> Insertion: 1099 581 583 584 592  min=581 max=1099
>>
>> Container:std::tr1::unordered_map<int,int>  Key:int
>> Insertion: 889 491 519 492 493  min=491 max=889
>>
>>    I prefer to propose it now because it impacts ABI.
>>
>> 2015-02-19  François Dumont  <fdumont@gcc.gnu.org>
>>
>>    * include/bits/hashtable.h (_Hashtable<>::_M_bbegin_bkt): New, bucket
>>    pointing to _M_before_begin.
>>    (_Hashtable<>): Maintain and use later.
>>
>> Tested under Linux x86_64.
>
> I'm nervous about this kind of change at this stage. The patch looks
> right, but are you sure you've covered *every* case where the cached
> value needs to be updated?
>
> Since this only affects performance, not correctness, it might be hard
> to convince the release managers it is necessary during stage 4, and
> I'm not entirely convinced myself either.
>
     Ok, I understand, lets keep it for later then. I am also still 
looking for the best implementation, maybe me or someone else will come 
with the perfect one having tr1 performances for insert/lookup and std 
performances for iteration/erase.

François



More information about the Libstdc++ mailing list