[Bug libstdc++/54075] [4.7.1] unordered_map insert still slower than 4.6.2
Thu Nov 29 21:34:00 GMT 2012
On 11/23/2012 11:24 PM, Paolo Carlini wrote:
> On 11/23/2012 10:59 PM, FranÃ§ois Dumont wrote:
>> But I think I won't lose my time anymore on this approach and will
>> rather prepare a small patch to change default behavior regarding the
>> hash code. I agree with all your remarks Jonathan so I will use them
>> for the patch.
> First thanks a lot for all the time you are spending on this. That
> said, I want to make sure the following is recorded: as far as I can
> see in the current 54075.cc performance testcase (I also ran it
> myself) there isn't much difference between the cached / non-cached
> cases, right? That worries me a lot, because the whole discussion
> about caching, if I remember correctly, started when, basing on some
> preliminary performance numbers:
Well, this test is showing a difference.
> we were under the impression that essentially we had the reason of the
> performance problem we were seeing for std vs tr1 nailed. In other
> terms: we thought (or I should say: *I* thought?) that the insert
> performance of the current std code vs the tr1 code was the same for
> the same caching strategy.
This is what I think too...
> Whatever we do for caching we have still to sort out the real reason
> of the performance regression vs tr1 for multiple insertions, or I'm I
> utterly confused?
...except in this use case. We indeed have a regression on collision
detection and it seems logical for me based on how the hashtable data
model has been modified.
For the moment I have no solution to this problem, the one I tried
recently involved too many additional memory leading to worst
performance. But the new data model also offer a number of enhancements,
it fixes PR 41975 and also enhance management of unordered_multi*, This
is why I think that for the moment we should go ahead with the current
implementation and simply change the default cache policy. So here is a
patch for that.
2012-11-30 FranÃ§ois Dumont <firstname.lastname@example.org>
* include/bits/functional_hash.h (__is_fast_hash): New.
* include/bits/hashtable.h (__cache_default): Cache only if the
hash functor is not noexcept qualified or if it is slow or if the
hash functor do not expose a default constructor.
* include/bits/hashtable_policy.h (_Local_iterator_base): Use
_Hashtable_ebo_helper to embed necessary functors into the
local_iterator. Pass information about functors involved in hash
code by copy.
* include/bits/basic_string.h: Use __is_fast_hash to signal that
hashing strings is slow.
(std::__debug::unordered_set<>::erase): Detect local iterators to
invalidate using contained node rather than generating a dummy
* testsuite/performance/23_containers/insert_erase/41975.cc: Test
std::tr1 and std versions of unordered_set regardless of any
macro. Add test on default behavior.
* testsuite/performance/23_containers/insert/54075.cc: Likewise.
The conditions to _not_ use cache are now:
- hash functor is noexcept qualified
- hash functor is not slow
- hash functor has a default constructor
Last condition is necessary to be able to write:
It is mandatory by the Standard to have iterators default constructible,
I introduce the __is_fast_hash next to root definition of std::hash. I
hope it is ok to introduce usage of std::true_type in this file. Is it
also the right way to introduce this kind of Standard extension ?
I have also adapt code to broadcast state of functors involved in
hashing to the local_iterators. Writing a hash functor with state made
me think that it is better.
I have also modified performance test to check that per default we adopt
the best bahavior regarding the has code policy.
Tested under Linux x86_64, normal and debug modes.
Ok to commit ?
I haven't documented it yet. Do you want me to do so ? I know that my
English is not perfect but I can do a proposition ;-)
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 32012 bytes
Desc: not available
More information about the Libstdc++