unordered_set::erase returns an iterator to the next element in the hash table. When the hash table is empty, this means checking every single bucket. For a hash table that used to have a lot of elements (say one million) which were all removed and now has only a few elements (say two), erasing an element is very slow. I'm not using the iterator returned by erase. Is there a way to avoid this situation? I'm not very keen on checking the load_factor and resizing the number buckets. Thanks, Shaun
Interesting question, but I'm afraid this is very hard to improve: note how in the specs themselves - Table 98 in N2960 - erase(iterator) has complexity: "Average case O(1), worst case O(a.size())."
We're seeing O(a.bucket_count()), which is quite different than O(a.size()), particularly when the hash table is empty. I understand why it's hard to improve, but I think the matter needs to brought up with the standards committee.
You are right that bucket_size != size. Now, can you imagine a way to fix this? Otherwise, I agree, we should raise the issue, do you want to do that, maybe on the library reflector?
If the pointer of the last element in the bucket pointed to the first element of the next non-empty bucket, this particular issue would be avoided. Although, I'm sure it would create its own issues. I'm not familiar with the library reflector. If you have the time to follow up with this issue there, I would appreciate it.
This is certainly related, and has been, well I would say, dismissed ìn Bellevue (I wasn't there, unfortunately, can try to find the minutes) http://www.open-std.org/jtc1/sc22/wg21/docs/lwg-closed.html#579 N2023 seems rather well written to me...
The same bug has recently been raised in Boost's implementation of unordered containers: https://svn.boost.org/trac/boost/ticket/3693
An implementation is probably expected to shrink bucket_count when size shrinks, so the complexity should still be O(size). That would be good for memory use anyway, so why's that not done?
I wouldn't depend on the number of buckets shrinking. Shrinking (and growing) a hash table is an expensive operation that requires rehashing all the elements in the hash table. Even if the implementation does provide the ability to shrink the hash table, a number of applications would disable it. Google sparsehash provides min_load_factor for this purpose.
(In reply to comment #7) > An implementation is probably expected to shrink bucket_count when size > shrinks, so the complexity should still be O(size). That would be good > for memory use anyway, so why's that not done? > shrinking invalidates iterators, doesn't it? erase() is specified not to invalidate iterators.
Stefan is right. The issue, in full generality, isn't trivial at all, there is now a new discussion on the library reflector. I'm under the impression that for C++0x we are not going to standardize the minimum load factor suggested by Matt, seems much more likely that erase will be just changed to return void, there is a growing consensus about that.
Relevant DR reopened: http://home.roadrunner.com/~hinnant/issue_review/lwg-active.html#579 Next, we should provide detailed wording...
Looks like there is a strong consensus in the LWG for the proposed resolution, that is returning void, and LWG 579 now is [Tentatively Ready]. We could even implement it in time for 4.5.0, but, if I'm not mistaken, the iterator range case is not trivial, let's reconsider that in a few days...
Subject: Bug 41975 Author: paolo Date: Thu Feb 11 18:11:01 2010 New Revision: 156705 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=156705 Log: 2010-02-11 Paolo Carlini <paolo.carlini@oracle.com> PR libstdc++/41975, DR 579 * include/bits/hashtable.h (_Hashtable<>::_M_erase_node): Remove. (erase(const_iterator), erase(const_iterator, const_iterator)): Change return type to void. * include/debug/unordered_map: Adjust. * include/debug/unordered_set: Likewise. * testsuite/util/exception/safety.h: Likewise. * testsuite/23_containers/unordered_map/erase/1.cc: Likewise. * testsuite/23_containers/unordered_map/erase/24061-map.cc: Likewise. * testsuite/23_containers/unordered_set/erase/1.cc: Likewise. * testsuite/23_containers/unordered_set/erase/24061-map.cc: Likewise. * testsuite/23_containers/unordered_multimap/erase/1.cc: Likewise. * testsuite/23_containers/unordered_multimap/erase/24061-map.cc: Likewise. * testsuite/23_containers/unordered_multiset/erase/1.cc: Likewise. * testsuite/23_containers/unordered_multiset/erase/24061-map.cc: Likewise. Modified: trunk/libstdc++-v3/ChangeLog trunk/libstdc++-v3/include/bits/hashtable.h trunk/libstdc++-v3/include/debug/unordered_map trunk/libstdc++-v3/include/debug/unordered_set trunk/libstdc++-v3/testsuite/23_containers/unordered_map/erase/1.cc trunk/libstdc++-v3/testsuite/23_containers/unordered_map/erase/24061-map.cc trunk/libstdc++-v3/testsuite/23_containers/unordered_multimap/erase/1.cc trunk/libstdc++-v3/testsuite/23_containers/unordered_multimap/erase/24061-multimap.cc trunk/libstdc++-v3/testsuite/23_containers/unordered_multiset/erase/1.cc trunk/libstdc++-v3/testsuite/23_containers/unordered_multiset/erase/24061-multiset.cc trunk/libstdc++-v3/testsuite/23_containers/unordered_set/erase/1.cc trunk/libstdc++-v3/testsuite/23_containers/unordered_set/erase/24061-set.cc trunk/libstdc++-v3/testsuite/util/exception/safety.h
Done.
Subject: Bug 41975 Author: paolo Date: Tue Mar 9 01:56:42 2010 New Revision: 157300 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=157300 Log: 2010-03-08 Paolo Carlini <paolo.carlini@oracle.com> Revert: 2010-02-11 Paolo Carlini <paolo.carlini@oracle.com> PR libstdc++/41975, DR 579 * include/bits/hashtable.h (_Hashtable<>::_M_erase_node): Remove. (erase(const_iterator), erase(const_iterator, const_iterator)): Change return type to void. * include/debug/unordered_map: Adjust. * include/debug/unordered_set: Likewise. * testsuite/util/exception/safety.h: Likewise. * testsuite/23_containers/unordered_map/erase/1.cc: Likewise. * testsuite/23_containers/unordered_map/erase/24061-map.cc: Likewise. * testsuite/23_containers/unordered_set/erase/1.cc: Likewise. * testsuite/23_containers/unordered_set/erase/24061-map.cc: Likewise. * testsuite/23_containers/unordered_multimap/erase/1.cc: Likewise. * testsuite/23_containers/unordered_multimap/erase/24061-map.cc: Likewise. * testsuite/23_containers/unordered_multiset/erase/1.cc: Likewise. * testsuite/23_containers/unordered_multiset/erase/24061-map.cc: Likewise. Modified: trunk/libstdc++-v3/ChangeLog trunk/libstdc++-v3/include/bits/hashtable.h trunk/libstdc++-v3/include/debug/unordered_map trunk/libstdc++-v3/include/debug/unordered_set trunk/libstdc++-v3/testsuite/23_containers/unordered_map/erase/1.cc trunk/libstdc++-v3/testsuite/23_containers/unordered_map/erase/24061-map.cc trunk/libstdc++-v3/testsuite/23_containers/unordered_multimap/erase/1.cc trunk/libstdc++-v3/testsuite/23_containers/unordered_multimap/erase/24061-multimap.cc trunk/libstdc++-v3/testsuite/23_containers/unordered_multiset/erase/1.cc trunk/libstdc++-v3/testsuite/23_containers/unordered_multiset/erase/24061-multiset.cc trunk/libstdc++-v3/testsuite/23_containers/unordered_set/erase/1.cc trunk/libstdc++-v3/testsuite/23_containers/unordered_set/erase/24061-set.cc trunk/libstdc++-v3/testsuite/util/exception/safety.h
I reverted my changes and re-opened the PR: DR 579 is being resolved as NAD, because there is evidence (eg, the Dinkumware implementation) that returning an iterator doesn't necessarily impact performance.
(In reply to comment #16) > there is evidence (eg, the Dinkumware implementation) that returning an > iterator doesn't necessarily impact performance. The GCC implementation does have poor performance. Why not leave the (performance-improving) patch in place until someone actually comes up with an implementation for GCC which does perform well? As it stands, users are left in a strange situation, being told that the performance (in GCC) is poor, but that it has to be this way because of Dinkumware (which, it's claimed, is fast). Doesn't this only serve to drive users away from GCC's implementation? To be clear, I am very much in favor of any solution which provides approximately optimal performance. Boost, for example, chose to implement a new method called "erase_return_void" for users who care about performance of this operation--a workaround, but with emphasis on "work."
Because would be non-conforming. In case my previous message was not clear enough, in C++1x erase will return *iterator*.
How does the Dinkumware implementation avoid the performance hit of erase returning an iterator?
Nobody knows the gory details, but Plauger said that people can look into the headers of the last beta of Visual Studio... Knowledgeable people are under the impression they are using a different, double linked, implementation for the hash table, which has its own problem, still, at this point it's pretty clear that in C++1x erase will return an iterator and we have to live with that.
(In reply to comment #18) > Because would be non-conforming. In case my previous message was not clear > enough, in C++1x erase will return *iterator*. The Boost approach is still an option for GCC: let the standards mandate a suboptimal interface if they must, but provide an alternative (the one Boost calls "erase_return_void"). From where I sit at least, it doesn't seem infeasible for GCC to go a little bit beyond the standard in this case (again, as Boost have already elected to do).
As a last resort, we can imagine doing that. Would be the first case, however, to my best knowledge, that we provide a different interface controlled by a macro this way. Much better would be, and has also been mentioned today, providing two different implementations, both returning iterator and only different as regards the internal details.
At the end of a further discussion today, apparently there is a strong consensus to add additional signatures returning void, "Boost-style".
GCC 4.5.0 is being released. Deferring to 4.5.1.
GCC 4.5.1 is being released, adjusting target milestone.
US 113, ES 2, US 118 / Issue 579 have been closed as NAD, thus let's figure out how best obtain O(1) in our implementation...
Unless somebody posts here over the next two/three days or so *concrete* ideas of a different sort, I'm going to simply work on a doubly linked list solution, along the lines of the section "iterator" here: http://www.drdobbs.com/184403822 Nothing new, therefore. All the operations on iterators will become faster, not just computing the iterator returned by erase, on the other hand two pointers instead of one will be used for each element.
> US 113, ES 2, US 118 / Issue 579 have been closed as NAD, thus > let's figure out how best obtain O(1) in our implementation... Do you have a rationale for the closing of this NB comments? N3133 shows 579 unchanged. I was told that someone reported about the existence of a O(1) singly-linked list implementation in Rapperswil, but I don't have additional details. I'd wait to get the full picture before going to a doubly-linked list: the commitee had full info on the issue, so if they closed 579 as NAD they are supposed to be able to provide a rationale.
I'm not aware of any singly linked list implementation, to be honest. I know that Dinkumware already uses doubly, and, if I'm not wrong, Howard just moved to it. I'll send you privately the rationale I have from the minutes, I'm also asking again Matt whether he has anything else to suggest, but frankly I'm rather fed up with this issue, I mean to implement something that *works*, is *conforming* and then test it in the field.
More correctly (in the meanwhile went through a exchange at the beginning of this year), Howard stores the hash, which boils down to a memory requirement similar to that of the traditional doubly linked list scheme per Dinkum in the limit of high load factor, for small load factor is better because can use only one pointer instead of two for each bucket in the bucket list. All in all, if the requirements of throwing hash + erase complexity are combined, I don't think anything with a memory use similar to that of the singly linked schemes is possible. Joaquin, I think Matt would be in favor of a motion asking a re-opening of the issue in Batavia, what do you think?
Paolo, I've read the minutes and seems no strong consensus was reached. I think it'd be useful if the issue can be reopened, at least for informative purposes, so that the committe specify whether it's in the spirit of the standard to allow low memory (singly-linked, one word per node) implementations, and if so what the likely implementations are (I understand this is Pablo's third approach). If this is the case the FCD has to be changed so as to allow erase() to throw. Otherwise implementors will know we have to resort to doubly-linked solutions.
GCC 4.5.2 is being released, adjusting target milestone.
Author: fdumont Date: Wed Nov 23 20:30:18 2011 New Revision: 181677 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=181677 Log: 2011-11-23 François Dumont <fdumont@gcc.gnu.org> PR libstdc++/41975 * include/bits/hashtable.h (_Hashtable<>): Major data model modification to limit performance impact of empty buckets in erase(iterator) implementation. * include/bits/hashtable_policy.h (_Hashtable_iterator, _Hashtable_const_iterator): Remove not used anymore. * include/bits/hashtable_policy.h (_Prime_rehash_policy): Remove _M_grow_factor, just use natural evolution of prime numbers. Add _M_prev_size to know when the number of buckets can be reduced. * include/bits/unordered_set.h (__unordered_set<>, __unordered_multiset<>), unordered_map.h (__unordered_map<>, __unordered_multimap<>): Change default value of cache hash code template parameter, false for integral types with noexcept hash functor, true otherwise. * include/debug/unordered_map, unordered_set: Adapt transformation from iterator/const_iterator to respectively local_iterator/const_local_iterator. * testsuite/performance/23_containers/copy_construct/unordered_set.cc: New. * testsuite/23_containers/unordered_set/instantiation_neg.cc: New. * testsuite/23_containers/unordered_set/hash_policy/rehash.cc: New. * testsuite/23_containers/unordered_multiset/cons/copy.cc: New. * testsuite/23_containers/unordered_multiset/erase/1.cc, 24061-multiset.cc: Add checks on the number of bucket elements. * testsuite/23_containers/unordered_multiset/insert/multiset_range.cc, multiset_single.cc, multiset_single_move.cc: Likewise. Added: trunk/libstdc++-v3/testsuite/23_containers/unordered_multiset/cons/copy.cc trunk/libstdc++-v3/testsuite/23_containers/unordered_set/hash_policy/rehash.cc trunk/libstdc++-v3/testsuite/23_containers/unordered_set/instantiation_neg.cc trunk/libstdc++-v3/testsuite/performance/23_containers/copy_construct/unordered_set.cc Modified: trunk/libstdc++-v3/ChangeLog trunk/libstdc++-v3/include/bits/hashtable.h trunk/libstdc++-v3/include/bits/hashtable_policy.h trunk/libstdc++-v3/include/bits/unordered_map.h trunk/libstdc++-v3/include/bits/unordered_set.h trunk/libstdc++-v3/include/debug/unordered_map trunk/libstdc++-v3/include/debug/unordered_set trunk/libstdc++-v3/testsuite/23_containers/unordered_multiset/erase/1.cc trunk/libstdc++-v3/testsuite/23_containers/unordered_multiset/erase/24061-multiset.cc trunk/libstdc++-v3/testsuite/23_containers/unordered_multiset/insert/multiset_range.cc trunk/libstdc++-v3/testsuite/23_containers/unordered_multiset/insert/multiset_single.cc trunk/libstdc++-v3/testsuite/23_containers/unordered_multiset/insert/multiset_single_move.cc
Fixed for 4.7.0.
It appears that http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=181677 has caused some performance regression in gcc-4.7 (or I am missing something). The following example runs 3 times faster with gcc-4.6 than with gcc-4.7: #include <unordered_map> using namespace std; int main() { unordered_map<int, int> m; //m.rehash(10000000); for (int i = 0; i < 10000000; i++) { m[i] = i; } } Looking at a profile generated by google-perftools, it seems that most of the time is spent in _M_rehash with gcc-4.7, as the newly rehashed size is considerably smaller than in gcc-4.6 (perhaps due to the removal of _M_growth_factor?), therefore it gets called a lot more often. Is this expected behavior? Reserve/rehash also appears to be broken. If the 'rehash' line is uncommented in the example above, then a subsequent insert hits the check against _M_prev_resize in _M_need_rehash and rehashes the bucket size back to a small value.
See http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54075