[Bug libstdc++/41622] New: [c++0x] std::hash<std::string>::operator() copies its argument
jbytheway at gmail dot com
gcc-bugzilla@gcc.gnu.org
Wed Oct 7 15:40:00 GMT 2009
The following program:
====
#include <string>
#include <unordered_set>
int main()
{
std::unordered_set<std::string> set;
std::string key("foo");
for (int i=0; i<1000; ++i) {
set.count(key);
}
return 0;
}
====
(if compiled without optimizations) makes 1000 calls to the std::string copy
constructor (according to profiling by valgrind --tool=callgrind). This is
because the std::hash<std::string>::operator() takes its argument by value
rather than by const reference (see tr1_impl/functional_hash.h lines 164--166).
This slowed one of my programs by about 10-20%, the faster time being achieved
by using boost::hash<std::string> instead of std::hash<std::string>. The boost
implementation takes its argument by const reference (See
http://gitorious.org/boost/svn/blobs/750d0927ebb5adc7f4150c854305e5e28924d747/boost/functional/hash/hash.hpp
lines 351--369 and line 389). I believe libstdc++ should follow this lead on
performance grounds. String keys for unordered associative containers are a
very common use case, and it seems silly not to make them as fast as possible.
Looking at a draft standard (N2914, [unord.hash]) it does seem to suggest that
std::hash<T>::operator() must take its argument by value, but my standard-fu is
insufficient to be sure if it's really insisting.
As evidence that the change is OK:
- boost::hash claims in its documentation
(http://www.boost.org/doc/libs/1_40_0/doc/html/hash.html) to be compliant with
TR1, linking to N1836, in which [tr.unord.hash] has essentially the same
content as the draft standard. I would expect Boost to get this right,
therefore I conclude that it is acceptable to use a const reference argument.
- If you were specializing std::hash for a movable-but-not-copyable type, then
it would be impossible to take the argument by value. The standard certainly
seems to imply that Key types need not be copyable.
- The only obvious constraint on user-supplied hash functions is "A hash
function is a function object that takes a single argument of type Key and
returns a value of type std::size_t." ([unord.req], p4); it doesn't insist on
taking the argument by value.
--
Summary: [c++0x] std::hash<std::string>::operator() copies its
argument
Product: gcc
Version: 4.4.1
Status: UNCONFIRMED
Severity: enhancement
Priority: P3
Component: libstdc++
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: jbytheway at gmail dot com
GCC build triplet: x86_64-pc-linux-gnu
GCC host triplet: x86_64-pc-linux-gnu
GCC target triplet: x86_64-pc-linux-gnu
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41622
More information about the Gcc-bugs
mailing list