number of calls to hash function in unordered_set

Marc Glisse marc.glisse@inria.fr
Mon Jun 11 18:25:00 GMT 2018


On Mon, 11 Jun 2018, Frank Tetzel wrote:

> Hi,
>
> can anybody give me a reasonable explanation why the hash function for
> a simple unordered_set is called twice as much as I would expect.
>
> Example:
>
>
> #include <iostream>
> #include <unordered_set>
>
> struct MyHash{
> 	static int numberOfCalls;
>
> 	size_t operator()(int i) const noexcept{
> 		++numberOfCalls;
> 		//return std::hash<int>{}(i);
> 		return i;
> 	}
> };
> int MyHash::numberOfCalls = 0;
>
> int main(){
> 	std::unordered_set<int,MyHash> set;
>
> 	std::cout << MyHash::numberOfCalls << '\n';
> 	set.reserve(1024);
> 	std::cout << MyHash::numberOfCalls << '\n';
> 	for(int i=0; i<1024; ++i){
> 		set.emplace(i);
> 	}
> 	std::cout << MyHash::numberOfCalls << '\n';
>
> 	return 0;
> }
>
>
>
>
> output:
> 0
> 0
> 2047
>
> It seems to always be (inserted_elements * 2) - 1.
> When adding 2 elements, it hashes 3 times. When adding 1 element, it
> only hashes once. What is happening here? Why is the hash calculated
> twice? What is so special about the last or first element?

https://gcc.gnu.org/onlinedocs/libstdc++/manual/unordered_associative.html#containers.unordered.cache 
answers part of the question (libc++ on the other hand seems to have 
chosen always to cache the hash).

The other part is related to how the elements are linked in the hash 
table, note that the hashes computed are 0 1 0 2 1 3 2 4 3 5 4 6 5 7 6... 
Reading the code seems best to understand what is going on.

-- 
Marc Glisse



More information about the Gcc-help mailing list