[PATCH] libstdc++/91223 Improve unordered containers == operator

Jonathan Wakely jwakely@redhat.com
Sat Jan 11 01:19:00 GMT 2020

On 10/01/20 22:01 +0000, Jonathan Wakely wrote:
>On 10/01/20 18:54 +0100, François Dumont wrote:
>>    Here is my attempt to improve == operator.
>>    There is a small optimization for the std::unordered_mutiXXX 
>>containers but the main enhancement rely on some partial template 
>>specialization of the _Equality type. I limit it to usage of 
>>unordered containers with std::equal_to to be sure that the 
>>container _Equal functor is like the key type ==.
>I think we can assume that for any _Equal, not just std::equal_to:
>However ...
>>    Do I need to also consider user partial template specialization 
>>of std::equal_to ? It is a well know bad practice so I hope the 
>>Standard says that such a partial specialization leads to undefined 
>It's certainly not undefined to specialize equal_to, and I'm not sure
>how to make your optimisations valid in that case.
>struct X
>  int i;
>  int rounded() const { return i - (i % 10); }
>  bool operator==(X x) const { return i == x.i; }
>template<> struct std::equal_to<X>
>  bool operator()(X l, X r) const
>  { return l.rounded() == r.rounded(); }
>template<> std::hash<X>
>  bool operator()(X x) const { return hash<int>()(x.rounded()); }
>std::unordered_multiset<X> u{ X{10}, X{11}, X{12} };
>std::unordered_multiset<X> v{ X{15}, X{16}, X{17} };
>bool b1 = u == v;
>bool b2 = std::is_permutation(u.begin(), u.end(), v.begin());
>assert(b1 == b2);
>I think the last new specialization in your patch would be used for
>this case, and because __x_count == v.count(*u.begin()) it will say
>they're equal. But the is_permutation call says they're not.
>So I think the assertion would fail, but the standard says it should
>pass. Am I mistaken?

I believe the optimization would still be valid if you do not use
__other.count(*__itx) to check for equivalent keys in the other
container. Your patch does:

+	  if (__x_count != __other.count(*__itx))
+	    return false;

This uses the _Equal predicate to count the equivalent elements in
__other. Instead you need to use operator== to count the **equal**

I think there's a similar problem in the _Equality specialization for
unordered_map (i.e. key-value pairs, unique keys):

+	  const auto __ity = __other.find(__itx->first);
+	  if (__ity == __other.end() || !bool(__ity->second == __itx->second))
+	    return false;

The call to __other.find(__itx->first) will return an element with
equivalent key, but that's not guaranteed to be equal. I think you
could fix this either by still using == to compare the keys after
__other.find(*__itx) returns an element (which doesn't fix the PR91263
bug) or by replacing find with a similar operation that looks up the
hash code and then uses == to test for equality (instead of using
_Equal pred to test for equivalent keys).

Basically, you can't use functions like find and count that rely on
equivalence of keys, you need to use handwritten lookups using ==.

And if you do that, then it doesn't matter whether _Equal is a
specialization of std::equal_to or not, and it doesn't matter whether
the user has defined their own specialization of std::equal_to. You
can do the optimizations for any _Equal, because you won't actually be
using it to test for equality.

Does that make sense?

More information about the Libstdc++ mailing list