This is the mail archive of the libstdc++@gcc.gnu.org mailing list for the libstdc++ project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [patch] libstdc++/61143 make unordered containers usable after move


On 15/05/2014 22:52, Jonathan Wakely wrote:
On 15/05/14 22:20 +0200, François Dumont wrote:
Hi

Here is a proposal to fix PR 61143. As explained in the PR I finally prefer to leave the container in a valid state that is to say with a non zero number of bucket, that is to say 1, after a move. This bucket is stored directly in the container so not allocated to leave the move operations noexcept qualified.

Thanks for fixing this, I like the direction very much. I have a few
comments below ...

With this evolution we could even make the default constructor noexcept but I don't think it has any interest.

I tend to agree with Paolo that a noexcept default constructor might
be useful, but let's fix the bug first and consider that later.

    ok, we will have to review the hint values but it should be easy.


Index: include/bits/hashtable.h
===================================================================
--- include/bits/hashtable.h    (revision 210479)
+++ include/bits/hashtable.h    (working copy)
@@ -316,14 +316,37 @@
      size_type            _M_element_count;
      _RehashPolicy        _M_rehash_policy;

+ // A single bucket used when only need 1 bucket. After move the hashtable + // is left with only 1 bucket which is not allocated so that we can have a
+      // noexcept move constructor.
+ // Note that we can't leave hashtable with 0 bucket without adding
+      // numerous checks in the code to avoid 0 modulus.
+      __bucket_type        _M_single_bucket;

Does this get initialized in the constructors?
Would it make sense to give it an initializer?

     __bucket_type        _M_single_bucket = nullptr;

This bucket is replacing those normally allocated and when they are allocated they are 0 initialised. So, you were right, there were one place where this initialization was missing which is fixed in this new patch. So I don't think this additional initialization is necessary.


@@ -980,12 +999,16 @@
    _M_move_assign(_Hashtable&& __ht, std::true_type)
    {
      this->_M_deallocate_nodes(_M_begin());
-      if (__builtin_expect(_M_bucket_count != 0, true))
-    _M_deallocate_buckets();
-
+      _M_deallocate_buckets();
      __hashtable_base::operator=(std::move(__ht));
      _M_rehash_policy = __ht._M_rehash_policy;
-      _M_buckets = __ht._M_buckets;
+ if (__builtin_expect(__ht._M_buckets != &__ht._M_single_bucket, true))
+    _M_buckets = __ht._M_buckets;

What is the value of this->_M_single_bucket now?

Should it be set to nullptr, if only to help debugging?

We are not resetting buckets to null when rehashing so unless I add more checks I won't be able to reset it each time.


+ if (__builtin_expect(__ht._M_buckets == &__ht._M_single_bucket, false))

This check appears in a few places, I wonder if it is worth creating a
private member function to hide the details:

 bool _M_moved_from() const noexcept
 {
   return __builtin_expect(_M_buckets == &_M_single_bucket, false);
 }

Then just test if (__ht._M_moved_from())

Usually I would think the __builtin_expect should not be inside the
member function, so the caller decides what the expected result is,
but I think in all cases the result is expected to be false. That
matches how move semantics are designed: the object that gets moved
from is expected to be going out of scope, and so will be reused in a
minority of cases.

@@ -1139,7 +1170,14 @@
    {
      if (__ht._M_node_allocator() == this->_M_node_allocator())
    {
-      _M_buckets = __ht._M_buckets;
+ if (__builtin_expect(__ht._M_buckets == &__ht._M_single_bucket, false))

This could be if (__ht._M_moved_from())

I hesitated in doing so and finally do so. I only prefer _M_use_single_bucket as we might not limit its usage to moved instances.


@@ -1193,11 +1231,21 @@
      std::swap(_M_bucket_count, __x._M_bucket_count);
      std::swap(_M_before_begin._M_nxt, __x._M_before_begin._M_nxt);
      std::swap(_M_element_count, __x._M_element_count);
+      std::swap(_M_single_bucket, __x._M_single_bucket);

+ // Fix buckets if any is pointing to its single bucket that can't be
+      // swapped.
+      if (_M_buckets == &__x._M_single_bucket)
+    _M_buckets = &_M_single_bucket;
+
+      if (__x._M_buckets == &_M_single_bucket)
+    __x._M_buckets = &__x._M_single_bucket;
+

Does this do more work than necessary, swapping the _M_buckets
members, then updating them again?

How about removing the std::swap(_M_buckets, __x._M_buckets) above and
doing (untested):

 if (this->_M_moved_from())
   {
     if (__x._M_moved_from())
       _M_buckets = &_M_single_bucket;
     else
       _M_buckets = __x._M_buckets;
     __x._M_buckets = &__x._M_single_bucket;
   }
 else if (__x._M_moved_from())
   {
     __x._M_buckets = _M_buckets;
     _M_buckets = &_M_single_bucket;
   }
 else
   std::swap(_M_buckets, __x._M_buckets);

Is that worth it?  I'm not sure.

Yes, with the newly introduced _M_use_single_bucket it worth it I think and is what I have done.

Here is the new patch limited to what I really want to commit this time.

2014-05-20  François Dumont  <fdumont@gcc.gnu.org>

    PR libstdc++/61143
    * include/bits/hashtable.h: Fix move semantic to leave hashtable in a
    usable state.
    * testsuite/23_containers/unordered_set/61143.cc: New.
    * testsuite/23_containers/unordered_set/modifiers/swap.cc: New.

Tested under Linux x86_64, ok to commit ?

François

Index: include/bits/hashtable.h
===================================================================
--- include/bits/hashtable.h	(revision 210479)
+++ include/bits/hashtable.h	(working copy)
@@ -316,14 +316,49 @@
       size_type			_M_element_count;
       _RehashPolicy		_M_rehash_policy;
 
+      // A single bucket used when only need for 1 bucket. Especially
+      // interesting in move semantic to leave hashtable with only 1 buckets
+      // which is not allocated so that we can have those operations noexcept
+      // qualified.
+      // Note that we can't leave hashtable with 0 bucket without adding
+      // numerous checks in the code to avoid 0 modulus.
+      __bucket_type		_M_single_bucket;
+
+      bool
+      _M_use_single_bucket(__bucket_type* __bkts) const
+      { return __builtin_expect(_M_buckets == &_M_single_bucket, false); }
+
+      bool
+      _M_use_single_bucket() const
+      { return _M_use_single_bucket(_M_buckets); }
+
       __hashtable_alloc&
       _M_base_alloc() { return *this; }
 
-      using __hashtable_alloc::_M_deallocate_buckets;
+      __bucket_type*
+      _M_allocate_buckets(size_type __n)
+      {
+	if (__builtin_expect(__n == 1, false))
+	  {
+	    _M_single_bucket = nullptr;
+	    return &_M_single_bucket;
+	  }
 
+	return __hashtable_alloc::_M_allocate_buckets(__n);
+      }
+
       void
+      _M_deallocate_buckets(__bucket_type* __bkts, size_type __n)
+      {
+	if (_M_use_single_bucket(__bkts))
+	  return;
+
+	__hashtable_alloc::_M_deallocate_buckets(__bkts, __n);
+      }
+
+      void
       _M_deallocate_buckets()
-      { this->_M_deallocate_buckets(_M_buckets, _M_bucket_count); }
+      { _M_deallocate_buckets(_M_buckets, _M_bucket_count); }
 
       // Gets bucket begin, deals with the fact that non-empty buckets contain
       // their before begin node.
@@ -703,11 +738,7 @@
 
       size_type
       erase(const key_type& __k)
-      {
-	if (__builtin_expect(_M_bucket_count == 0, false))
-	  return 0;
-	return _M_erase(__unique_keys(), __k);
-      }
+      { return _M_erase(__unique_keys(), __k); }
 
       iterator
       erase(const_iterator, const_iterator);
@@ -768,7 +799,7 @@
       _M_rehash_policy()
     {
       _M_bucket_count = _M_rehash_policy._M_next_bkt(__bucket_hint);
-      _M_buckets = this->_M_allocate_buckets(_M_bucket_count);
+      _M_buckets = _M_allocate_buckets(_M_bucket_count);
     }
 
   template<typename _Key, typename _Value,
@@ -796,7 +827,7 @@
 	    std::max(_M_rehash_policy._M_bkt_for_elements(__nb_elems),
 		     __bucket_hint));
 
-	_M_buckets = this->_M_allocate_buckets(_M_bucket_count);
+	_M_buckets = _M_allocate_buckets(_M_bucket_count);
 	__try
 	  {
 	    for (; __f != __l; ++__f)
@@ -833,9 +864,9 @@
 	      {
 		// Replacement allocator cannot free existing storage.
 		this->_M_deallocate_nodes(_M_begin());
-		if (__builtin_expect(_M_bucket_count != 0, true))
-		  _M_deallocate_buckets();
-		_M_reset();
+		_M_before_begin._M_nxt = nullptr;
+		_M_deallocate_buckets();
+		_M_buckets = nullptr;
 		std::__alloc_on_copy(__this_alloc, __that_alloc);
 		__hashtable_base::operator=(__ht);
 		_M_bucket_count = __ht._M_bucket_count;
@@ -867,7 +898,7 @@
 	if (_M_bucket_count != __ht._M_bucket_count)
 	  {
 	    __former_buckets = _M_buckets;
-	    _M_buckets = this->_M_allocate_buckets(__ht._M_bucket_count);
+	    _M_buckets = _M_allocate_buckets(__ht._M_bucket_count);
 	    _M_bucket_count = __ht._M_bucket_count;
 	  }
 	else
@@ -885,8 +916,7 @@
 		      [&__roan](const __node_type* __n)
 		      { return __roan(__n->_M_v()); });
 	    if (__former_buckets)
-	      this->_M_deallocate_buckets(__former_buckets,
-					  __former_bucket_count);
+	      _M_deallocate_buckets(__former_buckets, __former_bucket_count);
 	  }
 	__catch(...)
 	  {
@@ -917,7 +947,7 @@
       {
 	__bucket_type* __buckets = nullptr;
 	if (!_M_buckets)
-	  _M_buckets = __buckets = this->_M_allocate_buckets(_M_bucket_count);
+	  _M_buckets = __buckets = _M_allocate_buckets(_M_bucket_count);
 
 	__try
 	  {
@@ -964,8 +994,9 @@
     _M_reset() noexcept
     {
       _M_rehash_policy._M_reset();
-      _M_bucket_count = 0;
-      _M_buckets = nullptr;
+      _M_bucket_count = 1;
+      _M_single_bucket = nullptr;
+      _M_buckets = &_M_single_bucket;
       _M_before_begin._M_nxt = nullptr;
       _M_element_count = 0;
     }
@@ -980,12 +1011,16 @@
     _M_move_assign(_Hashtable&& __ht, std::true_type)
     {
       this->_M_deallocate_nodes(_M_begin());
-      if (__builtin_expect(_M_bucket_count != 0, true))
-	_M_deallocate_buckets();
-
+      _M_deallocate_buckets();
       __hashtable_base::operator=(std::move(__ht));
       _M_rehash_policy = __ht._M_rehash_policy;
-      _M_buckets = __ht._M_buckets;
+      if (!__ht._M_use_single_bucket())
+	_M_buckets = __ht._M_buckets;
+      else
+	{
+	  _M_buckets = &_M_single_bucket;
+	  _M_single_bucket = __ht._M_single_bucket;
+	}
       _M_bucket_count = __ht._M_bucket_count;
       _M_before_begin._M_nxt = __ht._M_before_begin._M_nxt;
       _M_element_count = __ht._M_element_count;
@@ -1019,7 +1054,7 @@
 	  if (_M_bucket_count != __ht._M_bucket_count)
 	    {
 	      __former_buckets = _M_buckets;
-	      _M_buckets = this->_M_allocate_buckets(__ht._M_bucket_count);
+	      _M_buckets = _M_allocate_buckets(__ht._M_bucket_count);
 	      _M_bucket_count = __ht._M_bucket_count;
 	    }
 	  else
@@ -1093,10 +1128,18 @@
       _M_element_count(__ht._M_element_count),
       _M_rehash_policy(__ht._M_rehash_policy)
     {
+      // Update, if necessary, buckets if __ht is using its single bucket.
+      if (__ht._M_use_single_bucket())
+	{
+	  _M_buckets = &_M_single_bucket;
+	  _M_single_bucket = __ht._M_single_bucket;
+	}
+
       // Update, if necessary, bucket pointing to before begin that hasn't
       // moved.
       if (_M_begin())
 	_M_buckets[_M_bucket_index(_M_begin())] = &_M_before_begin;
+
       __ht._M_reset();
     }
 
@@ -1139,7 +1182,14 @@
     {
       if (__ht._M_node_allocator() == this->_M_node_allocator())
 	{
-	  _M_buckets = __ht._M_buckets;
+	  if (__ht._M_use_single_bucket())
+	    {
+	      _M_buckets = &_M_single_bucket;
+	      _M_single_bucket = __ht._M_single_bucket;
+	    }
+	  else
+	    _M_buckets = __ht._M_buckets;
+
 	  _M_before_begin._M_nxt = __ht._M_before_begin._M_nxt;
 	  // Update, if necessary, bucket pointing to before begin that hasn't
 	  // moved.
@@ -1189,15 +1239,32 @@
 
       std::__alloc_on_swap(this->_M_node_allocator(), __x._M_node_allocator());
       std::swap(_M_rehash_policy, __x._M_rehash_policy);
-      std::swap(_M_buckets, __x._M_buckets);
+
+      // Deal properly with potentially moved instances.
+      if (!_M_use_single_bucket() && !__x._M_use_single_bucket())
+	std::swap(_M_buckets, __x._M_buckets);
+      else
+	if (!_M_use_single_bucket())
+	  {
+	    __x._M_buckets = _M_buckets;
+	    _M_buckets = &_M_single_bucket;
+	  }
+	else if (!__x._M_use_single_bucket())
+	  {
+	    _M_buckets = __x._M_buckets;
+	    __x._M_buckets = &__x._M_single_bucket;
+	  }
+
       std::swap(_M_bucket_count, __x._M_bucket_count);
       std::swap(_M_before_begin._M_nxt, __x._M_before_begin._M_nxt);
       std::swap(_M_element_count, __x._M_element_count);
+      std::swap(_M_single_bucket, __x._M_single_bucket);
 
       // Fix buckets containing the _M_before_begin pointers that can't be
       // swapped.
       if (_M_begin())
 	_M_buckets[_M_bucket_index(_M_begin())] = &_M_before_begin;
+
       if (__x._M_begin())
 	__x._M_buckets[__x._M_bucket_index(__x._M_begin())]
 	  = &__x._M_before_begin;
@@ -1230,9 +1297,6 @@
 	       _H1, _H2, _Hash, _RehashPolicy, _Traits>::
     find(const key_type& __k)
     {
-      if (__builtin_expect(_M_bucket_count == 0, false))
-	return end();
-
       __hash_code __code = this->_M_hash_code(__k);
       std::size_t __n = _M_bucket_index(__k, __code);
       __node_type* __p = _M_find_node(__n, __k, __code);
@@ -1250,9 +1314,6 @@
 	       _H1, _H2, _Hash, _RehashPolicy, _Traits>::
     find(const key_type& __k) const
     {
-      if (__builtin_expect(_M_bucket_count == 0, false))
-	return end();
-      
       __hash_code __code = this->_M_hash_code(__k);
       std::size_t __n = _M_bucket_index(__k, __code);
       __node_type* __p = _M_find_node(__n, __k, __code);
@@ -1270,9 +1331,6 @@
 	       _H1, _H2, _Hash, _RehashPolicy, _Traits>::
     count(const key_type& __k) const
     {
-      if (__builtin_expect(_M_bucket_count == 0, false))
-	return 0;
-
       __hash_code __code = this->_M_hash_code(__k);
       std::size_t __n = _M_bucket_index(__k, __code);
       __node_type* __p = _M_bucket_begin(__n);
@@ -1287,7 +1345,7 @@
 	  else if (__result)
 	    // All equivalent values are next to each other, if we
 	    // found a non-equivalent value after an equivalent one it
-	    // means that we won't find any more equivalent values.
+	    // means that we won't find any new equivalent value.
 	    break;
 	  if (!__p->_M_nxt || _M_bucket_index(__p->_M_next()) != __n)
 	    break;
@@ -1311,9 +1369,6 @@
 	       _H1, _H2, _Hash, _RehashPolicy, _Traits>::
     equal_range(const key_type& __k)
     {
-      if (__builtin_expect(_M_bucket_count == 0, false))
-	return std::make_pair(end(), end());
-
       __hash_code __code = this->_M_hash_code(__k);
       std::size_t __n = _M_bucket_index(__k, __code);
       __node_type* __p = _M_find_node(__n, __k, __code);
@@ -1347,9 +1402,6 @@
 	       _H1, _H2, _Hash, _RehashPolicy, _Traits>::
     equal_range(const key_type& __k) const
     {
-      if (__builtin_expect(_M_bucket_count == 0, false))
-	return std::make_pair(end(), end());
-	
       __hash_code __code = this->_M_hash_code(__k);
       std::size_t __n = _M_bucket_index(__k, __code);
       __node_type* __p = _M_find_node(__n, __k, __code);
@@ -1944,7 +1996,7 @@
 	       _H1, _H2, _Hash, _RehashPolicy, _Traits>::
     _M_rehash_aux(size_type __n, std::true_type)
     {
-      __bucket_type* __new_buckets = this->_M_allocate_buckets(__n);
+      __bucket_type* __new_buckets = _M_allocate_buckets(__n);
       __node_type* __p = _M_begin();
       _M_before_begin._M_nxt = nullptr;
       std::size_t __bbegin_bkt = 0;
@@ -1969,8 +2021,7 @@
 	  __p = __next;
 	}
 
-      if (__builtin_expect(_M_bucket_count != 0, true))
-	_M_deallocate_buckets();
+      _M_deallocate_buckets();
       _M_bucket_count = __n;
       _M_buckets = __new_buckets;
     }
@@ -1986,7 +2037,7 @@
 	       _H1, _H2, _Hash, _RehashPolicy, _Traits>::
     _M_rehash_aux(size_type __n, std::false_type)
     {
-      __bucket_type* __new_buckets = this->_M_allocate_buckets(__n);
+      __bucket_type* __new_buckets = _M_allocate_buckets(__n);
 
       __node_type* __p = _M_begin();
       _M_before_begin._M_nxt = nullptr;
@@ -2060,8 +2111,7 @@
 	    __new_buckets[__next_bkt] = __prev_p;
 	}
 
-      if (__builtin_expect(_M_bucket_count != 0, true))
-	_M_deallocate_buckets();
+      _M_deallocate_buckets();
       _M_bucket_count = __n;
       _M_buckets = __new_buckets;
     }
Index: testsuite/23_containers/unordered_set/61143.cc
===================================================================
--- testsuite/23_containers/unordered_set/61143.cc	(revision 0)
+++ testsuite/23_containers/unordered_set/61143.cc	(working copy)
@@ -0,0 +1,38 @@
+// { dg-options "-std=gnu++11" }
+
+// Copyright (C) 2014 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+// <http://www.gnu.org/licenses/>.
+
+// libstdc++/61143
+
+#include <unordered_set>
+
+void test01()
+{
+  std::unordered_set<int> us1, us2;
+  us1.insert(1);
+
+  us2 = std::move(us1);
+
+  us1.insert(1);
+}
+
+int main()
+{
+  test01();
+  return 0;
+}
Index: testsuite/23_containers/unordered_set/modifiers/swap.cc
===================================================================
--- testsuite/23_containers/unordered_set/modifiers/swap.cc	(revision 0)
+++ testsuite/23_containers/unordered_set/modifiers/swap.cc	(working copy)
@@ -0,0 +1,65 @@
+// Copyright (C) 2014 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+// <http://www.gnu.org/licenses/>.
+
+// { dg-options "-std=gnu++11" }
+
+#include <unordered_set>
+#include <testsuite_hooks.h>
+
+void test01()
+{
+  bool test __attribute__((unused)) = true;
+  std::unordered_set<int> us1 { 0, 1 };
+  {
+    std::unordered_set<int> us2(std::move(us1));
+
+    us1.swap(us2);
+
+    VERIFY( us1.find(0) != us1.end() );
+
+    us1.insert(2);
+
+    VERIFY( us1.size() == 3 );
+
+    us2.swap(us1);
+
+    VERIFY( us2.size() == 3 );
+    VERIFY( us2.find(2) != us2.end() );
+
+    us1 = { 3, 4, 5 };
+
+    VERIFY( us1.size() == 3 );
+    VERIFY( us1.bucket_count() >= 3 );
+
+    std::unordered_set<int> us3(std::move(us1));
+    us3 = std::move(us2);
+
+    us1.swap(us2);
+
+    VERIFY( us1.empty() );
+    VERIFY( us2.empty() );
+  }
+
+  us1 = { 0, 1 };
+  VERIFY( us1.size() == 2 );
+}
+
+int main()
+{
+  test01();
+  return 0;
+}

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]