This is the mail archive of the libstdc++@gcc.gnu.org mailing list for the libstdc++ project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [v3] fix libstdc++/52476

From: François Dumont <frs dot dumont at gmail dot com>
To: libstdc++ at gcc dot gnu dot org
Date: Thu, 15 Mar 2012 21:41:06 +0100
Subject: Re: [v3] fix libstdc++/52476
References: <4F5FBDE6.4010407@gmail.com> <4F609EB0.9060502@oracle.com>

Compiling your feedbacks here is what I came to:

2012-03-15 François Dumont <fdumont@gcc.gnu.org>

        PR libstdc++/52476
        * include/bits/hashtable.h (_Hashtable<>::_M_rehash_aux): Add.
        (_Hashtable<>::_M_rehash): Use the latter.
        * testsuite/23_containers/unordered_multimap/insert/52476.cc: New.
        * testsuite/23_containers/unordered_multiset/insert/52476.cc: New.

Regarding the testcase, the code in the ticket is showing the problem but is not a test. The test might seem a little bit complicated but I try to make it independent to how elements are inserted into the container which is not defined by the Standard. Even if we change implementation and store 0-3,0-2,0-1,0-0 rather than 0-0,0-1,0-2,0-3 the test will still work and only check the Standard point which is that the order of those elements should be preserve on rehash.

Tested under Linux x86_64.

Ok for mainline ?

François

On 03/14/2012 02:35 PM, Paolo Carlini wrote:

Hi,
Here a patch proposition to fix PR 52446.

I have introduce 2 version of the _M_rehash method. One used when keys are unique which is very close to the existing one. The second that take care of keeping equivalent keys relative order on rehash. The second one might seem complicated but I wanted to avoid to recompute a bucket index each time we find an equivalent element. If the hash code is cached it is ok but if it is not I prefer to limit the number of time it is recalculated.

2012-03-13 François Dumont <fdumont@gcc.gnu.org>
    PR libstdc++/52446
    * include/bits/hashtable.h (_Hashtable<>::_M_rehash): Split into 2
    methods, the first, copy of the existing one, is used when keys
    are unique, the second purpose is to keep equivalent keys relative
    orders.
The "purpose of the second", maybe? Can we use a new name for the new functions, instead of overloading? A descriptive name or suffix would add clarity to the code.

* testsuite/23_containers/unordered_multimap/insert/52446.cc: New.

Tested under linux x86_64.

If ok tell me if I must also apply it to 4.7 branch.
We are going to fix this in mainline and, then, if everything goes well, in 4.7.1.

Minor nits:
François

52446.patch
Index: include/bits/hashtable.h
===================================================================
--- include/bits/hashtable.h    (revision 185352)
+++ include/bits/hashtable.h    (working copy)
@@ -596,6 +596,11 @@
        // reserve, if present, comes from _Rehash_base.
      private:
+      // Rehash method when keys are unique
+      void _M_rehash(size_type __n, std::true_type);
A blank line here.
+ // Rehash method when there might be equivalent keys. + void _M_rehash(size_type __n, std::false_type); + // Unconditionally change size of bucket array to n, restore hash policy // state to __state on exception. void _M_rehash(size_type __n, const _RehashPolicyState& __state); @@ -1592,41 +1597,141 @@ { __try { - _Bucket* __new_buckets = _M_allocate_buckets(__n); - _Node* __p = _M_begin(); - _M_before_begin._M_nxt = nullptr; - std::size_t __cur_bbegin_bkt; - while (__p) + _M_rehash(__n, integral_constant<bool, __uk>()); + } + __catch(...) + { + // A failure here means that buckets allocation failed. We only + // have to restore hash policy previous state. + _M_rehash_policy._M_reset(__state); + __throw_exception_again; + } + } + + template<typename _Key, typename _Value, + typename _Allocator, typename _ExtractKey, typename _Equal, + typename _H1, typename _H2, typename _Hash, typename _RehashPolicy, + bool __chc, bool __cit, bool __uk> + void + _Hashtable<_Key, _Value, _Allocator, _ExtractKey, _Equal, + _H1, _H2, _Hash, _RehashPolicy, __chc, __cit, __uk>:: + _M_rehash(size_type __n, std::true_type) + { + // This version of rehash do not have to care about equivalent elements + // relative order. + _Bucket* __new_buckets = _M_allocate_buckets(__n); + _Node* __p = _M_begin(); + _M_before_begin._M_nxt = nullptr; + std::size_t __cur_bbegin_bkt; + while (__p) + { + _Node* __next = __p->_M_next(); + std::size_t __new_index = _HCBase::_M_bucket_index(__p, __n); + if (!__new_buckets[__new_index]) { - _Node* __next = __p->_M_next(); - std::size_t __new_index = _HCBase::_M_bucket_index(__p, __n); - if (!__new_buckets[__new_index]) + __p->_M_nxt = _M_before_begin._M_nxt; + _M_before_begin._M_nxt = __p; + __new_buckets[__new_index] =&_M_before_begin; + if (__p->_M_nxt) + __new_buckets[__cur_bbegin_bkt] = __p; + __cur_bbegin_bkt = __new_index; + } + else + { + __p->_M_nxt = __new_buckets[__new_index]->_M_nxt; + __new_buckets[__new_index]->_M_nxt = __p; + } + __p = __next; + } + _M_deallocate_buckets(_M_buckets, _M_bucket_count); + _M_bucket_count = __n; + _M_buckets = __new_buckets; + } + + template<typename _Key, typename _Value, + typename _Allocator, typename _ExtractKey, typename _Equal, + typename _H1, typename _H2, typename _Hash, typename _RehashPolicy, + bool __chc, bool __cit, bool __uk> + void + _Hashtable<_Key, _Value, _Allocator, _ExtractKey, _Equal, + _H1, _H2, _Hash, _RehashPolicy, __chc, __cit, __uk>:: + _M_rehash(size_type __n, std::false_type) + { + // This version of rehash must guaranty that equivalent elements relative + // order is preserve. + _Bucket* __new_buckets = _M_allocate_buckets(__n); + _Node* __p = _M_begin(); + _M_before_begin._M_nxt = nullptr; + std::size_t __cur_bbegin_bkt; + std::size_t __prev_index; + _Node* __prev_p = nullptr; + bool __check_next_bucket = false; + while (__p) + { + bool __check_now = true; + _Node* __next = __p->_M_next(); + std::size_t __new_index = _HCBase::_M_bucket_index(__p, __n); + if (!__new_buckets[__new_index]) + { + __p->_M_nxt = _M_before_begin._M_nxt; + _M_before_begin._M_nxt = __p; + __new_buckets[__new_index] =&_M_before_begin; + if (__p->_M_nxt) + __new_buckets[__cur_bbegin_bkt] = __p; + __cur_bbegin_bkt = __new_index; + } + else + { + if (__prev_p&& __prev_index == __new_index) { - __p->_M_nxt = _M_before_begin._M_nxt; - _M_before_begin._M_nxt = __p; - __new_buckets[__new_index] =&_M_before_begin; - if (__p->_M_nxt) - __new_buckets[__cur_bbegin_bkt] = __p; - __cur_bbegin_bkt = __new_index; + // Previous insert was already in this bucket, we insert after + // the previously inserted one to preserve equivalent elements + // relative order. + __p->_M_nxt = __prev_p->_M_nxt; + __prev_p->_M_nxt = __p;
And here. As a general style, I would say always when the comment itself is preceded by code != open curly bracket.
+          // Request a check as soon as we move out of the sequence of
+          // equivalent nodes.
+          __check_next_bucket = true;
+          __check_now = false;
In terms of testcases, I would also add something more straightforward, close to what we have in the PR, and duplicated for map and set. But I'm not going to insist ;)
Thanks!
Paolo.

Index: include/bits/hashtable.h
===================================================================
--- include/bits/hashtable.h	(revision 185352)
+++ include/bits/hashtable.h	(working copy)
@@ -596,6 +596,12 @@
       // reserve, if present, comes from _Rehash_base.
 
     private:
+      // Helper rehash method used when keys are unique.
+      void _M_rehash_aux(size_type __n, std::true_type);
+
+      // Helper rehash method used when keys can be non-unique.
+      void _M_rehash_aux(size_type __n, std::false_type);
+
       // Unconditionally change size of bucket array to n, restore hash policy
       // state to __state on exception.
       void _M_rehash(size_type __n, const _RehashPolicyState& __state);
@@ -1592,41 +1598,145 @@
     {
       __try
 	{
-	  _Bucket* __new_buckets = _M_allocate_buckets(__n);
-	  _Node* __p = _M_begin();
-	  _M_before_begin._M_nxt = nullptr;
-	  std::size_t __cur_bbegin_bkt;
-	  while (__p)
+	  _M_rehash_aux(__n, integral_constant<bool, __uk>());
+	}
+      __catch(...)
+	{
+	  // A failure here means that buckets allocation failed.  We only
+	  // have to restore hash policy previous state.
+	  _M_rehash_policy._M_reset(__state);
+	  __throw_exception_again;
+	}
+    }
+
+  // Rehash when there is no equivalent elements.
+  template<typename _Key, typename _Value,
+	   typename _Allocator, typename _ExtractKey, typename _Equal,
+	   typename _H1, typename _H2, typename _Hash, typename _RehashPolicy,
+	   bool __chc, bool __cit, bool __uk>
+    void
+    _Hashtable<_Key, _Value, _Allocator, _ExtractKey, _Equal,
+	       _H1, _H2, _Hash, _RehashPolicy, __chc, __cit, __uk>::
+    _M_rehash_aux(size_type __n, std::true_type)
+    {
+      _Bucket* __new_buckets = _M_allocate_buckets(__n);
+      _Node* __p = _M_begin();
+      _M_before_begin._M_nxt = nullptr;
+      std::size_t __bbegin_bkt;
+      while (__p)
+	{
+	  _Node* __next = __p->_M_next();
+	  std::size_t __bkt = _HCBase::_M_bucket_index(__p, __n);
+	  if (!__new_buckets[__bkt])
 	    {
-	      _Node* __next = __p->_M_next();
-	      std::size_t __new_index = _HCBase::_M_bucket_index(__p, __n);
-	      if (!__new_buckets[__new_index])
+	      __p->_M_nxt = _M_before_begin._M_nxt;
+	      _M_before_begin._M_nxt = __p;
+	      __new_buckets[__bkt] = &_M_before_begin;
+	      if (__p->_M_nxt)
+		__new_buckets[__bbegin_bkt] = __p;
+	      __bbegin_bkt = __bkt;
+	    }
+	  else
+	    {
+	      __p->_M_nxt = __new_buckets[__bkt]->_M_nxt;
+	      __new_buckets[__bkt]->_M_nxt = __p;
+	    }
+	  __p = __next;
+	}
+      _M_deallocate_buckets(_M_buckets, _M_bucket_count);
+      _M_bucket_count = __n;
+      _M_buckets = __new_buckets;
+    }
+
+  // Rehash when there can be equivalent elements, preserve their relative
+  // order.
+  template<typename _Key, typename _Value,
+	   typename _Allocator, typename _ExtractKey, typename _Equal,
+	   typename _H1, typename _H2, typename _Hash, typename _RehashPolicy,
+	   bool __chc, bool __cit, bool __uk>
+    void
+    _Hashtable<_Key, _Value, _Allocator, _ExtractKey, _Equal,
+	       _H1, _H2, _Hash, _RehashPolicy, __chc, __cit, __uk>::
+    _M_rehash_aux(size_type __n, std::false_type)
+    {
+      _Bucket* __new_buckets = _M_allocate_buckets(__n);
+
+      _Node* __p = _M_begin();
+      _M_before_begin._M_nxt = nullptr;
+      std::size_t __bbegin_bkt;
+      std::size_t __prev_bkt;
+      _Node* __prev_p = nullptr;
+      bool __check_bucket = false;
+
+      while (__p)
+	{
+	  bool __check_now = true;
+	  _Node* __next = __p->_M_next();
+	  std::size_t __bkt = _HCBase::_M_bucket_index(__p, __n);
+
+	  if (!__new_buckets[__bkt])
+	    {
+	      __p->_M_nxt = _M_before_begin._M_nxt;
+	      _M_before_begin._M_nxt = __p;
+	      __new_buckets[__bkt] = &_M_before_begin;
+	      if (__p->_M_nxt)
+		__new_buckets[__bbegin_bkt] = __p;
+	      __bbegin_bkt = __bkt;
+	    }
+	  else
+	    {
+	      if (__prev_p && __prev_bkt == __bkt)
 		{
-		  __p->_M_nxt = _M_before_begin._M_nxt;
-		  _M_before_begin._M_nxt = __p;
-		  __new_buckets[__new_index] = &_M_before_begin;
-		  if (__p->_M_nxt)
-		    __new_buckets[__cur_bbegin_bkt] = __p;
-		  __cur_bbegin_bkt = __new_index;
+		  // Previous insert was already in this bucket, we insert after
+		  // the previously inserted one to preserve equivalent elements
+		  // relative order.
+		  __p->_M_nxt = __prev_p->_M_nxt;
+		  __prev_p->_M_nxt = __p;
+
+		  // Inserting after a node in a bucket require to check that we
+		  // haven't change the bucket last node, in this case next
+		  // bucket containing its before begin node must be updated. We
+		  // schedule a check as soon as we move out of the sequence of
+		  // equivalent nodes to limit the number of checks.
+		  __check_bucket = true;
+		  __check_now = false;
 		}
 	      else
 		{
-		  __p->_M_nxt = __new_buckets[__new_index]->_M_nxt;
-		  __new_buckets[__new_index]->_M_nxt = __p;
+		  __p->_M_nxt = __new_buckets[__bkt]->_M_nxt;
+		  __new_buckets[__bkt]->_M_nxt = __p;
 		}
-	      __p = __next;
 	    }
-	  _M_deallocate_buckets(_M_buckets, _M_bucket_count);
-	  _M_bucket_count = __n;
-	  _M_buckets = __new_buckets;
+	  
+	  if (__check_now && __check_bucket)
+	    {
+	      // Check if we shall update the next bucket because of insertions
+	      // into __prev_bkt bucket.
+	      if (__prev_p->_M_nxt)
+		{
+		  std::size_t __next_bkt
+		    = _HCBase::_M_bucket_index(__prev_p->_M_next(), __n);
+		  if (__next_bkt != __prev_bkt)
+		    __new_buckets[__next_bkt] = __prev_p;
+		}
+	      __check_bucket = false;
+	    }
+	  __prev_p = __p;
+	  __prev_bkt = __bkt;
+	  __p = __next;
 	}
-      __catch(...)
+
+      if (__check_bucket && __prev_p->_M_nxt)
 	{
-	  // A failure here means that buckets allocation failed.  We only
-	  // have to restore hash policy previous state.
-	  _M_rehash_policy._M_reset(__state);
-	  __throw_exception_again;
+	  std::size_t __next_bkt
+	    = _HCBase::_M_bucket_index(__prev_p->_M_next(), __n);
+	  if (__next_bkt != __prev_bkt)
+	    __new_buckets[__next_bkt] = __prev_p;
 	}
+
+      _M_deallocate_buckets(_M_buckets, _M_bucket_count);
+      _M_bucket_count = __n;
+      _M_buckets = __new_buckets;
     }
 
 _GLIBCXX_END_NAMESPACE_VERSION
Index: testsuite/23_containers/unordered_multimap/insert/52476.cc
===================================================================
--- testsuite/23_containers/unordered_multimap/insert/52476.cc	(revision 0)
+++ testsuite/23_containers/unordered_multimap/insert/52476.cc	(revision 0)
@@ -0,0 +1,59 @@
+// { dg-options "-std=gnu++0x" }
+//
+// Copyright (C) 2012 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+//
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+//
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+// <http://www.gnu.org/licenses/>.
+
+#include <unordered_map>
+#include <vector>
+#include <algorithm>
+#include <testsuite_hooks.h>
+
+void test01()
+{
+  using namespace std;
+  bool test __attribute__((unused)) = true;
+  
+  unordered_multimap<int, int> mmap;
+  vector<int> values;
+
+  size_t nb_bkts = mmap.bucket_count();
+  int i = 0;
+  for (;; ++i)
+    {
+      mmap.insert(make_pair(0, i));
+      if (mmap.bucket_count() != nb_bkts)
+	// Container got rehash
+	break;
+      values.clear();
+      transform(mmap.begin(), mmap.end(), back_inserter(values),
+		[](const pair<int, int>& p) { return p.second; });
+    }
+
+  vector<int> rehash_values;
+  transform(mmap.begin(), mmap.end(), back_inserter(rehash_values),
+	    [](const pair<int, int>& p) { return p.second; });
+  // Remove the value that result in a rehash
+  rehash_values.erase(remove(rehash_values.begin(), rehash_values.end(), i));
+
+  VERIFY( rehash_values == values );
+}
+  
+int main()
+{
+  test01();
+  return 0;
+}
Index: testsuite/23_containers/unordered_multiset/insert/52476.cc
===================================================================
--- testsuite/23_containers/unordered_multiset/insert/52476.cc	(revision 0)
+++ testsuite/23_containers/unordered_multiset/insert/52476.cc	(revision 0)
@@ -0,0 +1,77 @@
+// { dg-options "-std=gnu++0x" }
+//
+// Copyright (C) 2012 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+//
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+//
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+// <http://www.gnu.org/licenses/>.
+
+#include <unordered_set>
+#include <vector>
+#include <algorithm>
+#include <testsuite_hooks.h>
+
+namespace
+{
+  struct pair_hash
+  {
+    std::size_t
+    operator()(const std::pair<int, int>& p) const noexcept
+    { return std::hash<int>()(p.first); }
+  };
+
+  struct pair_equal_to
+  {
+    bool
+    operator()(const std::pair<int, int>& x,
+	       const std::pair<int, int>& y) const noexcept
+    { return x.first == y.first; }
+  };
+}
+
+void test01()
+{
+  using namespace std;
+  bool test __attribute__((unused)) = true;
+  
+  unordered_multiset<pair<int, int>, pair_hash, pair_equal_to> mset;
+  vector<int> values;
+
+  size_t nb_bkts = mset.bucket_count();
+  int i = 0;
+  for (;; ++i)
+    {
+      mset.insert(make_pair(0, i));
+      if (mset.bucket_count() != nb_bkts)
+	// Container got rehash
+	break;
+      values.clear();
+      transform(mset.begin(), mset.end(), back_inserter(values),
+		[](const pair<int, int>& p) { return p.second; });
+    }
+
+  vector<int> rehash_values;
+  transform(mset.begin(), mset.end(), back_inserter(rehash_values),
+	    [](const pair<int, int>& p) { return p.second; });
+  // Remove the value that result in a rehash
+  rehash_values.erase(remove(rehash_values.begin(), rehash_values.end(), i));
+
+  VERIFY( rehash_values == values );
+}
+  
+int main()
+{
+  test01();
+  return 0;
+}

Follow-Ups:
- Re: [v3] fix libstdc++/52476
  - From: Paolo Carlini

References:
- [v3] fix libstdc++/52446
  - From: François Dumont
- Re: [v3] fix libstdc++/52446
  - From: Paolo Carlini

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]