This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: PR 71181 Avoid rehash after reserve


On 25/05/2016 16:01, Jonathan Wakely wrote:
On 22/05/16 17:16 +0200, François Dumont wrote:
Hi

To fix 71181 problem I propose to change how we deal with reserve called with pivot values that is to say prime numbers. Now _M_next_bkt always return a value higher than the input value. This way when reverse(97) is called we end up with 199 buckets and so enough space to store 97 values without rehashing.

I have integrated in this patch several other enhancements on the same subject. Improvement of _M_next_resize management when reaching highest bucket number. Remove sentinel value in __prime_list, just need to limit range when calling lower_bound.

I don't think the change to __prime_list is safe. If you compile some
code with GCC 5 and then used a libstdc++.so with this change the old
code would still be looking for the sentinel in the array, and would
not find it.

I think it would be safe to leave the old __prime_list unchanged (and
then not need to change anything in tr1/hashtable_policy.h?) and add a
new array with a different name. Existing code compiled with older
versions of GCC would still find __prime_list, but the new code would
use a different array.



What about this version ? tr1 mode still limit search range as it should to make sure it doesn't need to check lower_bound result. And sentinel is only kept for backward compatibility and commented to make that clear. Maybe there is a clearer way to express that sentinel can be removed on a future version breaking abi ?

François
diff --git a/libstdc++-v3/include/tr1/hashtable_policy.h b/libstdc++-v3/include/tr1/hashtable_policy.h
index 4ee6d45..2f70289 100644
--- a/libstdc++-v3/include/tr1/hashtable_policy.h
+++ b/libstdc++-v3/include/tr1/hashtable_policy.h
@@ -403,7 +403,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
     _M_need_rehash(std::size_t __n_bkt, std::size_t __n_elt,
 		   std::size_t __n_ins) const;
 
-    enum { _S_n_primes = sizeof(unsigned long) != 8 ? 256 : 256 + 48 };
+    // Number of primes minus 1 to avoid check on lower_bound result.
+    enum { _S_n_primes = (sizeof(unsigned long) != 8 ? 256 : 256 + 48) - 1 };
 
     float                _M_max_load_factor;
     float                _M_growth_factor;
diff --git a/libstdc++-v3/src/c++11/hashtable_c++0x.cc b/libstdc++-v3/src/c++11/hashtable_c++0x.cc
index a5e6520..dc0dab5 100644
--- a/libstdc++-v3/src/c++11/hashtable_c++0x.cc
+++ b/libstdc++-v3/src/c++11/hashtable_c++0x.cc
@@ -46,10 +46,10 @@ namespace __detail
   {
     // Optimize lookups involving the first elements of __prime_list.
     // (useful to speed-up, eg, constructors)
-    static const unsigned char __fast_bkt[12]
-      = { 2, 2, 2, 3, 5, 5, 7, 7, 11, 11, 11, 11 };
+    static const unsigned char __fast_bkt[13]
+      = { 2, 2, 3, 5, 5, 7, 7, 11, 11, 11, 11, 13, 13 };
 
-    if (__n <= 11)
+    if (__n <= 12)
       {
 	_M_next_resize =
 	  __builtin_ceil(__fast_bkt[__n] * (long double)_M_max_load_factor);
@@ -58,10 +58,22 @@ namespace __detail
 
     constexpr auto __n_primes
       = sizeof(__prime_list) / sizeof(unsigned long) - 1;
+    constexpr auto __prime_list_end = __prime_list + __n_primes;
     const unsigned long* __next_bkt =
-      std::lower_bound(__prime_list + 5, __prime_list + __n_primes, __n);
-    _M_next_resize =
-      __builtin_ceil(*__next_bkt * (long double)_M_max_load_factor);
+      std::lower_bound(__prime_list + 6, __prime_list_end, __n);
+
+    if (*__next_bkt == __n && __next_bkt != __prime_list_end)
+      ++__next_bkt;
+
+    if (__next_bkt == __prime_list_end)
+      // Set next resize to the max value so that we never try to rehash again
+      // as we already reach the biggest possible bucket number.
+      // Note that it might result in max_load_factor not being respected.
+      _M_next_resize = std::size_t(-1);
+    else
+      _M_next_resize =
+	__builtin_ceil(*__next_bkt * (long double)_M_max_load_factor);
+
     return *__next_bkt;
   }
 
diff --git a/libstdc++-v3/src/shared/hashtable-aux.cc b/libstdc++-v3/src/shared/hashtable-aux.cc
index 081bb12..9cbca98 100644
--- a/libstdc++-v3/src/shared/hashtable-aux.cc
+++ b/libstdc++-v3/src/shared/hashtable-aux.cc
@@ -25,6 +25,7 @@
 namespace __detail
 {
 _GLIBCXX_BEGIN_NAMESPACE_VERSION
+  // The sentinel value is only kept for backward compatibility.
   extern const unsigned long __prime_list[] = // 256 + 1 or 256 + 48 + 1
   {
     2ul, 3ul, 5ul, 7ul, 11ul, 13ul, 17ul, 19ul, 23ul, 29ul, 31ul,

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]