[PATCH] libstdc++: Avoid zero-probability events in discrete_distribution [PR61369]

Lewis Hyatt lhyatt@gmail.com
Tue Dec 15 21:49:53 GMT 2020


On Thu, Nov 19, 2020 at 07:16:52PM +0000, Jonathan Wakely wrote:
> On 19/11/20 12:57 -0500, Lewis Hyatt via Libstdc++ wrote:
> > Hello-
> > 
> > PR61369 (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61369) points out
> > that std::discrete_distribution can return an event even if it has 0
> > probability, and proposes a simple fix. It seems that this fix was never
> > applied, because there was an expectation of redoing this code anyway to
> > use a more efficient algorithm (PR57925). Given that this new algorithm
> > has not been implemented so far, would it make sense to apply the simple
> > fix to address this issue? The attached patch does this.
> > 
> > One question about the patch, a slight annoyance is that only
> > std::lower_bound() is currently available in random.tcc, as this file
> > includes only bits/stl_algobase.h and not bits/stl_algo.h (via including
> > <vector>). Is there a preference between simply including stl_algo.h, or
> > moving upper_bound to stl_algobase.h, where lower_bound is? I noticed
> > that in C++20 mode, <vector> includes stl_algo.h already, so I figured
> > it would be fine to just include it in random.tcc unconditionally.
> 
> But the increase in header sizes in C++20 is a regression:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92546
> 
> Anyway, I'll review this patch tomorrow, thanks for sending it.
> 
> > bootstrap + testing were done on x86-64 GNU/Linux, all tests the same
> > before + after plus 2 new passes from the new test. Thanks for taking a
> > look!
> > 
> > -Lewis
>

Sorry for the noise on this... I realized the patch I sent before missed
one line. I guess there is an internal __generate() function that
presumably needs the same change as operator()(). This version modifies
that one as well and adds a test case. Would you mind please reviewing this
one instead? Thanks!

-Lewis
-------------- next part --------------
From: Lewis Hyatt <lhyatt@gmail.com>
Date: Tue, 1 Dec 2020 11:17:23 -0500
Subject: [PATCH] libstdc++: Avoid zero-probability events in discrete_distribution [PR61369]

Fixes PR61369, as recommended by the PR's submitter, by replacing
lower_bound() with upper_bound(). Currently, if there is an initial subset of
events with probability 0, the first of them will be returned with non-zero
probability (if the underlying RNG returns exactly 0). Switching to
upper_bound() ensures that this will not happen.

libstdc++-v3/ChangeLog:

	PR libstdc++/61369
	* include/bits/random.tcc: Include bits/stl_algo.h.
	(discrete_distribution::operator()): Use upper_bound rather than
	lower_bound.
	(discrete_distribution::__generate_impl): Likewise.
	* testsuite/26_numerics/random/pr60037-neg.cc: Adapt to new line
	numbering in random.tcc.
	* testsuite/26_numerics/random/discrete_distribution/pr61369.cc: New
	test.

diff --git a/libstdc++-v3/include/bits/random.tcc b/libstdc++-v3/include/bits/random.tcc
index 3205442f2f6..a7b33b0739e 100644
--- a/libstdc++-v3/include/bits/random.tcc
+++ b/libstdc++-v3/include/bits/random.tcc
@@ -31,6 +31,7 @@
 #define _RANDOM_TCC 1
 
 #include <numeric> // std::accumulate and std::partial_sum
+#include <bits/stl_algo.h> // std::upper_bound
 
 namespace std _GLIBCXX_VISIBILITY(default)
 {
@@ -2706,7 +2707,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 	  __aurng(__urng);
 
 	const double __p = __aurng();
-	auto __pos = std::lower_bound(__param._M_cp.begin(),
+	auto __pos = std::upper_bound(__param._M_cp.begin(),
 				      __param._M_cp.end(), __p);
 
 	return __pos - __param._M_cp.begin();
@@ -2736,7 +2737,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 	while (__f != __t)
 	  {
 	    const double __p = __aurng();
-	    auto __pos = std::lower_bound(__param._M_cp.begin(),
+	    auto __pos = std::upper_bound(__param._M_cp.begin(),
 					  __param._M_cp.end(), __p);
 
 	    *__f++ = __pos - __param._M_cp.begin();
diff --git a/libstdc++-v3/testsuite/26_numerics/random/discrete_distribution/pr61369.cc b/libstdc++-v3/testsuite/26_numerics/random/discrete_distribution/pr61369.cc
new file mode 100644
index 00000000000..03fd38888da
--- /dev/null
+++ b/libstdc++-v3/testsuite/26_numerics/random/discrete_distribution/pr61369.cc
@@ -0,0 +1,60 @@
+// Copyright (C) 2020 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+// <http://www.gnu.org/licenses/>.
+
+// { dg-do run { target c++11 } }
+// { dg-require-cstdint "" }
+
+#include <random>
+#include <limits>
+#include <testsuite_hooks.h>
+#include <testsuite_common_types.h>
+
+class not_so_random
+{
+public:
+  using result_type = std::uint64_t;
+
+  static constexpr result_type
+  min()
+  { return 0u; }
+
+  static constexpr result_type
+  max()
+  { return std::numeric_limits<result_type>::max(); }
+
+  result_type
+  operator()() const
+  { return 0u; }
+};
+
+void
+test01()
+{
+  std::discrete_distribution<> u{0.0, 0.5, 0.5};
+  not_so_random rng;
+  VERIFY( u(rng) > 0 );
+
+  double a[10];
+  u.__generate(a, std::end(a), rng);
+  for(auto x : a)
+    VERIFY( x > 0 );
+}
+
+int main()
+{
+  test01();
+}
diff --git a/libstdc++-v3/testsuite/26_numerics/random/pr60037-neg.cc b/libstdc++-v3/testsuite/26_numerics/random/pr60037-neg.cc
index ba252ef34fe..4d00d1846c4 100644
--- a/libstdc++-v3/testsuite/26_numerics/random/pr60037-neg.cc
+++ b/libstdc++-v3/testsuite/26_numerics/random/pr60037-neg.cc
@@ -12,4 +12,4 @@ auto x = std::generate_canonical<std::size_t,
 
 // { dg-error "static assertion failed: template argument must be a floating point type" "" { target *-*-* } 167 }
 
-// { dg-error "static assertion failed: template argument must be a floating point type" "" { target *-*-* } 3346 }
+// { dg-error "static assertion failed: template argument must be a floating point type" "" { target *-*-* } 3347 }


More information about the Libstdc++ mailing list