This is the mail archive of the
libstdc++@gcc.gnu.org
mailing list for the libstdc++ project.
Re: [patch, libstdc++] std::shuffle: Generate two swap positions at a time if possible
- From: Eelis <eelis at eelis dot net>
- To: Jonathan Wakely <jwakely at redhat dot com>
- Cc: libstdc++ <libstdc++ at gcc dot gnu dot org>, gcc-patches <gcc-patches at gcc dot gnu dot org>
- Date: Fri, 2 Sep 2016 20:53:34 +0200
- Subject: Re: [patch, libstdc++] std::shuffle: Generate two swap positions at a time if possible
- Authentication-results: sourceware.org; auth=none
- Newsgroups: gmane.comp.gcc.patches, gmane.comp.gcc.libstdc++.devel
- References: <ng536o$4iu$1@ger.gmane.org> <ng53cl$4iu$2@ger.gmane.org> <CAH6eHdTnASg7A5USA_sTAHXPfUFmkrZvF27tZUJhj5ywwDVDag@mail.gmail.com> <5728B8CC.7030405@eelis.net> <20160831124502.GH3342@redhat.com> <57C9C304.90607@eelis.net>
On 2016-09-02 20:20, Eelis van der Weegen wrote:
On 2016-08-31 14:45, Jonathan Wakely wrote:
Is this significantly faster than just using
uniform_int_distribution<_IntType>{0, __bound - 1}(__g) so we don't
need to duplicate the logic? (And people maintaining the code won't
reconvince themselves it's correct every time they look at it :-)
[..]
Could we hoist this test out of the loop somehow?
If we change the loop condition to be __i+1 < __last we don't need to
test it on every iteration, and then after the loop we can just do
the final swap if (__urange % 2).
Reusing std::uniform_int_distribution seems just as fast, so I've removed __generate_random_index_below.
I've hoisted the (__i + 1 == __last) check out of the loop (in a slightly different way), and it seems to shave off a couple more cycles, yay!
Updated patch attached.
Please ignore that patch, I used __g()&1 but that's invalid (the new "UniformRandomBitGenerator" name is misleading).
Updated patch (which uses a proper distribution even for the [0,1] case) attached.
Index: libstdc++-v3/include/bits/stl_algo.h
===================================================================
--- libstdc++-v3/include/bits/stl_algo.h (revision 239895)
+++ libstdc++-v3/include/bits/stl_algo.h (working copy)
@@ -3772,6 +3772,47 @@
typedef typename std::make_unsigned<_DistanceType>::type __ud_type;
typedef typename std::uniform_int_distribution<__ud_type> __distr_type;
typedef typename __distr_type::param_type __p_type;
+
+ typedef typename std::remove_reference<_UniformRandomNumberGenerator>::type _Gen;
+ typedef typename std::common_type<typename _Gen::result_type, __ud_type>::type __uc_type;
+
+ const __uc_type __urngrange = __g.max() - __g.min();
+ const __uc_type __urange = __uc_type(__last - __first);
+
+ if (__urngrange / __urange >= __urange)
+ // I.e. (__urngrange >= __urange * __urange) but without wrap issues.
+ {
+ _RandomAccessIterator __i = __first + 1;
+
+ // Since we know the range isn't empty, an even number of elements
+ // means an uneven number of elements /to swap/, in which case we
+ // do the first one up front:
+
+ if ((__urange % 2) == 0)
+ {
+ __distr_type __d{0, 1};
+ std::iter_swap(__i++, __first + __d(__g));
+ }
+
+ // Now we know that __last - __i is even, so we do the rest in pairs,
+ // using a single distribution invocation to produce swap positions
+ // for two successive elements at a time:
+
+ while (__i != __last)
+ {
+ const __uc_type __swap_range = __uc_type(__i - __first) + 1;
+ const __uc_type __comp_range = __swap_range * (__swap_range + 1);
+
+ std::uniform_int_distribution<__uc_type> __d{0, __comp_range - 1};
+ const __uc_type __pospos = __d(__g);
+
+ std::iter_swap(__i++, __first + (__pospos % __swap_range));
+ std::iter_swap(__i++, __first + (__pospos / __swap_range));
+ }
+
+ return;
+ }
+
__distr_type __d;
for (_RandomAccessIterator __i = __first + 1; __i != __last; ++__i)