[Bug libstdc++/61107] stl_algo.h: std::__inplace_stable_partition() doesn't process the whole data range

Wed Nov 12 18:09:00 GMT 2014

On 12/11/14 14:56 +0000, Christopher Jefferson wrote:
>I did suggest this change, so I feel I should defend it!
>
>Our testing of many algorithms is woefully slim, that is how (for
>example) the segfaulting bug in std::nth_element got through into a
>release -- the tests for that algorithm were terrible, and basically
>didn't test the functionality on enough possible inputs.
>
>I consider a series of random inputs to be a good practical way of
>getting decent code coverage and perform a basic sanity test, without
>the need for an excessive amount of coding. While these tests aren't
>showing anything yet,
>
>a) We didn't know that until after they were written and executed, and:
>b) They might help catch problems in future, particular in other
>algorithms changing underlying functionality.
>
>I recently added a set of similar tests for a number of algorithms.
>
>If you have an alternative suggestion for better testing I'd be happy
>to hear it, but I think the algorithms need something beyond just one
>or two hardwired inputs.

OK, that makes sense. Testing sorting algorithms is *hard* because the
edge cases are not obvious so it's difficult to craft input that is
known to stress the possible failures.

But if we're going to use random numbers then we should definitely not
use std::random_device (or we'll get truly random failures that are
very hard to reproduce) and I'd like to see more iterations with the
PRNG than just for (int i = 0; i < 10; ++i), because otherwise it *is*
just ten hardwired inputs (albeit with up to 1000 elements per
iteration), because the PRNG is deterministic.

(Again, my reproducer for the std::set memory leaks used 10 rounds
with a PRNG because I was lazy and just wanted to demonstrate there
was still a bug, not because I thought that was a good test.)

We could use std::random_device to seed the PRNG iff we print the seed
to the logs, so the test can be easily run again with the same seed.
That would avoid testing the exact same data on every run, while still
being reproducable. I think it's best to avoid std::random_device
though.

If we test more than 10 rounds, the number should be configurable so
it can be overriden on simulators using a -D option in the dg-options.

So Chris, if you think the new test adds value then I'm OK with it
being added.

I reserve the right to remain sceptical about just throwing random
numbers at the sorting algos though. I'd prefer to see something more
methodical, like taking several permutations of a fixed input and
verifying that it is always sorted correctly; verifying that sorting
it twice does nothing (e.g. do we have tests that stable_sort on a
sorted range doesn't ever try to swap elements? or that elements that
are swapped are not equivalent?); sorting a range that is composed of
sorted subranges (1,2,3,1,3,5,2,4,6,1,3,6) and that sort of thing.

But sorting algorithms are not my strong point, and if I'm not going
to write those tests myself I have no right to complain! :-)