sorting routine for forward_list

Tue Jul 9 08:51:00 GMT 2013

Jonathan Wakely <jwakely.gcc@gmail.com> writes:

> Your algorithm requires the list size, which isn't stored for a
> std::forward_list.

Yes.  The reason it is needed in advance is to have as close to
equally-sized sublists as possible.  If one instead uses only power-of-2
sized sublists, one can let oneself be surprised by the end of the list
whenever it may come up.  At worst this will lead to about n comparisons
more than when knowing the list size in advance, so for simple
comparisons it would be a tossup in the worst case.

I quite recently discovered list_sort.c in the Linux kernel
<URL:https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/lib/list_sort.c>
which is a quite concise implementation of that approach.  Even while
using the simpler power-of-2 subdivision, the control logic is not done
arithmetically (no counters are being maintained anyway).  Instead, null
pointers in the stack of sorted sublists are used for keeping track of
the set of completed merges, and null pointers in the lists itself are
used for keeping track of the length of sublists.

> Including the time to count the list in the timings brings it down to
> 30-35% faster in some quick tests I did, but that's still significant.

I have not looked at the STL code.  Personally, I think that the speedup
is more than one would reasonably expect from flattening a simple
recursive algorithm.  That makes me suspect that the STL implementation
is not a "best of its kind" and could likely be made at least 20% faster
without a major change of design, assuming it actually works on lists
and does not create and sort arrays of pointers.  But that's just a wild
guess.

> I haven't looked at the actual sort routine (sorting algorithms are
> not my speciality,) but using a comparison object that counts
> invocations shows the number of comparisons is of the same order of
> magnitude as the current std::forward_list::sort() implementation,
> which is good (the standard says there will be approximately N log(N)
> comparisons.)  Do you know its worst-case performance?

Well, anything but O(N log (N)) would be really surprising: it's the
theoretical complexity and pretty much the gold standard.  Quicksort is
an exception which manages to be competitive in the mean when sorting
arrays with items of low comparison cost, even while having an O(n^2)
worst case.

So pretty much any serious sorting algorithm contender will distinguish
itself not significantly with number of comparisons but rather the
associated overhead for accessing and placing the data.

In any case, the worst case of comparisons (n > 0) for this sorting
routine is

n*ceil(log2(n))-2**ceil(log2(n)) + 1

which is less than n*log2(n) for n > 1 (the derivation is pretty
straightforward).  So worst case behavior is as good as it gets.  The
worst case in comparisons is also the worst case with regard to write
operations since then every link needs to get rewritten (with
splice_after) in the merge passes, so one gets the same number of writes
as of comparisons.  But triggering worst case behavior is hard to do by
accident: presorted data in either direction results in best case
behavior (about half the number of comparisons, and no link rewrites),
and random data would likely already show only half the link rewrites
(each comparison then results in a link rewrite with 50% probability).

The Linux kernel version will likely be competitive regarding the
control logic (at least if it is streamlined a bit more), will show
probably an average comparison number hit of few percent for choosing
power-of-2 subdivisions (the worst case losing about n/2 comparisons or
so), and will probably lose performance due to writing every link in
every pass: the merge loop could take a page from mine.

I have not actually benchmarked it: that's just a hunch.

It might make sense for list internal use to build a template based on a
reference/pointer idiom rather than a complete list idiom, for
efficiency and versatility's sake.

>> I have signed assignment papers for various GNU software already, so
>> going through with that procedure for GCC or libstdc++ would not be a
>> problem, given an actual interest.
>
> Great, because that's sometimes the hardest part :-)

-- 
David Kastrup