This is the mail archive of the
`libstdc++@gcc.gnu.org`
mailing list for the libstdc++ project.

Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|

Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |

Other format: | [Raw text] |

*From*: David Kastrup <dak at gnu dot org>*To*: libstdc++ at gcc dot gnu dot org*Date*: Thu, 09 Oct 2014 12:40:07 +0200*Subject*: Re: Possible improvement to std::list::sort*Authentication-results*: sourceware.org; auth=none*References*: <CAOG+gn38rWvw2WU7mzvNcbHnJm-f48-L7Tj+O9YjCWWmS71xEg at mail dot gmail dot com>

Dhruv Matani <dhruvbird@gmail.com> writes: > To improve std::list's sort, (i.e. reduce the # of comparisons), > there's a trivial fix that can be done. i.e. Use blocks of 4 elements > to merge instead of 1. i.e. Take 4 elements at a time and sort them > instead of 1 at a time. If we take 1 element, then in the worst case, > sorting 4 elements uses 7 comparisons, and in the best case uses 4 > comparisons. If we use the min. # of comparisons (5) (can go down to 3 > in certain cases), No, it can't. If we have 4 elements, any scheme that can go down to 3 comparisons in certain cases will not manage to get along with at most 5 comparisons in every other case. You can't have both. Two comparisons can either yield two sorted pairs, or in the best case a sorted triple and an untested element. In the latter case I _have_ to compare with the middle of the triple first in order to keep the maximum number of comparisons down to 5. In the former case, the possibility of getting out with a single comparison would only be there if I compared the smaller of one pair with the larger of the other pair. And if the outcome is the more likely one (the smaller of one pair being smaller than the larger of the other pair), and assuming that I had ab and cd, then I just ruled out abcd by finding c smaller than b and still have in the race acbd acdb cabd cadb cdab and I can tell at most 4 combinations apart with 2 more comparisons. > then we're still doing better than the worst case earlier. Of course, > if you special-case for 8, we end up using at most 16 comparisons, > which is a lot better than the # of comparisons you'd use to sort 8 > elements using merge sort. Uh what? The only known sorting _scheme_ (rather than open-coded sorting chains) that manages to get along with 16 comparisons (which is the information theoretic minimum as well) is merge insertion. However, the data flow of merge insertion is neither suitable for lists nor for arrays. So it's a rather theoretical sorting method. In comparison, sorting 8 elements with mergesort has a worst case number of comparisons of 17. I would not call 16 "a lot better" than 17. > These tiny hacks could improve the running time of std::list::sort by > a decent fraction. With respect to comparisons, you cannot really improve mergesort significantly, and its data flow is perfectly suited for linked lists. So the only realistic area of improvement is to minimize administrational overhead. Substituting merge insertion for mergesort for 8 elements would let the administrational overhead _explode_, more than offsetting the wins from being able to get along with 16 rather than 17 comparisons. If we _do_ care about degenerate presorted cases, it is also worth noting that merge insertion is rather hard to orient to the degenerate cases. The minimum number is 14 comparisons for merge insertion, and 12 comparisons for merge sort. So _if_ we expect the degenerate case to be even slightly more relevant than others, merge sort books in better than merge insertion in that respect. The bottom line is that a low-overhead implementation of mergesort is basically unbeatable for linked lists. Not in small scale, not in large scale. The current implementation is lacking in the low-overhead department. -- David Kastrup

**References**:**Re: Possible improvement to std::list::sort***From:*Dhruv Matani

Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|

Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |