Remove algo logic duplication Round 3

Christopher Jefferson chris@bubblescope.net
Mon Sep 30 21:45:00 GMT 2013


On 30 September 2013 17:11, Paolo Carlini <paolo.carlini@oracle.com> wrote:
> Hi,
>
>
> On 09/29/2013 10:19 PM, Christopher Jefferson wrote:
>>
>> (we also perform 2 comparisons where one should be enough, but that's a
>> different issue)
>> That is worth looking at (although increasingly, I want to investigate
>> doing a fresh sort altogether. In my own code I use an implementation
>> of timsort, which shows much better performance where data is partly
>> sorted. However, that's a different patch/discussion!)
>
> It's definitely a different patch / discussion. What I'm trying to
> understand is how old the issue is, whether is a relatively recent
> regression or it's an old issue. Because the behavior seems first blush
> really weird, but if it dates back to the HP / SGI times, it can probably
> wait a bit more. Any ideas?

This comes from way back in the SGI days by the look of things to me.

The problem arises in __insertion_sort. The logic looks something like:

 if (*__i < *__first)
{ Then *__i has to go all the way to the start of the array. Do that! }
else
{ We don't know where *__i will go, but we know that if we start
moving it back 1 square at a time, we will stop either at or before
__first }

That second part is __unguarded_linear_insert.

In the case where actually *__i has to go all the way back to just
after first, we do one comparison too many (we could stop one place
before). However, the whole point of unguarded linear insert is that
we are trading off one comparison to remove the need for a boundary
condition check. We could add a check in the while of
__unguarded_linear_insert of that form (__i > __first), but in many
cases that would end up being more expensive than the comparisons,
particularly because it is only in very rare cases that we expect to
hit this case.

So the short version is: We do the extra value comparison on purpose,
and taking it out will make the algorithm quite a lot more in iterator
comparisions.

We could add a special case std::sort for very small lists, for
example up to size 4 or 5, and use a 'sort network'. Not sure how
often these cases come up, and if it is worth the code bloat.

Chris




> Thanks,
> Paolo.



More information about the Libstdc++ mailing list