[libstdc++] pdqsort - a faster std::sort

Orson Peters orsonpeters@gmail.com
Wed Apr 8 12:44:00 GMT 2015


I apologize if this message is a duplicate, I'm re-sending it because I just got
a notification the post failed because I accidentally sent the email with HTML
contents. I'm just not very accustomed to using mailing lists.

> This does raise one (possible) problem, which is if your code might be
> slower in C++03 due to copies -- to be exact, I don't care if it's
> slower than it is in C++11, the question is if, with large types,
> pdqsort is slower than std::sort when both are compiled C++03 (I don't
> think it will be, but perhaps another thing to check. Should be easy
> however!)

Oh oh, forgive me. I was confused so my previous statements weren't 100%
accurate.

In the case of swap(*a, *b), iter_swap can be used with no problems whatsoever
in pdqsort. I still think that insertion sort should use moves, not swaps,
despite creating copies in C++03. The current libstdc++ implementation seems to
agree with me (__insertion_sort), so it won't cause a performance regression
compared to the status quo there. I'll change the code in my repository to use
iter_swap.

However, moving the pivot to a local variable for speed does invoke a copy in
C++03. A total of 3 moves happen per partition operation, so if I'm not mistaken
this performance hit will be dwarfed by the amount of moves happening in the
insertion sort step. It does not make sense to remove this optimization while
leaving insertion sort as it is.

But that wasn't the thing that confused me during my study of libstdc++, and
what my previous performance remarks were about.

No, my confusion was regarding __gnu_cxx::__ops::__iter_comp_iter and co that
are used in libstdc++s implementation. I do not understand these comparison
function adaptors purposes, it only limits the power of the comparison function.

Since I move the pivot to a local variable for speed, I am not compatible with
these comparison adaptors. In pdqsorts code I use comp(*iter, value),
comp(value, *iter) and comp(*iter, *iter).

2015-04-08 13:59 GMT+02:00 Christopher Jefferson <chris@bubblescope.net>:
> On 8 April 2015 at 11:12, Orson Peters <orsonpeters@gmail.com> wrote:
>>>  - I assume the sort works on move-only types in C++11? (I can see you
>>> make use of move, but do you require it?).
>>
>> If C++11 is enabled, the implementation only moves or swaps. All moves
>> or swaps are done using ADL-lookup unqualified swap or
>> PDQSORT_PREFER_MOVE which becomes std::move on C++11.
>
> Great
>
>>> There was a reason (and I can't remember what it was now!) that we
>>> used 'iter_swap' rather than 'swap(*a,*b)' on iterators. I think it
>>> might be required to allow sorting std::vector<bool>? Either way, it
>>> might be worth trying it.
>>
>> I noticed this when I was studying libstdc++s implementation, and I
>> couldn't figure out why. The implementation of pdqsort does not
>> currently do this, and for a good reason: it's faster to move the
>> pivot into a local variable for speed.
>>
>> The compiler can't prove that *begin (where the pivot is kept during
>> the partition operation) does not get changed during the partition
>> operation, and thus has to load the memory for every comparison. This
>> will always hit cache after the first few iterations, so it doesn't
>> matter that much, but it's not nothing.
>
> Actually, moving into a local variable of type
> iterator_traits<T>::value_type is OK. See for example
> __insertion_sort, it uses _GLIBCXX_MOVE, like your
> PDQSORT_PREFER_MOVE.
>
> The problem is if you write 'swap(*a, *b)', when
> iterator_traits<T>::reference_type is a proxy reference. For example,
> with vector<bool>, we end up instantiating swap<std::_Bit_iterator>,
> not swap<bool> as you might expect, so the temporary created in swap
> isn't a bool, it's a _Bit_iterator. Now in the case of vector<bool>,
> we've made sure this works, but some other non-true iterators might
> mess up in this sitution. The C++ standard is generally unclear on
> proxy iterators, but it would be nice not to break them any more than
> we have to. You should find (hopefully!) that replacing swap(*a,*b)
> with std::iter_swap(a,b) makes no difference to performance.
>
> It is true, separately, that we probably swap rather than move more
> than we should for C++11 code. This is because when the library is
> compiled in C++03 mode, then std::swap on vectors is O(1), but copy is
> O(n).
>
> This does raise one (possible) problem, which is if your code might be
> slower in C++03 due to copies -- to be exact, I don't care if it's
> slower than it is in C++11, the question is if, with large types,
> pdqsort is slower than std::sort when both are compiled C++03 (I don't
> think it will be, but perhaps another thing to check. Should be easy
> however!)
>
> Chris



More information about the Libstdc++ mailing list