[libstdc++] pdqsort - a faster std::sort

Orson Peters orsonpeters@gmail.com
Thu May 19 17:06:00 GMT 2016


Hi Chris, Mark, other readers,

For a while I've been working on a paper for pdqsort, explaining its
design and giving some proofs for runtime bounds. It's not ready for
publication yet, as it's missing proofreading and benchmarks, but for
the purposes of discussion on this mailing list it's pretty much done.

A draft of this paper can be found at
https://drive.google.com/file/d/0B1-vl-dPgKm_T0Fxeno1a0lGT0E/view.

Chris: since last email I've made a minor change that fixes a tiny
(miniscule) performance bug (an off-by-one error for a heuristic), and
fixed a spelling error in a comment. These can be found as the last
two commits here:
https://github.com/orlp/pdqsort/commits/libstdc%2B%2B-merge.


Greetings,

Orson Peters

On Sun, Apr 24, 2016 at 9:36 PM, Marc Glisse <marc.glisse@inria.fr> wrote:
> Hello Chris,
>
> just making sure this doesn't fall off the radar for the stage1 that is just
> starting. No pressure, there's plenty of time :-)
>
> On Tue, 27 Oct 2015, Orson Peters wrote:
>
>> Hi Chris,
>>
>> I've been very busy with real life things, so I haven't had the time
>> to add further benchmarks for
>> pdqsort. I also apologize for this late reply, as I've filtered
>> libstdc++ in my email to a separate
>> folder, and haven't checked in a while.
>>
>> Either way, I'm still entirely happy with the branch. There have been
>> a few changes since I last
>> emailed, but those changes have been strictly spelling errors in the
>> comments. The perks of being a
>> non-native speaker :)
>>
>>
>> Greetings,
>>
>> Orson Peters
>>
>>
>> On Tue, Oct 13, 2015 at 9:28 PM, Christopher Jefferson
>> <chris@bubblescope.net> wrote:
>>>
>>> On 13 October 2015 at 17:13, Marc Glisse <marc.glisse@inria.fr> wrote:
>>>>
>>>> Hi Chris,
>>>>
>>>> did you ever get a chance to look at it? I was reminded of it because I
>>>> have
>>>> an application here where one call to sort is about 1/3 of the total
>>>> running
>>>> time. Using sort takes 3.46s, while using stable_sort takes 3.08s (yes,
>>>> it
>>>> is faster...) and pdqsort is only 2.85s.
>>>
>>>
>>>
>>> Hi Marc,
>>>
>>> I didn't go back and look again, but I will.
>>>
>>> Orson: Are you happy with the branch as it currently exists?
>>>
>>> Chris
>>>
>>>>
>>>> (I didn't look at the implementation myself)
>>>>
>>>>
>>>> On Tue, 21 Apr 2015, Christopher Jefferson wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> Sorry, I've been very busy and not had time to look at this. I plan to
>>>>> do so this week.
>>>>>
>>>>> One final request (sorry!) Have you tried benchmarking this against
>>>>> std::sort with -std=c++03, and with a more expensive type (I recommend
>>>>> making some vectors of length... 100,000? Make some easy to compare
>>>>> (make first element different) and some expensive to compare (make
>>>>> only last elements different).
>>>>>
>>>>> Chris
>>>>>
>>>>>
>>>>> On 12 April 2015 at 07:11, Orson Peters <orsonpeters@gmail.com> wrote:
>>>>>>
>>>>>>
>>>>>> All right, I've put in the work and made the pdqsort source libstdc++
>>>>>> compatible. I've created a seperate branch on Github where I published
>>>>>> all my
>>>>>> changes - you can look at the commit history to see exactly how I got
>>>>>> from my
>>>>>> original code to the libstdc++ compatible code.
>>>>>>
>>>>>> https://github.com/orlp/pdqsort/blob/libstdc%2B%2B-merge/pdqsort.h
>>>>>>
>>>>>> While testing I found one minor bug in pdqsort. In the insertion sort
>>>>>> routine I
>>>>>> decremented (but not dereferenced) an iterator before 'begin', which
>>>>>> is
>>>>>> not
>>>>>> allowed. The fix was trivial (move decrement after range check instead
>>>>>> of
>>>>>> before).
>>>>>>
>>>>>> After fixing the above bug pdqsort passes all libstdc++ tests. You can
>>>>>> confirm
>>>>>> this yourself by inserting the above linked code into bits/stl_algo.h,
>>>>>> renaming
>>>>>> sort to stdsort (or anything else) and pdqsort to sort.
>>>>>>
>>>>>> A couple of notes that maybe should be addressed before merging the
>>>>>> code:
>>>>>>
>>>>>>  - I have attempted to follow the style guide where possible, but it's
>>>>>> very
>>>>>>    likely I've made a mistake here and there due to unfamiliarity.
>>>>>>
>>>>>>  - My insertion sort routines (__pdq_insertion_sort and
>>>>>>    __pdq_unguarded_insertion_sort) are different than those found in
>>>>>>    libstdc++. To my knowledge __pdq_insertion_sort is strictly better
>>>>>> than
>>>>>>    __insertion_sort, because I only move an element if it's out of
>>>>>> place
>>>>>> -
>>>>>>    __insertion_sort always moves an element twice, even if already
>>>>>> placed
>>>>>>    correctly. __pdq_unguarded_insertion_sort is a trade-off vs
>>>>>>    __unguarded_insertion_sort. The pdq version compares iterators on
>>>>>> every
>>>>>>    iteration, the libstdc++ version does an initial check against the
>>>>>> first
>>>>>>    element in the array. For pointers I'm almost certain my
>>>>>> implementation is
>>>>>>    faster (the comparison is virtually free), but it's not a clear cut
>>>>>> in
>>>>>> the
>>>>>>    general case.
>>>>>>
>>>>>>    I'm convinced my insertion sort can replace libstdc++s without
>>>>>> issue,
>>>>>> but I'm
>>>>>>    not sure about the unguarded version.
>>>>>>
>>>>>>  - I have marked all functions inline for now - I don't know which
>>>>>> ones
>>>>>> we'd
>>>>>>    really want to mark inline and which ones we don't.
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Sat, Apr 11, 2015 at 10:50 AM, Marc Glisse <marc.glisse@inria.fr>
>>>>>> wrote:
>>>>>>>
>>>>>>>
>>>>>>> On Sat, 11 Apr 2015, François Dumont wrote:
>>>>>>>
>>>>>>>>    I think comparator is taken by copy simply because the Standard
>>>>>>>> signature is taking it by copy. We expect gcc to do a good job to
>>>>>>>> optimize
>>>>>>>> away the intermediate copy.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> While this is true, I don't think it is a reason to force extra
>>>>>>> copies
>>>>>>> internally when it is just as easy to avoid them (as long as someone
>>>>>>> is
>>>>>>> willing to write and test the patch). For iterators in particular,
>>>>>>> passing
>>>>>>> by value is useful to force the decay from array to pointer, but
>>>>>>> there
>>>>>>> is no
>>>>>>> need to decay an already decayed type.
>>>>>>>
>>>>>>> https://gcc.gnu.org/ml/libstdc++/2013-09/msg00220.html
>>>>>>>
>>>>>>> (I didn't feel strongly enough to write the patch myself, and still
>>>>>>> don't)
>>>>>>>
>>>>>>> --
>>>>>>> Marc Glisse
>>>>>
>>>>>
>>>>>
>>>>
>>>> --
>>>> Marc Glisse
>>
>>
>
> --
> Marc Glisse



More information about the Libstdc++ mailing list