Testing replacement std::sort
chris jefferson
caj@cs.york.ac.uk
Wed May 25 09:58:00 GMT 2005
Paolo Carlini wrote:
>chris jefferson wrote:
>
>
>
>>I would be very thankful to anyone who could run the attached program
>>and return to me along with the flags used and it's output.
>>
>>
>
>I see that people are already answering your call. Nice. However, I
>believe everyone would be even more interested and willing to
>constructively collaborate if you could *explain* a bit what you are
>trying to achieve and which is the rationale, at least... ;)
>
>
>
Sorry, I should perhaps have explained :)
This is part of my continuing aim to add move symantics/rvalue
references to <algorithm>. For almost every function doing this has been
a case of simply looking carefully for where we copy things and seeing
where these copies can be turned into moves, and where they are
unnessasary. In almost all cases the basic algorithm ends up being the
same at the end.
The sole exception to this is introsort_loop, and it's calls to
unguarded_partition.
This algorithm basically works as follows:
Given a range [first,last), choose the median element out of *first,
*(first+(last-first)/2), *last, and then partition the vector and return
a position cut such that:
forall i in [first,cut) !(i>median)
forall i in [cut,last) !(i<median)
Now, if the median value is either *first or *last, all is fine, as we
can leave it where it is and continue on. The problem is if the median
value is in the middle. We have 3 options:
1) Make a copy of it (trisplitmincopy)
2) Swap it with the first or last element (trisplitswap)
3) Make sure we don't move the median (trisplitunmovingpivot)
4) Make sure we don't move any value which is exactly the median value
(whereas at the moment we always swap them to the other side)
4) Would be nice and you'd think would make things faster. However the
unguarded_partition relies on the search stopping on values equal to the
median to avoid it simply falling off the end, and adding in checks for
that slows things down terribly.
2) is fastest if you give a random vector. Unfortunatly if you ask for a
already sorted vector to be sorted, then normally nothing will be moved,
but this swapping around messes things up terribly.
3) Appears to be the most reasonable, it only produces a very small
slowdown, which is easily swallowed if the copying you saved was any
significant size.
I'm currently implementing a more careful version of 3. As a current
rough guide, I get the following:
std::sort an array length 100,000 of:
vector<int> : 40 ticks
vector<int>* (using a comparitor that derefences the *): 36 ticks
vector<int> with moveable sort: 30 ticks
Note that this is only for vector<int>s of length 1. Of course the
second and third stay quite similar as the size of the vector increases
More information about the Libstdc++
mailing list