This is the mail archive of the
libstdc++@gcc.gnu.org
mailing list for the libstdc++ project.
Re: Optimising std::find on x86 and PPC
Matt Austern wrote:
On Dec 14, 2004, at 6:32 AM, Chris Jefferson wrote:
Hello,
I recently tried changing the std::find random_access overload to
change the main loop from:
difference_type __trip_count = (__last - __first) >> 2;
for(; __trip_count > 0 ; --__trip_count) { if(*__first = __val)
return __first; ++__first; (4 times) }
to:
Iterator __newlast = __last - (__last - __first) % 4;
for( ; __first < __newlast;){ if(*__first = __val) return __first;
++__first; (4 times) }
This knocked about 30% off the time taken on x86 (Note that in a
final version I'd change the %4 into some kind of &ing and/or shifting).
Unfortunatly, a quick test on Mac OS X by Andrew Pinski (thank you!)
found that this slightly decreased both performance in terms of both
space and time on the, as this new version will no longer use the
specialised "count" operator.
What kind of iterators did you use for your timing test? My guess
would be int* or char*, but you should probably say.
I used int*. I also tried it with vector<int>::iterator, which is not an
actual int*, but the optimiser manages to "pick" an int* out of it and
just use that in the loop.
Chris