This is the mail archive of the
libstdc++@gcc.gnu.org
mailing list for the libstdc++ project.
Performance of std::max_element and std::min_element
- From: David GONZALEZ MALINE <David dot Gonzalez dot Maline at cern dot ch>
- To: <libstdc++ at gcc dot gnu dot org>
- Date: Mon, 28 Apr 2008 14:03:52 +0200
- Subject: Performance of std::max_element and std::min_element
Dear Developers,
I have written a month ago about the performance of the
std::next_permutation. As I said before, we are making some performance
tests on some of the standard library algorithms against the ones we use
in ROOT (see Thu, 13 Mar 2008).
This time I want to show the results obtained for the std::max_element
and std::min_elements. We have tried the algorithm in three different
machines and the results on performance are as follows (on microseconds
per call):
std
algorithms ROOT algorithms
MacOSX
(with Intel 64 bits & gcc 4.0)
317.703 84.94
Linux 64 bits
(scientific Linux 4 & gcc 3.4.6)
398.7 129.113
Linux 32 bits
(scientific Linux 4 & gcc 3.4.6)
412.82 623.95 // Only case
where there is a worst time.
As the table shows, only in the version of Linux with 32 bits the
performance is improved. Both algorithms are very similar, with the
difference that in the std it seems like it dereferences one of the
iterators in every epoch of the loop:
if (__first == __last)
return __first;
_ForwardIterator __result = __first;
while (++__first != __last)
if (*__result < *__first) /// Always dereference the
*__result iterator!
__result = __first;
return __result;
In the ROOT algorithm, the same value is stored in a register variable:
Long64_t *TMath::LocMin*(Long64_t n, *const* Short_t *a)
{
/// Return index of array with the minimum element.
/ /// If more than one element is minimum returns first found.
/
*if* (n <= 0 || !a) *return* -1;
Short_t xmin = a[0]; // This is the variable used for the comparison.
Long64_t loc = 0;
*for* (Long64_t i = 1; i < n; i++) {
*if* (xmin > a[i]) {
xmin = a[i];
loc = i;
}
}
*return* loc;
}
It seems like in most of the systems, the compiler optimizes the second
version better.
Hope it helps.
David