I'd like some advice about this speedup to iterators for red-black
trees. The idea comes from developers of IBM's proprietary C++
product who implemented it in the C++ library used on AIX. Their
users access the functionality with -Dsomething when compiling, but
it could be done by adding a g++ option that defines the macro, like:
-ffast-set-map-iterators
This option augments the base class for map<>, set<>, multimap<> and
multiset<> template classes to allow faster iterator operations.
Warning: -ffast-set-map-iterators causes GCC to generate code that
is not binary compatible with code generated without the option.
All code that operates on the same set or map objects must use the
same variant of this option.
The functional changes are all within include/bits/stl_tree.h and have
no effect on the compilation of src/tree.cc, which doesn't need to
know the size of the class. It uglifies the header with code
protected by #ifdef __GLIBCXX_USE_FAST_SET_MAP_ITERATOR__.
Why do this? Because it speeds up 447.dealII from SPEC CPU2006. On
powerpc64-linux, the speedup over our usual peak options is 32% for
-m32, 26% for -m64. The two other CPU2006 benchmarks that use set
and/or map are omnetpp and xalancbmk, which speed up between 1% and
1.4%. Those speedups are based on the the middle result from 3 runs,
using mainline from a couple of months ago. I have not yet run
benchmarks and any other targets.