This is the mail archive of the
mailing list for the GCC project.
Re: [RFC] optional fast iterators for set/map in libstdc++
- From: Gawain BOLTON <gp dot bolton at computer dot org>
- To: janis187 at us dot ibm dot com
- Cc: libstdc++ at gcc dot gnu dot org, gcc-patches at gcc dot gnu dot org
- Date: Fri, 08 May 2009 22:19:53 +0200
- Subject: Re: [RFC] optional fast iterators for set/map in libstdc++
- References: <1241744347.6152.26.camel@janis-laptop>
Janis Johnson wrote:
I'd like some advice about this speedup to iterators for red-black
trees. The idea comes from developers of IBM's proprietary C++
product who implemented it in the C++ library used on AIX. Their
users access the functionality with -Dsomething when compiling, but
it could be done by adding a g++ option that defines the macro, like:
This option augments the base class for map<>, set<>, multimap<> and
multiset<> template classes to allow faster iterator operations.
Warning: -ffast-set-map-iterators causes GCC to generate code that
is not binary compatible with code generated without the option.
All code that operates on the same set or map objects must use the
same variant of this option.
The functional changes are all within include/bits/stl_tree.h and have
no effect on the compilation of src/tree.cc, which doesn't need to
know the size of the class. It uglifies the header with code
protected by #ifdef __GLIBCXX_USE_FAST_SET_MAP_ITERATOR__.
Why do this? Because it speeds up 447.dealII from SPEC CPU2006. On
powerpc64-linux, the speedup over our usual peak options is 32% for
-m32, 26% for -m64. The two other CPU2006 benchmarks that use set
and/or map are omnetpp and xalancbmk, which speed up between 1% and
1.4%. Those speedups are based on the the middle result from 3 runs,
using mainline from a couple of months ago. I have not yet run
benchmarks and any other targets.
I've had a look at your patch and found it quite an interesting
suggestion. However, I do not think it is appropriate for this standard
If you look at the standard map, multimap, set and multiset container
classes you'll see that:
* These containers are not designed to be optimized for iteration
operations. They /*are*/ meant to be efficient at inserting,
removing and finding elements.
o The thing is, the patch to speed up iterating will slow down
two of the operations which are meant to be optimized:
inserting and removing elements.
* Storage efficiency is already one of the weaknesses of these
containers as each node requires 3 pointers for the parent, left
and right nodes as well as storage for the node colour.
o The patch is a basic space for time trade off and will
increase the size of each node by adding two more pointers
for iterating (next/prev).
o On 32-bit platforms this takes up 4 * 4 octets = 16 octets
of storage overhead per node. The patch would increase this
by 50% to 24 octets of overhead per node.
It seems that to be really useful, different container classes should be
created. In this way, users could more easily use them based on their
performance and memory constraints. A compile flag or switch just
cannot give the flexibility required.