This is the mail archive of the
mailing list for the libstdc++ project.
[libstdc++ PATCH/RFC] Aggressive space optimization for std::vector
- From: Doug Gregor <dgregor at apple dot com>
- To: libstdc++ at gcc dot gnu dot org
- Cc: gcc-patches at gcc dot gnu dot org
- Date: Mon, 28 Jul 2003 12:52:31 -0700
- Subject: [libstdc++ PATCH/RFC] Aggressive space optimization for std::vector
The attached patch implements an aggressive space optimization for
vector, where we share a bit of the vector code for a POD type with all
other POD types with the same size/alignment. So, for instance,
vector<void*> and vector<int*> (and on many platforms, vector<int>)
would share what code they can, reducing some of the "template bloat"
we have. The basic implementation idea is simple: most of the work
gets factored out into a base class _Vector_impl<_StoredType,
_Allocator> that is wrapped by vector<_Tp, _Allocator>: a small
metafunction _Store_as converts the actual value_type _Tp into the
storage type _StoredType: for non-PODs, this is the identity function,
but for POD types the result will be _Basic_pod<sizeof(_Tp),
__alignof__(_Tp)>, i.e., another POD type that has the same size and
alignment as _Tp.
Does it really help? The attached "results.txt" gives the results of an
experiment that evaluates the sizes of object files with varying
numbers of instantiations of the same routine that manipulates vectors.
The routine in question is synthetic: it merely exercises the
constructors and member functions of a vector of type 'T'. I ran the
experiment for all combinations of these factors:
1) 1 to 10 unique POD types for 'T' (they were all pointers, so that
all sizes/alignments were the same)
2) Compiler options, varying from very debuggable to very optimized
3) Versions of the code: (a) before the patch, (b) after the patch but
with the optimization turned off, and (c) after the patch with the
optimization turned on.
The results look promising, I think. We lose in the case where we can't
share code (23.8% growth in the routine size at -O0 -g3, mainly because
of the extra debug information), but this loss becomes negligible with
better optimization (3.3% at -g -O2, -Os; disappears at -O3). If there
is any sharing, however, we win: at -O0 -g3, each additional
instantiation of the routine where sharing can occur requires 27.7%
less code per instantiation of the test routine, 53% less code per
instantiation at -O3. In the test case, if we share a single
instantiation we win in all but the -O0 -g3 case (that requires us to
share 2 instantiations).
There is more that can be done on the micro-optimization level, such as
eliminating some of the extra overhead in _Vector_impl (e.g., its
iterators can just be pointers, because they will never be exposed to
the user). However, I'd like to know if this type of optimization is
desired in libstdc++. It uglifies the source a bit, but it can be a
boon to users.
Tested on powerpc-apple-darwin and i686-pc-linux-gnu; no regressions.
ChangeLog follows. Okay to commit?
2003-07-28 Doug Gregor <firstname.lastname@example.org>
* include/bits/cpp_type_traits.h: (__is_pod): Pointers are POD
(__type_traits): Describe the properties of _Basic_pod.
* include/bits/stl_vector.h: Factor out work that can safely
operation on a POD type isomorphic to the vale type into
_Vector_impl. Make std::vector a wrapper around _Vector_impl that
presents the properly-typed interface to the user.
* include/bits/vector.tcc: As above. Remove _GLIBCXX_DEPRECATED
region that defined an undeclared member function.
Description: Binary data
Description: Text document