This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[libstdc++ PATCH/RFC] Aggressive space optimization for std::vector


The attached patch implements an aggressive space optimization for vector, where we share a bit of the vector code for a POD type with all other POD types with the same size/alignment. So, for instance, vector<void*> and vector<int*> (and on many platforms, vector<int>) would share what code they can, reducing some of the "template bloat" we have. The basic implementation idea is simple: most of the work gets factored out into a base class _Vector_impl<_StoredType, _Allocator> that is wrapped by vector<_Tp, _Allocator>: a small metafunction _Store_as converts the actual value_type _Tp into the storage type _StoredType: for non-PODs, this is the identity function, but for POD types the result will be _Basic_pod<sizeof(_Tp), __alignof__(_Tp)>, i.e., another POD type that has the same size and alignment as _Tp.

Does it really help? The attached "results.txt" gives the results of an experiment that evaluates the sizes of object files with varying numbers of instantiations of the same routine that manipulates vectors. The routine in question is synthetic: it merely exercises the constructors and member functions of a vector of type 'T'. I ran the experiment for all combinations of these factors:
1) 1 to 10 unique POD types for 'T' (they were all pointers, so that all sizes/alignments were the same)
2) Compiler options, varying from very debuggable to very optimized
3) Versions of the code: (a) before the patch, (b) after the patch but with the optimization turned off, and (c) after the patch with the optimization turned on.


The results look promising, I think. We lose in the case where we can't share code (23.8% growth in the routine size at -O0 -g3, mainly because of the extra debug information), but this loss becomes negligible with better optimization (3.3% at -g -O2, -Os; disappears at -O3). If there is any sharing, however, we win: at -O0 -g3, each additional instantiation of the routine where sharing can occur requires 27.7% less code per instantiation of the test routine, 53% less code per instantiation at -O3. In the test case, if we share a single instantiation we win in all but the -O0 -g3 case (that requires us to share 2 instantiations).

There is more that can be done on the micro-optimization level, such as eliminating some of the extra overhead in _Vector_impl (e.g., its iterators can just be pointers, because they will never be exposed to the user). However, I'd like to know if this type of optimization is desired in libstdc++. It uglifies the source a bit, but it can be a boon to users.

Tested on powerpc-apple-darwin and i686-pc-linux-gnu; no regressions. ChangeLog follows. Okay to commit?

Doug

2003-07-28 Doug Gregor <dgregor@apple.com>

	* include/bits/cpp_type_traits.h: (__is_pod): Pointers are POD
	types.
	(_Type_with_alignment): New.
	(_Basic_pod): New.
	(_Store_as): New.
	(__type_traits): Describe the properties of _Basic_pod.
	* include/bits/stl_vector.h: Factor out work that can safely
	operation on a POD type isomorphic to the vale type into
	_Vector_impl. Make std::vector a wrapper around _Vector_impl that
	presents the properly-typed interface to the user.
	* include/bits/vector.tcc: As above. Remove _GLIBCXX_DEPRECATED
	region that defined an undeclared member function.

Attachment: vector_space_opt.patch
Description: Binary data

Attachment: results.txt
Description: Text document


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]