This is the mail archive of the
mailing list for the GCC project.
more on the C++ abstraction penalty
- To: gcc at gcc dot gnu dot org
- Subject: more on the C++ abstraction penalty
- From: Joe Buck <jbuck at racerx dot synopsys dot com>
- Date: Tue, 7 Nov 2000 09:43:51 -0800 (PST)
gcc 2.95.2 does very well on the Stepanov abstraction penalty benchmark.
This is misleading. The reason is that every "abstract object" in the
Stepanov test is reference to a builtin type or a one-element struct,
and these are handled very nicely by a special optimization we've been
But real C++ programming tends to pass objects through inline functions
that have more than one element. gcc tends to generate lots of
unnecessary loads and stores for such code. One example of a library
that I like that really suffers from problems with this is the
Generic Graph Component Library (see http://www.lsc.nd.edu/research/ggcl/).
Compilers such as KAI's do a much better job here.
ADDRESSOF is currently broken in the snapshots and should be fixed, but
now that we have a tree-based inliner, we should be able to do much
better, by doing a fairly simple pass on the tree after a function is
inlined. What I have in mind is the following (KAI does this kind of thing):
transformation #1: propagation of references. For each local reference,
we replace all uses of it by uses of the object the reference points to.
The local reference is then removed or marked unused (if we have to keep
it to generate debug information).
(Other types of value propagation could be included here as well, e.g.
pointers that always point to known values, but to keep it simple it
should probably be done only for objects that are initialized and then
never changed, as it's tricky to do full value propagation on trees).
transformation #2: struct splitting. For each local struct (or class
object) whose address is not taken, replace with one variable per element.
For temporary objects this is no problem ... for named objects there is
the problem of generating the debug information for these split structs.
#2 can create more opportunities for #1, so we can iterate until there
is no more gain.
We should be able to get huge performance gains for a lot of C++ code with
this kind of mechanism.