This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Q: About Optimization to Elide Copy Constructors
- To: egcs at egcs dot cygnus dot com
- Subject: Q: About Optimization to Elide Copy Constructors
- From: Josh Stern <jstern at citilink dot com>
- Date: Thu, 15 Apr 1999 18:21:47 -0500 (CDT)
- Posted-Date: Thu, 15 Apr 1999 18:21:47 -0500 (CDT)
To: egcs.cygnus.com
Subject: Q: About Optimization to Elide Copy Constructors
In a recent newgroup thread ("KAI C++..." on comp.lang.c++ and
comp.os.linux.development.apps) it was claimed that it would
be very difficult (or impossible) for egcs to implement the
C++ opimization that allows the elimination of 'extra' copy
constructor invocations on return from functions which
return-by-value C++ objects with non-trivial constructors.
The person making the strongest form of this claim seemed to
be ill-informed on a number of issues, however Per Bothner
of Cygnus also commented as follows:
"Multi-pass" means that the compiler uses multiple passes.
That usually means that the different passes do different things.
One of the problems with compilers in general and optimizing
compilers in particular is that often you want some information
before you can conveniently get it. Multiple passes is no
panacea. The way Gcc is currently structured, the return
value optimization is best done during the first pass
(ignoring the pre-processer pass). But we can't do it until
we know what value is actually returned, i.e. after the first pass.
My reason for posting this is to ask about the last sentence in
this paragraph. I'm sure that Per and other EGCS contributors
know a thousand times more about compilers in general, and EGCS
in particular, than I do. However, my naive thought would be
that it is not a practical necessity to know the value returned
in order to implement this optimization at the caller and the
callee - so long as they agree on a convention for implementing
it. In particular, the class declaration for the object returned
must be visible at both points, so it can be determined whether
this object has any non-trivial constructors or copy constructor.
If so, then the convention can be adopted that the caller will
always, when this condition is met, supply a region of local
storage in the stack frame of the caller for the returned object
and pass a pointer to
this region as a 'hidden' argument of the function. The
implementation of the function would always assume such a
parameter (when the conditions are met), and do the equivalent
of placement new construction of the object to be returned
on this storage region wherever appropriate. It would be
up to the caller to further decide whether this newly
formed object is identical to a newly constructed object
in the original scope of the function call, or whether it
is a temporary argument to the constructor of a different
object or an assignment operator, which much then be
destroyed after this operation is complete.
I believe that this strategy would implement the optimization
in question with
the following drawbacks: 1) binary incompatibility with
previous ABI, 2) a different "working definition" of when
an object file is stale wrt a header file, and 3) a small
loss of efficiency for cases where the object is very small,
the copy constructor was trivial, and it turns out that
only trivial versions of the constructor would have been
called anyway. Does this analysis make sense? If so,
then perhaps it would be something to target for the next
point of ABI breakage (note: I have no idea whether this
would be actually hard or easy to implement in egcs, it's
just a thought).
- Josh
jstern@citilink.com