This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Type name mangling
- To: egcs at cygnus dot com
- Subject: Type name mangling
- From: Nathan Myers <ncm at cantrip dot org>
- Date: Thu, 22 Oct 1998 04:42:39 -0700
The following is a draft of a short paper in connection with a
"Extension namespace" technical report proposal before the
ANSI/ISO C++ committee. I'm posting the draft here to solicit
comments, and as a "heads-up" for those not on the committee
library reflector. Comments welcome.
Nathan Myers
ncm@cantrip.org
-----------
Extern "C" Types for C Link-compatibility DRAFT 1998-10-22
-----------------------------------------
by Nathan Myers ncm@cantrip.org
This is not a proposal for a language change. Rather, it is an
exploration of the effect of an implementation detail on integration
of C++ code with existing C libraries and headers. This is related
to the "Extension Namespaces" technical report proposal, N1166,
distributed in Santa Cruz and in the post-mailing. (See
http://www.cantrip.org/N1166.html)
The Problem
Users of C++ persist in using C libraries in their programs,
and including C headers into their C++ compilation units.
Struct types defined in C headers get mangled into C++ function
linkage-names. As a result, it matters which namespace those
struct types are defined in, and code which "forward-declares"
the types does not link if it is in a different namespace. This
makes a gradual transition from C to properly namespaced C++
difficult or impossible, and interferes with placing C types
in namespaces at all.
The C library subset of the standard C++ library offers good
examples. It defines a type, "struct tm". C programmers are used
to forward-declaring struct tags to use in declaring functions --
as in
struct tm;
void maraud(tm*);
-- instead of including <time.h>, and will be very reluctant to give
up the practice for the benefit of C++ users. The resulting linkage
name for maraud() might be "maraud__FP2tm", where, were a conforming
<time.h> included first, it would instead be "maraud__FPQ23std2tm".
Some implementers solve this problem by recognizing specific names
in the std namespace, such as "tm" and "lconv", and mangling them
into function names as if they were global types. Not only is
this a terrible maintenance risk, it interferes with simultaneous
conformance to various standards (e.g. C89 and C9x), and does
nothing for users who have the same problem with their own C
names.
A Possible Solution
The standard does not specify name mangling, only name visibility.
Conventionally, only function names are affected by the 'extern "C"'
syntax, but consider:
struct Y;
namespace X {
void maraud(::Y*); //(a) "maraud__1XP1Y
extern "C" {
struct Y {};
void maraud(Y*); //(b) "maraud"
}
}
void maraud(Y*); //(c) "maraud_FP1Y"
void maraud(X::Y*); //(d) "maraud__FPQ21X1Y", or "maraud_FP1Y"?
If 'extern "C"' were to affect how the type X::Y mangles into function
names, we could extend to users (and ourselves) the tools to treat C
types precisely as we would like, at the (conforming) source code
level. The four declarations of maraud() above declare four different
functions. Of these, (b) would actually be "unmangled" and would match
a C function. The interesting case is (c) and (d), which could be given
identical link-names and so, despite being different functions, refer
to the same object-code definition.
What is required to implement this? The 'extern "C"' syntax would
have to be remembered as part of the struct tag name, and (only)
when generating function name manglings would be expressed by
stripping off any namespace prefix. No linker changes would be
needed.
What if a user tried to pass an X::Y pointer to (c), above, if
(d) were not declared? The standard requires a diagnostic for
this case, but an implementation might reasonably guess (c)
anyhow, and continue.
Backward Compatibility
The above mangling scheme might not preserve link-compatible with
existing user code that mangles namespaced extern "C" types into
non-C function names. What can be done? A conforming extension for
the same purpose might instead apply the semantics described above to
types defined in an 'extern "C-global"' block. If compilers which
implement the semantics described above for 'extern "C"' also provided
the same semantics for 'extern "C-global"', some source-code portability
would be possible.
Conclusion
The problems of link-compatibility with C libraries do not appear to
require compiler knowledge of the specific set of C names involved.
Instead, a minor (and conforming) change to existing practice might
leave room for a more general and portable solution. Solutions
will be necessary to be able to absorb and encapsulate libraries
defined by other standards bodies.