This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Type name mangling



The following is a draft of a short paper in connection with a
"Extension namespace" technical report proposal before the 
ANSI/ISO C++ committee.  I'm posting the draft here to solicit 
comments, and as a "heads-up" for those not on the committee
library reflector.  Comments welcome.

Nathan Myers
ncm@cantrip.org

-----------

Extern "C" Types for C Link-compatibility    DRAFT 1998-10-22
-----------------------------------------
by Nathan Myers  ncm@cantrip.org

This is not a proposal for a language change.  Rather, it is an 
exploration of the effect of an implementation detail on integration 
of C++ code with existing C libraries and headers.  This is related 
to the "Extension Namespaces" technical report proposal, N1166, 
distributed in Santa Cruz and in the post-mailing.  (See 
http://www.cantrip.org/N1166.html)

The Problem

Users of C++ persist in using C libraries in their programs,
and including C headers into their C++ compilation units.

Struct types defined in C headers get mangled into C++ function
linkage-names.  As a result, it matters which namespace those 
struct types are defined in, and code which "forward-declares"
the types does not link if it is in a different namespace.  This 
makes a gradual transition from C to properly namespaced C++ 
difficult or impossible, and interferes with placing C types 
in namespaces at all.

The C library subset of the standard C++ library offers good 
examples.  It defines a type, "struct tm".  C programmers are used 
to forward-declaring struct tags to use in declaring functions -- 
as in

  struct tm;
  void maraud(tm*);

-- instead of including <time.h>, and will be very reluctant to give
up the practice for the benefit of C++ users.  The resulting linkage 
name for maraud() might be "maraud__FP2tm", where, were a conforming 
<time.h> included first, it would instead be "maraud__FPQ23std2tm".

Some implementers solve this problem by recognizing specific names 
in the std namespace, such as "tm" and "lconv", and mangling them 
into function names as if they were global types.   Not only is 
this a terrible maintenance risk, it interferes with simultaneous 
conformance to various standards (e.g. C89 and C9x), and does 
nothing for users who have the same problem with their own C 
names.

A Possible Solution

The standard does not specify name mangling, only name visibility.
Conventionally, only function names are affected by the 'extern "C"'
syntax, but consider:

  struct Y;
  namespace X {
    void maraud(::Y*); //(a)  "maraud__1XP1Y
    extern "C" {
      struct Y {};
      void maraud(Y*); //(b)  "maraud"
    }
  }
  void maraud(Y*);     //(c)  "maraud_FP1Y"
  void maraud(X::Y*);  //(d)  "maraud__FPQ21X1Y", or "maraud_FP1Y"?

If 'extern "C"' were to affect how the type X::Y mangles into function
names, we could extend to users (and ourselves) the tools to treat C
types precisely as we would like, at the (conforming) source code 
level.  The four declarations of maraud() above declare four different 
functions.  Of these, (b) would actually be "unmangled" and would match 
a C function.  The interesting case is (c) and (d), which could be given
identical link-names and so, despite being different functions, refer
to the same object-code definition.

What is required to implement this?  The 'extern "C"' syntax would
have to be remembered as part of the struct tag name, and (only) 
when generating function name manglings would be expressed by 
stripping off any namespace prefix.  No linker changes would be
needed.

What if a user tried to pass an X::Y pointer to (c), above, if
(d) were not declared?  The standard requires a diagnostic for
this case, but an implementation might reasonably guess (c) 
anyhow, and continue.

Backward Compatibility

The above mangling scheme might not preserve link-compatible with 
existing user code that mangles namespaced extern "C" types into 
non-C function names.  What can be done?  A conforming extension for 
the same purpose might instead apply the semantics described above to 
types defined in an 'extern "C-global"' block.  If compilers which 
implement the semantics described above for 'extern "C"' also provided 
the same semantics for 'extern "C-global"', some source-code portability 
would be possible.

Conclusion

The problems of link-compatibility with C libraries do not appear to 
require compiler knowledge of the specific set of C names involved.  
Instead, a minor (and conforming) change to existing practice might 
leave room for a more general and portable solution.  Solutions 
will be necessary to be able to absorb and encapsulate libraries 
defined by other standards bodies.



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]