This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: [C++] GCC tree linkage types
- From: Chris Lattner <sabre at nondot dot org>
- To: Ian Lance Taylor <ian at wasabisystems dot com>
- Cc: Matt Austern <austern at apple dot com>, <gcc at gcc dot gnu dot org>,Gabriel Dos Reis <gdr at integrable-solutions dot net>,Richard Henderson <rth at redhat dot com>
- Date: Fri, 7 Nov 2003 10:01:11 -0600 (CST)
- Subject: Re: [C++] GCC tree linkage types
On 7 Nov 2003, Ian Lance Taylor wrote:
> Chris Lattner <sabre@nondot.org> writes:
>
> > difference in semantics. You are confusing what linkers _happen to
> > currently implement_ with the need of the source languages. I'm more
> > interested in what it takes to implement language requirements
> > efficiently.
>
> I honestly don't think I am confusing anything. We are talking about
> several different kinds of symbols with different link-time semantics.
Ok, I'm sorry, I should have clarified...
> Common symbols and weak symbols have different semantics; I just
> described common symbol semantics in my last message, and they are
> clearly not the same as weak symbol semantics.
>From your description, it sounds like common symbols and weak symbols have
the same behavior, except that common symbols expand as necessary. Do
common symbols merge their initializers or something else that I'm
missing? If not, how are common symbols different from weak symbols where
the linker prefers to keep the largest of the symbols when it links?
> It's not a matter of what linkers happen to currently implement; these
> different types of symbols have well-defined semantics, and all linkers
> can and do implement them the same way (modulo a long-standing and
> relatively unimportant disagreement about the handling of weak symbols
> in ELF). You may be saying that the different link-time semantics don't
> matter; in the real world, I don't think that is correct.
Personally, I'm not working under the constraints of using ELF or any
other existing linker technology. We already have our own linker, and we
already support the linkage types as described here (I know the names are
horribly confusing things, for which I sincerely appologize!):
http://llvm.cs.uiuc.edu/docs/LangRef.html#modulestructure
These linkage types work great for us, I just want to have the compiler
generate as many of our 'linkonce' symbols as possible, in preference to
'weak'.
> From an idealized language perspective, there is really only one type
> of defined symbol--an ordinary strong definition. For efficient
> implementation of C++, we add provisional symbols (which you are
> calling linkonce), and linkonce symbols (which you are calling weak).
Ok, so the root of what I'm getting at is that some provisional/linkonce
symbols can be deleted if they are not used (within a single translation
unit, WITHOUT whole-program information), and some cannot. This is a
property of the source language, not the objects in question. I'd like
there to be a clear distinction between the two, because it allows for
significant optimization on my end.
> For correct implementation of FORTRAN, we must also add common
> symbols; common symbols are also used in traditional C environments.
> (Again, common symbols are not the same as weak symbols; if gcc
> emitted common symbols the same way that it emits weak symbols, it
> would produce incorrect results for FORTRAN.)
Again, I must be missing the distinction. Please help! :)
> Once we leave the idealized language perspective, we must consider the
> needs of library maintainers. In gcc these have led to language
> extensions in the form of function and variable attributes. Some
> attributes which have link-time semantics are section, constructor,
> are implemented using special section semantics; weak and alias are
> implemented (at least in ELF) using two forms of the weak symbol type
> (which is not quite the same as what you are calling weak). You can't
> just lump all these things together, or you will get the wrong result.
I understand that from the GCC perspective, things get more complex. In
LLVM, we implement a subset of these extensions (including the constructor
and destructor ones), but currently don't support them all (such as the
section attribute). Eventually we will, but none of these attributes have
anything to do with linkage types, even though they impact the linker
(the current implementation might IMPLEMENT these with special linkage
types, but that is not required).
One of the things that I dislike about the GCC/GNU ld approach is that
distinct ideas, such as linkage types and executable sections, are often
confused, though they are completely orthogonal ideas. GCC/ld happens to
implement a variety of linkage optimizations using special named sections
(such as the gnu.linkonce family), but this is just an implementation
technique, not a necessary approach. I'm trying to filter out the minimal
set of information needed to represent the source program, while letting
a suitable capable optimizer do good things to the program.
Thanks for your boundless patience. :)
-Chris
--
http://llvm.cs.uiuc.edu/
http://www.nondot.org/~sabre/Projects/