This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: [C++] GCC tree linkage types
- From: Ian Lance Taylor <ian at wasabisystems dot com>
- To: Chris Lattner <sabre at nondot dot org>
- Cc: Matt Austern <austern at apple dot com>, <gcc at gcc dot gnu dot org>, Gabriel Dos Reis <gdr at integrable-solutions dot net>, Richard Henderson <rth at redhat dot com>
- Date: 07 Nov 2003 12:34:57 -0500
- Subject: Re: [C++] GCC tree linkage types
- References: <Pine.LNX.4.44.0311070949140.18499-100000@nondot.org>
Chris Lattner <sabre@nondot.org> writes:
> > Common symbols and weak symbols have different semantics; I just
> > described common symbol semantics in my last message, and they are
> > clearly not the same as weak symbol semantics.
>
> From your description, it sounds like common symbols and weak symbols have
> the same behavior, except that common symbols expand as necessary. Do
> common symbols merge their initializers or something else that I'm
> missing? If not, how are common symbols different from weak symbols where
> the linker prefers to keep the largest of the symbols when it links?
I should say that when I wrote the above, I was thinking of the usual
definition of weak symbols, which is based on the ELF standard. I
think that you are using the term weak in a different way, meaning
something more like what I would call linkonce (which I think is not
what you would call linkonce). So I'll try to answer the question of
whether common symbols are different from linkonce symbols.
One simple difference between common symbols and linkonce symbols is
that common symbols have only a size, while linkonce symbols have an
address. But that is an implementation detail--linkonce symbols can
in principle have sizes, and if the initializer value is zero the
address may not be important.
So the question then is: is there a difference betweem common symbols
and linkonce symbols initialized to zero, provided we always choose
the largest linkonce symbol when doing a link?
First I'll introduce another wrinkle of common symbols, which is that
it is OK for the same symbol name to appear as both a common symbol
and as a defined symbol. In such a case the common symbol is treated
as an undefined reference to the defined symbol. This is used in
FORTRAN to provides values for a common block.
So now the question is: is there is a difference between common
symbols and linkonce symbols initialized to zero, provided we always
choose the largest linkonce symbol when doing a link, and provided we
permit a normal definition to override and replace any linkonce
symbol?
I think the answer to the question may be that there is no
difference. There would certainly be some trickiness in
implementation, particularly when linking against a dynamic library.
But in principle that should be solvable.
But the uses of common symbols and linkonce symbols are quite
different. In practice, a linkonce symbol always has a non-zero
initializer. By definition, a common symbol always has a zero
initializer. The linker normally simply uses the first definition of
a linkonce symbol, and discards subsequent ones; is there a benefit to
choosing the largest one?
Could common symbols be implemented as linkonce symbols? It would be
less efficient in current practice, because common symbols take up no
space in an object file whereas linkonce symbols do take up space.
But other than that, I think common symbols could be implemented as
linkonce symbols, if we extended the semantics of linkonce symbols as
described above to always choose the largest one, and to permit a
strong symbol to override the linkonce symbol.
But common symbols are still clearly different from weak symbols,
using the ELF definition of weak symbols.
> Personally, I'm not working under the constraints of using ELF or any
> other existing linker technology. We already have our own linker, and we
> already support the linkage types as described here (I know the names are
> horribly confusing things, for which I sincerely appologize!):
> http://llvm.cs.uiuc.edu/docs/LangRef.html#modulestructure
I don't see much consideration of dynamic linking there, though. For
example, what about symbol versioning? Perhaps it doesn't matter for
you.
> These linkage types work great for us, I just want to have the compiler
> generate as many of our 'linkonce' symbols as possible, in preference to
> 'weak'.
I'm not clear on the semantic difference between your `linkonce'
symbols and your `weak' symbols. To me the difference seems to be
whether the linker does garbage collection or not.
Or, wait, I see, the difference applies at the compiler level, not the
linker level. Your `linkonce' symbols may be discarded from the
object file if they are not referenced from within the object file.
That is a distinction which makes no difference to the linker, of
course. It would never see an instance of your `linkonce' symbol
which was not referenced.
> One of the things that I dislike about the GCC/GNU ld approach is that
> distinct ideas, such as linkage types and executable sections, are often
> confused, though they are completely orthogonal ideas. GCC/ld happens to
> implement a variety of linkage optimizations using special named sections
> (such as the gnu.linkonce family), but this is just an implementation
> technique, not a necessary approach. I'm trying to filter out the minimal
> set of information needed to represent the source program, while letting
> a suitable capable optimizer do good things to the program.
Yes, using a specially named section for linkonce symbols is just a
trick used for ELF, because ELF didn't have any way to define the
symbol semantics prior to recent introduction of SHT_GROUP. You are
of course correct that linkage type and section placement are
completely different ideas. (Note that there is a subtle difference
which arises when using specially named sections for linkonce
sections, which is that the linkonce character is driven from the
section name, not the symbol name. It is possible to generate object
files such that this makes no difference, but it is also possible to
generate object files such that it does make a difference.)
Ian