This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Possible problem with COMDAT symbol resolution
- From: Richard Smith <richard at ex-parrot dot com>
- To: GCC Mailing List <gcc at gcc dot gnu dot org>
- Date: Wed, 27 Oct 2004 19:05:16 +0100 (BST)
- Subject: Possible problem with COMDAT symbol resolution
I have been seeing some odd behaviour with the way exception
handling interacts with dynamically loaded libraries (i.e.
DSOs opened using dlopen). I have duplicated it on a number
of platforms using, variously, gcc 3.4.0, gcc 3.3.3, gcc
3.1, binutils 2.15, glibc 2.3.2 and glibc 2.3.4 (a Gentoo
pre-release). It happens on both the AMD-64 and x86
architectures.
I have two shared libraries -- libbase.so and libmiddle.so
with libmiddle.so linking against libbase.so. My test
program dlopens first libbase.so and then libmiddle.so
before calling a function in libmiddle.so (extracted with
dlsym). This calls down to libbase.so which throws an
exception which fails to get caught by libmiddle.so. (See
attatched code.)
The reasons that this fails to get caught is that the
exception's type string symbol "_ZTS9exception" getse
resolved differently in the two DSOs. This means that
type_info::operator== fails to recognise the type_infos in
the two DSOs as the same type.
In this particular example, this causes a real problem
because, when the compiler tries to catch the exception,
__class_type_info::__do_upcast fails to realise that the
type of the catch block matches the type thrown, and the
exception fails to be caught in the correct place.
The different bindings for the symbol "_ZTS9exception" seem
to come about from the way symbol resolution happens. Both
libbase.so and libmiddle.so have a copy of this symbol.
Clearly libbase.so is loaded, the its symbol must be
resolved to address in that library as it is the only copy
visible. However, when libmiddle.so is loaded, its symbol
is resolved to its own version of the type string. This is
presumably justified by section in the ELF specification
which says
| When resolving symbolic references, the dynamic linker
| examines the symbol tables with a breadth-first search.
| That is, it first looks at the symbol table of the
| executable program itself, then at the symbol tables of
| the DT_NEEDED entries (in order), and then at second level
| DT_NEEDED entries, and so on.
I understand *why* it says this -- sometimes it is necessary
to override symbols in lower-level libraries. (E.g. the way
in which libpthread.so overrides certain weak symbols such
as lseek and open from libc.so.)
But is this behaviour correct for symbols with COMDAT
linkage? It seems that these symbols should be bound via a
depth-first search of the symbol tables as the naive
expectation is that the lowest-level version of the symbol
would be used.
I'm not sure whether this really is a bug or whether I'm
expecting too much from the compiler / linker. Has anyone
encountered this before, or have any suggested resolutions?
(I'm aware that this particular case can be got around by
passing RTLD_GLOBAL when the libraries are dlopened, however
this does not solve the problem in general.)
Cheers,
Richard Smith
Attachment:
weaksyms.tar.gz
Description: Test case