Created attachment 32518 [details] Example showing failure to initialize a dynamic library after multiple calls to dlopen(). If a dynamic library is loaded multiple times via dlopen(), subsequent loads do not correctly initialize static variables under the following conditions: 1) A class/struct with a constructor 2) with an inlined function 3) containing a static variable. Please run the attached example. # tar xf gcc_static_issue.tgz # cd gcc_static_issue # make # ./test_static Expected behavior (as on RHEL5, g++ 4.1.2): Type 'q' to exit or enter to reload/run the DLL count:1 count:1 count:1 count:1 count:1 q # Actual behavior (as on RHEL6 and RHEL7beta, g++ 4.4.7 and 4.8.2, respectively): Type 'q' to exit or enter to reload/run the DLL count:1 count:2 count:3 count:4 count:5 q #
Works up to GCC 4.4, fails since GCC 4.5. It's not clear what makes the difference here. Btw, with LD_DEBUG=all I see 10389: opening file=./static.so [0]; direct_opencount=2 10389: 10389: symbol=routine; lookup in file=./static.so [0] 10389: binding file ./static.so [0] to ./static.so [0]: normal symbol `routine' count:2 10389: opening file=./static.so [0]; direct_opencount=3 10389: 10389: symbol=routine; lookup in file=./static.so [0] 10389: binding file ./static.so [0] to ./static.so [0]: normal symbol `routine' count:3 so the dlclose call does nothing. While in the working case: 10438: 10438: file=./static.so [0]; dynamically loaded by ./test_static [0] 10438: file=./static.so [0]; generating link map 10438: dynamic: 0x00007ffff6ffede0 base: 0x00007ffff6dfd000 size: 0x00000000002020a8 10438: entry: 0x00007ffff6dfdb10 phdr: 0x00007ffff6dfd040 phnum: 7 .... 10438: calling init: ./static.so 10438: 10438: opening file=./static.so [0]; direct_opencount=1 10438: 10438: symbol=routine; lookup in file=./static.so [0] 10438: binding file ./static.so [0] to ./static.so [0]: normal symbol `routine' count:1 10438: 10438: calling fini: ./static.so [0] 10438: 10438: 10438: file=./static.so [0]; destroying link map
We hit void _dl_close (void *_map) { struct link_map *map = _map; /* First see whether we can remove the object at all. */ if (__builtin_expect (map->l_flags_1 & DF_1_NODELETE, 0)) { assert (map->l_init_called); /* Nope. Do nothing. */ return; the DF_1_NODELETE flag is set already after the first dlopen call which sets it via do_lookup_x for the STB_GNU_UNIQUE symbol _ZGVZ16make_static_stayvE3smp if (map->l_type == lt_loaded) /* Make sure we don't unload this object by setting the appropriate flag. */ ((struct link_map *) map)->l_flags_1 |= DF_1_NODELETE; so this either points to a "bad" design on the guard code for initializing 'smp' or to a weakness in the dynamic loader which doesn't handle unloading of objects which define any(?) STB_GNU_UNIQUE symbol. Note the above is guarded with if ((type_class & ELF_RTYPE_CLASS_COPY) != 0) enter (entries, size, new_hash, strtab + sym->st_name, ref, undef_map); else { enter (entries, size, new_hash, strtab + sym->st_name, sym, map); if (map->l_type == lt_loaded) /* Make sure we don't unload this object by setting the appropriate flag. */ ((struct link_map *) map)->l_flags_1 |= DF_1_NODELETE; } thus if this were referenced via a copy relocation it would work. Jason?
Right, it was a deliberate choice in ld.so to suppress dlclose of DSOs that use STB_GNU_UNIQUE, which causes problems with some code that relies on reinitialization with dlclose/dlopen. As Ian says in http://gcc.gnu.org/ml/gcc-help/2011-05/msg00450.html this seems excessive; you only need to avoid unloading files that are satisfying symbol references in another DSO. But I guess checking for that was deemed too slow. If you're using the gold linker, you can link with --no-gnu-unique to avoid the use of STB_GNU_UNIQUE. I suppose I should add a compiler flag to turn it off, too...
Thus this is a bug in the dynamic loader as well. Please file a bug against glibc on sourceware.org/bugzilla.
And actually it might be considered a non-bug in GCC but a consequence of implementing a requirement. Jason posted a patch that implements a workaround for the dynamic linker issue. Closing as moved - please open a bugreport against glibc.
I created glibc bug #16805 (https://sourceware.org/bugzilla/show_bug.cgi?id=16805).
Author: jason Date: Mon Apr 7 13:27:39 2014 New Revision: 209186 URL: http://gcc.gnu.org/viewcvs?rev=209186&root=gcc&view=rev Log: PR c++/60731 * common.opt (-fno-gnu-unique): Add. * config/elfos.h (USE_GNU_UNIQUE_OBJECT): Check it. Modified: trunk/gcc/ChangeLog trunk/gcc/common.opt trunk/gcc/config/elfos.h trunk/gcc/doc/invoke.texi
Author: jason Date: Mon Apr 7 13:27:45 2014 New Revision: 209187 URL: http://gcc.gnu.org/viewcvs?rev=209187&root=gcc&view=rev Log: PR c++/60731 * lib/gcc-dg.exp (dg-build-dso): New. (gcc-dg-test-1): Handle dg-do-what "dso". * lib/target-supports.exp (add_options_for_dlopen): New. (check_effective_target_dlopen): Use it. * g++.dg/dso/dlclose1.C: New. * g++.dg/dso/dlclose1-dso.cc: New. Added: trunk/gcc/testsuite/g++.dg/dso/ trunk/gcc/testsuite/g++.dg/dso/dlclose1-dso.cc trunk/gcc/testsuite/g++.dg/dso/dlclose1.C Modified: trunk/gcc/testsuite/ChangeLog trunk/gcc/testsuite/lib/gcc-dg.exp trunk/gcc/testsuite/lib/target-supports.exp
I started to looking what STB_UNIQUE purpose is so I have several questions. First does suggestion below really work? http://gcc.gnu.org/ml/gcc-help/2011-05/msg00450.html Say you have foo.so with unique symbol foo and function bar *getfoo() { return (void *) &foo; } which gets loaded and unloaded like void *h = dlopen("foo.so",RTLD_NOW); bar *p1 = dlsym(h,"getfoo")(); dlclose(h); foo->baz(); h = dlopen("foo.so",RTLD_NOW); bar *p2 = dlsym(h,"getfoo")(); dlclose(h); Are p1 and p2 supposed to point to same object? Also is foo->baz(); legal or not? If its so we cannot call destructors. There could be fix to add zombie state where we call destructors but not free memory so we can reinitialize object at same address but I need to know that calling destructor is always intended behaviour.
(In reply to Ondrej Bilka from comment #9) > First does suggestion below really work? > http://gcc.gnu.org/ml/gcc-help/2011-05/msg00450.html I don't see why it wouldn't. > void *h = dlopen("foo.so",RTLD_NOW); > bar *p1 = dlsym(h,"getfoo")(); > dlclose(h); > foo->baz(); > h = dlopen("foo.so",RTLD_NOW); > bar *p2 = dlsym(h,"getfoo")(); > dlclose(h); > > Are p1 and p2 supposed to point to same object? > Also is foo->baz(); legal or not? If its so we cannot call destructors. I don't think a program can reasonably rely on either of these. > There could be fix to add zombie state where we call destructors but not > free memory so we can reinitialize object at same address but I need to know > that calling destructor is always intended behaviour. That sounds fine to me.
Can this please be reopened? It was determined in the glibc bugzilla that this is a gcc problem because of the incorrect setting of unique flag.
(In reply to Dave Johansen from comment #11) > Can this please be reopened? It was determined in the glibc bugzilla that > this is a gcc problem because of the incorrect setting of unique flag. The setting is not incorrect, nor is it an optimization; it is necessary to fix the behavior of RTLD_LOCAL with multiple loaded objects depending on the same library, since the glibc developers rejected the other approach that I suggested (https://www.sourceware.org/ml/libc-alpha/2002-05/msg00222.html). If you don't need this handling, in 4.9 you can use -fno-gnu-unique to disable it. I'll go ahead and backport that switch to 4.8 as well.
Author: jason Date: Fri Jun 13 16:39:37 2014 New Revision: 211648 URL: https://gcc.gnu.org/viewcvs?rev=211648&root=gcc&view=rev Log: PR c++/60731 * common.opt (-fno-gnu-unique): Add. * config/elfos.h (USE_GNU_UNIQUE_OBJECT): Check it. Modified: branches/gcc-4_8-branch/gcc/ChangeLog branches/gcc-4_8-branch/gcc/common.opt branches/gcc-4_8-branch/gcc/config/elfos.h branches/gcc-4_8-branch/gcc/doc/invoke.texi
Could you please point me to how I can reproduce the issue with "RTLD_LOCAL with multiple loaded objects depending on the same library"? I would like to see if I can reproduce that issue with clang++ and icpc. Thanks, Dave