Problem with static const objects and LTO
Jakub Jelinek
jakub@redhat.com
Thu Sep 17 19:03:42 GMT 2020
On Thu, Sep 17, 2020 at 12:18:40PM -0600, Jeff Law via Gcc-patches wrote:
> >> In an LTO world the TU isn't indivisible anymore. LTO will happily
> >> discard things which don't appear to be used. So parts of the TU may
> >> be in the main program, other parts may be in DSOs used by the main
> >> program. This can mean that objects are unexpectedly being passed
> >> across DSO boundaries and the details of those objects has, in effect,
> >> become part of the ABI. If I was to change an internal data structure,
> >> build the static library and main program we could get bad behavior
> >> because an instance of that data structure could be passed to a DSO
> >> because the main executable is "incomplete" and we end up calling copies
> >> of routines from the (not rebuilt) DSOs.
> > I think the situation is simpler - LTO can duplicate data objects for the
> > purpose of optimization within some constraints and there might be a simple
> > error in it thinking duplicating of the static const object into two LTRANS
> > units is OK. So - do you have a testcase?
>
> man-db trips over this. The executable links against a static version
> of gnulib as well as linking dynamically to DSOs which themselves were
> linked against a static gnulib. We're getting bits of regcomp.c in the
> main executable and other bits are in a DSO. We have two copies of
> utf8_sb_map (one in the main executable, another in a DSO) -- and the
> code within regcomp assumes there is only one copy of utf8_sb_map.
>
> Prior to LTO, when the main executable linked against the static gnulib
> and had to pull in anything from regcomp.c, it ended up pulling in *all*
> of regcomp.c into the main executable and those definitions overrode
> anything in the DSO and it "just worked", though it is a nightmare from
> a composability standpoint.
That looks to me like a problem on the linker or linker plugin side.
Because, if a TU has two global entrypoints, then when linking a *.a library
with that TU, either linker chooses to link it in and then we should ensure
both symbols are exported, or it is not linked and nothing is added.
Both of the symbols are part of the library ABI (of course, unless using a
versioning script, symbols are hidden etc.).
Can you check if it behaves always that way for shared libraries?
On the executable side, I guess it is slightly different case, unless e.g.
-Wl,-E / -Wl,--export-dynamic (in which case again everything not hidden
should be treated like exported as in shared libraries), I guess we might
consider symbols not really used as not needed, but I'd say we should even
here really follow what the linker would do normally, i.e. if it would
normally put symbols into .dynsym (as e.g. some shared library references
that symbol and the binary is linked against it), we should still ensure the
symbol is exported.
Jakub
More information about the Gcc-patches
mailing list