[PATCH] lto-plugin: add support for feature detection

Rui Ueyama rui314@gmail.com
Sun May 15 06:57:25 GMT 2022


On Fri, May 6, 2022 at 10:47 PM Alexander Monakov <amonakov@ispras.ru> wrote:
>
>
>
> On Thu, 5 May 2022, Martin Liška wrote:
>
> > On 5/5/22 12:52, Alexander Monakov wrote:
> > > Feels a bit weird to ask, but before entertaining such an API extension,
> > > can we step back and understand the v3 variant of get_symbols? It is not
> > > documented, and from what little I saw I did not get the "motivation" for
> > > its existence (what it is doing that couldn't be done with the v2 api).
> >
> > Please see here:
> > https://github.com/rui314/mold/issues/181#issuecomment-1037927757
>
> Thanks. I've also re-read [1] and [2] which provided some relevant ideas.
>
> [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86490
> [2] https://sourceware.org/bugzilla/show_bug.cgi?id=23411
>
>
> OK, so the crux of the issue is that sometimes the linker needs to feed the
> compiler plugin with LTO .o files extracted from static archives. This is
> not really obvious, because normally .a archives have an index that enumerates
> symbols defined/used by its .o files, and even during LTO the linker can simply
> consult the index to find out which members to extract.  In theory, at least.
>
> The theory breaks in the following cases:
>
>  - ld.bfd and common symbols (I wonder if weak/comdat code is also affected?):
>  archive index does not indicate which definitions are common, so ld.bfd
>  extracts the member and feeds it to the plugin to find out;
>
>  - ld.gold and emulated archives via --start-lib a.o b.o ... --end-lib: here
>  there's no index to consult and ld.gold feeds each .o to the plugin.
>
> In those cases it may happen that the linker extracts an .o file that would
> not be extracted during non-LTO link, and if that happens, the linker needs to
> inform the plugin. This is not the same as marking each symbol from spuriously
> extracted .o file as PREEMPTED when the .o file has constructors (the plugin
> will assume the constructors are kept while the linker needs to discard them).
>
> So get_symbols_v3 allows the linker to discard an LTO .o file to solve this.
>
> In absence of get_symbols_v3 mold tries to ensure correctness by restarting
> itself while appending a list of .o files to be discarded to its command line.
>
> I wonder if mold can invoke plugin cleanup callback to solve this without
> restarting.

We can call the plugin cleanup callback from mold, but there are a few problems:

First of all, it looks like it is not clear what state the plugin
cleanup callback resets to.
It may reset it to the initial state with which we need to restart
everything from calling
`onload` callback, or it may not deregister functions registered by
the previous `onload`
call. Since the exact semantics is not documented, the LLVM gold
plugin may behave
differently than the GCC plugin.

Second, if we reset a plugin's internal state, we need to register all
input files by calling
the `claim_file_hook` callback, which in turn calls the `add_symbols`
callback. But we
don't need any symbol information at this point because mold already
knows what are
in LTO object files as it calls `claim_file_hook` already on the same
sets of files. So the
`add_symbols` invocations would be ignored, which is a waste of resources.

So, I prefer get_symbols_v3 over calling the plugin cleanup callback.

> (also, hm, it seems to confirm my idea that LTO .o files should have had the
> correct symbol table so normal linker algorithms would work)

I agree. If GCC LTO object file contains a correct ELF symbol table,
we can also eliminate
the need of the special LTO-aware ar command. It looks like it is a
very common error to
use an ar command that doesn't understand the LTO object file, which
results in mysterious
"undefined symbol" errors even though the object files in an archive
file provide that very
symbols.


More information about the Gcc-patches mailing list