Created attachment 59754 [details] Preprocessed source code for minimal test case. When GCC is directed to generate position-independent code (via the -fPIC or -fpic option), it fails to inline function invocations that are clearly suitable for inlining. I have constructed a minimal test case demonstrating this behavior. In the attached file mycode.i is the preprocessed source. Please see the attached source before reading further. To demonstrate, suppose we put the code in mycode.c and build it into a shared library (simply compiling it into an object file would suffice, but "objdump -d" more clearly shows the targets of branches instructions for shared libraries than for object files). Build it, but *not* as position-independent code (i.e. without -fPIC or -fpic), with the following command: gcc -shared -o libmycode.so mycode.c -Wl,-h,libmycode.so.0 -O3 If we then run "objdump -d libmycode.so", we see that the invocation of function identity in function also_identity has been inlined away. Next, build the code again but as position-independent code with either of the below commands: gcc -shared -o libmycode.so mycode.c -Wl,-h,libmycode.so.0 -O3 -fPIC gcc -shared -o libmycode.so mycode.c -Wl,-h,libmycode.so.0 -O3 -fpic If we again run "objdump -d libmycode.so", we see that, this time, gcc has *not* inlined the invocation of function identity in function also_identity. (Of course, for my toy example, the failure to inline would have a negligible effect on performance, but a missed optimization is still a missed optimization, and one can imagine that the detriment to performance might be more significant in a more "real world" scenario.) I am using version 14.2.1+r134+gab884fffe3fc-1 of the Arch Linux gcc package. I have also pasted below the output of "gcc -v" on my machine: Using built-in specs. COLLECT_GCC=gcc COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-pc-linux-gnu/14.2.1/lto-wrapper Target: x86_64-pc-linux-gnu Configured with: /build/gcc/src/gcc/configure --enable-languages=ada,c,c++,d,fortran,go,lto,m2,objc,obj-c++,rust --enable-bootstrap --prefix=/usr --libdir=/usr/lib --libexecdir=/usr/lib --mandir=/usr/share/man --infodir=/usr/share/info --with-bugurl=https://gitlab.archlinux.org/archlinux/packaging/packages/gcc/-/issues --with-build-config=bootstrap-lto --with-linker-hash-style=gnu --with-system-zlib --enable-__cxa_atexit --enable-cet=auto --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-default-ssp --enable-gnu-indirect-function --enable-gnu-unique-object --enable-libstdcxx-backtrace --enable-link-serialization=1 --enable-linker-build-id --enable-lto --enable-multilib --enable-plugin --enable-shared --enable-threads=posix --disable-libssp --disable-libstdcxx-pch --disable-werror Thread model: posix Supported LTO compression algorithms: zlib zstd gcc version 14.2.1 20240910 (GCC)
https://gcc.gnu.org/onlinedocs/gcc-14.2.0/gcc/Optimize-Options.html#index-fsemantic-interposition This is by design. You need -fno-semantic-interposition..
Or declare the function with hidden visibility.
(In reply to Andrew Pinski from comment #1) > https://gcc.gnu.org/onlinedocs/gcc-14.2.0/gcc/Optimize-Options.html#index- > fsemantic-interposition > > This is by design. You need -fno-semantic-interposition.. (In reply to Andreas Schwab from comment #2) > Or declare the function with hidden visibility. Thanks for the tips. Sorry to have wasted y'all's time.
It's a common footgun in ELF. :)