When building openSUSE:Factory nvme-cli with -flto=2 it causes a variation in the resulting nvme binary that is not there without -flto /usr/sbin/nvme differs in assembler output @@ -433,7 +433,7 @@ Disassembly of section .text: -_GLOBAL__I_65535_0_cc6lpCUH.o.10070: +_GLOBAL__I_65535_0_cckZcJmg.o.10070: sub $something,%rsp lea offset(%rip),%rdi # <plugin.lto_priv.10> callq <register_extension> strace shows that ["cc", "-D_GNU_SOURCE", "-D__CHECK_ENDIAN__", "-O2", "-Wall", "-D_FORTIFY_SOURCE=2", "-fstack-protector-strong", "-funwind-tables", "-fasynchronous-unwind-tables", "-fstack-clash-protection", "-Werror=return-type", "-flto=2", "-I.", "-std=gnu99", "-I.", "-DLIBUUID", "-DNVME_VERSION=\"1.8.1+git135.9bab71e\"", "nvme.c", "-o", "nvme", "argconfig.o", "suffix.o", "parser.o", "nvme-print.o", "nvme-ioctl.o", "nvme-lightnvm.o", "fabrics.o", "json.o", "nvme-models.o", "plugin.o", "nvme-status.o", "plugins/intel/intel-nvme.o", "plugins/lnvm/lnvm-nvme.o", "plugins/memblaze/memblaze-nvme.o", "plugins/wdc/wdc-nvme.o", "plugins/wdc/wdc-utils.o", "plugins/huawei/huawei-nvme.o", "plugins/netapp/netapp-nvme.o", "plugins/toshiba/toshiba-nvme.o", "plugins/micron/micron-nvme.o", "plugins/seagate/seagate-nvme.o", "plugins/virtium/virtium-nvme.o", "plugins/shannon/shannon-nvme.o", "-luuid"] calls ["/usr/lib64/gcc/x86_64-suse-linux/9/collect2", "-plugin", "/usr/lib64/gcc/x86_64-suse-linux/9/liblto_plugin.so", "-plugin-opt=/usr/lib64/gcc/x86_64-suse-linux/9/lto-wrapper", "-plugin-opt=-fresolution=/tmp/ccQNkgLE.res", "-plugin-opt=-pass-through=-lgcc", "-plugin-opt=-pass-through=-lgcc_s", "-plugin-opt=-pass-through=-lc", "-plugin-opt=-pass-through=-lgcc", "-plugin-opt=-pass-through=-lgcc_s", "-flto=2", "--build-id", "--eh-frame-hdr", "-m", "elf_x86_64", "-dynamic-linker", "/lib64/ld-linux-x86-64.so.2", "-pie", "-o", "nvme", "/usr/lib64/gcc/x86_64-suse-linux/9/../../../../lib64/Scrt1.o", "/usr/lib64/gcc/x86_64-suse-linux/9/../../../../lib64/crti.o", "/usr/lib64/gcc/x86_64-suse-linux/9/crtbeginS.o", "-L/usr/lib64/gcc/x86_64-suse-linux/9", "-L/usr/lib64/gcc/x86_64-suse-linux/9/../../../../lib64", "-L/lib/../lib64", "-L/usr/lib/../lib64", "-L/usr/lib64/gcc/x86_64-suse-linux/9/../../../../x86_64-suse-linux/lib", "-L/usr/lib64/gcc/x86_64-suse-linux/9/../../..", "/tmp/cchuJoLU.o", "argconfig.o", "suffix.o", "parser.o", "nvme-print.o", "nvme-ioctl.o", "nvme-lightnvm.o", "fabrics.o", "json.o", "nvme-models.o", "plugin.o", "nvme-status.o", "plugins/intel/intel-nvme.o", "plugins/lnvm/lnvm-nvme.o", "plugins/memblaze/memblaze-nvme.o", "plugins/wdc/wdc-nvme.o", "plugins/wdc/wdc-utils.o", "plugins/huawei/huawei-nvme.o", "plugins/netapp/netapp-nvme.o", "plugins/toshiba/toshiba-nvme.o", "plugins/micron/micron-nvme.o", "plugins/seagate/seagate-nvme.o", "plugins/virtium/virtium-nvme.o", "plugins/shannon/shannon-nvme.o", "-luuid", "-lgcc", "--push-state", "--as-needed", "-lgcc_s", "--pop-state", "-lc", "-lgcc", "--push-state", "--as-needed", "-lgcc_s", "--pop-state", "/usr/lib64/gcc/x86_64-suse-linux/9/crtendS.o", "/usr/lib64/gcc/x86_64-suse-linux/9/../../../../lib64/crtn.o"] Without -flto the collect2 call also gets the random tmp file passed, but the name is not embedded in the result.
We also tried building with -pipe, but the observed behavior didn't change.
So it is a global constructor which is picking up the name of the object file.
It seems to be triggered by nvme-cli/cmd_handler.h #define PLUGIN(name, cmds) \ static struct plugin plugin = { \ name \ .commands = commands \ }; \ \ static void init(void) __attribute__((constructor)); \ static void init(void) \ { \ register_extension(&plugin); \ }
A simple two-file testcase like static void init(void) __attribute__((constructor)); static void init() { static volatile int i = 0; } int main() { return 0; } ---- static void init2(void) __attribute__((constructor)); static void init2() { static volatile int i = 0; } reproduces this. Having two ctors (in different CUs) is important.
Confirmed: $ marxin@marxinbox:/tmp> gcc -flto pr91307-*.c -o a.out && objdump -S a.out | grep GLOBAL 0000000000401109 <_GLOBAL__I_65535_0_ccIH3dv1.o.4348>: marxin@marxinbox:/tmp> gcc -flto pr91307-*.c -o a.out && objdump -S a.out | grep GLOBAL 0000000000401109 <_GLOBAL__I_65535_0_ccrEZiHf.o.4348>:
The name to function is given here: get_file_function_name Breakpoint 1, get_file_function_name (type=0x7fffffffd670 "I_65535_0") at /home/marxin/Programming/gcc/gcc/tree.c:9809 9809 if (first_global_object_name) (gdb) bt #0 get_file_function_name (type=0x7fffffffd670 "I_65535_0") at /home/marxin/Programming/gcc/gcc/tree.c:9809 #1 0x0000000000ab7c9b in cgraph_build_static_cdtor_1 (which=<optimized out>, body=0x7ffff78958e0, priority=65535, final=<optimized out>, optimization=0x7ffff7895880, target=0x7ffff789c558) at /home/marxin/Programming/gcc/gcc/ipa.c:845 #2 0x0000000000ab830b in build_cdtor (ctor_p=true, cdtors=...) at /home/marxin/Programming/gcc/gcc/tree.h:3258 #3 0x0000000000ab877d in build_cdtor_fns (dtors=0x7fffffffd7d0, ctors=0x7fffffffd720) at /home/marxin/Programming/gcc/gcc/ipa.c:1064 #4 ipa_cdtor_merge () at /home/marxin/Programming/gcc/gcc/ipa.c:1093 #5 0x0000000000bb9f1a in execute_one_pass (pass=0x2263e60) at /home/marxin/Programming/gcc/gcc/passes.c:2474 #6 0x0000000000bbb927 in execute_ipa_pass_list (pass=0x2263e60) at /home/marxin/Programming/gcc/gcc/passes.c:2914 #7 0x00000000007e1e51 in do_whole_program_analysis () at /home/marxin/Programming/gcc/gcc/context.h:48 #8 lto_main () at /home/marxin/Programming/gcc/gcc/lto/lto.c:628 #9 0x0000000000c8c900 in compile_file () at /home/marxin/Programming/gcc/gcc/toplev.c:456 #10 0x00000000007ba21b in do_compile () at /home/marxin/Programming/gcc/gcc/toplev.c:2190 #11 toplev::main (this=this@entry=0x7fffffffda6e, argc=<optimized out>, argc@entry=19, argv=<optimized out>, argv@entry=0x7fffffffdb68) at /home/marxin/Programming/gcc/gcc/toplev.c:2325 #12 0x00000000007bdd2f in main (argc=19, argv=0x7fffffffdb68) at /home/marxin/Programming/gcc/gcc/main.c:39 So when one uses --save-temps: gcc -flto pr91307-*.c -o a.out --save-temps && objdump -S a.out | grep GLOBAL generating: _GLOBAL__I_65535_0_pr91307_1.o We can probably give a better name in WPA.
Can anything as simple as this resolve it: diff --git a/gcc/tree.c b/gcc/tree.c index 8cf75f22220..56c0fd450f1 100644 --- a/gcc/tree.c +++ b/gcc/tree.c @@ -9817,12 +9817,17 @@ get_file_function_name (const char *type) || (strncmp (type, "sub_", 4) == 0 && (type[4] == 'I' || type[4] == 'D'))) { - const char *file = main_input_filename; - if (! file) - file = LOCATION_FILE (input_location); - /* Just use the file's basename, because the full pathname - might be quite long. */ - p = q = ASTRDUP (lbasename (file)); + if (flag_wpa) + p = q = ASTRDUP ("wpa"); + else + { + const char *file = main_input_filename; + if (! file) + file = LOCATION_FILE (input_location); + /* Just use the file's basename, because the full pathname + might be quite long. */ + p = q = ASTRDUP (lbasename (file)); + } } else { ?
I wonder why we can't simply use .init_array and thus get away with a local symbol on targets that support this. Uh, so the symbol is already local but we're keeping it in our stripping process it seems. And the name is fancy "for debugging purposes". Not sure if gdb has special breakpoints for ctors/dtors or if the user is expected to know the mangling. Btw, for !targetm.have_ctors_dtors we're appending "some randomness" anyways so I wonder if we shouldn't simply pass down appropriate deterministic randomness by producing it from the symbols of the individual ctors/dtors we call here: /* Find the next batch of constructors/destructors with the same initialization priority. */ for (;i < j; i++) { tree call; fn = cdtors[i]; call = build_call_expr (fn, 0); if (ctor_p) DECL_STATIC_CONSTRUCTOR (fn) = 0; else DECL_STATIC_DESTRUCTOR (fn) = 0; /* We do not want to optimize away pure/const calls here. When optimizing, these should be already removed, when not optimizing, we want user to be able to breakpoint in them. */ TREE_SIDE_EFFECTS (call) = 1; append_to_statement_list (call, &body); } like Index: gcc/ipa.c =================================================================== --- gcc/ipa.c (revision 274527) +++ gcc/ipa.c (working copy) @@ -827,7 +827,7 @@ ipa_discover_variable_flags (void) static void cgraph_build_static_cdtor_1 (char which, tree body, int priority, bool final, tree optimization, - tree target) + tree target, hashval_t hstate) { static int counter = 0; char which_buf[16]; @@ -842,7 +842,7 @@ cgraph_build_static_cdtor_1 (char which, /* Proudce sane name but one not recognizable by collect2, just for the case we fail to inline the function. */ sprintf (which_buf, "sub_%c_%.5d_%d", which, priority, counter++); - name = get_file_function_name (which_buf); + name = get_file_function_name (which_buf, hstate); decl = build_decl (input_location, FUNCTION_DECL, name, build_function_type_list (void_type_node, NULL_TREE)); @@ -973,6 +973,7 @@ build_cdtor (bool ctor_p, const vec<tree } /* Find the next batch of constructors/destructors with the same initialization priority. */ + inchash::hash hstate; for (;i < j; i++) { tree call; @@ -987,13 +988,15 @@ build_cdtor (bool ctor_p, const vec<tree optimizing, we want user to be able to breakpoint in them. */ TREE_SIDE_EFFECTS (call) = 1; append_to_statement_list (call, &body); + hstate.merge_hash (IDENTIFIER_HASH_VALUE (DECL_NAME (fn))); } gcc_assert (body != NULL_TREE); /* Generate a function to call all the function of like priority. */ cgraph_build_static_cdtor_1 (ctor_p ? 'I' : 'D', body, priority, true, DECL_FUNCTION_SPECIFIC_OPTIMIZATION (cdtors[0]), - DECL_FUNCTION_SPECIFIC_TARGET (cdtors[0])); + DECL_FUNCTION_SPECIFIC_TARGET (cdtors[0]), + hstate.end); } } plus appropriate changes to get_file_function_name. It also seems to me we may not need to produce a unique symbol here so Martins original idea of using "wpa" for LTO might work fine (but I'd change the signature of get_file_function_name to pass in the actual "file name" then, keeping the original use of main_input_filename via an overload). Note with !targetm.have_ctors_dtors we are going to produce either symbol clashes or random symbols (with and without -flto, "conveniently" IPA CDTOR merging is only run for -flto or !targetm.have_ctors_dtors for whathever reasons).
I am not 100% sure if hashing calle names works safely, since they will all be something like "static construction" or so, so I guess one can construct two different translation units that will end up with same hash. But perhaps we could teach LTO FE to produce those names purely on symbols exported from the LTO unit since those should be unique and use it everywhere (including the lto_priv numbers)? Of course unless we go the route of getting API to the plugin interface to obtain full symbol name.
After a full rebuild of openSUSE Tumbleweed, the GLOBAL__I_65535_ string appears in diffs of 12 packages: blog libpt2 lodepng nethogs nodejs12 nvme-cli python-python-crfsuite qpid-cpp Rivet udp2raw-tunnel udpspeeder yate-bts Of those, qpid-cpp Rivet were already unreproducible without LTO.
Author: rguenth Date: Tue Aug 20 13:14:59 2019 New Revision: 274748 URL: https://gcc.gnu.org/viewcvs?rev=274748&root=gcc&view=rev Log: 2019-08-20 Richard Biener <rguenther@suse.de> PR lto/91307 * ipa.c (cgraph_build_static_cdtor_1): Use names not recognizable by collect2 when targetm.have_ctors_dtors which avoids dragging in temporary filenames from LTO input objects. Modified: trunk/gcc/ChangeLog trunk/gcc/ipa.c
Fixed for trunk.