Bug 91307 - -flto causes binary to vary
Summary: -flto causes binary to vary
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: lto (show other bugs)
Version: 9.1.1
: P3 normal
Target Milestone: ---
Assignee: Richard Biener
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2019-07-31 12:12 UTC by Bernhard M. Wiedemann
Modified: 2019-08-20 13:26 UTC (History)
3 users (show)

See Also:
Host:
Target:
Build:
Known to work: 10.0
Known to fail:
Last reconfirmed: 2019-07-31 00:00:00


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Bernhard M. Wiedemann 2019-07-31 12:12:57 UTC
When building openSUSE:Factory nvme-cli
with -flto=2
it causes a variation in the resulting nvme binary
that is not there without -flto

/usr/sbin/nvme differs in assembler output
@@ -433,7 +433,7 @@

 Disassembly of section .text:

-_GLOBAL__I_65535_0_cc6lpCUH.o.10070:
+_GLOBAL__I_65535_0_cckZcJmg.o.10070:
        sub    $something,%rsp
        lea    offset(%rip),%rdi        #   <plugin.lto_priv.10>
        callq  <register_extension>


strace shows that

["cc", "-D_GNU_SOURCE", "-D__CHECK_ENDIAN__", "-O2", "-Wall", "-D_FORTIFY_SOURCE=2", "-fstack-protector-strong", "-funwind-tables", "-fasynchronous-unwind-tables", "-fstack-clash-protection", "-Werror=return-type", "-flto=2", "-I.", "-std=gnu99", "-I.", "-DLIBUUID", "-DNVME_VERSION=\"1.8.1+git135.9bab71e\"", "nvme.c", "-o", "nvme", "argconfig.o", "suffix.o", "parser.o", "nvme-print.o", "nvme-ioctl.o", "nvme-lightnvm.o", "fabrics.o", "json.o", "nvme-models.o", "plugin.o", "nvme-status.o", "plugins/intel/intel-nvme.o", "plugins/lnvm/lnvm-nvme.o", "plugins/memblaze/memblaze-nvme.o", "plugins/wdc/wdc-nvme.o", "plugins/wdc/wdc-utils.o", "plugins/huawei/huawei-nvme.o", "plugins/netapp/netapp-nvme.o", "plugins/toshiba/toshiba-nvme.o", "plugins/micron/micron-nvme.o", "plugins/seagate/seagate-nvme.o", "plugins/virtium/virtium-nvme.o", "plugins/shannon/shannon-nvme.o", "-luuid"]

calls

["/usr/lib64/gcc/x86_64-suse-linux/9/collect2", "-plugin", "/usr/lib64/gcc/x86_64-suse-linux/9/liblto_plugin.so", "-plugin-opt=/usr/lib64/gcc/x86_64-suse-linux/9/lto-wrapper", "-plugin-opt=-fresolution=/tmp/ccQNkgLE.res", "-plugin-opt=-pass-through=-lgcc", "-plugin-opt=-pass-through=-lgcc_s", "-plugin-opt=-pass-through=-lc", "-plugin-opt=-pass-through=-lgcc", "-plugin-opt=-pass-through=-lgcc_s", "-flto=2", "--build-id", "--eh-frame-hdr", "-m", "elf_x86_64", "-dynamic-linker", "/lib64/ld-linux-x86-64.so.2", "-pie", "-o", "nvme", "/usr/lib64/gcc/x86_64-suse-linux/9/../../../../lib64/Scrt1.o", "/usr/lib64/gcc/x86_64-suse-linux/9/../../../../lib64/crti.o", "/usr/lib64/gcc/x86_64-suse-linux/9/crtbeginS.o", "-L/usr/lib64/gcc/x86_64-suse-linux/9", "-L/usr/lib64/gcc/x86_64-suse-linux/9/../../../../lib64", "-L/lib/../lib64", "-L/usr/lib/../lib64", "-L/usr/lib64/gcc/x86_64-suse-linux/9/../../../../x86_64-suse-linux/lib", "-L/usr/lib64/gcc/x86_64-suse-linux/9/../../..", "/tmp/cchuJoLU.o", "argconfig.o", "suffix.o", "parser.o", "nvme-print.o", "nvme-ioctl.o", "nvme-lightnvm.o", "fabrics.o", "json.o", "nvme-models.o", "plugin.o", "nvme-status.o", "plugins/intel/intel-nvme.o", "plugins/lnvm/lnvm-nvme.o", "plugins/memblaze/memblaze-nvme.o", "plugins/wdc/wdc-nvme.o", "plugins/wdc/wdc-utils.o", "plugins/huawei/huawei-nvme.o", "plugins/netapp/netapp-nvme.o", "plugins/toshiba/toshiba-nvme.o", "plugins/micron/micron-nvme.o", "plugins/seagate/seagate-nvme.o", "plugins/virtium/virtium-nvme.o", "plugins/shannon/shannon-nvme.o", "-luuid", "-lgcc", "--push-state", "--as-needed", "-lgcc_s", "--pop-state", "-lc", "-lgcc", "--push-state", "--as-needed", "-lgcc_s", "--pop-state", "/usr/lib64/gcc/x86_64-suse-linux/9/crtendS.o", "/usr/lib64/gcc/x86_64-suse-linux/9/../../../../lib64/crtn.o"]


Without -flto the collect2 call also gets the random tmp file passed, but the name is not embedded in the result.
Comment 1 Simon Schricker 2019-07-31 12:36:06 UTC
We also tried building with -pipe, but the observed behavior didn't change.
Comment 2 Andrew Pinski 2019-07-31 13:27:40 UTC
So it is a global constructor which is picking up the name of the object file.
Comment 3 Bernhard M. Wiedemann 2019-07-31 13:57:59 UTC
It seems to be triggered by nvme-cli/cmd_handler.h

#define PLUGIN(name, cmds)                              \
static struct plugin plugin = {                         \
        name                                            \
        .commands = commands                            \
};                                                      \
                                                        \
static void init(void) __attribute__((constructor));    \
static void init(void)                                  \
{                                                       \
        register_extension(&plugin);                    \
}
Comment 4 Richard Biener 2019-07-31 14:12:04 UTC
A simple two-file testcase like

static void init(void) __attribute__((constructor));
static void init()
{
  static volatile int i = 0;
}

int main() { return 0; }

----

static void init2(void) __attribute__((constructor));
static void init2()
{
    static volatile int i = 0;
}

reproduces this.  Having two ctors (in different CUs) is important.
Comment 5 Martin Liška 2019-07-31 14:15:56 UTC
Confirmed:

$ marxin@marxinbox:/tmp> gcc -flto pr91307-*.c -o a.out && objdump -S a.out | grep GLOBAL
0000000000401109 <_GLOBAL__I_65535_0_ccIH3dv1.o.4348>:
marxin@marxinbox:/tmp> gcc -flto pr91307-*.c -o a.out && objdump -S a.out | grep GLOBAL
0000000000401109 <_GLOBAL__I_65535_0_ccrEZiHf.o.4348>:
Comment 6 Martin Liška 2019-07-31 14:39:36 UTC
The name to function is given here:
get_file_function_name

Breakpoint 1, get_file_function_name (type=0x7fffffffd670 "I_65535_0") at /home/marxin/Programming/gcc/gcc/tree.c:9809
9809	  if (first_global_object_name)
(gdb) bt
#0  get_file_function_name (type=0x7fffffffd670 "I_65535_0") at /home/marxin/Programming/gcc/gcc/tree.c:9809
#1  0x0000000000ab7c9b in cgraph_build_static_cdtor_1 (which=<optimized out>, body=0x7ffff78958e0, priority=65535, final=<optimized out>, optimization=0x7ffff7895880, target=0x7ffff789c558) at /home/marxin/Programming/gcc/gcc/ipa.c:845
#2  0x0000000000ab830b in build_cdtor (ctor_p=true, cdtors=...) at /home/marxin/Programming/gcc/gcc/tree.h:3258
#3  0x0000000000ab877d in build_cdtor_fns (dtors=0x7fffffffd7d0, ctors=0x7fffffffd720) at /home/marxin/Programming/gcc/gcc/ipa.c:1064
#4  ipa_cdtor_merge () at /home/marxin/Programming/gcc/gcc/ipa.c:1093
#5  0x0000000000bb9f1a in execute_one_pass (pass=0x2263e60) at /home/marxin/Programming/gcc/gcc/passes.c:2474
#6  0x0000000000bbb927 in execute_ipa_pass_list (pass=0x2263e60) at /home/marxin/Programming/gcc/gcc/passes.c:2914
#7  0x00000000007e1e51 in do_whole_program_analysis () at /home/marxin/Programming/gcc/gcc/context.h:48
#8  lto_main () at /home/marxin/Programming/gcc/gcc/lto/lto.c:628
#9  0x0000000000c8c900 in compile_file () at /home/marxin/Programming/gcc/gcc/toplev.c:456
#10 0x00000000007ba21b in do_compile () at /home/marxin/Programming/gcc/gcc/toplev.c:2190
#11 toplev::main (this=this@entry=0x7fffffffda6e, argc=<optimized out>, argc@entry=19, argv=<optimized out>, argv@entry=0x7fffffffdb68) at /home/marxin/Programming/gcc/gcc/toplev.c:2325
#12 0x00000000007bdd2f in main (argc=19, argv=0x7fffffffdb68) at /home/marxin/Programming/gcc/gcc/main.c:39

So when one uses --save-temps:

gcc -flto pr91307-*.c -o a.out --save-temps && objdump -S a.out | grep GLOBAL
generating: _GLOBAL__I_65535_0_pr91307_1.o

We can probably give a better name in WPA.
Comment 7 Martin Liška 2019-07-31 14:49:03 UTC
Can anything as simple as this resolve it:

diff --git a/gcc/tree.c b/gcc/tree.c
index 8cf75f22220..56c0fd450f1 100644
--- a/gcc/tree.c
+++ b/gcc/tree.c
@@ -9817,12 +9817,17 @@ get_file_function_name (const char *type)
 	   || (strncmp (type, "sub_", 4) == 0
 	       && (type[4] == 'I' || type[4] == 'D')))
     {
-      const char *file = main_input_filename;
-      if (! file)
-	file = LOCATION_FILE (input_location);
-      /* Just use the file's basename, because the full pathname
-	 might be quite long.  */
-      p = q = ASTRDUP (lbasename (file));
+      if (flag_wpa)
+	p = q = ASTRDUP ("wpa");
+      else
+	{
+	  const char *file = main_input_filename;
+	  if (! file)
+	    file = LOCATION_FILE (input_location);
+	  /* Just use the file's basename, because the full pathname
+	     might be quite long.  */
+	  p = q = ASTRDUP (lbasename (file));
+	}
     }
   else
     {

?
Comment 8 Richard Biener 2019-08-15 12:27:47 UTC
I wonder why we can't simply use .init_array and thus get away with a
local symbol on targets that support this.  Uh, so the symbol is already
local but we're keeping it in our stripping process it seems.  And
the name is fancy "for debugging purposes".  Not sure if gdb has special
breakpoints for ctors/dtors or if the user is expected to know the
mangling.

Btw, for !targetm.have_ctors_dtors we're appending "some randomness"
anyways so I wonder if we shouldn't simply pass down appropriate
deterministic randomness by producing it from the symbols of the individual
ctors/dtors we call here:

      /* Find the next batch of constructors/destructors with the same
         initialization priority.  */
      for (;i < j; i++)
        {
          tree call;
          fn = cdtors[i];
          call = build_call_expr (fn, 0);
          if (ctor_p)
            DECL_STATIC_CONSTRUCTOR (fn) = 0;
          else
            DECL_STATIC_DESTRUCTOR (fn) = 0;
          /* We do not want to optimize away pure/const calls here.
             When optimizing, these should be already removed, when not
             optimizing, we want user to be able to breakpoint in them.  */
          TREE_SIDE_EFFECTS (call) = 1;
          append_to_statement_list (call, &body);
        }

like

Index: gcc/ipa.c
===================================================================
--- gcc/ipa.c   (revision 274527)
+++ gcc/ipa.c   (working copy)
@@ -827,7 +827,7 @@ ipa_discover_variable_flags (void)
 static void
 cgraph_build_static_cdtor_1 (char which, tree body, int priority, bool final,
                             tree optimization,
-                            tree target)
+                            tree target, hashval_t hstate)
 {
   static int counter = 0;
   char which_buf[16];
@@ -842,7 +842,7 @@ cgraph_build_static_cdtor_1 (char which,
   /* Proudce sane name but one not recognizable by collect2, just for the
      case we fail to inline the function.  */
     sprintf (which_buf, "sub_%c_%.5d_%d", which, priority, counter++);
-  name = get_file_function_name (which_buf);
+  name = get_file_function_name (which_buf, hstate);
 
   decl = build_decl (input_location, FUNCTION_DECL, name,
                     build_function_type_list (void_type_node, NULL_TREE));
@@ -973,6 +973,7 @@ build_cdtor (bool ctor_p, const vec<tree
        }
       /* Find the next batch of constructors/destructors with the same
         initialization priority.  */
+      inchash::hash hstate;
       for (;i < j; i++)
        {
          tree call;
@@ -987,13 +988,15 @@ build_cdtor (bool ctor_p, const vec<tree
             optimizing, we want user to be able to breakpoint in them.  */
          TREE_SIDE_EFFECTS (call) = 1;
          append_to_statement_list (call, &body);
+         hstate.merge_hash (IDENTIFIER_HASH_VALUE (DECL_NAME (fn)));
        }
       gcc_assert (body != NULL_TREE);
       /* Generate a function to call all the function of like
         priority.  */
       cgraph_build_static_cdtor_1 (ctor_p ? 'I' : 'D', body, priority, true,
                                   DECL_FUNCTION_SPECIFIC_OPTIMIZATION (cdtors[0]),
-                                  DECL_FUNCTION_SPECIFIC_TARGET (cdtors[0]));
+                                  DECL_FUNCTION_SPECIFIC_TARGET (cdtors[0]),
+                                  hstate.end);
     }
 }
 
plus appropriate changes to get_file_function_name.

It also seems to me we may not need to produce a unique symbol here
so Martins original idea of using "wpa" for LTO might work fine
(but I'd change the signature of get_file_function_name to pass in
the actual "file name" then, keeping the original use of main_input_filename
via an overload).

Note with !targetm.have_ctors_dtors we are going to produce
either symbol clashes or random symbols (with and without -flto,
"conveniently" IPA CDTOR merging is only run for -flto or !targetm.have_ctors_dtors for whathever reasons).
Comment 9 Jan Hubicka 2019-08-15 15:58:16 UTC
I am not 100% sure if hashing calle names works safely, since they will all be something like "static construction" or so, so I guess one can construct two different translation units that will end up with same hash. But perhaps we could teach LTO FE to produce those names purely on symbols exported from the LTO unit since those should be unique and use it everywhere (including the lto_priv numbers)?
Of course unless we go the route of getting API to the plugin interface to obtain full symbol name.
Comment 10 Bernhard M. Wiedemann 2019-08-20 08:58:39 UTC
After a full rebuild of openSUSE Tumbleweed, the GLOBAL__I_65535_ string
appears in diffs of 12 packages:
blog
libpt2
lodepng
nethogs
nodejs12
nvme-cli
python-python-crfsuite
qpid-cpp
Rivet
udp2raw-tunnel
udpspeeder
yate-bts

Of those, qpid-cpp Rivet were already unreproducible without LTO.
Comment 11 Richard Biener 2019-08-20 13:15:31 UTC
Author: rguenth
Date: Tue Aug 20 13:14:59 2019
New Revision: 274748

URL: https://gcc.gnu.org/viewcvs?rev=274748&root=gcc&view=rev
Log:
2019-08-20  Richard Biener  <rguenther@suse.de>

	PR lto/91307
	* ipa.c (cgraph_build_static_cdtor_1): Use names not recognizable
	by collect2 when targetm.have_ctors_dtors which avoids dragging
	in temporary filenames from LTO input objects.

Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/ipa.c
Comment 12 Richard Biener 2019-08-20 13:26:48 UTC
Fixed for trunk.