Bug 53572 - Some public symbols don't get to serialized LTO
Summary: Some public symbols don't get to serialized LTO
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: lto (show other bugs)
Version: 4.7.1
: P3 normal
Target Milestone: 4.7.2
Assignee: Jan Hubicka
URL:
Keywords: lto
Depends on:
Blocks: 53604
  Show dependency treegraph
 
Reported: 2012-06-04 12:37 UTC by Yuri Gribov
Modified: 2012-09-07 13:06 UTC (History)
5 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2012-06-04 00:00:00


Attachments
This archive includes Makefile, header and two cpps which illustrate the problem (618 bytes, application/x-compressed-tar)
2012-06-04 12:37 UTC, Yuri Gribov
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Yuri Gribov 2012-06-04 12:37:11 UTC
Created attachment 27553 [details]
This archive includes Makefile, header and two cpps which illustrate the problem

The attached small project builds successfully by default

  ygribov@ygribov-desktop:~/test/repro-x86$ make CC=<path-to-gcc> clean all
  rm -f prog main.o lib.a lib.o
  gcc  -O2 -DNDEBUG -frtti -c -o main.o main.cpp
  gcc -O2 -DNDEBUG -frtti -c -o lib.o lib.cpp
  ar rcs lib.a lib.o
  gcc  main.o -L. lib.a -o prog -lsupc++

but fails if I compile with LTO enabled:

  ygribov@ygribov-desktop:~/test/repro-x86/jni$ make CC=<path-to-gcc> LTO=-flto clean all
  rm -f prog main.o lib.a lib.o
  gcc -flto -O2 -DNDEBUG -frtti -save-temps -c -o main.o main.cpp
  gcc -O2 -DNDEBUG -frtti -save-temps -c -o lib.o lib.cpp
  ar rcs lib.a lib.o
  gcc -flto main.o -L. lib.a -o prog -lsupc++
  `_ZTI1A' referenced in section `.rodata._ZTV1A[_ZTV1A]' of lib.a(lib.o): defined in discarded section `.text' of main.o (symbol from plugin)
  collect2: error: ld returned 1 exit status
  make: *** [prog] Error 1

_ZTI1A is a weak definition of vtable for class A which is defined both in main.cpp and in lib.cpp (because they both include common.h which defines thus class). Main.cpp is compiled with -flto and _ZTI1A then gets into main.o's LTO symtab. As main.o is the first file in the list which declares to provide definition for this weak symbol linker assumes that it should come from main.o. But for some reason the actual GIMPLE tree for _ZTI1A is not stored in main.o's LTO sections and this causes errors during linking (linker can't find symbol promised by main.o's LTO symtab).

I tried to compile both with gcc 4.6 and gcc trunk and every time it failed in the same way.

I have to include a tgz instead of single source file because problem manifests itself only when I link several files.
Comment 1 Richard Biener 2012-06-04 12:53:16 UTC
Confirmed.  I see the resolution file

1
main.o 6
252 65004255f21c98f6 PREVAILING_DEF main
263 65004255f21c98f6 RESOLVED_EXEC _ZdlPv
270 65004255f21c98f6 RESOLVED_EXEC _Z1fv
202 65004255f21c98f6 PREVAILING_DEF _ZTI1A
225 65004255f21c98f6 PREVAILING_DEF _ZTS1A
213 65004255f21c98f6 RESOLVED_EXEC _ZTVN10__cxxabiv117__class_type_infoE

But the error message

`_ZTI1A' referenced in section `.rodata._ZTV1A[_ZTV1A]' of lib.a(lib.o): defined in discarded section `.gnu.linkonce.t._ZTI1A' of main.o (symbol from plugin)
collect2: error: ld returned 1 exit status

suggests this is a linker bug?

Either LTO decides to drop _ZTI1A from main.o - which it can - then the linker
needs to re-scan the archives and resolve the symbol to the archive provided one:

> nm lib.o 
0000000000000000 t _GLOBAL__sub_I_a
0000000000000000 T _Z1fv
0000000000000000 W _ZN1AD0Ev
0000000000000000 W _ZN1AD1Ev
0000000000000000 W _ZN1AD2Ev
0000000000000000 n _ZN1AD5Ev
0000000000000000 V _ZTI1A
0000000000000000 V _ZTS1A
0000000000000000 V _ZTV1A
                 U _ZTVN10__cxxabiv117__class_type_infoE
                 U _ZdlPv
                 U __cxa_atexit
                 U __dso_handle
0000000000000000 B a

Thus, can you file a binutils bug?
Comment 2 Jan Hubicka 2012-06-04 14:10:35 UTC
mine.
Comment 3 Jan Hubicka 2012-06-04 14:19:02 UTC
I do not have linker plugin enabled setup handy at the moment, but the following patch should fix it:
Index: cgraph.h
===================================================================
--- cgraph.h    (revision 188138)
+++ cgraph.h    (working copy)
@@ -1126,7 +1126,8 @@ varpool_can_remove_if_no_refs (struct va
   if (DECL_EXTERNAL (node->symbol.decl))
     return true;
   return (!node->symbol.force_output && !node->symbol.used_from_other_partition
-         && (DECL_COMDAT (node->symbol.decl)
+         && ((DECL_COMDAT (node->symbol.decl)
+              && symtab_used_from_object_file_p ((symtab_node) node))
              || !node->symbol.externally_visible
              || DECL_HAS_VALUE_EXPR_P (node->symbol.decl)));
 }

I am testing this patch for mainline.
As discussed on IRC, this patch can have negative effect on V1 linker API setups: here all comdats appears to be externally visible and thus they will be forced to output.

But perhaps it is now resonable to expect V2 linker API even for GCC 4.7 based setups for sane LTO with C++?  We already mention in release notes that V1 API is bad idea...

Honza
Comment 4 Richard Biener 2012-06-04 14:38:06 UTC
(In reply to comment #3)
> I do not have linker plugin enabled setup handy at the moment, but the
> following patch should fix it:
> Index: cgraph.h
> ===================================================================
> --- cgraph.h    (revision 188138)
> +++ cgraph.h    (working copy)
> @@ -1126,7 +1126,8 @@ varpool_can_remove_if_no_refs (struct va
>    if (DECL_EXTERNAL (node->symbol.decl))
>      return true;
>    return (!node->symbol.force_output &&
> !node->symbol.used_from_other_partition
> -         && (DECL_COMDAT (node->symbol.decl)
> +         && ((DECL_COMDAT (node->symbol.decl)
> +              && symtab_used_from_object_file_p ((symtab_node) node))

&& !symtab_used_from_object_file_p

I suppose.  I would have expected used_from_other_partition to be true
in this case.  Also the 4.7 branch looks suspiciously different from
this, referencing flag_toplevel_reorder ...

The adjusted as above patch works on trunk though.

But I really would say "object file"s are also another "partition".

>               || !node->symbol.externally_visible
>               || DECL_HAS_VALUE_EXPR_P (node->symbol.decl)));
>  }
> 
> I am testing this patch for mainline.
> As discussed on IRC, this patch can have negative effect on V1 linker API
> setups: here all comdats appears to be externally visible and thus they will be
> forced to output.

If as comdat the linker should eliminate them.

> But perhaps it is now resonable to expect V2 linker API even for GCC 4.7 based
> setups for sane LTO with C++?  We already mention in release notes that V1 API
> is bad idea...

Well, sure - assuming that we have a V2 capable linker is ok I guess.  In the
end we should simply annotate the resolution file with the linker capability,
thus extend the file-format to include a one-line header with a version.

Richard.

> Honza
Comment 5 Jan Hubicka 2012-06-04 19:52:51 UTC
> > !node->symbol.used_from_other_partition
> > -         && (DECL_COMDAT (node->symbol.decl)
> > +         && ((DECL_COMDAT (node->symbol.decl)
> > +              && symtab_used_from_object_file_p ((symtab_node) node))
> 
> && !symtab_used_from_object_file_p

Ah, yes. Sorry.
> 
> I suppose.  I would have expected used_from_other_partition to be true

Well, used_from_other_partition reffers to IR partitions, not "other uses".
I would not mix them unless we need so. There are some differences - like
symbols from other partitions can be local and can use custom calling conventions.
Symbols used form asm can't.

> > But perhaps it is now resonable to expect V2 linker API even for GCC 4.7 based
> > setups for sane LTO with C++?  We already mention in release notes that V1 API
> > is bad idea...
> 
> Well, sure - assuming that we have a V2 capable linker is ok I guess.  In the
> end we should simply annotate the resolution file with the linker capability,
> thus extend the file-format to include a one-line header with a version.

As I tried to explain on IRC, this won't help since V1/V2 API differs already
on compile time when we don't have linker plugin and have no clue what version
of API will be used at linktime.
It is possible to teach lto-plugin to do the hack instead of doing it in 
lto-streamer-out.c if we really want sane support for V1 API.

Concerning the problem in this PR, i guess we probably should give up handling
this sanely in V1 API: we already hide COMDAT vtables and type descriptions
should be harmless/dropped by the linker eventually.

Honza
Comment 6 Yuri Gribov 2012-06-05 10:53:56 UTC
Jan's suggestion indeed fixed the bug, thanks!

Regarding Richard's suggestion: should we ask binutils guys to detect situation when LTO plugin fails to deliver symbols which it promised in resolution file and work around it? I think this would be a major refactor on their side.
Comment 7 Richard Biener 2012-06-12 10:42:15 UTC
*** Bug 53604 has been marked as a duplicate of this bug. ***
Comment 8 Jan Hubicka 2012-06-25 15:07:18 UTC
OK, I will commit the fix as it is to mainline today and see if there are any significant negative effects with V1 API on Mozilla. If so, I will try to update lto-plugin to handle the v1/v2 hack there instead of lto-streamer for 4.7.
Comment 9 Jan Hubicka 2012-06-26 10:15:22 UTC
Author: hubicka
Date: Tue Jun 26 10:15:18 2012
New Revision: 188982

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=188982
Log:

	PR lto/53572
	* cgraph.h (varpool_can_remove_if_no_refs): Fix handling of
	used symbols.

Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/cgraph.h
Comment 10 Christopher Hite 2012-07-23 12:54:04 UTC
So this fix will be in gcc 4.7.2?

Could you check if it fixes #52900 too?
Comment 11 Matt Hargett 2012-09-04 23:48:40 UTC
Applying this patch to my 4.7 branch checkout fixes my problem, which had the same symptoms. Can someone please commit this to the 4.7 branch?

===================================================================
--- ../google-gcc-4_7/gcc/cgraph.h	(revision 190939)
+++ ../google-gcc-4_7/gcc/cgraph.h	(working copy)
@@ -1004,10 +1004,16 @@
 static inline bool
 varpool_can_remove_if_no_refs (struct varpool_node *node)
 {
-  return (!node->force_output && !node->used_from_other_partition
+  if (DECL_EXTERNAL (node->decl))
+     return true;
+ 
+   return (!node->force_output && !node->used_from_other_partition
 	  && (flag_toplevel_reorder || DECL_COMDAT (node->decl)
 	      || DECL_ARTIFICIAL (node->decl))
-  	  && (DECL_COMDAT (node->decl) || !node->externally_visible));
+  	  && ((DECL_COMDAT (node->decl) 
+            && !varpool_used_from_object_file_p (node))
+          || !node->externally_visible
+          || DECL_HAS_VALUE_EXPR_P (node->decl)));
 }
Comment 12 Richard Biener 2012-09-07 13:05:22 UTC
Author: rguenth
Date: Fri Sep  7 13:05:18 2012
New Revision: 191073

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=191073
Log:
2012-09-07  Richard Guenther  <rguenther@suse.de>

	Backport from mainline
	2012-06-26  Jan Hubicka  <jh@suse.cz>

	PR lto/53572
	* cgraph.h (varpool_can_remove_if_no_refs): Fix handling of
	used symbols.

Modified:
    branches/gcc-4_7-branch/gcc/ChangeLog
    branches/gcc-4_7-branch/gcc/cgraph.h
Comment 13 Richard Biener 2012-09-07 13:06:14 UTC
Fixed.