Bug 87380 - Explicit instantations should use weak symbols on darwin
Summary: Explicit instantations should use weak symbols on darwin
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: c++ (show other bugs)
Version: 9.0
: P3 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords: ABI
Depends on:
Blocks:
 
Reported: 2018-09-21 15:28 UTC by Jonathan Wakely
Modified: 2018-12-24 13:24 UTC (History)
4 users (show)

See Also:
Host:
Target: *-*-darwin*
Build:
Known to work:
Known to fail:
Last reconfirmed: 2018-09-21 00:00:00


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Jonathan Wakely 2018-09-21 15:28:16 UTC
The response from Apple quoted in 82172 comment 26 says that explicit instantiation definitions in a dylib need to be weak symbols, otherwise they won't be merged with other copies of the same symbol in other translation units:

> The problem is that it is not “weak” in libstdc++.6.dylib. It is a regular
> exported symbol. If it were weak, then at runtime dyld would coalesce it with
> the one in the program “foo”.
> 
> macOS does not use “flat namespace”. It uses two level namespace where every
> symbol found in a dylib at build time has the dylib in which it was found
> recorded and at runtime dyld only looks there. The exception to this is weak
> symbols, where dyld looks across all dylibs and picks one, then adjusts all
> uses in all dylibs to use that choosen one.
> 
> The static linker (ld64) knows those rules and when building a dylib that
> exports a non-weak symbol, the linker optimizes all uses within that dylib
> to directly use that symbol. That is what is happening in libstdc++.6.dylib.
> __ZNSs4_Rep20_S_empty_rep_storageE is not weak, so when libstdc++.6.dylib all
> uses of __ZNSs4_Rep20_S_empty_rep_storageE are directly bound to use the copy
> in libstdc++.6.dylib. There is nothing dyld can do at runtime to change that.
> 
> The fix here is that __ZNSs4_Rep20_S_empty_rep_storageE needs to be weak when
> libstdc++.6.dylib is built.
Comment 1 Jonathan Wakely 2018-09-21 15:34:45 UTC
(In reply to Jonathan Wakely from comment #0)
> The response from Apple quoted in 82172 comment 26 says that explicit

That should have said Bug 82172 comment 26.

The problem only arises when a template might get implicitly instantiated, because an explicit instantiations declaration was not seen. In such an object file can contain implicit instantations and the explicit instantiation in the dylib isn't needed. Due to the linker behaviour, the two definitions are not merged and remain separate.

In practice this probably only matters for symbols where the address is taken, so that duplicate copies of the symbol with different addresses can cause problems (which was the cause of Bug 82172).
Comment 2 Jonathan Wakely 2018-09-21 16:12:31 UTC
The following should run and exit successfully:

$ cat lib.h
template<typename T>
struct A {
  static T member;
};
template<typename T>
  T A<T>::member;

bool match(int*);

$ cat lib.cc
#include "lib.h"

template class A<int>;

bool match(int* p)
{
  return p == &A<int>::member;
}

$ cat main.cc
#include "lib.h"

int main()
{
  if (!match(&A<int>::member))
    throw 1;
}

$ g++ -shared -fPIC lib.cc -o liblib.so 
$ g++ main.cc -L. -llib
$ LD_LIBRARY_PATH=. ./a.out

I'm not able to check if it does on darwin, but this is a simplification of the problem in PR 82172.
Comment 3 Jonathan Wakely 2018-09-21 16:16:20 UTC
On GNU/Linux the symbol in the shared library is a global unique symbol:

$ nm --defined-only -g liblib.so  | grep member
000000000020101c u _ZN1AIiE6memberE

It seems that we need to make it weak to ensure that's true on darwin.
Comment 4 Iain Sandoe 2018-09-21 16:24:26 UTC
In Darwin dialect...
gcc -v =>
gcc version 9.0.0 20180908 (experimental) [trunk revision 264172] (GCC) 

$ ./gcc/xg++ -Bgcc -shared  lib.cc -o liblib.dylib
$ ./gcc/xg++ -Bgcc main.cc -L. -llib -o t
$ DYLD_LIBRARY_PATH=. ./t
libc++abi.dylib: terminating with uncaught exception of type int
Abort trap: 6
Comment 5 Iain Sandoe 2018-09-21 16:31:09 UTC
fudging the static member to be weak, and rebuilding the lib - the test completes w/out throwing.
Comment 6 Iain Sandoe 2018-09-21 18:59:31 UTC
hmm...

Linux:
$ more lib.s 
	.file	"lib.cc"
	.text
	.weak	_ZN1AIiE6memberE
	.section	.bss._ZN1AIiE6memberE,"awG",@nobits,_ZN1AIiE6memberE,comdat
	.align 4
	.type	_ZN1AIiE6memberE, @gnu_unique_object
	.size	_ZN1AIiE6memberE, 4

Darwin:
$ more lib.s
        .text
        .globl __ZN1AIiE6memberE
        .zerofill __DATA,__pu_bss2,__ZN1AIiE6memberE,4,2
        .text

so... now to figure out why.
Comment 7 Iain Sandoe 2018-09-23 09:59:40 UTC
it's been broken a looooong time - even apple-gcc-4.2.1 is broken, I don't have 3.3 anywhere without booting an old machine.

clang produces:
       .section        __DATA,__data
        .globl  __ZN1AIiE6memberE       ## @_ZN1AIiE6memberE
        .weak_definition        __ZN1AIiE6memberE
        .p2align        2
__ZN1AIiE6memberE:
        .long   0                       ## 0x0

====

it doesn't *look* like a back-end issue (the weak is missing by the time we get to varpool_node::assemble_decl) - present in the decl for Linux, missing for Darwin. 

Darwin declares MAKE_DECL_ONE_ONLY() => weak and that should fire up SUPPORTS_ONE_ONLY.

so... now to figure out where the divergence is.
Comment 8 Iain Sandoe 2018-09-23 13:18:47 UTC
(In reply to Iain Sandoe from comment #7)
> it's been broken a looooong time - even apple-gcc-4.2.1 is broken

actually, that's not true; it puts the symbol in common to merge this.
Comment 9 Iain Sandoe 2018-09-23 20:10:40 UTC
(In reply to Iain Sandoe from comment #8)
> (In reply to Iain Sandoe from comment #7)
> > it's been broken a looooong time - even apple-gcc-4.2.1 is broken
> 
> actually, that's not true; it puts the symbol in common to merge this.

so the culprit is:
gcc/config/darwin.h:
#define TARGET_WEAK_NOT_IN_ARCHIVE_TOC 1

.. this has been there for a long time.

So, now I need to audit what versions of ld64 really don't care about this.

Setting this to 0 (with recent as/ld64), makes the testcase complete without throwing. Bootstrapping and checking on a few systems.
Comment 10 mrs@gcc.gnu.org 2018-09-24 23:22:19 UTC
I've related 15428 which has some details for why this is the way it is.

Unless the underlying bug has been fixed in the linker, changing this will merely re-introduce the previous bug that was fixed.  If we do that, we need to weigh which feature we want to work and which to break.  The previous answer, was to make static linking work.

See 2004-06-03  Matt Austern  <austern@apple.com> and 2004-03-12  Matt Austern  <austern@apple.com> for the totality of the code.
Comment 11 Iain Sandoe 2018-09-24 23:33:31 UTC
(In reply to mrs@gcc.gnu.org from comment #10)
> I've related 15428 which has some details for why this is the way it is.
> 
> Unless the underlying bug has been fixed in the linker, changing this will
> merely re-introduce the previous bug that was fixed.  If we do that, we need
> to weigh which feature we want to work and which to break.  The previous
> answer, was to make static linking work.
> 
> See 2004-06-03  Matt Austern  <austern@apple.com> and 2004-03-12  Matt
> Austern  <austern@apple.com> for the totality of the code.

Yup .. I've been looking at that.
We already have the linker version, so should be able to make an intelligent choice.

However, as I read the original bug (against powerpc-darwin7), it might actually be ar that's the problem - since it's the presence of weak extern symbols in the archive TOC that's the issue.  If that's the case, I'm sure we can run a config. test to see if ar meets the criteria.

Anyway, trying to work my way back to a cut-off point (I don't have an easy way to test if powerpc-darwin7 will actually bootstrap any recent GCC - but suspect that it will not without an updated as/ld at minimum).
Comment 12 mrs@gcc.gnu.org 2018-09-25 00:22:23 UTC
I changed the test case around, and the linker seems to be able to resolve from a .a now:

$ ar -q liblib.a lib.o
$ nm -m liblib.a
liblib.a(lib.o):
0000000000000020 (__TEXT,__eh_frame) non-external EH_frame1
0000000000000000 (__TEXT,__text) external __Z5matchPi
0000000000000018 (__DATA,__data) weak external __ZN1AIiE6memberE
$ nm main.o 
0000000000000018 s EH_frame1
                 U __ZN1AIiE6memberE
0000000000000000 T _main
$ ../gcc/xgcc -B../gcc main.cc -L. -llib -o t -lstdc++

so, I think this is now safe to flip with later linkers.  Some type of code like:

  if (darwin_target_linker
      && (strverscmp (darwin_target_linker, "409.12") >= 0))

I think will do it.  We just need to figure out when the bug was fixed.  I tested 409.12.  This is the linker included with Xcode 10.0, which is the now current Xcode.  Jack or Iain might have access to older systems.

up:t mrs$ cat lib.cc
#define LIB

#include "lib.h"

template class A<int>;

bool match(int* p)
{
  return p == &A<int>::member;
}
up:t mrs$ cat lib.h
template<typename T>
struct A {
  static T member;
};
template<typename T>
  T A<T>::member;

bool match(int*);

#ifndef LIB
extern template class A<int>;
#endif
up:t mrs$ cat main.cc
#include "lib.h"

extern char _ZN1AIiE6memberE[3];

int main()
{

  void *vp = _ZN1AIiE6memberE;



  //  if (!match(&A<int>::member))
  //    throw 1;
}

is the type of test case I tried.  This is _with_ the change TARGET_WEAK_NOT_IN_ARCHIVE_TOC to make it 0.  This is required to get the symbol weak.
Comment 13 mrs@gcc.gnu.org 2018-09-25 00:30:25 UTC
Ah, yes, likely it would be ar that has changed.  Anyway, the ld version is a cheap indirect proxy for bugs in ar.  We're likely to get new ar versions with new ld versions.  A real config test is fine as well.
Comment 14 mrs@gcc.gnu.org 2018-09-25 00:38:00 UTC
You can test it if you can dump the ar symbol table, with a assembled .s file.  Nothing needs to work except the assembler and ar and the tool to dump the symbol table.  In the olden days, there was no entry for the weak symbol.  I was wondering around trying to recall which tool dumps the ar symbol table, no well; it seems like I used to know how to dump it, but haven't in years.
Comment 15 Iain Sandoe 2018-09-25 08:38:31 UTC
(In reply to mrs@gcc.gnu.org from comment #14)
> You can test it if you can dump the ar symbol table, with a assembled .s
> file.  Nothing needs to work except the assembler and ar and the tool to
> dump the symbol table.  In the olden days, there was no entry for the weak
> symbol.  I was wondering around trying to recall which tool dumps the ar
> symbol table, no well; it seems like I used to know how to dump it, but
> haven't in years.

otool -Sv
 * but that's broken on llvm-based toolchains (still listed but doesn't work)
 * someone with some spare cycles might file a radar or fix it upstream
 * however, that won't help people with machines stuck on 10.11 etc (like my 8core xeon).

xcrun otool-classic -Sv foo.a would also work (for as long as Apple package otool-classic).

==== however

Back to [xc2.5] i686-darwin8, the symbol is present for
 ar cru foo.a bar.o
 ranlib foo.a
 otool -Sv foo.a

Will try on a ppc-darwin8 system at some point.

So, at present, my inclination is to (not a patch, just the intent):

gcc/config/darwin.h:

#undef TARGET_WEAK_NOT_IN_ARCHIVE_TOC
#define TARGET_WEAK_NOT_IN_ARCHIVE_TOC 0

gcc/config/rs6000/darwin7.h:

#undef TARGET_WEAK_NOT_IN_ARCHIVE_TOC
#define TARGET_WEAK_NOT_IN_ARCHIVE_TOC 1

Since the original bug is against powerpc-darwin7 and we don't have anyone on this thread able to test if it was fixed there;

As noted, I have my doubts whether xc1.5 is able to build current GCC (ISTR that odcctools was required at some point - and, if that's so, then someone trying to build for really ancient darwin is probably better off making a newer cctools and ld64).
Comment 16 Iain Sandoe 2018-09-25 20:05:20 UTC
(In reply to Iain Sandoe from comment #15)

> Will try on a ppc-darwin8 system at some point.

external weak definitions appear in archive __SYMTABS for cctools 622 on powerpc-darwin8.11 (maybe some chance to test with darwin7, it would be nice to have a way to repeat the original problem).
Comment 17 Iain Sandoe 2018-09-25 20:33:58 UTC
(In reply to Iain Sandoe from comment #16)
> (In reply to Iain Sandoe from comment #15)
> 
> > Will try on a ppc-darwin8 system at some point.
> 
> external weak definitions appear in archive __SYMTABS for cctools 622 on
> powerpc-darwin8.11 (maybe some chance to test with darwin7, it would be nice
> to have a way to repeat the original problem).

I had a 10.3 boot partition on my old tibook (yeah, 2002 machine still works quite happily).  

So it's confirmed.

powerpc-darwin7 (cctools 525) weak external definitions don't appear in archive TOCs.  So the proposed solution above seems safe to me.
Comment 18 mrs@gcc.gnu.org 2018-09-25 21:44:47 UTC
So, didn't you just say that it works on darwin8 and later and is broken on darwin7?  If so, then darwin8.h needs the #define (since it is the first version where ar has been fixed).  ?
Comment 19 Iain Sandoe 2018-12-06 20:09:57 UTC
Author: iains
Date: Thu Dec  6 20:09:26 2018
New Revision: 266866

URL: https://gcc.gnu.org/viewcvs?rev=266866&root=gcc&view=rev
Log:
Darwin - fix PR c++/87380

This is [intentionally] broken C++ ABI, that was catering for a
tool problem that existed in a very old Darwin toolchain.

The issue is not present after Darwin7 (using default system
tools) so confine the fix to that revision or earlier.

2018-12-06  Iain Sandoe  <iain@sandoe.co.uk>

	PR c++/87380
	* config/darwin.h (TARGET_WEAK_NOT_IN_ARCHIVE_TOC) Remove; use the
	default.
	* config/rs6000/darwin7.h (TARGET_WEAK_NOT_IN_ARCHIVE_TOC): New.


Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/darwin.h
    trunk/gcc/config/rs6000/darwin7.h
Comment 20 Iain Sandoe 2018-12-23 20:56:10 UTC
Author: iains
Date: Sun Dec 23 20:55:39 2018
New Revision: 267386

URL: https://gcc.gnu.org/viewcvs?rev=267386&root=gcc&view=rev
Log:
darwin, fix c++/87380 by backproting r266866.

This was [intentionally] broken C++ ABI, that was catering for a
tool problem that existed in a very old Darwin toolchain.

It's no longer needed.

2018-12-23  Iain Sandoe  <iain@sandoe.co.uk>

	Backport from mainline
	2018-12-06  Iain Sandoe  <iain@sandoe.co.uk>

	PR c++/87380
	* config/darwin.h (TARGET_WEAK_NOT_IN_ARCHIVE_TOC) Remove, use the
	default.
	* config/rs6000/darwin7.h (TARGET_WEAK_NOT_IN_ARCHIVE_TOC): New.


Modified:
    branches/gcc-8-branch/gcc/ChangeLog
    branches/gcc-8-branch/gcc/config/darwin.h
    branches/gcc-8-branch/gcc/config/rs6000/darwin7.h
Comment 21 Iain Sandoe 2018-12-24 13:18:00 UTC
Author: iains
Date: Mon Dec 24 13:17:28 2018
New Revision: 267409

URL: https://gcc.gnu.org/viewcvs?rev=267409&root=gcc&view=rev
Log:
Fix PR c++/87380 (ABI breakage) for Darwin.

2018-12-24  Iain Sandoe  <iain@sandoe.co.uk>

	Backport from mainline
	2018-12-06  Iain Sandoe  <iain@sandoe.co.uk>

	PR c++/87380
	* config/darwin.h (TARGET_WEAK_NOT_IN_ARCHIVE_TOC) Remove, use the
	default.
	* config/rs6000/darwin7.h (TARGET_WEAK_NOT_IN_ARCHIVE_TOC): New.


Modified:
    branches/gcc-7-branch/gcc/ChangeLog
    branches/gcc-7-branch/gcc/config/darwin.h
    branches/gcc-7-branch/gcc/config/rs6000/darwin7.h
Comment 22 Iain Sandoe 2018-12-24 13:24:14 UTC
fixed on open branches, maintainers of earlier branches might wish to apply this locally.