Starting with r129030 tramp3d-v4 segfaults on startup if compiled statically with -fopenmp. This can be reproduced with the preprocessed testcase from http://www.suse.de/~rguenther/tramp3d/tramp3d-v4.ii.gz (x86_64) and compiling with -fopenmp -static (optimization does not change the effect). Author: jason Date: Fri Oct 5 05:35:46 2007 New Revision: 129030 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=129030 Log: 2007-09-13 Doug Kwan <dougkwan@google.com> * gcc/gthr-posix.h (__gthread_cond_broadcast, __gthread_cond_wait, __gthread_cond_wait_recursive): Add to extend interface for POSIX conditional variables. (__GTHREAD_HAS_COND): Macro defined to signify support of conditional variables. * gcc/gthr-posix95.h (__gthread_cond_broadcast, __gthread_cond_wait, __gthread_cond_wait_recursive): Add to extend interface for POSIX conditional variables. (__GTHREAD_HAS_COND): Macro defined to signify support of conditional variables. * gcc/gthr-single.h (__gthread_cond_broadcast, __gthread_cond_wait, __gthread_cond_wait_recursive): Add to extend interface for POSIX conditional variables. * gcc/gthr.h: Update comments to document new interface. * libstdc++-v3/include/ext/concurrent.h (class __mutex, class __recursive_mutex): Add new method gthread_mutex to access inner gthread mutex. [__GTHREAD_HAS_COND] (class __concurrence_broadcast_error, class __concurrence_wait_error, class __cond): Add. * guard.cc (recursive_push, recursive_pop): Delete. (init_in_progress_flag, set_init_in_progress_flag): Add to replace recursive_push and recursive_pop. (throw_recursive_init_exception): Add. (acquire, __cxa_guard_acquire, __cxa_guard_abort and __cxa_guard_release): [__GTHREAD_HAS_COND] Use a conditional for synchronization of static variable initialization. The global mutex is only held briefly when guards are accessed. [!__GTHREAD_HAS_COND] Fall back to the old code, which deadlocks. * testsuite/thread/guard.cc: Add new test. It deadlocks with the old locking code in libstdc++-v3/libsup++/guard.cc.
gdb doesn't like static code too much but the following is a backtrace of the crash: Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 0x88d8a0 (LWP 8358)] 0x0000000000000000 in ?? () (gdb) bt #0 0x0000000000000000 in ?? () warning: (Internal error: pc 0x55290b in read in psymtab, but not in symtab.) #1 0x000000000055290c in __cxa_guard_release (g=warning: (Internal error: pc 0x5528d0 in read in psymtab, but not in symtab.) warning: (Internal error: pc 0x55290b in read in psymtab, but not in symtab.) 0x874a70) at /space/rguenther/tramp3d/obj/x86_64-unknown-linux-gnu/libstdc++-v3/include/x86_64-unknown-linux-gnu/bits/gthr-default.h:749 #2 0x00000000004fc9ac in get_locale_mutex () at ../../../../trunk/libstdc++-v3/src/locale_init.cc:42 #3 0x00000000004fe99a in locale (this=0x872dd8) at ../../../../trunk/libstdc++-v3/src/locale_init.cc:215 #4 0x00000000004fa325 in Init (this=<value optimized out>) at /space/rguenther/tramp3d/obj/x86_64-unknown-linux-gnu/libstdc++-v3/include/streambuf:462 #5 0x0000000000402a79 in __static_initialization_and_destruction_0 ( __initialize_p=1, __priority=65535) at /space/rguenther/tramp3d/install/lib/gcc/x86_64-unknown-linux-gnu/4.3.0/../../../../include/c++/4.3.0/iostream:77 #6 0x0000000000402fc9 in global constructors keyed to _ZN5Pooma5pinfoE () at tramp3d-v4.cpp:56094 #7 0x00000000005cb046 in __do_global_ctors_aux () #8 0x00000000004001a3 in _init () #9 0x0000000000000001 in ?? () #10 0x0000000000569736 in __libc_csu_init () #11 0x00000000005691e2 in __libc_start_main () #12 0x00000000004001d9 in _start ()
Which is static inline int __gthread_cond_broadcast (__gthread_cond_t *cond) { return __gthrw_(pthread_cond_broadcast) (cond); } It looks like pthread_cond_broadcast is not correctly bound, as the disassembly shows: <__cxa_guard_release+52>: mov %rax,%rdi <__cxa_guard_release+55>: callq 0x0 though it _does_ work for pthread_mutex_lock: <__cxa_guard_release+28>: mov %rax,%rdi <__cxa_guard_release+31>: callq 0x566dc0 <pthread_mutex_lock>
The complete statically linked __cxa_guard_release looks like: 00000000005528d0 <__cxa_guard_release>: 5528d0: 53 push %rbx 5528d1: 48 89 fb mov %rdi,%rbx 5528d4: 48 83 ec 10 sub $0x10,%rsp 5528d8: 48 83 3d 90 e6 31 00 cmpq $0x0,0x31e690(%rip) # 870f70 <rtld_search_dirs+0xe0> 5528df: 00 5528e0: 74 4b je 55292d <__cxa_guard_release+0x5d> 5528e2: c6 44 24 0f 01 movb $0x1,0xf(%rsp) 5528e7: e8 c4 fd ff ff callq 5526b0 <_ZN12_GLOBAL__N_116get_static_mutexEv> 5528ec: 48 89 c7 mov %rax,%rdi 5528ef: e8 cc 44 01 00 callq 566dc0 <__pthread_mutex_lock> 5528f4: 85 c0 test %eax,%eax 5528f6: 75 3e jne 552936 <__cxa_guard_release+0x66> 5528f8: c6 43 01 00 movb $0x0,0x1(%rbx) 5528fc: c6 03 01 movb $0x1,(%rbx) 5528ff: e8 5c fe ff ff callq 552760 <_ZN12_GLOBAL__N_115get_static_condEv> 552904: 48 89 c7 mov %rax,%rdi 552907: e8 f4 d6 aa ff callq 0 <_nl_current_LC_CTYPE> 55290c: 85 c0 test %eax,%eax 55290e: 75 54 jne 552964 <__cxa_guard_release+0x94> 552910: 80 7c 24 0f 00 cmpb $0x0,0xf(%rsp) 552915: 74 10 je 552927 <__cxa_guard_release+0x57> 552917: 48 8b 3d aa 23 33 00 mov 0x3323aa(%rip),%rdi # 884cc8 <_ZN12_GLOBAL__N_1L12static_mutexE> 55291e: e8 0d 4f 01 00 callq 567830 <__pthread_mutex_unlock> 552923: 85 c0 test %eax,%eax 552925: 75 6b jne 552992 <__cxa_guard_release+0xc2> 552927: 48 83 c4 10 add $0x10,%rsp 55292b: 5b pop %rbx 55292c: c3 retq 55292d: c6 47 01 00 movb $0x0,0x1(%rdi) 552931: c6 07 01 movb $0x1,(%rdi) 552934: eb f1 jmp 552927 <__cxa_guard_release+0x57> 552936: bf 08 00 00 00 mov $0x8,%edi 55293b: e8 40 ec ff ff callq 551580 <__cxa_allocate_exception> 552940: 48 89 c7 mov %rax,%rdi 552943: 48 8b 05 3e e6 31 00 mov 0x31e63e(%rip),%rax # 870f88 <rtld_search_dirs+0xf8> 55294a: 48 8b 15 e7 e5 31 00 mov 0x31e5e7(%rip),%rdx # 870f38 <rtld_search_dirs+0xa8> 552951: 48 8b 35 40 e6 31 00 mov 0x31e640(%rip),%rsi # 870f98 <rtld_search_dirs+0x108> 552958: 48 83 c0 10 add $0x10,%rax 55295c: 48 89 07 mov %rax,(%rdi) 55295f: e8 0c fc ff ff callq 552570 <__cxa_throw> 552964: bf 08 00 00 00 mov $0x8,%edi 552969: e8 12 ec ff ff callq 551580 <__cxa_allocate_exception> 55296e: 48 89 c7 mov %rax,%rdi 552971: 48 8b 05 08 e6 31 00 mov 0x31e608(%rip),%rax # 870f80 <rtld_search_dirs+0xf0> 552978: 48 8b 15 41 e6 31 00 mov 0x31e641(%rip),%rdx # 870fc0 <rtld_search_dirs+0x130> 55297f: 48 8b 35 6a e5 31 00 mov 0x31e56a(%rip),%rsi # 870ef0 <rtld_search_dirs+0x60> 552986: 48 83 c0 10 add $0x10,%rax 55298a: 48 89 07 mov %rax,(%rdi) 55298d: e8 de fb ff ff callq 552570 <__cxa_throw> 552992: bf 08 00 00 00 mov $0x8,%edi 552997: e8 e4 eb ff ff callq 551580 <__cxa_allocate_exception> 55299c: 48 89 c7 mov %rax,%rdi 55299f: 48 8b 05 72 e5 31 00 mov 0x31e572(%rip),%rax # 870f18 <rtld_search_dirs+0x88> 5529a6: 48 8b 15 1b e6 31 00 mov 0x31e61b(%rip),%rdx # 870fc8 <rtld_search_dirs+0x138> 5529ad: 48 8b 35 2c e5 31 00 mov 0x31e52c(%rip),%rsi # 870ee0 <rtld_search_dirs+0x50> 5529b4: 48 83 c0 10 add $0x10,%rax 5529b8: 48 89 07 mov %rax,(%rdi) 5529bb: e8 b0 fb ff ff callq 552570 <__cxa_throw> 5529c0: 48 8d 7c 24 0f lea 0xf(%rsp),%rdi 5529c5: 48 89 c3 mov %rax,%rbx 5529c8: e8 c3 fd ff ff callq 552790 <_ZN10__cxxabiv113mutex_wrapperD1Ev> 5529cd: 48 89 df mov %rbx,%rdi 5529d0: e8 5b 07 01 00 callq 563130 <_Unwind_Resume> showing the obvious error. The shared libstdc++v3 has a relocation to pthread_cond_broadcast instead: c4aff: e8 5c fe ff ff callq c4960 <_ZN12_GLOBAL__N_115get_static_condEv> c4b04: 48 89 c7 mov %rax,%rdi c4b07: e8 24 0d f9 ff callq 55830 <pthread_cond_broadcast@plt> c4b0c: 85 c0 test %eax,%eax
What happens if you force libpthreads to be all linked in?
Subject: Re: New: [4.3 Regression] r129030 breaks -fopenmp -static compile of tramp3d-v4 I'm looking at that. -Doug 31 Oct 2007 14:52:04 -0000, rguenth at gcc dot gnu dot org <gcc-bugzilla@gcc.gnu.org>: > Starting with r129030 tramp3d-v4 segfaults on startup if compiled statically > with -fopenmp. This can be reproduced with the preprocessed testcase from > http://www.suse.de/~rguenther/tramp3d/tramp3d-v4.ii.gz (x86_64) and compiling > with -fopenmp -static (optimization does not change the effect). > > Author: jason > Date: Fri Oct 5 05:35:46 2007 > New Revision: 129030 > > URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=129030 > Log: > 2007-09-13 Doug Kwan <dougkwan@google.com> > > * gcc/gthr-posix.h (__gthread_cond_broadcast, __gthread_cond_wait, > __gthread_cond_wait_recursive): Add to extend interface for POSIX > conditional variables. (__GTHREAD_HAS_COND): Macro defined to signify > support of conditional variables. > * gcc/gthr-posix95.h (__gthread_cond_broadcast, __gthread_cond_wait, > __gthread_cond_wait_recursive): Add to extend interface for POSIX > conditional variables. (__GTHREAD_HAS_COND): Macro defined to signify > support of conditional variables. > * gcc/gthr-single.h (__gthread_cond_broadcast, __gthread_cond_wait, > __gthread_cond_wait_recursive): Add to extend interface for POSIX > conditional variables. > * gcc/gthr.h: Update comments to document new interface. > * libstdc++-v3/include/ext/concurrent.h (class __mutex, > class __recursive_mutex): Add new method gthread_mutex to access > inner gthread mutex. > [__GTHREAD_HAS_COND] (class __concurrence_broadcast_error, > class __concurrence_wait_error, class __cond): Add. > * guard.cc (recursive_push, recursive_pop): Delete. > (init_in_progress_flag, set_init_in_progress_flag): Add to > replace recursive_push and recursive_pop. > (throw_recursive_init_exception): Add. > (acquire, __cxa_guard_acquire, __cxa_guard_abort and > __cxa_guard_release): [__GTHREAD_HAS_COND] Use a conditional > for synchronization of static variable initialization. > The global mutex is only held briefly when guards are > accessed. [!__GTHREAD_HAS_COND] Fall back to the old code, > which deadlocks. > * testsuite/thread/guard.cc: Add new test. It deadlocks with the > old locking code in libstdc++-v3/libsup++/guard.cc. > > > -- > Summary: [4.3 Regression] r129030 breaks -fopenmp -static compile > of tramp3d-v4 > Product: gcc > Version: 4.3.0 > Status: UNCONFIRMED > Keywords: wrong-code > Severity: normal > Priority: P3 > Component: c++ > AssignedTo: unassigned at gcc dot gnu dot org > ReportedBy: rguenth at gcc dot gnu dot org > > > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33960 > > ------- You are receiving this mail because: ------- > You are on the CC list for the bug, or are watching someone who is. >
Richard, I think I know what happened. Could you please do an nm a.out|grep " pthread_" or your executable and send that to me? It seems that we need to change glibc unfortunately. Here is code at the end of libc/nptl/pthread_create.c: /* If pthread_create is present, libgcc_eh.a and libsupc++.a expects some other POSIX thread functions to be present as well. */ PTHREAD_STATIC_FN_REQUIRE (pthread_mutex_lock) PTHREAD_STATIC_FN_REQUIRE (pthread_mutex_unlock) PTHREAD_STATIC_FN_REQUIRE (pthread_once) PTHREAD_STATIC_FN_REQUIRE (pthread_cancel) PTHREAD_STATIC_FN_REQUIRE (pthread_key_create) PTHREAD_STATIC_FN_REQUIRE (pthread_setspecific) PTHREAD_STATIC_FN_REQUIRE (pthread_getspecific) When the linker sees pthread_create, it will also bring in pthread_mutex_lock and pthread_mutex_unlock automatically but not pthread_cond_broadcast and pthread_cond_wait. Those two symbols are defined as weak references so they will remain NULL. Apparently the fix is to add dependency of pthread_cond_broadcast and pthread_cond_wait into glibc. A band-aid is removing the #define __GTHREAD_HAS_COND in gthr-posix*.h to diasable the new code temporarily and util glibc is fixed.
It seems that this is only a problem for a static link. And it would presumably work fine if we had strong references to the functions we need. So let's just do this at the end of guard.cc: #if !defined(__PIC__) && defined(__GLIBC__) && defined(__GTHREAD_HAS_COND) asm(".globl pthread_cond_wait"); asm(".globl pthread_cond_broadcast"); #endif Seems like that should work.
Yes, the analysis from comment #6 looks correct - Jakub, can you take care of the required glibc fix? I'll check if Ians trick works as well.
The only at least partially workable way of linking statically against NPTL libpthread.a is -Wl,--whole-archive -lpthread -Wl,--no-whole-archive. There is just a huge amount of issues if you don't have everything in there in (e.g. the various cancellation wrappers, which for dynamically linked code can handle cancellation even in libc.so, but not so for the heavily unsupported static linking. Guess we should just change glibc Makefiles to ld -r all libpthread.a objects together and install that as libpthread.a instead.
The trick from comment #7 doesn't work.
Linking pthread with --whole-archive works. Re comment #6 - here's the output of nm tramp3d-v4 | grep " pthread_" 0000000000569440 T pthread_attr_destroy 0000000000569480 T pthread_attr_getstacksize 0000000000569420 W pthread_attr_init 000000000056a980 W pthread_attr_setaffinity_np 0000000000569460 T pthread_attr_setdetachstate 00000000005694a0 T pthread_attr_setstacksize 000000000056a150 T pthread_cancel w pthread_cond_broadcast w pthread_cond_wait 0000000000568400 W pthread_create 000000000056a7d0 W pthread_getaffinity_np 0000000000569fd0 T pthread_getspecific 0000000000569f70 T pthread_key_create 00000000005694c0 T pthread_mutex_lock 0000000000569f30 T pthread_mutex_unlock 000000000056a1c0 T pthread_once 0000000000569410 T pthread_self 000000000056a8d0 W pthread_setaffinity_np w pthread_setcancelstate 000000000056a050 T pthread_setspecific
This is not a bug on the GCC side.