Creating a new bug for bug 66146 comment 41 increased visibility, and to deal with it separately from the exception-handling bug that is the topic of bug 66146. Quoting from there: The new std::call_once using a futex is not backwards compatible, so I think it needs to be reverted, or hidden behind an ABI-breaking flag. The new std::once_flag::_M_activate() function sets _M_once=1 when an initialization is in progress. With glibc, if a call to the new std::call_once happens before a call to the old version of std::call_once, then glibc's pthread_once will find no fork generation value in _M_once and so will think it should run the init_function itself. Both threads will run their init_function, instead of the second one waiting for the first to finish. With musl, if a call to the old std::call_once happens before a call to the new std::call_once, then the second thread won't set _M_once=3 and so musl's pthread_once won't wake the second thread when the first finishes. The second thread will sleep forever (or until a spurious wake from the futex wait).
More details ... Although libstdc++ ignores any fork generation bits, glibc doesn't and it *requires* a (possibly zero) fork generation to be present. The __pthread_once_slow function uses CAS to set the once_control variable, and if that succeeds and the IN_PROGRESS bit is set, it compares the high bits with the current fork generation. If the IN_PROGRESS bit was set by the new libstdc++ std::call_once there won't be a fork generation. If the process' fork generation is not zero then this check in glibc will be false: /* Check whether the initializer execution was interrupted by a fork. We know that for both values, __PTHREAD_ONCE_INPROGRESS is set and __PTHREAD_ONCE_DONE is not. */ if (val == newval) { /* Same generation, some other thread was faster. Wait and retry. */ futex_wait_simple ((unsigned int *) once_control, (unsigned int) newval, FUTEX_PRIVATE); continue; } If std::call_once in another thread set val=2 but the fork gen is 4, then newval=4|1 and so val == newval is false. That means glibc assumes that an in-progress initialization was interrupted by a fork and so this thread should run it. That means two threads will be running it at once. Demo: #include <unistd.h> #include <sys/wait.h> #include <stdlib.h> #include <iostream> #include <mutex> extern std::once_flag once_flag; extern void old(); #if __GNUC__ >= 11 std::once_flag once_flag; int main() { // increase for generation: switch (fork()) { case -1: abort(); case 0: break; default: wait(nullptr); return 0; } // This is the child process, fork generation is non-zero. std::call_once(once_flag, [] { std::cout << "Active execution started using new code\n"; old(); std::cout << "Active execution finished using new code\n"; }); } #else void old() { std::call_once(once_flag, [] { std::cout << "Active execution started using old code\n"; std::cout << "Active execution finished using old code\n"; }); } #endif Compile this once with GCC 11 and once with GCC 10 and link with GCC 11, and instead of deadlocking (as it should do) you'll get: Active execution started using new code Active execution started using old code Active execution finished using old code Active execution finished using new code And using a libstdc++ build with _GLIBCXX_ASSERTIONS you'll get: /home/jwakely/src/gcc/gcc/libstdc++-v3/src/c++11/mutex.cc:79: void std::once_flag::_M_finish(bool): Assertion 'prev & _Bits::_Active' failed. because the "inner" execution changes the state to _Done when it finishes, and the assertion in the "outer" execution fails.
The new implementation added these new symbols to libstdc++.so: _ZNSt9once_flag11_M_activateEv _ZNSt9once_flag9_M_finishEb I think I'd like to keep them, but have the new implementation disabled by default (except for the ABI-changing gnu-versioned-namespace build). The new implementation is superior (it fixes PR 66146, and performs better in some cases due to inlining an initial check to see if executions should be passive) so I'd like to retain it for users who explicitly request it. By default we need to be compatible with the old version based on pthread_once. At some point (maybe when there are further ABI changes queued) maybe we'll do another painful transition like the cxx11-abi one and switch to the new std::once_flag by default. Maybe the new std::once_flag should be moved to an inline namespace, and then the symbols above defined as aliases for the equivalents using the inline namespace in the mangled name (the aliases are needed if we want to support binaries that are already using them, e.g. packages in the upcoming Fedora 34 which were compiled with gcc-11.0.1 already).
A glibc fix has been posted: https://sourceware.org/pipermail/libc-alpha/2021-March/123167.html
Fixed on the glibc side: https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=f0419e6a10740a672b28e112c409ae24f5e890ab
The master branch has been updated by Jonathan Wakely <redi@gcc.gnu.org>: https://gcc.gnu.org/g:6ee24638ed0ad51e568c799bacf149ba9bd7628b commit r11-7688-g6ee24638ed0ad51e568c799bacf149ba9bd7628b Author: Jonathan Wakely <jwakely@redhat.com> Date: Fri Mar 12 11:47:20 2021 +0000 libstdc++: Revert to old std::call_once implementation [PR 99341] The new std::call_once implementation is not backwards compatible, contrary to my intention. Because std::once_flag::_M_active() doesn't write glibc's "fork generation" into the pthread_once_t object, it's possible for glibc and libstdc++ to run two active executions concurrently. This violates the primary invariant of the feature! This patch reverts std::once_flag and std::call_once to the old implementation that uses pthread_once. This means PR 66146 is a problem again, but glibc has been changed to solve that. A new API similar to pthread_once but supporting failure and resetting the pthread_once_t will be proposed for inclusion in glibc and other C libraries. This change doesn't simply revert r11-4691 because I want to retain the new implementation for non-ghtreads targets (which didn't previously support std::call_once at all, so there's no backwards compatibility concern). This also leaves the new std::call_once::_M_activate() and std::call_once::_M_finish(bool) symbols present in libstdc++.so.6 so that code already compiled against GCC 11 can still use them. Those symbols will be removed in a subsequent commit (which distros can choose to temporarily revert if needed). libstdc++-v3/ChangeLog: PR libstdc++/99341 * include/std/mutex [_GLIBCXX_HAVE_LINUX_FUTEX] (once_flag): Revert to pthread_once_t implementation. [_GLIBCXX_HAVE_LINUX_FUTEX] (call_once): Likewise. * src/c++11/mutex.cc [_GLIBCXX_HAVE_LINUX_FUTEX] (struct __once_flag_compat): New type matching the reverted implementation of once_flag using futexes. (once_flag::_M_activate): Remove, replace with ... (_ZNSt9once_flag11_M_activateEv): ... alias symbol. (once_flag::_M_finish): Remove, replace with ... (_ZNSt9once_flag9_M_finishEb): ... alias symbol. * testsuite/30_threads/call_once/66146.cc: Removed.
The master branch has been updated by Jonathan Wakely <redi@gcc.gnu.org>: https://gcc.gnu.org/g:995a740cb01a0671a2082cb1ae13d0c356d4b568 commit r11-7689-g995a740cb01a0671a2082cb1ae13d0c356d4b568 Author: Jonathan Wakely <jwakely@redhat.com> Date: Fri Mar 12 11:47:20 2021 +0000 libstdc++: Remove symbols for new std::call_once implementation [PR 99341] This removes the new symbols added for the new futex-based std::call_once implementation. These symbols were new on trunk, so not in any released version. However, they are already present in some beta distro releases (Fedora Linux 34) and in Fedora Linux rawhide. This change can be locally reverted by distros that need to keep the symbols present until affected packages have been rebuilt. libstdc++-v3/ChangeLog: PR libstdc++/99341 * config/abi/post/aarch64-linux-gnu/baseline_symbols.txt: Remove std::once_flag symbols. * config/abi/post/ia64-linux-gnu/baseline_symbols.txt: Likewise. * config/abi/post/m68k-linux-gnu/baseline_symbols.txt: Likewise. * config/abi/post/riscv64-linux-gnu/baseline_symbols.txt: Likewise. * config/abi/pre/gnu.ver: Likewise. * src/c++11/mutex.cc [_GLIBCXX_HAVE_LINUX_FUTEX] (struct __once_flag_compat): Remove. (_ZNSt9once_flag11_M_activateEv): Remove. (_ZNSt9once_flag9_M_finishEb): Remove.
Fixed then.