Bug 66146 - call_once not C++11-compliant on ppc64le
Summary: call_once not C++11-compliant on ppc64le
Status: NEW
Alias: None
Product: gcc
Classification: Unclassified
Component: libstdc++ (show other bugs)
Version: 5.1.0
: P3 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords: wrong-code
: 70298 71025 78184 93610 104495 (view as bug list)
Depends on: 84323
Blocks:
  Show dependency treegraph
 
Reported: 2015-05-14 18:31 UTC by Andrey V
Modified: 2022-12-03 09:26 UTC (History)
17 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2015-05-15 00:00:00


Attachments
after this update, my clang does not work any more (21.85 KB, image/png)
2021-04-07 06:22 UTC, cqwrteur
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Andrey V 2015-05-14 18:31:33 UTC
std::call_once is not C++11 (or even N2447) compliant on ppc64le.

According to N2447, "If the invocation of func results in an exception being thrown, the exception is propagated to the caller and the effects are as-if this invocation of call_once did not occur."

Given the code

#include <mutex>

int call_count = 0;
void func_() {
printf("Inside func_ call_count %d\n", call_count);
if (++call_count < 2)
throw 0;
}
int main() {
std::once_flag flag_;
printf("Before calling call_once flag_: %d\n", *(int*)&flag_);
try { std::call_once(flag_, func_); } catch(...) { printf("Inside catch all excepton flag_: %d\n", *(int*)&flag_); }
printf("before the 2nd call to call_once flag_: %d\n", *(int*)&flag_);
std::call_once(flag_, func_);
}

the output on amd64 is
Before calling call_once flag_: 0
Inside func_ call_count 0
Inside catch all excepton flag_: 0
before the 2nd call to call_once flag_: 0

but on ppc64el is
Before calling call_once flag_: 0
Inside func_ call_count 0
Inside catch all excepton flag_: 1
before the 2nd call to call_once flag_: 1

amd64 has the correct behavior.

ppc64le gcc tested:
$gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/powerpc64le-linux-gnu/4.9/lto-wrapper
Target: powerpc64le-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu 4.9.1-16ubuntu6' --with-bugurl=file:///usr/share/doc/gcc-4.9/README.Bugs --enable-languages=c,c++,java,go,d,fortran,objc,obj-c++ --prefix=/usr --program-suffix=-4.9 --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --with-gxx-include-dir=/usr/include/c++/4.9 --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-gnu-unique-object --disable-libsanitizer --disable-libquadmath --enable-plugin --with-system-zlib --disable-browser-plugin --enable-java-awt=gtk --enable-gtk-cairo --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-4.9-ppc64el/jre --enable-java-home --with-jvm-root-dir=/usr/lib/jvm/java-1.5.0-gcj-4.9-ppc64el --with-jvm-jar-dir=/usr/lib/jvm-exports/java-1.5.0-gcj-4.9-ppc64el --with-arch-directory=ppc64 --with-ecj-jar=/usr/share/java/eclipse-ecj.jar --enable-objc-gc --enable-secureplt --with-cpu=power7 --with-tune=power8 --disable-multilib --enable-multiarch --disable-werror --with-long-double-128 --enable-checking=release --build=powerpc64le-linux-gnu --host=powerpc64le-linux-gnu --target=powerpc64le-linux-gnu
Thread model: posix
gcc version 4.9.1 (Ubuntu 4.9.1-16ubuntu6) 

$gcc-5.1 -v
Using built-in specs.
COLLECT_GCC=/home/andreyv/nfs/gcc/bin/gcc-5.1
COLLECT_LTO_WRAPPER=/nfs/home/andreyv/gcc/bin/../libexec/gcc/powerpc64le-unknown-linux-gnu/5.1.0/lto-wrapper
Target: powerpc64le-unknown-linux-gnu
Configured with: ../gcc-5.1.0/configure --prefix=/nfs/home/andreyv/gcc/ --program-suffix=-5.1 --enable-languages=c,c++,lto CFLAGS='-O2 -mtune=native' CXXFLAGS='-O2 -mtune=native'
Thread model: posix
gcc version 5.1.0 (GCC) 

Both versions compiled and run on Ubuntu 14.10 on a Power8.
gcc-5.1 invocation: $ ~/nfshome/gcc/bin/g++-5.1 -std=c++11 -lpthread -Wl,-rpath=~/nfshome/gcc/lib64 -Wall -Wextra a.cc
gcc-4.9 invocation: g++ -std=c++11 -lpthread -Wall -Wextra a.cc
Comment 1 Jonathan Wakely 2015-05-15 08:55:28 UTC
This has nothing to do with libstdc++, the std::call_once code is identical on x86 and ppc, and you get the same behaviour with pthreads, so it should be reported to glibc instead.

#include <stdio.h>
#include <pthread.h>

pthread_once_t flag_ = PTHREAD_ONCE_INIT;

int call_count = 0;
extern "C" void func_() {
  printf("Inside func_ call_count %d\n", call_count);
  if (++call_count < 2)
    throw 0;
}
int main() {
  printf("Before calling call_once flag_: %d\n", *(int*)&flag_);
  try {
    pthread_once(&flag_, func_);
  } catch(...) {
    printf("Inside catch all excepton flag_: %d\n", *(int*)&flag_);
  }
  printf("before the 2nd call to call_once flag_: %d\n", *(int*)&flag_);
  pthread_once(&flag_, func_);
}
Comment 2 Andrey V 2015-05-15 15:12:54 UTC
Replacing throw/try/catch with longjmp/setjmp for non-returning function exit, like so:

#include <stdio.h>
#include <pthread.h>
#include <setjmp.h>

pthread_once_t flag_ = PTHREAD_ONCE_INIT;

int call_count = 0;
jmp_buf catch_;

void func_() {
  printf("Inside func_ call_count %d\n", call_count);
  if (++call_count < 2)
	longjmp(catch_, 1);
}
int main() {
  int signo;
  printf("Before calling call_once flag_: %d\n", *(int*)&flag_);
  signo = setjmp(catch_);
  if (!signo)
    pthread_once(&flag_, func_);
  else
    printf("Inside catch all excepton flag_: %d\n", *(int*)&flag_);
  printf("before the 2nd call to call_once flag_: %d\n", *(int*)&flag_);
  pthread_once(&flag_, func_);
}

, the invalid behavior isn't triggered.
It appears that pthread_once doesn't take kindly to exceptions in the once_func.
Comment 3 Martin Sebor 2015-05-15 17:40:53 UTC
On Power, both glibc and AIX pthread_once behave the same way: i.e., they fail to clear the once flag on exception.  The test case below mimics glibc's pthread_once and demonstrates the root cause of the problem: the cancellation handler is not invoked when an exception is thrown unless the pthread_cleanup_push/pop macros are compiled in C++ code. Simply recompiling glibc's pthread_once.c using a C++ compiler (and adding the apprpriate extern "C" decoration) should fix it.

Until it's fixed, as a workaround, it seems that libstdc++ could clear the flag when an exception is thrown before propagating it out of call_once.

$ cat t.c && gcc -O2 -Wall -c -fasynchronous-unwind-tables -g t.c && g++ -DMAIN -O2 -Wall t.o -pthread t.c && ./a.out 
#include <pthread.h>
#include <stdio.h>

extern int n;

#if MAIN

extern "C" void foo () { throw 0; }
extern "C" void bar (void (*)());

int main () {
    try {
        bar (foo);
    }
    catch (...) {
        printf ("caught exception: pthread cleanup handler %sinvoked\n",
                n ? "" : "not ");
    }

    return n == 1 ? 0 : 1;
}

#else

int n;
#if __cplusplus
extern "C" {
#endif

static void cleanup (void *arg) { ++n; }

void bar (void (*pf)(void)) {
    pthread_cleanup_push (cleanup, 0);
    pf ();
    pthread_cleanup_pop (0);
}

#if __cplusplus
}
#endif

#endif
caught exception: pthread cleanup handler not invoked
Comment 4 Jonathan Wakely 2015-05-15 20:31:09 UTC
Thanks, Martin. So maybe something like this:

--- a/libstdc++-v3/include/std/mutex
+++ b/libstdc++-v3/include/std/mutex
@@ -726,7 +738,17 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
       __set_once_functor_lock_ptr(&__functor_lock);
 #endif
 
+#ifdef __powerpc__
+      int __e;
+      __try {
+       __e = __gthread_once(&__once._M_once, &__once_proxy);
+      } __catch(...) {
+         __once._M_once = once_flag{}._M_once;
+         __throw_exception_again;
+      }
+#else
       int __e = __gthread_once(&__once._M_once, &__once_proxy);
+#endif
 
 #ifndef _GLIBCXX_HAVE_TLS
       if (__functor_lock)
Comment 5 Andrew Pinski 2015-05-15 21:14:08 UTC
(In reply to Jonathan Wakely from comment #4)
> Thanks, Martin. So maybe something like this:

This happens on almost all non-x86 machines.  I tested it on aarch64 and we had the same failure as powerpc.
Comment 6 Andrey V 2015-05-19 17:36:23 UTC
Same failure on s390x.
Comment 7 Martin Sebor 2015-05-20 00:11:30 UTC
I opened http://sourceware.org/bugzilla/show_bug.cgi?id=18435 for the glibc bug and attached a lightly tested patch to it.
Comment 8 Jonathan Wakely 2015-05-20 14:12:42 UTC
NetBSD 5 and DragonFly BSD fail the test too. I'm going to make libstdc++ assume pthread_once is not exception-aware unless specifically told otherwise for targets where we know it works, such as x86-linux and (hopefully) glibc > 2.21
Comment 9 Rich Felker 2015-05-20 14:35:56 UTC
This is not portable usage of pthread_once. The longjmp based version is clearly wrong, per the resolution of http://austingroupbugs.net/view.php?id=863 and while POSIX has nothing to say about C++, the lack of any text forbidding the analogous C++ construct does not make a requirement to support it.

FYI, this type of usage is not supported, and impossible to support, in musl's implementation of pthread_once (which does not interact with exceptions/unwinding, only with thread cancellation).

There's a slim chance the C11 call_once function could be used, but I think the way it's specified also requires this kind of deadlock if you leave the init routine via any means but returning, and it's not widely supported yet anyway.

I think the right solution is to use atomics directly to implement std::call_once and only fallback to pthread sync primitives in the contended case. As long as you don't pass arbitrary C++ callback functions provided by the caller to pthread_once, it should be safe to use any pthread functions.
Comment 10 Jakub Jelinek 2015-05-20 14:46:37 UTC
In glibc, clear_once_control should clear the pthread_once_t var on throw, wonder why it doesn't work on ppc*, seems like pthread_once.c is correctly compiled with -fexceptions.
If musl doesn't want to support it, it is its own bad choice.
Comment 11 Jonathan Wakely 2015-05-20 14:46:58 UTC
(In reply to Jonathan Wakely from comment #4)
> Thanks, Martin. So maybe something like this:
> 
> --- a/libstdc++-v3/include/std/mutex
> +++ b/libstdc++-v3/include/std/mutex
> @@ -726,7 +738,17 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>        __set_once_functor_lock_ptr(&__functor_lock);
>  #endif
>  
> +#ifdef __powerpc__
> +      int __e;
> +      __try {
> +       __e = __gthread_once(&__once._M_once, &__once_proxy);
> +      } __catch(...) {
> +         __once._M_once = once_flag{}._M_once;

N.B. This won't work because it races, and may not even compile (if e.g. pthread_once_t has a const member)
Comment 12 Rich Felker 2015-05-20 14:51:53 UTC
Jakub, this is not the place to discuss the pros and cons of musl or other particular implementations; libstdc++ needs to support many which do not have the glibc-specific semantics you want. In particular there are plenty of BSDs the current code is broken on too. If you want to use pthread_once directly to implement std::call_once on glibc, that's your business, but there needs to be a portable implementation that works on other systems.
Comment 13 nsz 2016-03-18 15:28:07 UTC
*** Bug 70298 has been marked as a duplicate of this bug. ***
Comment 14 nsz 2016-03-18 15:37:49 UTC
call_once is implemented on top of pthread_once in libstdc++
but pthead_once is not exception safe, only cancellation safe
and it cannot be both at the same time in glibc with the current
cancellation implementation (although there is ongoing work
to fix that).

in general this is problematic because it relies on an interface
contract that pthread_once does not provide, so libstdc++ should
fix it even if glibc makes pthread_once exception safe.

(i was told that the underlying implementation is visible through
"native handles" so a pthread-independent call_once implementation
is problematic, this sounds to me like a c++ defect, such handles
probably should not be exposed by libstdc++, it will break whenever
there is divergence between posix and c++ requirements.. and there
are plenty of that already.)
Comment 15 Jonathan Wakely 2016-03-18 15:51:21 UTC
The native handles are not required by the standard, whether they exist or not is implementation-defined. If they're not supportable we'll remove them, that's not a problem.
Comment 16 Anthony Williams 2016-03-18 16:14:53 UTC
Would it be worth ignoring pthread_once and using an implementation of call_once based on Mike Burrows' algorithm?

http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2444.html
Comment 17 Jonathan Wakely 2016-03-18 16:36:19 UTC
Yes, I think we need to do that, because even if glibc were to change that wouldn't help on non-gnu targets.
Comment 18 Jonathan Wakely 2016-05-09 18:31:34 UTC
*** Bug 71025 has been marked as a duplicate of this bug. ***
Comment 19 Jonathan Wakely 2016-11-07 22:08:29 UTC
*** Bug 78184 has been marked as a duplicate of this bug. ***
Comment 20 Jonathan Wakely 2018-02-28 13:16:53 UTC
(In reply to Anthony Williams from comment #16)
> Would it be worth ignoring pthread_once and using an implementation of
> call_once based on Mike Burrows' algorithm?
> 
> http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2444.html

That seems to work for pthread_once but not std::call_once, due to this assumption:

  The number of pthread_once_t variables in a programme is bounded and can be
  counted in an instance of type T without overflow. If T is 32-bits, this
  implementation requires that the number of pthread_once_t variables in a
  programme not exceed 2**31-3. (Recall that pthread_once_t variables are
  intended to have static storage class, so it would be remarkable for such
  a huge number to exist.) 

Nothing prevents an unbounded number of std::once_flag objects in C++:

#include <mutex>

int main()
{
  while (true)
  {
    std::once_flag f;
    std::call_once(f, [](){});
  }
}
Comment 21 Jon Cohen 2018-06-05 15:49:07 UTC
What is the current status on this bug?  I'm currently working on exception safety for Abseil's call_once which uses pthread_once under the hood.  It looks like there are still people bringing up issue with this in the glibc thread as well -- is this still actively being worked on?
Comment 22 Jonathan Wakely 2018-06-05 22:39:06 UTC
It still needs to be fixed, and hoping for a libc fix isn't going to work because even if we get a solution in glibc it won't work for other C libraries. So we need to reimplement our std::call_once. I'll make sure we get this done for GCC 9.
Comment 23 Jakub Jelinek 2019-05-03 09:17:24 UTC
GCC 9.1 has been released.
Comment 24 Jon Cohen 2019-05-03 15:15:33 UTC
I don't see anything in the release notes about call_once.  Is this still an open issue?
Comment 25 Jonathan Wakely 2019-05-03 20:38:05 UTC
Yes, that's why the bug is still open.

Jakub's comment accompanied changing the target milestone from 9.1 to 9.2, because it wasn't fixed for 9.1.

Changing it to 10, because it won't be done for 9.2 either.
Comment 26 Paul Adobe 2019-12-12 23:12:59 UTC
It may be interesting to note that we recently ended up tracking a change in glibc to making this a problem on x86_64 architectures. Apparently when they refactored pthread_once in glibc patch 2.17.288 it changed the problem from an arm/powerpc only problem to an all-architecture problem. (cent7.7 had recently updated to .292, which made this quite evident)
Comment 27 Jonathan Wakely 2019-12-12 23:38:42 UTC
Yes, we need to reimplement call_once to not use pthread_once at all, for any targets.
Comment 28 Jonathan Wakely 2020-02-06 08:53:35 UTC
*** Bug 93610 has been marked as a duplicate of this bug. ***
Comment 29 Jakub Jelinek 2020-05-07 11:56:24 UTC
GCC 10.1 has been released.
Comment 30 Peter Dimov 2020-06-27 18:18:22 UTC
I was going to ask the stupid question "why not just use the straightforward double-checked locking here" but the answer is probably "ABI break".
Comment 31 Evgeny 2020-08-27 13:56:40 UTC
(In reply to Jonathan Wakely from comment #27)
> Yes, we need to reimplement call_once to not use pthread_once at all, for
> any targets.
Is there any progress on this issue?
Comment 32 CVS Commits 2020-11-03 18:45:07 UTC
The master branch has been updated by Jonathan Wakely <redi@gcc.gnu.org>:

https://gcc.gnu.org/g:93e79ed391b9c636f087e6eb7e70f14963cd10ad

commit r11-4691-g93e79ed391b9c636f087e6eb7e70f14963cd10ad
Author: Jonathan Wakely <jwakely@redhat.com>
Date:   Tue Nov 3 18:44:32 2020 +0000

    libstdc++: Rewrite std::call_once to use futexes [PR 66146]
    
    The current implementation of std::call_once uses pthread_once, which
    only meets the C++ requirements when compiled with support for
    exceptions. For most glibc targets and all non-glibc targets,
    pthread_once does not work correctly if the init_routine exits via an
    exception. The pthread_once_t object is left in the "active" state, and
    any later attempts to run another init_routine will block forever.
    
    This change makes std::call_once work correctly for Linux targets, by
    replacing the use of pthread_once with a futex, based on the code from
    __cxa_guard_acquire. For both glibc and musl, the Linux implementation
    of pthread_once is already based on futexes, and pthread_once_t is just
    a typedef for int, so this change does not alter the layout of
    std::once_flag. By choosing the values for the int appropriately, the
    new code is even ABI compatible. Code that calls the old implementation
    of std::call_once will use pthread_once to manipulate the int, while new
    code will use the new std::once_flag members to manipulate it, but they
    should interoperate correctly. In both cases, the int is initially zero,
    has the lowest bit set when there is an active execution, and equals 2
    after a successful returning execution. The difference with the new code
    is that exceptional exceptions are correctly detected and the int is
    reset to zero.
    
    The __cxa_guard_acquire code (and musl's pthread_once) use an additional
    state to say there are other threads waiting. This allows the futex wake
    syscall to be skipped if there is no contention. Glibc doesn't use a
    waiter bit, so we have to unconditionally issue the wake in order to be
    compatible with code calling the old std::call_once that uses Glibc's
    pthread_once. If we know that we're using musl (and musl's pthread_once
    doesn't change) it would be possible to set a waiting state and check
    for it in std::once_flag::_M_finish(bool), but this patch doesn't do
    that.
    
    This doesn't fix the bug for non-linux targets. A similar approach could
    be used for targets where we know the definition of pthread_once_t is a
    mutex and an integer. We could make once_flag._M_activate() use
    pthread_mutex_lock on the mutex member within the pthread_once_t, and
    then only set the integer if the execution finishes, and then unlock the
    mutex. That would require careful study of each target's pthread_once
    implementation and that work is left for a later date.
    
    This also fixes PR 55394 because pthread_once is no longer needed, and
    PR 84323 because the fast path is now just an atomic load.
    
    As a consequence of the new implementation that doesn't use
    pthread_once, we can also make std::call_once work for targets with no
    gthreads support. The code for the single-threaded implementation
    follows the same methods as on Linux, but with no need for atomics or
    futexes.
    
    libstdc++-v3/ChangeLog:
    
            PR libstdc++/55394
            PR libstdc++/66146
            PR libstdc++/84323
            * config/abi/pre/gnu.ver (GLIBCXX_3.4.29): Add new symbols.
            * include/std/mutex [!_GLIBCXX_HAS_GTHREADS] (once_flag): Define
            even when gthreads is not supported.
            (once_flag::_M_once) [_GLIBCXX_HAVE_LINUX_FUTEX]: Change type
            from __gthread_once_t to int.
            (once_flag::_M_passive(), once_flag::_M_activate())
            (once_flag::_M_finish(bool), once_flag::_Active_execution):
            Define new members for futex and non-threaded implementation.
            [_GLIBCXX_HAS_GTHREADS] (once_flag::_Prepare_execution): New
            RAII helper type.
            (call_once): Use new members of once_flag.
            * src/c++11/mutex.cc (std::once_flag::_M_activate): Define.
            (std::once_flag::_M_finish): Define.
            * testsuite/30_threads/call_once/39909.cc: Do not require
            gthreads.
            * testsuite/30_threads/call_once/49668.cc: Likewise.
            * testsuite/30_threads/call_once/60497.cc: Likewise.
            * testsuite/30_threads/call_once/call_once1.cc: Likewise.
            * testsuite/30_threads/call_once/dr2442.cc: Likewise.
            * testsuite/30_threads/call_once/once_flag.cc: Add test for
            constexpr constructor.
            * testsuite/30_threads/call_once/66146.cc: New test.
            * testsuite/30_threads/call_once/constexpr.cc: Removed.
            * testsuite/30_threads/once_flag/cons/constexpr.cc: Removed.
Comment 33 Jonathan Wakely 2020-11-03 18:46:54 UTC
Fixed for linux targets, not others though.
Comment 34 Jonathan Wakely 2020-11-03 20:17:23 UTC
Untested sketch of a solution for Solaris and BSDs:
https://gcc.gnu.org/pipermail/gcc-patches/2020-November/557928.html
Comment 35 Thomas Schwinge 2020-12-17 09:08:38 UTC
In an '--enable-werror' configuration (assuming that's relevant), I'm seeing new code from commit r11-4691-g93e79ed391b9c636f087e6eb7e70f14963cd10ad "libstdc++: Rewrite std::call_once to use futexes [PR 66146]" fail to build:

    [...]/source-gcc/libstdc++-v3/src/c++11/mutex.cc: In member function ‘void std::once_flag::_M_finish(bool)’:
    [...]/source-gcc/libstdc++-v3/src/c++11/mutex.cc:77:11: error: unused variable ‘prev’ [-Werror=unused-variable]
       77 |       int prev = __atomic_exchange_n(&_M_once, newval, __ATOMIC_RELEASE);
          |           ^~~~
    cc1plus: all warnings being treated as errors
    Makefile:648: recipe for target 'mutex.lo' failed
    make[5]: *** [mutex.lo] Error 1
    make[5]: Leaving directory '[...]/build-gcc/x86_64-pc-linux-gnu/libstdc++-v3/src/c++11'

Should a '(void) prev;' be added (my current workaround), or 'prev' get some attribute 'used' added, or something else?
Comment 36 Jonathan Wakely 2020-12-17 09:58:47 UTC
Attribute unused, not attribute used.

I'll fix it shortly.
Comment 37 CVS Commits 2020-12-17 14:03:43 UTC
The master branch has been updated by Jonathan Wakely <redi@gcc.gnu.org>:

https://gcc.gnu.org/g:9f9dbc8e09cf48406aa24b6c78735f1a7912cc4e

commit r11-6229-g9f9dbc8e09cf48406aa24b6c78735f1a7912cc4e
Author: Jonathan Wakely <jwakely@redhat.com>
Date:   Thu Dec 17 13:16:02 2020 +0000

    libstdc++: Fix -Wunused warning
    
    As noted in PR 66146 comment 35, there is a new warning in the new
    std::call_once implementation.
    
    libstdc++-v3/ChangeLog:
    
            * src/c++11/mutex.cc (std::once_flag::_M_finish): Add
            maybe_unused attribute to variable used in assertion.
Comment 38 Libor Bukata 2021-02-04 15:16:14 UTC
(In reply to Jonathan Wakely from comment #34)
> Untested sketch of a solution for Solaris and BSDs:
> https://gcc.gnu.org/pipermail/gcc-patches/2020-November/557928.html

Thank you for the patch, I will test it on Solaris
and let you know when I have the results.
Comment 39 Libor Bukata 2021-02-11 08:00:07 UTC
(In reply to Libor Bukata from comment #38)
> (In reply to Jonathan Wakely from comment #34)
> > Untested sketch of a solution for Solaris and BSDs:
> > https://gcc.gnu.org/pipermail/gcc-patches/2020-November/557928.html
> 
> Thank you for the patch, I will test it on Solaris
> and let you know when I have the results.

Patched GCC master with Jonathan's patch, but unfortunately it is not compilable on Solaris 11:
In file included from /gcc_nightly/build/sparcv9/sparcv9-sun-solaris2.11/libstdc++-v3/include/sparcv9-sun-solaris2.11/bits/c++config.h:568,
                 from /gcc_nightly/build/sparcv9/sparcv9-sun-solaris2.11/libstdc++-v3/include/cassert:43,
                 from /gcc_nightly/gcc-11.0.0-master/libstdc++-v3/include/precompiled/stdc++.h:33:
/gcc_nightly/build/sparcv9/sparcv9-sun-solaris2.11/libstdc++-v3/include/mutex:861:10: error: token "{" is not valid in preprocessor expressions
  861 |       || _GLIBCXX_COMPAT_PTHREAD_ONCE_T
      |          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from /gcc_nightly/build/sparcv9/sparcv9-sun-solaris2.11/libstdc++-v3/include/sparcv9-sun-solaris2.11/bits/c++config.h:568,
                 from /gcc_nightly/build/sparcv9/sparcv9-sun-solaris2.11/libstdc++-v3/include/cassert:43,
                 from /gcc_nightly/gcc-11.0.0-master/libstdc++-v3/include/precompiled/stdc++.h:33:
/gcc_nightly/build/sparcv9/sparcv9-sun-solaris2.11/libstdc++-v3/include/mutex:861:10: error: token "{" is not valid in preprocessor expressions
  861 |       || _GLIBCXX_COMPAT_PTHREAD_ONCE_T
      |          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from /gcc_nightly/build/sparcv9/sparcv9-sun-solaris2.11/libstdc++-v3/include/future:38,
                 from /gcc_nightly/gcc-11.0.0-master/libstdc++-v3/include/precompiled/stdc++.h:105:
/gcc_nightly/build/sparcv9/sparcv9-sun-solaris2.11/libstdc++-v3/include/mutex:675:15: error: explicitly defaulted function 'constexpr std::once_flag::once_flag()' cannot be declared 'constexpr' because the implicit declaration is not 'constexpr':
  675 |     constexpr once_flag() noexcept = default;
      |               ^~~~~~~~~
/gcc_nightly/build/sparcv9/sparcv9-sun-solaris2.11/libstdc++-v3/include/mutex:699:12: note: defaulted constructor calls non-'constexpr' 'std::once_flag::_Once::_Once()'
  699 |     struct _Once _GLIBCXX_COMPAT_PTHREAD_ONCE_T;

Note that there was a merge conflict with the current master, however, its resolution was straightforward. If there is an updated patch, I will be happy to test it as well.
Comment 40 Jonathan Wakely 2021-02-11 13:29:57 UTC
(In reply to Libor Bukata from comment #39)
> (In reply to Libor Bukata from comment #38)
> > (In reply to Jonathan Wakely from comment #34)
> > > Untested sketch of a solution for Solaris and BSDs:
> > > https://gcc.gnu.org/pipermail/gcc-patches/2020-November/557928.html
> > 
> > Thank you for the patch, I will test it on Solaris
> > and let you know when I have the results.
> 
> Patched GCC master with Jonathan's patch, but unfortunately it is not
> compilable on Solaris 11:
> In file included from
> /gcc_nightly/build/sparcv9/sparcv9-sun-solaris2.11/libstdc++-v3/include/
> sparcv9-sun-solaris2.11/bits/c++config.h:568,
>                  from
> /gcc_nightly/build/sparcv9/sparcv9-sun-solaris2.11/libstdc++-v3/include/
> cassert:43,
>                  from
> /gcc_nightly/gcc-11.0.0-master/libstdc++-v3/include/precompiled/stdc++.h:33:
> /gcc_nightly/build/sparcv9/sparcv9-sun-solaris2.11/libstdc++-v3/include/
> mutex:861:10: error: token "{" is not valid in preprocessor expressions
>   861 |       || _GLIBCXX_COMPAT_PTHREAD_ONCE_T
>       |          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> In file included from
> /gcc_nightly/build/sparcv9/sparcv9-sun-solaris2.11/libstdc++-v3/include/
> sparcv9-sun-solaris2.11/bits/c++config.h:568,
>                  from
> /gcc_nightly/build/sparcv9/sparcv9-sun-solaris2.11/libstdc++-v3/include/
> cassert:43,
>                  from
> /gcc_nightly/gcc-11.0.0-master/libstdc++-v3/include/precompiled/stdc++.h:33:
> /gcc_nightly/build/sparcv9/sparcv9-sun-solaris2.11/libstdc++-v3/include/
> mutex:861:10: error: token "{" is not valid in preprocessor expressions
>   861 |       || _GLIBCXX_COMPAT_PTHREAD_ONCE_T
>       |          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Oops, that should be:

                || defined _GLIBCXX_COMPAT_PTHREAD_ONCE_T

to test if the macro is defined, rather than testing its value.



> In file included from
> /gcc_nightly/build/sparcv9/sparcv9-sun-solaris2.11/libstdc++-v3/include/
> future:38,
>                  from
> /gcc_nightly/gcc-11.0.0-master/libstdc++-v3/include/precompiled/stdc++.h:105:
> /gcc_nightly/build/sparcv9/sparcv9-sun-solaris2.11/libstdc++-v3/include/
> mutex:675:15: error: explicitly defaulted function 'constexpr
> std::once_flag::once_flag()' cannot be declared 'constexpr' because the
> implicit declaration is not 'constexpr':
>   675 |     constexpr once_flag() noexcept = default;
>       |               ^~~~~~~~~
> /gcc_nightly/build/sparcv9/sparcv9-sun-solaris2.11/libstdc++-v3/include/
> mutex:699:12: note: defaulted constructor calls non-'constexpr'
> 'std::once_flag::_Once::_Once()'
>   699 |     struct _Once _GLIBCXX_COMPAT_PTHREAD_ONCE_T;

This one's a bit trickier. The struct needs a default constructor that initializes all its members, but then it can't be used in the union.

I'll provide a new patch when I get a chance to look into it.
Comment 41 Jonathan Wakely 2021-02-15 13:03:03 UTC
The new std::call_once using a futex is not backwards compatible, so I think it needs to be reverted, or hidden behind an ABI-breaking flag.

The new std::once_flag::_M_activate() function sets _M_once=1 when an initialization is in progress.

With glibc, if a call to the new std::call_once happens before a call to the old version of std::call_once, then glibc's pthread_once will find no fork generation value in _M_once and so will think it should run the init_function itself. Both threads will run their init_function, instead of the second one waiting for the first to finish.

With musl, if a call to the old std::call_once happens before a call to the new std::call_once, then the second thread won't set _M_once=3 and so musl's pthread_once won't wake the second thread when the first finishes. The second thread will sleep forever (or until a spurious wake from the futex wait).
Comment 42 Rich Felker 2021-02-15 14:13:30 UTC
I'm confused why this is an ABI boundary at all. Was the old implementation of std::call_once being inlined into callers? Otherwise all code operating on the same once object should be using a common implementation, either the old one or the new one, from libstdc++.
Comment 43 Jonathan Wakely 2021-02-15 15:00:22 UTC
(In reply to Rich Felker from comment #42)
> I'm confused why this is an ABI boundary at all. Was the old implementation
> of std::call_once being inlined into callers?

Yes, it's a function template:

https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=libstdc%2B%2B-v3/include/std/mutex;h=12b7e548d179c3a2cb0ed65b6e113031f11293f6;hb=ee5c3db6c5b2c3332912fb4c9cfa2864569ebd9a#l710

The call to __gthread_once (which is a weak alias for pthread_once) is on line 729.
Comment 44 Rich Felker 2021-02-15 17:08:05 UTC
Uhg. I don't know what kind of retroactive fix for that is possible, if any, but going forward this kind of thing (assumptions that impose ABI boundaries) should not be inlined by the template. It should just expand to an external call so that the implementation details can be kept as implementation details and changed as needed.
Comment 45 Florian Weimer 2021-02-15 17:10:27 UTC
Statically linking libstdc++ into shared objects is also not too uncommon.  With luck, the libstdc++ symbols are hidden, but operating on globally shared across multiple libstdc++s exposes similar issues even without inlining.
Comment 46 Rich Felker 2021-02-15 17:30:49 UTC
It's a standard and completely reasonable assumption that, if you statically linked libstdc++ into your shared library, the copy there is for *internal use only* and cannot share objects of the standard library's types across boundaries with other libraries or the main application. The problem only comes when the library's implementation (via templates or inline code in headers) imposes the same requirement on normal dynamic linking, where it's a nonstandard and unreasonable one.
Comment 47 CVS Commits 2021-03-16 12:39:48 UTC
The master branch has been updated by Jonathan Wakely <redi@gcc.gnu.org>:

https://gcc.gnu.org/g:6ee24638ed0ad51e568c799bacf149ba9bd7628b

commit r11-7688-g6ee24638ed0ad51e568c799bacf149ba9bd7628b
Author: Jonathan Wakely <jwakely@redhat.com>
Date:   Fri Mar 12 11:47:20 2021 +0000

    libstdc++: Revert to old std::call_once implementation [PR 99341]
    
    The new std::call_once implementation is not backwards compatible,
    contrary to my intention. Because std::once_flag::_M_active() doesn't
    write glibc's "fork generation" into the pthread_once_t object, it's
    possible for glibc and libstdc++ to run two active executions
    concurrently. This violates the primary invariant of the feature!
    
    This patch reverts std::once_flag and std::call_once to the old
    implementation that uses pthread_once. This means PR 66146 is a problem
    again, but glibc has been changed to solve that. A new API similar to
    pthread_once but supporting failure and resetting the pthread_once_t
    will be proposed for inclusion in glibc and other C libraries.
    
    This change doesn't simply revert r11-4691 because I want to retain the
    new implementation for non-ghtreads targets (which didn't previously
    support std::call_once at all, so there's no backwards compatibility
    concern). This also leaves the new std::call_once::_M_activate() and
    std::call_once::_M_finish(bool) symbols present in libstdc++.so.6 so
    that code already compiled against GCC 11 can still use them. Those
    symbols will be removed in a subsequent commit (which distros can choose
    to temporarily revert if needed).
    
    libstdc++-v3/ChangeLog:
    
            PR libstdc++/99341
            * include/std/mutex [_GLIBCXX_HAVE_LINUX_FUTEX] (once_flag):
            Revert to pthread_once_t implementation.
            [_GLIBCXX_HAVE_LINUX_FUTEX] (call_once): Likewise.
            * src/c++11/mutex.cc [_GLIBCXX_HAVE_LINUX_FUTEX]
            (struct __once_flag_compat): New type matching the reverted
            implementation of once_flag using futexes.
            (once_flag::_M_activate): Remove, replace with ...
            (_ZNSt9once_flag11_M_activateEv): ... alias symbol.
            (once_flag::_M_finish): Remove, replace with ...
            (_ZNSt9once_flag9_M_finishEb): ... alias symbol.
            * testsuite/30_threads/call_once/66146.cc: Removed.
Comment 48 cqwrteur 2021-04-07 06:22:30 UTC
Created attachment 50518 [details]
after this update, my clang does not work any more

Jonathan. why?
Any solutions?
Comment 49 Jonathan Wakely 2021-04-07 10:03:36 UTC
Looks like you didn't rebuild something properly. The __once_functor symbol should not have changed at all.
Comment 50 Jakub Jelinek 2021-04-27 11:37:45 UTC
GCC 11.1 has been released, retargeting bugs to GCC 11.2.
Comment 51 Jonathan Wakely 2022-02-10 23:20:53 UTC
*** Bug 104495 has been marked as a duplicate of this bug. ***
Comment 52 Jonathan Wakely 2022-02-10 23:29:12 UTC
*** Bug 104495 has been marked as a duplicate of this bug. ***
Comment 53 LIU Hao 2022-12-02 01:25:14 UTC
Why not implement `call_once` with `__cxa_guard_{acquire,release,abort}`?

It's basically

  ```
  template<typename _Callable, typename... _Args>
  void
  call_once(once_flag& __once, _Callable&& __callable, _Args&&... __args)
    {
      int __r = ::__cxa_guard_acquire(&__once);
      if(__r == 0)
        return;  // passive

      __try {
        _INVOKE(forward<_Callable>(__callable), forward<_Args>(__args)...);
      }
      __catch(...) {
        ::__cxa_guard_abort(&__once);  // exceptional
        __throw_exception_again;
      }
      ::__cxa_guard_release(&__once);  // returning
    }
  ```
Comment 54 Jonathan Wakely 2022-12-03 09:26:45 UTC
Because it would be an ABI break. It's a good option if backwards compatibility is not required, I've suggested it before.