Bug 66146 - call_once not C++11-compliant on ppc64le
Summary: call_once not C++11-compliant on ppc64le
Status: ASSIGNED
Alias: None
Product: gcc
Classification: Unclassified
Component: libstdc++ (show other bugs)
Version: 5.1.0
: P3 normal
Target Milestone: 10.0
Assignee: Jonathan Wakely
URL:
Keywords:
: 70298 71025 78184 (view as bug list)
Depends on:
Blocks:
 
Reported: 2015-05-14 18:31 UTC by Andrey V
Modified: 2019-05-03 20:38 UTC (History)
13 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2015-05-15 00:00:00


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Andrey V 2015-05-14 18:31:33 UTC
std::call_once is not C++11 (or even N2447) compliant on ppc64le.

According to N2447, "If the invocation of func results in an exception being thrown, the exception is propagated to the caller and the effects are as-if this invocation of call_once did not occur."

Given the code

#include <mutex>

int call_count = 0;
void func_() {
printf("Inside func_ call_count %d\n", call_count);
if (++call_count < 2)
throw 0;
}
int main() {
std::once_flag flag_;
printf("Before calling call_once flag_: %d\n", *(int*)&flag_);
try { std::call_once(flag_, func_); } catch(...) { printf("Inside catch all excepton flag_: %d\n", *(int*)&flag_); }
printf("before the 2nd call to call_once flag_: %d\n", *(int*)&flag_);
std::call_once(flag_, func_);
}

the output on amd64 is
Before calling call_once flag_: 0
Inside func_ call_count 0
Inside catch all excepton flag_: 0
before the 2nd call to call_once flag_: 0

but on ppc64el is
Before calling call_once flag_: 0
Inside func_ call_count 0
Inside catch all excepton flag_: 1
before the 2nd call to call_once flag_: 1

amd64 has the correct behavior.

ppc64le gcc tested:
$gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/powerpc64le-linux-gnu/4.9/lto-wrapper
Target: powerpc64le-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu 4.9.1-16ubuntu6' --with-bugurl=file:///usr/share/doc/gcc-4.9/README.Bugs --enable-languages=c,c++,java,go,d,fortran,objc,obj-c++ --prefix=/usr --program-suffix=-4.9 --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --with-gxx-include-dir=/usr/include/c++/4.9 --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-gnu-unique-object --disable-libsanitizer --disable-libquadmath --enable-plugin --with-system-zlib --disable-browser-plugin --enable-java-awt=gtk --enable-gtk-cairo --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-4.9-ppc64el/jre --enable-java-home --with-jvm-root-dir=/usr/lib/jvm/java-1.5.0-gcj-4.9-ppc64el --with-jvm-jar-dir=/usr/lib/jvm-exports/java-1.5.0-gcj-4.9-ppc64el --with-arch-directory=ppc64 --with-ecj-jar=/usr/share/java/eclipse-ecj.jar --enable-objc-gc --enable-secureplt --with-cpu=power7 --with-tune=power8 --disable-multilib --enable-multiarch --disable-werror --with-long-double-128 --enable-checking=release --build=powerpc64le-linux-gnu --host=powerpc64le-linux-gnu --target=powerpc64le-linux-gnu
Thread model: posix
gcc version 4.9.1 (Ubuntu 4.9.1-16ubuntu6) 

$gcc-5.1 -v
Using built-in specs.
COLLECT_GCC=/home/andreyv/nfs/gcc/bin/gcc-5.1
COLLECT_LTO_WRAPPER=/nfs/home/andreyv/gcc/bin/../libexec/gcc/powerpc64le-unknown-linux-gnu/5.1.0/lto-wrapper
Target: powerpc64le-unknown-linux-gnu
Configured with: ../gcc-5.1.0/configure --prefix=/nfs/home/andreyv/gcc/ --program-suffix=-5.1 --enable-languages=c,c++,lto CFLAGS='-O2 -mtune=native' CXXFLAGS='-O2 -mtune=native'
Thread model: posix
gcc version 5.1.0 (GCC) 

Both versions compiled and run on Ubuntu 14.10 on a Power8.
gcc-5.1 invocation: $ ~/nfshome/gcc/bin/g++-5.1 -std=c++11 -lpthread -Wl,-rpath=~/nfshome/gcc/lib64 -Wall -Wextra a.cc
gcc-4.9 invocation: g++ -std=c++11 -lpthread -Wall -Wextra a.cc
Comment 1 Jonathan Wakely 2015-05-15 08:55:28 UTC
This has nothing to do with libstdc++, the std::call_once code is identical on x86 and ppc, and you get the same behaviour with pthreads, so it should be reported to glibc instead.

#include <stdio.h>
#include <pthread.h>

pthread_once_t flag_ = PTHREAD_ONCE_INIT;

int call_count = 0;
extern "C" void func_() {
  printf("Inside func_ call_count %d\n", call_count);
  if (++call_count < 2)
    throw 0;
}
int main() {
  printf("Before calling call_once flag_: %d\n", *(int*)&flag_);
  try {
    pthread_once(&flag_, func_);
  } catch(...) {
    printf("Inside catch all excepton flag_: %d\n", *(int*)&flag_);
  }
  printf("before the 2nd call to call_once flag_: %d\n", *(int*)&flag_);
  pthread_once(&flag_, func_);
}
Comment 2 Andrey V 2015-05-15 15:12:54 UTC
Replacing throw/try/catch with longjmp/setjmp for non-returning function exit, like so:

#include <stdio.h>
#include <pthread.h>
#include <setjmp.h>

pthread_once_t flag_ = PTHREAD_ONCE_INIT;

int call_count = 0;
jmp_buf catch_;

void func_() {
  printf("Inside func_ call_count %d\n", call_count);
  if (++call_count < 2)
	longjmp(catch_, 1);
}
int main() {
  int signo;
  printf("Before calling call_once flag_: %d\n", *(int*)&flag_);
  signo = setjmp(catch_);
  if (!signo)
    pthread_once(&flag_, func_);
  else
    printf("Inside catch all excepton flag_: %d\n", *(int*)&flag_);
  printf("before the 2nd call to call_once flag_: %d\n", *(int*)&flag_);
  pthread_once(&flag_, func_);
}

, the invalid behavior isn't triggered.
It appears that pthread_once doesn't take kindly to exceptions in the once_func.
Comment 3 Martin Sebor 2015-05-15 17:40:53 UTC
On Power, both glibc and AIX pthread_once behave the same way: i.e., they fail to clear the once flag on exception.  The test case below mimics glibc's pthread_once and demonstrates the root cause of the problem: the cancellation handler is not invoked when an exception is thrown unless the pthread_cleanup_push/pop macros are compiled in C++ code. Simply recompiling glibc's pthread_once.c using a C++ compiler (and adding the apprpriate extern "C" decoration) should fix it.

Until it's fixed, as a workaround, it seems that libstdc++ could clear the flag when an exception is thrown before propagating it out of call_once.

$ cat t.c && gcc -O2 -Wall -c -fasynchronous-unwind-tables -g t.c && g++ -DMAIN -O2 -Wall t.o -pthread t.c && ./a.out 
#include <pthread.h>
#include <stdio.h>

extern int n;

#if MAIN

extern "C" void foo () { throw 0; }
extern "C" void bar (void (*)());

int main () {
    try {
        bar (foo);
    }
    catch (...) {
        printf ("caught exception: pthread cleanup handler %sinvoked\n",
                n ? "" : "not ");
    }

    return n == 1 ? 0 : 1;
}

#else

int n;
#if __cplusplus
extern "C" {
#endif

static void cleanup (void *arg) { ++n; }

void bar (void (*pf)(void)) {
    pthread_cleanup_push (cleanup, 0);
    pf ();
    pthread_cleanup_pop (0);
}

#if __cplusplus
}
#endif

#endif
caught exception: pthread cleanup handler not invoked
Comment 4 Jonathan Wakely 2015-05-15 20:31:09 UTC
Thanks, Martin. So maybe something like this:

--- a/libstdc++-v3/include/std/mutex
+++ b/libstdc++-v3/include/std/mutex
@@ -726,7 +738,17 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
       __set_once_functor_lock_ptr(&__functor_lock);
 #endif
 
+#ifdef __powerpc__
+      int __e;
+      __try {
+       __e = __gthread_once(&__once._M_once, &__once_proxy);
+      } __catch(...) {
+         __once._M_once = once_flag{}._M_once;
+         __throw_exception_again;
+      }
+#else
       int __e = __gthread_once(&__once._M_once, &__once_proxy);
+#endif
 
 #ifndef _GLIBCXX_HAVE_TLS
       if (__functor_lock)
Comment 5 Andrew Pinski 2015-05-15 21:14:08 UTC
(In reply to Jonathan Wakely from comment #4)
> Thanks, Martin. So maybe something like this:

This happens on almost all non-x86 machines.  I tested it on aarch64 and we had the same failure as powerpc.
Comment 6 Andrey V 2015-05-19 17:36:23 UTC
Same failure on s390x.
Comment 7 Martin Sebor 2015-05-20 00:11:30 UTC
I opened http://sourceware.org/bugzilla/show_bug.cgi?id=18435 for the glibc bug and attached a lightly tested patch to it.
Comment 8 Jonathan Wakely 2015-05-20 14:12:42 UTC
NetBSD 5 and DragonFly BSD fail the test too. I'm going to make libstdc++ assume pthread_once is not exception-aware unless specifically told otherwise for targets where we know it works, such as x86-linux and (hopefully) glibc > 2.21
Comment 9 Rich Felker 2015-05-20 14:35:56 UTC
This is not portable usage of pthread_once. The longjmp based version is clearly wrong, per the resolution of http://austingroupbugs.net/view.php?id=863 and while POSIX has nothing to say about C++, the lack of any text forbidding the analogous C++ construct does not make a requirement to support it.

FYI, this type of usage is not supported, and impossible to support, in musl's implementation of pthread_once (which does not interact with exceptions/unwinding, only with thread cancellation).

There's a slim chance the C11 call_once function could be used, but I think the way it's specified also requires this kind of deadlock if you leave the init routine via any means but returning, and it's not widely supported yet anyway.

I think the right solution is to use atomics directly to implement std::call_once and only fallback to pthread sync primitives in the contended case. As long as you don't pass arbitrary C++ callback functions provided by the caller to pthread_once, it should be safe to use any pthread functions.
Comment 10 Jakub Jelinek 2015-05-20 14:46:37 UTC
In glibc, clear_once_control should clear the pthread_once_t var on throw, wonder why it doesn't work on ppc*, seems like pthread_once.c is correctly compiled with -fexceptions.
If musl doesn't want to support it, it is its own bad choice.
Comment 11 Jonathan Wakely 2015-05-20 14:46:58 UTC
(In reply to Jonathan Wakely from comment #4)
> Thanks, Martin. So maybe something like this:
> 
> --- a/libstdc++-v3/include/std/mutex
> +++ b/libstdc++-v3/include/std/mutex
> @@ -726,7 +738,17 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>        __set_once_functor_lock_ptr(&__functor_lock);
>  #endif
>  
> +#ifdef __powerpc__
> +      int __e;
> +      __try {
> +       __e = __gthread_once(&__once._M_once, &__once_proxy);
> +      } __catch(...) {
> +         __once._M_once = once_flag{}._M_once;

N.B. This won't work because it races, and may not even compile (if e.g. pthread_once_t has a const member)
Comment 12 Rich Felker 2015-05-20 14:51:53 UTC
Jakub, this is not the place to discuss the pros and cons of musl or other particular implementations; libstdc++ needs to support many which do not have the glibc-specific semantics you want. In particular there are plenty of BSDs the current code is broken on too. If you want to use pthread_once directly to implement std::call_once on glibc, that's your business, but there needs to be a portable implementation that works on other systems.
Comment 13 nsz 2016-03-18 15:28:07 UTC
*** Bug 70298 has been marked as a duplicate of this bug. ***
Comment 14 nsz 2016-03-18 15:37:49 UTC
call_once is implemented on top of pthread_once in libstdc++
but pthead_once is not exception safe, only cancellation safe
and it cannot be both at the same time in glibc with the current
cancellation implementation (although there is ongoing work
to fix that).

in general this is problematic because it relies on an interface
contract that pthread_once does not provide, so libstdc++ should
fix it even if glibc makes pthread_once exception safe.

(i was told that the underlying implementation is visible through
"native handles" so a pthread-independent call_once implementation
is problematic, this sounds to me like a c++ defect, such handles
probably should not be exposed by libstdc++, it will break whenever
there is divergence between posix and c++ requirements.. and there
are plenty of that already.)
Comment 15 Jonathan Wakely 2016-03-18 15:51:21 UTC
The native handles are not required by the standard, whether they exist or not is implementation-defined. If they're not supportable we'll remove them, that's not a problem.
Comment 16 Anthony Williams 2016-03-18 16:14:53 UTC
Would it be worth ignoring pthread_once and using an implementation of call_once based on Mike Burrows' algorithm?

http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2444.html
Comment 17 Jonathan Wakely 2016-03-18 16:36:19 UTC
Yes, I think we need to do that, because even if glibc were to change that wouldn't help on non-gnu targets.
Comment 18 Jonathan Wakely 2016-05-09 18:31:34 UTC
*** Bug 71025 has been marked as a duplicate of this bug. ***
Comment 19 Jonathan Wakely 2016-11-07 22:08:29 UTC
*** Bug 78184 has been marked as a duplicate of this bug. ***
Comment 20 Jonathan Wakely 2018-02-28 13:16:53 UTC
(In reply to Anthony Williams from comment #16)
> Would it be worth ignoring pthread_once and using an implementation of
> call_once based on Mike Burrows' algorithm?
> 
> http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2444.html

That seems to work for pthread_once but not std::call_once, due to this assumption:

  The number of pthread_once_t variables in a programme is bounded and can be
  counted in an instance of type T without overflow. If T is 32-bits, this
  implementation requires that the number of pthread_once_t variables in a
  programme not exceed 2**31-3. (Recall that pthread_once_t variables are
  intended to have static storage class, so it would be remarkable for such
  a huge number to exist.) 

Nothing prevents an unbounded number of std::once_flag objects in C++:

#include <mutex>

int main()
{
  while (true)
  {
    std::once_flag f;
    std::call_once(f, [](){});
  }
}
Comment 21 Jon Cohen 2018-06-05 15:49:07 UTC
What is the current status on this bug?  I'm currently working on exception safety for Abseil's call_once which uses pthread_once under the hood.  It looks like there are still people bringing up issue with this in the glibc thread as well -- is this still actively being worked on?
Comment 22 Jonathan Wakely 2018-06-05 22:39:06 UTC
It still needs to be fixed, and hoping for a libc fix isn't going to work because even if we get a solution in glibc it won't work for other C libraries. So we need to reimplement our std::call_once. I'll make sure we get this done for GCC 9.
Comment 23 Jakub Jelinek 2019-05-03 09:17:24 UTC
GCC 9.1 has been released.
Comment 24 Jon Cohen 2019-05-03 15:15:33 UTC
I don't see anything in the release notes about call_once.  Is this still an open issue?
Comment 25 Jonathan Wakely 2019-05-03 20:38:05 UTC
Yes, that's why the bug is still open.

Jakub's comment accompanied changing the target milestone from 9.1 to 9.2, because it wasn't fixed for 9.1.

Changing it to 10, because it won't be done for 9.2 either.