I'm getting weird hangs on Github Actions when using `std::atomic<state_type>::compare_exchange_strong` under GCC 13 on Ubuntu 23.04 (only; GCC 12 and earlier on Ubuntu 22.04 and earlier work). `state_type` is defined as ``` struct state_type { std::uint64_t timestamp; std::uint16_t clock_seq; }; ``` and the code doing the CAS is ``` auto oldst = ps_->load( std::memory_order_relaxed ); for( ;; ) { auto newst = get_new_state( oldst ); if( ps_->compare_exchange_strong( oldst, newst, std::memory_order_relaxed, std::memory_order_relaxed ) ) { state_ = newst; break; } } ``` where `ps` is of type `std::atomic<state_type>*`. At a glance, I see nothing immediately wrong with the generated code (https://godbolt.org/z/8Ee3hrTz8). However, when I change `state_type` to ``` struct state_type { std::uint64_t timestamp; std::uint16_t clock_seq; std::uint16_t padding[ 3 ]; }; ``` the hangs disappear. This leads me to think that the problem is caused by the original struct having padding, which isn't being handled correctly for some reason. As we know, `std::atomic<T>::compare_exchange_strong` is carefully specified to take and return `expected` by reference, such that it can both compare the entire object as if via `memcmp` (including the padding), and return it as if by `memcpy`, again including the padding. Even though the padding bits of the initial value returned by the atomic load are unspecified, at most one iteration of the loop would be required for the padding bits to converge and for the CAS to succeed. However, going by the symptoms alone, this doesn't seem to be the case here. The problem may well be inside libatomic, of course; I have no way to tell. One GHA run showing the issue is https://github.com/boostorg/uuid/actions/runs/8821753835, where only the GCC 13 job times out.
> The problem may well be inside libatomic, of course; I have no way to tell. Narrator: but he did, in fact, have a way to tell. This is a GHA run with GCC 9 to 13 tested on both Ubuntu 23.04 and Ubuntu 23.10, and only GCC 13 hangs. https://github.com/boostorg/uuid/actions/runs/8852038822/job/24309866206 And indeed, the codegen from GCC 12 (https://godbolt.org/z/xc7oT76fn) is radically different from (and much simpler than) the GCC 13 one (https://godbolt.org/z/eP48xv3Mr). I think that the problem has been introduced with this commit: https://github.com/gcc-mirror/gcc/commit/157236dbd621644b3cec50b6cf38811959f3e78c which, ironically enough, was supposed to improve the handling of types with padding bits, but broke them entirely. (I told the committee there was nothing wrong with compare_exchange as specified, but did they listen to me? To ask the question is to answer it.)
The issue is also present for GCC 14 on Ubuntu 24.04: https://github.com/boostorg/uuid/actions/runs/8853249656/job/24313667955
Can you provide a full runable testcase which shows the issue rather than just code snippets? And not just link to godbolt. You can attach it. In this case we don't need the preprocesed source since it is about the library that is having issue. Please make sure it only depends on the libstdc++/libc headers too.
This https://raw.githubusercontent.com/boostorg/uuid/feature/gcc_pr_114865/test/test_gcc_pr114865.cpp exhibits the problem for me on GCC 13/14. I'm only seeing the hang with -std=c++11 -m32 in the CI run because this combination is tested first, but I believe it's independent of standard level and address model.
Created attachment 58053 [details] testcase
(In reply to Peter Dimov from comment #4) > but I believe it's independent of standard level and address model. No it is dependent on the standard level. C++11 fails but C++14, C++17 and C++20 all pass.
The problem is only with auto atomic variables. Looks like the padding there is not being zeroed for -std=c++11 ...
(In reply to Andrew Pinski from comment #6) > No it is dependent on the standard level. C++11 fails but C++14, C++17 and > C++20 all pass. That's interesting because I see basically no difference in the generated code on CE between 11 and 14: --- "C++-x86-64 gcc 13.2-1 (1).asm" 2024-04-27 03:25:11.149385400 +0300 +++ "C++-x86-64 gcc 13.2-1.asm" 2024-04-27 03:24:59.207244400 +0300 @@ -76,10 +76,11 @@ jmp .L8 main: push rbx + mov eax, 8738 mov ebx, 1024 sub rsp, 16 mov QWORD PTR [rsp], 0 - mov QWORD PTR [rsp+8], 8738 + mov WORD PTR [rsp+8], ax .L14: mov rdi, rsp call generate(std::atomic<state_type>*) (https://godbolt.org/z/nexn5W4Ph)
Oh, my mistake. C++14 does mov QWORD, and C++11 does mov WORD.
#if __cplusplus >= 201402L && __has_builtin(__builtin_clear_padding) if _GLIBCXX17_CONSTEXPR (__atomic_impl::__maybe_has_padding<_Tp>()) __builtin_clear_padding(std::__addressof(_M_i)); #endif So yes it is definitely dependent on C++ level ... That is for C++14+ it is working correctly.
So, basically, C++14 and above initialize the padding of ``` std::atomic<state_type> state{{ 0, 0x2222 }}; ``` in `main` to zero, which masks the problem in `generate`. (The problem in `generate` still exists because the assembly is identical - it just doesn't trigger because the padding is zero. If we manually poke something nonzero into the padding, it would (ought to) still break.) Static variables work for the same reason - the padding is guaranteed zero.
(In reply to Peter Dimov from comment #11) > So, basically, C++14 and above initialize the padding of > > ``` > std::atomic<state_type> state{{ 0, 0x2222 }}; > ``` > > in `main` to zero, which masks the problem in `generate`. (The problem in > `generate` still exists because the assembly is identical - it just doesn't > trigger because the padding is zero. If we manually poke something nonzero > into the padding, it would (ought to) still break.) No there is no problem in generate for C++14+ because there is no way to get the padding non-zero for that variable. The problem is only with C++11's std::atomic ... See PR 111077 and such.
(In reply to Andrew Pinski from comment #10) > #if __cplusplus >= 201402L && __has_builtin(__builtin_clear_padding) > if _GLIBCXX17_CONSTEXPR (__atomic_impl::__maybe_has_padding<_Tp>()) > __builtin_clear_padding(std::__addressof(_M_i)); > #endif > > So yes it is definitely dependent on C++ level ... > That is for C++14+ it is working correctly. Oh, that's the constructor of `atomic`. I thought it was the compiler initializing the padding in C++14 and above. I wonder why `__cplusplus >= 201402L` is here.
Because a constexpr function can't have an if statement in C++11. But maybe we should just clear padding unconditionally for C++11.
--- a/libstdc++-v3/include/std/atomic +++ b/libstdc++-v3/include/std/atomic @@ -238,8 +238,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION constexpr atomic(_Tp __i) noexcept : _M_i(__i) { -#if __cplusplus >= 201402L && __has_builtin(__builtin_clear_padding) +#if __has_builtin(__builtin_clear_padding) +#if __cplusplus >= 201402L if _GLIBCXX17_CONSTEXPR (__atomic_impl::__maybe_has_padding<_Tp>()) +#endif __builtin_clear_padding(std::__addressof(_M_i)); #endif }
Alternatively, we could skip all the padding cope in compare_exchange for C++11, since that was a C++20 change and not a DR.
(In reply to Jonathan Wakely from comment #15) > --- a/libstdc++-v3/include/std/atomic > +++ b/libstdc++-v3/include/std/atomic > @@ -238,8 +238,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION > > constexpr atomic(_Tp __i) noexcept : _M_i(__i) > { > -#if __cplusplus >= 201402L && __has_builtin(__builtin_clear_padding) > +#if __has_builtin(__builtin_clear_padding) > +#if __cplusplus >= 201402L > if _GLIBCXX17_CONSTEXPR (__atomic_impl::__maybe_has_padding<_Tp>()) > +#endif > __builtin_clear_padding(std::__addressof(_M_i)); > #endif > } Oh right, that still doesn't work in C++11: /home/jwakely/gcc/15/include/c++/15.0.0/atomic: In constructor 'std::atomic<_Tp>::atomic(_Tp)': /home/jwakely/gcc/15/include/c++/15.0.0/atomic:247:7: error: 'constexpr' constructor does not have empty body I remember hitting this when I implemented the padding stuff.
So maybe: diff --git a/libstdc++-v3/include/bits/atomic_base.h b/libstdc++-v3/include/bits/atomic_base.h index aa7374bb252..662cff39df5 100644 --- a/libstdc++-v3/include/bits/atomic_base.h +++ b/libstdc++-v3/include/bits/atomic_base.h @@ -956,7 +956,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION constexpr bool __maybe_has_padding() { -#if ! __has_builtin(__builtin_clear_padding) + // We cannot clear padding in the constructor for C++11, + // so return false here to disable all code for zeroing padding. +#if __cplusplus < 201402L || ! __has_builtin(__builtin_clear_padding) return false; #elif __has_builtin(__has_unique_object_representations) return !__has_unique_object_representations(_Tp) diff --git a/libstdc++-v3/include/std/atomic b/libstdc++-v3/include/std/atomic index 36ff89a146c..336f27832fc 100644 --- a/libstdc++-v3/include/std/atomic +++ b/libstdc++-v3/include/std/atomic @@ -238,6 +238,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION constexpr atomic(_Tp __i) noexcept : _M_i(__i) { + // A constexpr constructor must be empty in C++11, + // so we can only clear padding for C++14 and later. #if __cplusplus >= 201402L && __has_builtin(__builtin_clear_padding) if _GLIBCXX17_CONSTEXPR (__atomic_impl::__maybe_has_padding<_Tp>()) __builtin_clear_padding(std::__addressof(_M_i));
This should work. I still don't understand why JF so insisted on all these padding shenanigans.
GCC 13.3 is being released, retargeting bugs to GCC 13.4.