I'm getting weird hangs on Github Actions when using `std::atomic<state_type>::compare_exchange_strong` under GCC 13 on Ubuntu 23.04 (only; GCC 12 and earlier on Ubuntu 22.04 and earlier work). `state_type` is defined as ``` struct state_type { std::uint64_t timestamp; std::uint16_t clock_seq; }; ``` and the code doing the CAS is ``` auto oldst = ps_->load( std::memory_order_relaxed ); for( ;; ) { auto newst = get_new_state( oldst ); if( ps_->compare_exchange_strong( oldst, newst, std::memory_order_relaxed, std::memory_order_relaxed ) ) { state_ = newst; break; } } ``` where `ps` is of type `std::atomic<state_type>*`. At a glance, I see nothing immediately wrong with the generated code (https://godbolt.org/z/8Ee3hrTz8). However, when I change `state_type` to ``` struct state_type { std::uint64_t timestamp; std::uint16_t clock_seq; std::uint16_t padding[ 3 ]; }; ``` the hangs disappear. This leads me to think that the problem is caused by the original struct having padding, which isn't being handled correctly for some reason. As we know, `std::atomic<T>::compare_exchange_strong` is carefully specified to take and return `expected` by reference, such that it can both compare the entire object as if via `memcmp` (including the padding), and return it as if by `memcpy`, again including the padding. Even though the padding bits of the initial value returned by the atomic load are unspecified, at most one iteration of the loop would be required for the padding bits to converge and for the CAS to succeed. However, going by the symptoms alone, this doesn't seem to be the case here. The problem may well be inside libatomic, of course; I have no way to tell. One GHA run showing the issue is https://github.com/boostorg/uuid/actions/runs/8821753835, where only the GCC 13 job times out.
> The problem may well be inside libatomic, of course; I have no way to tell. Narrator: but he did, in fact, have a way to tell. This is a GHA run with GCC 9 to 13 tested on both Ubuntu 23.04 and Ubuntu 23.10, and only GCC 13 hangs. https://github.com/boostorg/uuid/actions/runs/8852038822/job/24309866206 And indeed, the codegen from GCC 12 (https://godbolt.org/z/xc7oT76fn) is radically different from (and much simpler than) the GCC 13 one (https://godbolt.org/z/eP48xv3Mr). I think that the problem has been introduced with this commit: https://github.com/gcc-mirror/gcc/commit/157236dbd621644b3cec50b6cf38811959f3e78c which, ironically enough, was supposed to improve the handling of types with padding bits, but broke them entirely. (I told the committee there was nothing wrong with compare_exchange as specified, but did they listen to me? To ask the question is to answer it.)
The issue is also present for GCC 14 on Ubuntu 24.04: https://github.com/boostorg/uuid/actions/runs/8853249656/job/24313667955
Can you provide a full runable testcase which shows the issue rather than just code snippets? And not just link to godbolt. You can attach it. In this case we don't need the preprocesed source since it is about the library that is having issue. Please make sure it only depends on the libstdc++/libc headers too.
This https://raw.githubusercontent.com/boostorg/uuid/feature/gcc_pr_114865/test/test_gcc_pr114865.cpp exhibits the problem for me on GCC 13/14. I'm only seeing the hang with -std=c++11 -m32 in the CI run because this combination is tested first, but I believe it's independent of standard level and address model.
Created attachment 58053 [details] testcase
(In reply to Peter Dimov from comment #4) > but I believe it's independent of standard level and address model. No it is dependent on the standard level. C++11 fails but C++14, C++17 and C++20 all pass.
The problem is only with auto atomic variables. Looks like the padding there is not being zeroed for -std=c++11 ...
(In reply to Andrew Pinski from comment #6) > No it is dependent on the standard level. C++11 fails but C++14, C++17 and > C++20 all pass. That's interesting because I see basically no difference in the generated code on CE between 11 and 14: --- "C++-x86-64 gcc 13.2-1 (1).asm" 2024-04-27 03:25:11.149385400 +0300 +++ "C++-x86-64 gcc 13.2-1.asm" 2024-04-27 03:24:59.207244400 +0300 @@ -76,10 +76,11 @@ jmp .L8 main: push rbx + mov eax, 8738 mov ebx, 1024 sub rsp, 16 mov QWORD PTR [rsp], 0 - mov QWORD PTR [rsp+8], 8738 + mov WORD PTR [rsp+8], ax .L14: mov rdi, rsp call generate(std::atomic<state_type>*) (https://godbolt.org/z/nexn5W4Ph)
Oh, my mistake. C++14 does mov QWORD, and C++11 does mov WORD.
#if __cplusplus >= 201402L && __has_builtin(__builtin_clear_padding) if _GLIBCXX17_CONSTEXPR (__atomic_impl::__maybe_has_padding<_Tp>()) __builtin_clear_padding(std::__addressof(_M_i)); #endif So yes it is definitely dependent on C++ level ... That is for C++14+ it is working correctly.
So, basically, C++14 and above initialize the padding of ``` std::atomic<state_type> state{{ 0, 0x2222 }}; ``` in `main` to zero, which masks the problem in `generate`. (The problem in `generate` still exists because the assembly is identical - it just doesn't trigger because the padding is zero. If we manually poke something nonzero into the padding, it would (ought to) still break.) Static variables work for the same reason - the padding is guaranteed zero.
(In reply to Peter Dimov from comment #11) > So, basically, C++14 and above initialize the padding of > > ``` > std::atomic<state_type> state{{ 0, 0x2222 }}; > ``` > > in `main` to zero, which masks the problem in `generate`. (The problem in > `generate` still exists because the assembly is identical - it just doesn't > trigger because the padding is zero. If we manually poke something nonzero > into the padding, it would (ought to) still break.) No there is no problem in generate for C++14+ because there is no way to get the padding non-zero for that variable. The problem is only with C++11's std::atomic ... See PR 111077 and such.
(In reply to Andrew Pinski from comment #10) > #if __cplusplus >= 201402L && __has_builtin(__builtin_clear_padding) > if _GLIBCXX17_CONSTEXPR (__atomic_impl::__maybe_has_padding<_Tp>()) > __builtin_clear_padding(std::__addressof(_M_i)); > #endif > > So yes it is definitely dependent on C++ level ... > That is for C++14+ it is working correctly. Oh, that's the constructor of `atomic`. I thought it was the compiler initializing the padding in C++14 and above. I wonder why `__cplusplus >= 201402L` is here.
Because a constexpr function can't have an if statement in C++11. But maybe we should just clear padding unconditionally for C++11.
--- a/libstdc++-v3/include/std/atomic +++ b/libstdc++-v3/include/std/atomic @@ -238,8 +238,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION constexpr atomic(_Tp __i) noexcept : _M_i(__i) { -#if __cplusplus >= 201402L && __has_builtin(__builtin_clear_padding) +#if __has_builtin(__builtin_clear_padding) +#if __cplusplus >= 201402L if _GLIBCXX17_CONSTEXPR (__atomic_impl::__maybe_has_padding<_Tp>()) +#endif __builtin_clear_padding(std::__addressof(_M_i)); #endif }
Alternatively, we could skip all the padding cope in compare_exchange for C++11, since that was a C++20 change and not a DR.
(In reply to Jonathan Wakely from comment #15) > --- a/libstdc++-v3/include/std/atomic > +++ b/libstdc++-v3/include/std/atomic > @@ -238,8 +238,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION > > constexpr atomic(_Tp __i) noexcept : _M_i(__i) > { > -#if __cplusplus >= 201402L && __has_builtin(__builtin_clear_padding) > +#if __has_builtin(__builtin_clear_padding) > +#if __cplusplus >= 201402L > if _GLIBCXX17_CONSTEXPR (__atomic_impl::__maybe_has_padding<_Tp>()) > +#endif > __builtin_clear_padding(std::__addressof(_M_i)); > #endif > } Oh right, that still doesn't work in C++11: /home/jwakely/gcc/15/include/c++/15.0.0/atomic: In constructor 'std::atomic<_Tp>::atomic(_Tp)': /home/jwakely/gcc/15/include/c++/15.0.0/atomic:247:7: error: 'constexpr' constructor does not have empty body I remember hitting this when I implemented the padding stuff.
So maybe: diff --git a/libstdc++-v3/include/bits/atomic_base.h b/libstdc++-v3/include/bits/atomic_base.h index aa7374bb252..662cff39df5 100644 --- a/libstdc++-v3/include/bits/atomic_base.h +++ b/libstdc++-v3/include/bits/atomic_base.h @@ -956,7 +956,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION constexpr bool __maybe_has_padding() { -#if ! __has_builtin(__builtin_clear_padding) + // We cannot clear padding in the constructor for C++11, + // so return false here to disable all code for zeroing padding. +#if __cplusplus < 201402L || ! __has_builtin(__builtin_clear_padding) return false; #elif __has_builtin(__has_unique_object_representations) return !__has_unique_object_representations(_Tp) diff --git a/libstdc++-v3/include/std/atomic b/libstdc++-v3/include/std/atomic index 36ff89a146c..336f27832fc 100644 --- a/libstdc++-v3/include/std/atomic +++ b/libstdc++-v3/include/std/atomic @@ -238,6 +238,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION constexpr atomic(_Tp __i) noexcept : _M_i(__i) { + // A constexpr constructor must be empty in C++11, + // so we can only clear padding for C++14 and later. #if __cplusplus >= 201402L && __has_builtin(__builtin_clear_padding) if _GLIBCXX17_CONSTEXPR (__atomic_impl::__maybe_has_padding<_Tp>()) __builtin_clear_padding(std::__addressof(_M_i));
This should work. I still don't understand why JF so insisted on all these padding shenanigans.
GCC 13.3 is being released, retargeting bugs to GCC 13.4.
(In reply to Peter Dimov from comment #19) > This should work. In simple cases, yes. But if we have a mixed C++11 and C++14 (or later) codebase, it's possible for the std::atomic to be initialized in a C++11 TU, so padding is not cleared, and the CAS to be done in a C++14 TU, so it assumes padding is cleared. And so we have the same problem.
I'm not sure how to fix that problem without pessimizing every std::atomic<T>::compare_exchange_strong to use the looping implementation that std::atomic_ref<T>::compare_exchange_strong uses, or to find some C++11-friendly way to zero padding in the constexpr std::atomic<T>::atomic(T) constructor. I think the latter would require new compiler magic (maybe an attribute on the mem-initializer that says to clear padding, or something like that).
(In reply to Jonathan Wakely from comment #22) > I'm not sure how to fix that problem without pessimizing every > std::atomic<T>::compare_exchange_strong to use the looping implementation To be clear, not *every* compare exchange, only when T has padding bits. Which suggests you would get better perf by defining your type to have no padding bits: struct state_type { std::uint64_t timestamp; std::uint16_t clock_seq; char unused[6]; };
I already have https://github.com/boostorg/uuid/blob/e7f4cebe81835fd1b5558178f3d4c40ae266d8e2/include/boost/uuid/time_generator_v1.hpp#L32-L43 but this comes with its own issues (-Wmissing-field-initializers.) Do you think that it's worth trying to roll back this in the standard? Probably not because it's only a problem on C++11 and the standard doesn't care about C++11 at this point.
Probably not, although an unintentional ABI break against C++11 for something arguably unnecessary might get attention.
I think I should push the partial solution now anyway. It's better than what we do today, as at least it will work correctly for programs that don't initialize an atomic in C++11 objects and then use CAS in C++14/17/20/... objects. Whether to pessimize CAS for all types with padding can be decided later.
GCC 13.4 is being released, retargeting bugs to GCC 13.5.
The master branch has been updated by Patrick Palka <ppalka@gcc.gnu.org>: https://gcc.gnu.org/g:e4c57e146a224d0aaa71ace78f96fca1156add24 commit r16-7199-ge4c57e146a224d0aaa71ace78f96fca1156add24 Author: Patrick Palka <ppalka@redhat.com> Date: Fri Jan 30 15:25:43 2026 -0500 c++: non-empty constexpr constructor bodies in C++11 [PR123845] This patch makes us support C++14 non-empty constexpr constructor bodies in C++11, as an extension. This will make it trivial to safely fix the C++11 library regression PR114865 that requires us to do __builtin_clear_padding after initializing _M_i in std::atomic's single-parameter constructor, and that's not really possible with the C++11 constexpr restrictions. Since we lower member initializers to constructor body statements internally, and so constructor bodies are already effectively non-empty internally even in C++11, supporting non-empty bodies in user code is mostly a matter of relaxing the parse-time error. But constexpr-ex3.C revealed that by accepting the non-empty body of A's constructor, build_data_member_initialization goes on to mistake the 'i = _i' assignment as a member initializer, and we incorrectly accept the constructor in C++11 mode (even though omitting mem-inits is only valid since C++20). Turns out this is caused by that function recognizing MODIFY_EXPR only in C++11 mode, logic that was last changed by r5-5013 (presumably to limit impact of the patch at the time) but I reckon could just be removed outright. This should be safe because the result of build_data_member_initialization is only used by cx_check_missing_mem_inits for validation; evaluation is in terms of the entire lowered constructor body. PR c++/123845 PR libstdc++/114865 gcc/cp/ChangeLog: * constexpr.cc (build_data_member_initialization): Remove C++11-specific recognition of MODIFY_EXPR. (check_constexpr_ctor_body): Relax error diagnostic to a pedwarn and don't clear DECL_DECLARED_CONSTEXPR_P upon error. Return true if complaining. gcc/testsuite/ChangeLog: * g++.dg/cpp0x/constexpr-ex3.C: Adjust C++11 non-empty constexpr constructor dg-error to a dg-warning. Expect a follow-up missing member initializer diagnostic in C++11 mode. * g++.dg/cpp2a/constexpr-try1.C: Expect a follow-up compound-statement in constexpr function diagnostic in C++11 mode. * g++.dg/cpp2a/constexpr-try2.C: Likewise. Adjust C++11 non-empty constexpr constructor dg-error to a dg-warning. * g++.dg/cpp2a/constexpr-try3.C: Adjust C++11 non-empty constexpr constructor dg-error to a dg-warning. * g++.dg/cpp0x/constexpr-ctor23.C: New test. Reviewed-by: Jason Merrill <jason@redhat.com>
The master branch has been updated by Patrick Palka <ppalka@gcc.gnu.org>: https://gcc.gnu.org/g:48f2e8aa6ddad72955781728bdf515eb50411d24 commit r16-7471-g48f2e8aa6ddad72955781728bdf515eb50411d24 Author: Patrick Palka <ppalka@redhat.com> Date: Wed Feb 11 21:35:21 2026 -0500 libstdc++: Clear padding bits in std::atomic ctor in C++11 [PR114865] After the front end change r16-7199 both GCC and Clang allow non-empty constexpr constructor bodies in C++11 as an extension, so we can now unconditionally enable the __builtin_clear_padding logic in std::atomic's constructor. PR libstdc++/114865 libstdc++-v3/ChangeLog: * include/std/atomic (atomic<_Tp>::atomic(_Tp)) [C++11]: Enable __builtin_clear_padding logic. * testsuite/29_atomics/atomic/compare_exchange_padding.cc: Enable this test in earlier modes, including C++11. * testsuite/29_atomics/atomic/cons/zero_padding.cc [C++11]: Enable tests verifying cleared padding bits for a non-static-init std::atomic object. Reviewed-by: Tomasz KamiÅski <tkaminsk@redhat.com> Reviewed-by: Jonathan Wakely <jwakely@redhat.com>