Bug 114865 - [13/14/15/16 Regression] std::atomic<X>::compare_exchange_strong seems to hang under GCC 13 for C++11
Summary: [13/14/15/16 Regression] std::atomic<X>::compare_exchange_strong seems to han...
Status: NEW
Alias: None
Product: gcc
Classification: Unclassified
Component: libstdc++ (show other bugs)
Version: 13.2.0
: P2 normal
Target Milestone: 13.5
Assignee: Not yet assigned to anyone
URL:
Keywords: wrong-code
Depends on: 123845
Blocks:
  Show dependency treegraph
 
Reported: 2024-04-26 15:50 UTC by Peter Dimov
Modified: 2026-02-12 02:36 UTC (History)
2 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2024-08-23 00:00:00


Attachments
testcase (538 bytes, text/plain)
2024-04-27 00:16 UTC, Andrew Pinski
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Peter Dimov 2024-04-26 15:50:28 UTC
I'm getting weird hangs on Github Actions when using `std::atomic<state_type>::compare_exchange_strong` under GCC 13 on Ubuntu 23.04 (only; GCC 12 and earlier on Ubuntu 22.04 and earlier work). `state_type` is defined as

```
struct state_type
{
    std::uint64_t timestamp;
    std::uint16_t clock_seq;
};
```

and the code doing the CAS is

```
auto oldst = ps_->load( std::memory_order_relaxed );

for( ;; )
{
    auto newst = get_new_state( oldst );

    if( ps_->compare_exchange_strong( oldst, newst, std::memory_order_relaxed, std::memory_order_relaxed ) )
    {
        state_ = newst;
        break;
    }
}
```

where `ps` is of type `std::atomic<state_type>*`.

At a glance, I see nothing immediately wrong with the generated code (https://godbolt.org/z/8Ee3hrTz8).

However, when I change `state_type` to

```
struct state_type
{
    std::uint64_t timestamp;
    std::uint16_t clock_seq;
    std::uint16_t padding[ 3 ];
};
```
the hangs disappear. This leads me to think that the problem is caused by the original struct having padding, which isn't being handled correctly for some reason.

As we know, `std::atomic<T>::compare_exchange_strong` is carefully specified to take and return `expected` by reference, such that it can both compare the entire object as if via `memcmp` (including the padding), and return it as if by `memcpy`, again including the padding. Even though the padding bits of the initial value returned by the atomic load are unspecified, at most one iteration of the loop would be required for the padding bits to converge and for the CAS to succeed.

However, going by the symptoms alone, this doesn't seem to be the case here.

The problem may well be inside libatomic, of course; I have no way to tell.

One GHA run showing the issue is https://github.com/boostorg/uuid/actions/runs/8821753835, where only the GCC 13 job times out.
Comment 1 Peter Dimov 2024-04-26 17:47:10 UTC
> The problem may well be inside libatomic, of course; I have no way to tell.

Narrator: but he did, in fact, have a way to tell.

This is a GHA run with GCC 9 to 13 tested on both Ubuntu 23.04 and Ubuntu 23.10, and only GCC 13 hangs.

https://github.com/boostorg/uuid/actions/runs/8852038822/job/24309866206

And indeed, the codegen from GCC 12 (https://godbolt.org/z/xc7oT76fn) is radically different from (and much simpler than) the GCC 13 one (https://godbolt.org/z/eP48xv3Mr).

I think that the problem has been introduced with this commit:

https://github.com/gcc-mirror/gcc/commit/157236dbd621644b3cec50b6cf38811959f3e78c

which, ironically enough, was supposed to improve the handling of types with padding bits, but broke them entirely.

(I told the committee there was nothing wrong with compare_exchange as specified, but did they listen to me? To ask the question is to answer it.)
Comment 2 Peter Dimov 2024-04-26 19:40:03 UTC
The issue is also present for GCC 14 on Ubuntu 24.04:

https://github.com/boostorg/uuid/actions/runs/8853249656/job/24313667955
Comment 3 Andrew Pinski 2024-04-26 23:16:43 UTC
Can you provide a full runable testcase which shows the issue rather than just code snippets? 

And not just link to godbolt. You can attach it. In this case we don't need the preprocesed source since it is about the library that is having issue. Please make sure it only depends on the libstdc++/libc headers too.
Comment 4 Peter Dimov 2024-04-27 00:13:16 UTC
This

https://raw.githubusercontent.com/boostorg/uuid/feature/gcc_pr_114865/test/test_gcc_pr114865.cpp

exhibits the problem for me on GCC 13/14. I'm only seeing the hang with -std=c++11 -m32 in the CI run because this combination is tested first, but I believe it's independent of standard level and address model.
Comment 5 Andrew Pinski 2024-04-27 00:16:18 UTC
Created attachment 58053 [details]
testcase
Comment 6 Andrew Pinski 2024-04-27 00:22:26 UTC
(In reply to Peter Dimov from comment #4)
>  but I believe it's independent of standard level and address model.

No it is dependent on the standard level. C++11 fails but C++14, C++17 and C++20 all pass.
Comment 7 Andrew Pinski 2024-04-27 00:25:53 UTC
The problem is only with auto atomic variables. Looks like the padding there is not being zeroed for -std=c++11 ...
Comment 8 Peter Dimov 2024-04-27 00:29:45 UTC
(In reply to Andrew Pinski from comment #6)
> No it is dependent on the standard level. C++11 fails but C++14, C++17 and
> C++20 all pass.

That's interesting because I see basically no difference in the generated code on CE between 11 and 14:

--- "C++-x86-64 gcc 13.2-1 (1).asm"     2024-04-27 03:25:11.149385400 +0300
+++ "C++-x86-64 gcc 13.2-1.asm" 2024-04-27 03:24:59.207244400 +0300
@@ -76,10 +76,11 @@
         jmp     .L8
 main:
         push    rbx
+        mov     eax, 8738
         mov     ebx, 1024
         sub     rsp, 16
         mov     QWORD PTR [rsp], 0
-        mov     QWORD PTR [rsp+8], 8738
+        mov     WORD PTR [rsp+8], ax
 .L14:
         mov     rdi, rsp
         call    generate(std::atomic<state_type>*)

(https://godbolt.org/z/nexn5W4Ph)
Comment 9 Peter Dimov 2024-04-27 00:31:23 UTC
Oh, my mistake. C++14 does mov QWORD, and C++11 does mov WORD.
Comment 10 Andrew Pinski 2024-04-27 00:35:57 UTC
#if __cplusplus >= 201402L && __has_builtin(__builtin_clear_padding)
        if _GLIBCXX17_CONSTEXPR (__atomic_impl::__maybe_has_padding<_Tp>())
          __builtin_clear_padding(std::__addressof(_M_i));
#endif

So yes it is definitely dependent on C++ level ...
That is for C++14+ it is working correctly.
Comment 11 Peter Dimov 2024-04-27 00:38:35 UTC
So, basically, C++14 and above initialize the padding of

```
    std::atomic<state_type> state{{ 0, 0x2222 }};
```

in `main` to zero, which masks the problem in `generate`. (The problem in `generate` still exists because the assembly is identical - it just doesn't trigger because the padding is zero. If we manually poke something nonzero into the padding, it would (ought to) still break.)

Static variables work for the same reason - the padding is guaranteed zero.
Comment 12 Andrew Pinski 2024-04-27 00:42:35 UTC
(In reply to Peter Dimov from comment #11)
> So, basically, C++14 and above initialize the padding of
> 
> ```
>     std::atomic<state_type> state{{ 0, 0x2222 }};
> ```
> 
> in `main` to zero, which masks the problem in `generate`. (The problem in
> `generate` still exists because the assembly is identical - it just doesn't
> trigger because the padding is zero. If we manually poke something nonzero
> into the padding, it would (ought to) still break.)

No there is no problem in generate for C++14+ because there is no way to get the padding non-zero for that variable.

The problem is only with C++11's std::atomic ...
See PR 111077 and such.
Comment 13 Peter Dimov 2024-04-27 00:42:51 UTC
(In reply to Andrew Pinski from comment #10)
> #if __cplusplus >= 201402L && __has_builtin(__builtin_clear_padding)
>         if _GLIBCXX17_CONSTEXPR (__atomic_impl::__maybe_has_padding<_Tp>())
>           __builtin_clear_padding(std::__addressof(_M_i));
> #endif
> 
> So yes it is definitely dependent on C++ level ...
> That is for C++14+ it is working correctly.

Oh, that's the constructor of `atomic`. I thought it was the compiler initializing the padding in C++14 and above.

I wonder why `__cplusplus >= 201402L` is here.
Comment 14 Jonathan Wakely 2024-05-02 11:40:10 UTC
Because a constexpr function can't have an if statement in C++11.

But maybe we should just clear padding unconditionally for C++11.
Comment 15 Jonathan Wakely 2024-05-02 11:41:50 UTC
--- a/libstdc++-v3/include/std/atomic
+++ b/libstdc++-v3/include/std/atomic
@@ -238,8 +238,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
       constexpr atomic(_Tp __i) noexcept : _M_i(__i)
       {
-#if __cplusplus >= 201402L && __has_builtin(__builtin_clear_padding)
+#if __has_builtin(__builtin_clear_padding)
+#if __cplusplus >= 201402L
        if _GLIBCXX17_CONSTEXPR (__atomic_impl::__maybe_has_padding<_Tp>())
+#endif
          __builtin_clear_padding(std::__addressof(_M_i));
 #endif
       }
Comment 16 Jonathan Wakely 2024-05-02 11:44:47 UTC
Alternatively, we could skip all the padding cope in compare_exchange for C++11, since that was a C++20 change and not a DR.
Comment 17 Jonathan Wakely 2024-05-02 11:47:14 UTC
(In reply to Jonathan Wakely from comment #15)
> --- a/libstdc++-v3/include/std/atomic
> +++ b/libstdc++-v3/include/std/atomic
> @@ -238,8 +238,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>  
>        constexpr atomic(_Tp __i) noexcept : _M_i(__i)
>        {
> -#if __cplusplus >= 201402L && __has_builtin(__builtin_clear_padding)
> +#if __has_builtin(__builtin_clear_padding)
> +#if __cplusplus >= 201402L
>         if _GLIBCXX17_CONSTEXPR (__atomic_impl::__maybe_has_padding<_Tp>())
> +#endif
>           __builtin_clear_padding(std::__addressof(_M_i));
>  #endif
>        }

Oh right, that still doesn't work in C++11:

/home/jwakely/gcc/15/include/c++/15.0.0/atomic: In constructor 'std::atomic<_Tp>::atomic(_Tp)':
/home/jwakely/gcc/15/include/c++/15.0.0/atomic:247:7: error: 'constexpr' constructor does not have empty body

I remember hitting this when I implemented the padding stuff.
Comment 18 Jonathan Wakely 2024-05-02 11:55:26 UTC
So maybe:

diff --git a/libstdc++-v3/include/bits/atomic_base.h b/libstdc++-v3/include/bits/atomic_base.h
index aa7374bb252..662cff39df5 100644
--- a/libstdc++-v3/include/bits/atomic_base.h
+++ b/libstdc++-v3/include/bits/atomic_base.h
@@ -956,7 +956,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
       constexpr bool
       __maybe_has_padding()
       {
-#if ! __has_builtin(__builtin_clear_padding)
+       // We cannot clear padding in the constructor for C++11,
+       // so return false here to disable all code for zeroing padding.
+#if __cplusplus < 201402L || ! __has_builtin(__builtin_clear_padding)
        return false;
 #elif __has_builtin(__has_unique_object_representations)
        return !__has_unique_object_representations(_Tp)
diff --git a/libstdc++-v3/include/std/atomic b/libstdc++-v3/include/std/atomic
index 36ff89a146c..336f27832fc 100644
--- a/libstdc++-v3/include/std/atomic
+++ b/libstdc++-v3/include/std/atomic
@@ -238,6 +238,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
       constexpr atomic(_Tp __i) noexcept : _M_i(__i)
       {
+       // A constexpr constructor must be empty in C++11,
+       // so we can only clear padding for C++14 and later.
 #if __cplusplus >= 201402L && __has_builtin(__builtin_clear_padding)
        if _GLIBCXX17_CONSTEXPR (__atomic_impl::__maybe_has_padding<_Tp>())
          __builtin_clear_padding(std::__addressof(_M_i));
Comment 19 Peter Dimov 2024-05-02 15:13:49 UTC
This should work.

I still don't understand why JF so insisted on all these padding shenanigans.
Comment 20 Jakub Jelinek 2024-05-21 09:20:20 UTC
GCC 13.3 is being released, retargeting bugs to GCC 13.4.
Comment 21 Jonathan Wakely 2025-02-21 21:37:54 UTC
(In reply to Peter Dimov from comment #19)
> This should work.

In simple cases, yes. But if we have a mixed C++11 and C++14 (or later) codebase, it's possible for the std::atomic to be initialized in a C++11 TU, so padding is not cleared, and the CAS to be done in a C++14 TU, so it assumes padding is cleared. And so we have the same problem.
Comment 22 Jonathan Wakely 2025-02-22 12:36:17 UTC
I'm not sure how to fix that problem without pessimizing every std::atomic<T>::compare_exchange_strong to use the looping implementation that std::atomic_ref<T>::compare_exchange_strong uses, or to find some C++11-friendly way to zero padding in the constexpr std::atomic<T>::atomic(T) constructor. I think the latter would require new compiler magic (maybe an attribute on the mem-initializer that says to clear padding, or something like that).
Comment 23 Jonathan Wakely 2025-02-22 12:48:38 UTC
(In reply to Jonathan Wakely from comment #22)
> I'm not sure how to fix that problem without pessimizing every
> std::atomic<T>::compare_exchange_strong to use the looping implementation

To be clear, not *every* compare exchange, only when T has padding bits.

Which suggests you would get better perf by defining your type to have no padding bits:

struct state_type
{
    std::uint64_t timestamp;
    std::uint16_t clock_seq;
    char unused[6];
};
Comment 24 Peter Dimov 2025-02-22 15:36:29 UTC
I already have

https://github.com/boostorg/uuid/blob/e7f4cebe81835fd1b5558178f3d4c40ae266d8e2/include/boost/uuid/time_generator_v1.hpp#L32-L43

but this comes with its own issues (-Wmissing-field-initializers.)

Do you think that it's worth trying to roll back this in the standard? Probably not because it's only a problem on C++11 and the standard doesn't care about C++11 at this point.
Comment 25 Jonathan Wakely 2025-02-22 21:30:41 UTC
Probably not, although an unintentional ABI break against C++11 for something arguably unnecessary might get attention.
Comment 26 Jonathan Wakely 2025-02-24 14:34:45 UTC
I think I should push the partial solution now anyway. It's better than what we do today, as at least it will work correctly for programs that don't initialize an atomic in C++11 objects and then use CAS in C++14/17/20/... objects.

Whether to pessimize CAS for all types with padding can be decided later.
Comment 27 Jakub Jelinek 2025-06-05 17:11:05 UTC
GCC 13.4 is being released, retargeting bugs to GCC 13.5.
Comment 28 GCC Commits 2026-01-30 20:27:03 UTC
The master branch has been updated by Patrick Palka <ppalka@gcc.gnu.org>:

https://gcc.gnu.org/g:e4c57e146a224d0aaa71ace78f96fca1156add24

commit r16-7199-ge4c57e146a224d0aaa71ace78f96fca1156add24
Author: Patrick Palka <ppalka@redhat.com>
Date:   Fri Jan 30 15:25:43 2026 -0500

    c++: non-empty constexpr constructor bodies in C++11 [PR123845]
    
    This patch makes us support C++14 non-empty constexpr constructor bodies
    in C++11, as an extension.  This will make it trivial to safely fix the
    C++11 library regression PR114865 that requires us to do
    __builtin_clear_padding after initializing _M_i in std::atomic's
    single-parameter constructor, and that's not really possible with the
    C++11 constexpr restrictions.
    
    Since we lower member initializers to constructor body statements
    internally, and so constructor bodies are already effectively non-empty
    internally even in C++11, supporting non-empty bodies in user code is
    mostly a matter of relaxing the parse-time error.
    
    But constexpr-ex3.C revealed that by accepting the non-empty body of A's
    constructor, build_data_member_initialization goes on to mistake the
    'i = _i' assignment as a member initializer, and we incorrectly accept
    the constructor in C++11 mode (even though omitting mem-inits is only
    valid since C++20).  Turns out this is caused by that function
    recognizing MODIFY_EXPR only in C++11 mode, logic that was last changed
    by r5-5013 (presumably to limit impact of the patch at the time) but I
    reckon could just be removed outright.  This should be safe because the
    result of build_data_member_initialization is only used by
    cx_check_missing_mem_inits for validation; evaluation is in terms of
    the entire lowered constructor body.
    
            PR c++/123845
            PR libstdc++/114865
    
    gcc/cp/ChangeLog:
    
            * constexpr.cc (build_data_member_initialization): Remove
            C++11-specific recognition of MODIFY_EXPR.
            (check_constexpr_ctor_body): Relax error diagnostic to a
            pedwarn and don't clear DECL_DECLARED_CONSTEXPR_P upon
            error.  Return true if complaining.
    
    gcc/testsuite/ChangeLog:
    
            * g++.dg/cpp0x/constexpr-ex3.C: Adjust C++11 non-empty
            constexpr constructor dg-error to a dg-warning.  Expect
            a follow-up missing member initializer diagnostic in C++11 mode.
            * g++.dg/cpp2a/constexpr-try1.C: Expect a follow-up
            compound-statement in constexpr function diagnostic in C++11
            mode.
            * g++.dg/cpp2a/constexpr-try2.C: Likewise.  Adjust C++11
            non-empty constexpr constructor dg-error to a dg-warning.
            * g++.dg/cpp2a/constexpr-try3.C:  Adjust C++11 non-empty
            constexpr constructor dg-error to a dg-warning.
            * g++.dg/cpp0x/constexpr-ctor23.C: New test.
    
    Reviewed-by: Jason Merrill <jason@redhat.com>
Comment 29 GCC Commits 2026-02-12 02:36:03 UTC
The master branch has been updated by Patrick Palka <ppalka@gcc.gnu.org>:

https://gcc.gnu.org/g:48f2e8aa6ddad72955781728bdf515eb50411d24

commit r16-7471-g48f2e8aa6ddad72955781728bdf515eb50411d24
Author: Patrick Palka <ppalka@redhat.com>
Date:   Wed Feb 11 21:35:21 2026 -0500

    libstdc++: Clear padding bits in std::atomic ctor in C++11 [PR114865]
    
    After the front end change r16-7199 both GCC and Clang allow non-empty
    constexpr constructor bodies in C++11 as an extension, so we can now
    unconditionally enable the __builtin_clear_padding logic in std::atomic's
    constructor.
    
            PR libstdc++/114865
    
    libstdc++-v3/ChangeLog:
    
            * include/std/atomic (atomic<_Tp>::atomic(_Tp)) [C++11]:
            Enable __builtin_clear_padding logic.
            * testsuite/29_atomics/atomic/compare_exchange_padding.cc: Enable
            this test in earlier modes, including C++11.
            * testsuite/29_atomics/atomic/cons/zero_padding.cc [C++11]:
            Enable tests verifying cleared padding bits for a non-static-init
            std::atomic object.
    
    Reviewed-by: Tomasz KamiÅski <tkaminsk@redhat.com>
    Reviewed-by: Jonathan Wakely <jwakely@redhat.com>