Bug 118074

Summary: [coroutine] Initialize return value before destruction of coroutine
Product: gcc Reporter: Weibo He <newsigma>
Component: c++Assignee: Not yet assigned to anyone <unassigned>
Status: UNCONFIRMED ---    
Severity: normal CC: daniel.kruegler, enrico.seiler+gccbugs, webrown.cpp, zwuis
Priority: P3 Keywords: c++-coroutines, wrong-code
Version: 15.0   
Target Milestone: ---   
See Also: https://github.com/llvm/llvm-project/issues/120200
Host: Target:
Build: Known to work:
Known to fail: Last reconfirmed:

Description Weibo He 2024-12-17 02:19:55 UTC
Hi all,

I want to discuss an unusual usage of coroutine and CRTP. The following code gives the expected result using MSVC, but a unpredictable result using GCC and clang.

https://godbolt.org/z/r97MbdMhT

```
#include <cstdio>
#include <utility>
#include <coroutine>

template<class T>
struct CRCoro { // Curiously Recurring Coroutine
    struct RValueWrapper {
        T* p;

        operator T&&() const noexcept { return std::move(*p); }
    };

    using promise_type = T;

    T& getDerived() noexcept { return *static_cast<T*>(this); }

    auto get_return_object() noexcept { return RValueWrapper(&getDerived()); }
    std::suspend_never initial_suspend() noexcept { return {}; }
    std::suspend_never final_suspend() noexcept { return {}; }
    void return_value(T&& x) noexcept {
        getDerived() = std::move(x);
        //asm volatile("" : : "r,m"(getDerived()) : "memory"); // Forcing assignment has observable side effect
    }
    void unhandled_exception() {}
};

struct A : public CRCoro<A> {
    int a;
};

A func() {
    A aa{};
    aa.a = 5;
    co_return std::move(aa);
}

int main() {
    printf("%d\n", func().a);
    return 0;
}
```
Expected result (MSVC): 5

Build option: -std=c++20 -fcoroutines
GCC 10.5, 11.4, 12.4, 13.3, 14.2: 5
GCC trunk: 0

Build option: -std=c++20 -fcoroutines -O2
GCC 10.5, 11.4, 12.4, 13.3, trunk: 0
GCC 14.2: 5

Forcing assignment has observable side effect is a workaround for me. But the trick does not work for trunk. Is it the expected behavior or a bug?
Comment 1 Andrew Pinski 2024-12-17 02:42:08 UTC
Looks like a bad interaction with NRV for the trunk.

For 14 and before we get:
  operator delete (_4);
  D.12501 = MEM[(struct A &)_4 + 16];


That is even true at -O0:
Program returned: 1
=================================================================
==1==ERROR: AddressSanitizer: heap-use-after-free on address 0x504000000020 at pc 0x000000401466 bp 0x7fffa8da9120 sp 0x7fffa8da9118
READ of size 4 at 0x504000000020 thread T0
    #0 0x401465 in func() (/app/output.s+0x401465) (BuildId: fa1e65f64e07304f23fb285cba050c7c7c09d52a)
    #1 0x4019f1 in main (/app/output.s+0x4019f1) (BuildId: fa1e65f64e07304f23fb285cba050c7c7c09d52a)
    #2 0x761989e29d8f  (/lib/x86_64-linux-gnu/libc.so.6+0x29d8f) (BuildId: 490fef8403240c91833978d494d39e537409b92e)
    #3 0x761989e29e3f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x29e3f) (BuildId: 490fef8403240c91833978d494d39e537409b92e)
    #4 0x4011b4 in _start (/app/output.s+0x4011b4) (BuildId: fa1e65f64e07304f23fb285cba050c7c7c09d52a)

0x504000000020 is located 16 bytes inside of 48-byte region [0x504000000010,0x504000000040)
freed by thread T0 here:


>The following code gives the expected result using MSVC

Is there a way to check to make sure you are not using the value after deconstruction for MSVC too.

I am suspecting this is just undefined code.
Comment 2 Weibo He 2024-12-17 03:44:56 UTC
(In reply to Andrew Pinski from comment #1)
> Is there a way to check to make sure you are not using the value after
> deconstruction for MSVC too.

Thank you for your comment @Andrew Pinski.

We can take the address of return object and promise object. Duplicate codes are omitted here:

https://godbolt.org/z/fW335zTdG

```
template<class T>
struct CRCoro {
    void return_value(T&& x) noexcept {
        printf("%p\n", this);
        getDerived() = std::move(x);
    }
};

int main() {
    A aa = func();
    printf("%p\n", &aa);
    return 0;
}
```

The output is:
00C985D0
007DFB98
5

The pointers point to different memory and we can conclude that we are not using the value after promise object deconstruction.

> I am suspecting this is just undefined code.

I am not quite sure if it is undefined. And I am wondering if it is reasonable to consider it as an extension if it indeed is.

Please let me know if I might have made a mistake. Thanks again.
Comment 3 Weibo He 2024-12-17 05:22:10 UTC
Intert a printf into final_suspend helps clang and GCC 10 ~ 14 give expected result. While GCC trunk not work.

https://godbolt.org/z/rEhevKa8h

Output(MSVC, clang, GCC 10 ~ 14):
    final_suspend
    A(A&&) 5
    5

Output(GCC trunk):
    A(A&&) 0
    final_suspend
    0

Is it a GCC 15 regression?
Comment 4 Andrew Pinski 2024-12-17 05:36:12 UTC
(In reply to Weibo He from comment #3)
> Intert a printf into final_suspend helps clang and GCC 10 ~ 14 give expected
> result. While GCC trunk not work. 
> Is it a GCC 15 regression?

It does not help -O0 gcc nor -O0 clang, both still give runtime errors with -fsanitize=address .  clang -O2 does not give a runtime error with -fsanitize=address but I can only assume it is able to optimize away the load after the delete .

So I tried MSVC with /fsanitize=address but the dll is not installed on godbolt so I have no way to fully test it.
Comment 5 Weibo He 2024-12-17 08:21:04 UTC
Compiled locally using MSVC with /fsanitize=address

D:\main\build>cl /std:c++20 /fsanitize=address /Zi D:\main\src\main.cpp
Microsoft (R) C/C++ Optimizing Compiler Version 19.42.34435 for x64
Copyright (C) Microsoft Corporation.  All rights reserved.

main.cpp
Microsoft (R) Incremental Linker Version 14.42.34435.0
Copyright (C) Microsoft Corporation.  All rights reserved.

/out:main.exe
/InferAsanLibs
/debug
main.obj

D:\main\build>.\main.exe

5

D:\main\build>cl /std:c++20 /fsanitize=address /Zi /O2 D:\main\src\main.cpp
Microsoft (R) C/C++ Optimizing Compiler Version 19.42.34435 for x64
Copyright (C) Microsoft Corporation.  All rights reserved.

main.cpp
Microsoft (R) Incremental Linker Version 14.42.34435.0
Copyright (C) Microsoft Corporation.  All rights reserved.

/out:main.exe
/InferAsanLibs
/debug
main.obj

D:\main\build>.\main.exe

5
Comment 6 Yanzuo Liu 2024-12-19 08:23:32 UTC
I believe this has undefined behaviour.

When the result object is initialized from `promise.get_return_object()` is unclear currently (See [cwg2563](https://wg21.link/cwg2563)).

- If it is initialized before calling `promise.return_value()`, `promise.a` is uninitialized.

- If it is initialized after destroying `promise`, `promise.get_return_object().p` becomes a dangling pointer.
Comment 7 Weibo He 2024-12-20 09:41:28 UTC
Thanks, @Yanzuo Liu

It seems there are three issues with the bug:

1. [GCC 15 Regression]: Bad interaction with NRV for the trunk

2. [GCC 15 Regression]: Return object is being prematurely converted

Clang community resolved a similar issue: https://github.com/llvm/llvm-project/issues/56532

https://godbolt.org/z/aE7jMxoE5

Expected Output(GCC 10 ~ 14):
    return_void
    final suspend
    A&&

Wrong Output(GCC 15):
    A&&
    return_void
    final suspend

3. [Undefined behaviour]: When should the result object is initialized?

(1) According to CWG2563(https://github.com/GorNishanov/await/blob/master/2023_Issaquah/cwg2563-response.md), for the third common pattern of coroutine:

> get_return_object returns a proxy, coroutine body executes, conversion to return type happens when coroutine return the caller.
> Need coroutine body to execute before initialising the return-value since coroutine body produces the result.

which says conversion occurs when a coroutine returns to the caller. Referring to cppreference, we can conclude that conversion happens after await_suspend() if and only if it returns to the caller/resumer.

> if await_suspend returns void, control is immediately returned to the caller/resumer of the current coroutine (this coroutine remains suspended), otherwise
> if await_suspend returns bool,
>     the value true returns control to the caller/resumer of the current coroutine
>     the value false resumes the current coroutine.
> if await_suspend returns a coroutine handle for some other coroutine, that handle is resumed (by a call to handle.resume()) (note this may chain to eventually cause the current coroutine to resume).
> if await_suspend throws an exception, the exception is caught, the coroutine is resumed, and the exception is immediately re-thrown.

Note that return_value() happens before final_suspend(), I consider the behavior is well defined.

(2) Agree with your second point:

9.5.4 [dcl.fct.def.coroutine] paragraph 11 reads:

> The coroutine state is destroyed when control flows off the end of the coroutine
> or the destroy member function ([coroutine.handle.resumption]) of a coroutine handle ([coroutine.handle]) that refers to the coroutine is invoked.

If I am not misunderstanding, 'flows off the end of the coroutine' occurs before 'returning to the caller'. It is unclear whether return value initialization happens before or after destruction. And the wording 'flows off the end of the coroutine' seems too strict. I suggest the following change:

> The coroutine state is destroyed when returns from the await_resume() of final_suspend
> or the destroy member function ([coroutine.handle.resumption]) of a coroutine handle ([coroutine.handle]) that refers to the coroutine is invoked.

Note that initialization will be done before destruction of coroutine state.

Maybe we should seperate this bug into three.