Bug 109470 - unexpected const & behavior
Summary: unexpected const & behavior
Status: RESOLVED INVALID
Alias: None
Product: gcc
Classification: Unclassified
Component: c++ (show other bugs)
Version: unknown
: P3 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-04-11 07:43 UTC by Johannes Kellner
Modified: 2023-04-11 10:37 UTC (History)
2 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Johannes Kellner 2023-04-11 07:43:50 UTC
Hello, I am filing this ticket, as I had source code, where this funny Min template created a seg fault when compiling release (optimized code).
I changed the this to an example - to check the behavior and the first to show is this UNEXPECTED/INVALID? behavior:

#include <string>
#include <cassert>

template<typename TYPE>
const TYPE& Min(const TYPE& lhs, const TYPE& rhs)
{
   if (lhs < rhs) return lhs;
   return rhs;
}

int main()
{
   size_t lenght = 16;
   const int MAX = 32;

   const int& dst = Min(MAX, (int)lenght);
   assert(dst <= MAX);
   return dst;
}

Expected behavior: return 16

If compiled with gcc 12.2 and -O0: returns 16
If compiled with gcc 12.2 and -O1: returns 0  UNEXPECTED !!!

I assume, that something with the extension of lifetime, 
for the result of `(int)lenght` fails in the optimized code.

I found this unexpected behavior, because this Min function was used to calculate the size for a memcpy. And if compiled with optimization, dst became invalid (very large), just after the call to Min.

We tested with gcc10 but have the same problem with gcc12.

Please feel free to contact me for details.
Comment 1 Xi Ruoyao 2023-04-11 08:25:32 UTC
The standard says:

A temporary object bound to a reference parameter in a function call persists until the completion of the full-expression containing the call.

So at the "assert" line the lifetime of the temporary object created by the prvalue to glvalue materialization of (int)lenght has already ended.  Any reference to the temporary is dangling and it's undefined behavior to use such a reference.
Comment 2 Xi Ruoyao 2023-04-11 08:36:29 UTC
With "-Wall -O1" this is diagnosed properly, but with a spurious maybe-uninitialized warning:

In file included from /usr/include/c++/12.2.0/cassert:44,
                 from t.c:2:
t.c: In function 'int main()':
t.c:17:11: warning: dangling pointer 'dst' to an unnamed temporary may be used [-Wdangling-pointer=]
   17 |    assert(dst <= MAX);
      |           ^~~
t.c:16:24: note: unnamed temporary defined here
   16 |    const int& dst = Min(MAX, (int)lenght);
      |                     ~~~^~~~~~~~~~~~~~~~~~
t.c:16:24: warning: '<anonymous>' may be used uninitialized [-Wmaybe-uninitialized]

With "-Wall -O2" only the spurious maybe-uninitialized warning is emitted, which is not very helpful.

With "-Wall -O0" no warning at all (diagnosing this issue at least needs some IPA).
Comment 3 Johannes Kellner 2023-04-11 09:48:42 UTC
'A temporary object bound to a reference parameter in a function call persists until the completion of the full-expression containing the call.'

So this does not mean, that the temporary object, (int)lenght, should life until the end of main ??? As main() is the enclosing scope?
Comment 4 Andrew Pinski 2023-04-11 09:52:56 UTC
The function call in this case is Min.

So after the semicolon of the definition of dst, the temp is destroyed.
Comment 5 Jonathan Wakely 2023-04-11 09:56:00 UTC
(In reply to Johannes Kellner from comment #3)
> 'A temporary object bound to a reference parameter in a function call
> persists until the completion of the full-expression containing the call.'
> 
> So this does not mean, that the temporary object, (int)lenght, should life
> until the end of main ??? As main() is the enclosing scope?

No, because it says "completion of the full-expression" not "end of the enclosing scope".

The full-expression ends at the semi-colon.
Comment 6 Johannes Kellner 2023-04-11 10:23:33 UTC
Ok, Ok :)
It's not to me to argue this.

It's just an unexpected behavior (something I was unaware off/ something that does not happen when doing the same code with other compilers clang/msvc).

And in my humble opinion - `full-expression containing the call` could be understood as - until end of enclosing scope.
Or at least if understood this way, I would not have the uninitialized variable :)

I just had a stupid error and found this as the reason. We discussed it and even others in my team, assumed that it might be a bug. As we were not able to reproduce this behavior in other compilers we tested.
Nor did see this as `invalid` code - `ugly` yes, but not `invalid`. And yes, this code is 'not the best possible way'...

Well maybe you take this ticket as a reason to `review` the interpretation, that `full-expression containing the call` does not mean until the `end of enclosing scope`...

For me, I'm fine with current outcome.
Thank you anyway for the quick, very friendly and professional responses!

Best regards
Comment 7 Jakub Jelinek 2023-04-11 10:36:14 UTC
You can use -fsanitize=address to get such bugs diagnosed at runtime:
g++ -fsanitize=address -o /tmp/pr109470{,.C} -g; /tmp/pr109470
=================================================================
==2466554==ERROR: AddressSanitizer: stack-use-after-scope on address 0x7ffe08313d80 at pc 0x000000401304 bp 0x7ffe08313d20 sp 0x7ffe08313d18
READ of size 4 at 0x7ffe08313d80 thread T0
    #0 0x401303 in main /tmp/pr109470.C:17
    #1 0x7f5b06f7958f in __libc_start_call_main (/lib64/libc.so.6+0x2958f)
    #2 0x7f5b06f79648 in __libc_start_main@GLIBC_2.2.5 (/lib64/libc.so.6+0x29648)
    #3 0x4010e4 in _start (/tmp/pr109470+0x4010e4)

Address 0x7ffe08313d80 is located in stack of thread T0 at offset 64 in frame
    #0 0x4011b5 in main /tmp/pr109470.C:12

  This frame has 2 object(s):
    [48, 52) 'MAX' (line 14)
    [64, 68) '<unknown>' <== Memory access at offset 64 is inside this variable
HINT: this may be a false positive if your program uses some custom stack unwind mechanism, swapcontext or vfork
      (longjmp and C++ exceptions *are* supported)
SUMMARY: AddressSanitizer: stack-use-after-scope /tmp/pr109470.C:17 in main
Shadow bytes around the buggy address:
  0x10004105a760: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10004105a770: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10004105a780: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10004105a790: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10004105a7a0: 00 00 00 00 00 00 00 00 f1 f1 f1 f1 f1 f1 04 f2
=>0x10004105a7b0:[f8]f3 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00
  0x10004105a7c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10004105a7d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10004105a7e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10004105a7f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10004105a800: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==2466554==ABORTING

C++ has always behaved like this and changing it would significantly penalize all the code in the wild that uses C++ correctly.  There are some exceptions where the lifetime is extended, but is certainly not one of them.
Comment 8 Jonathan Wakely 2023-04-11 10:37:06 UTC
No, "full-expression" is a formal term defined very precisely in the C++ standard. There is no opportunity for GCC to review that without failing to conform to the C++ standard. Changing when temporary objects are destroyed would be a massive breaking change to the C++ language that would break assumptions made by correct code.

Just because you don't get a warning with other compilers, doesn't mean your code is correct. The code accesses an object outside its lifetime, and so has undefined behaviour. That is true with all compilers.

Clang gives a runtime error with -fsanitize=address e.g. https://godbolt.org/z/dhcEhvzze

That's because the program has undefined behaviour. This is not just GCC's interpretation of the C++ standard, it's a fact.