Bug 109299 - [12 Regression] wrong warning on std::wstring with -O2 -std=c++20 -D_FORTIFY_SOURCE=2
Summary: [12 Regression] wrong warning on std::wstring with -O2 -std=c++20 -D_FORTIFY_...
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: libstdc++ (show other bugs)
Version: 12.2.0
: P2 normal
Target Milestone: 12.3
Assignee: Not yet assigned to anyone
URL:
Keywords: diagnostic
Depends on:
Blocks:
 
Reported: 2023-03-27 15:01 UTC by Benjamin Buch
Modified: 2023-05-02 20:11 UTC (History)
4 users (show)

See Also:
Host:
Target:
Build:
Known to work: 11.3.0
Known to fail: 12.2.0, 13.0
Last reconfirmed: 2023-03-28 00:00:00


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Benjamin Buch 2023-03-27 15:01:50 UTC
I run into this by using googletest with CMake in release mode.

```
#include <string>

static std::wstring foo(std::wstring text = {}) {
    text.resize(42);
    return text;
}

int main() {
    foo();
}
```

```
$ g++ -O2 -std=c++20 -D_FORTIFY_SOURCE=2 main.cpp 
In file included from /usr/include/features.h:486,
                 from /usr/include/x86_64-linux-gnu/c++/12/bits/os_defines.h:39,
                 from /usr/include/x86_64-linux-gnu/c++/12/bits/c++config.h:655,
                 from /usr/include/c++/12/string:38,
                 from main.cpp:1:
In function ‘wchar_t* wmemcpy(wchar_t*, const wchar_t*, size_t)’,
    inlined from ‘static constexpr std::char_traits<wchar_t>::char_type* std::char_traits<wchar_t>::copy(char_type*, const char_type*, std::size_t)’ at /usr/include/c++/12/bits/char_traits.h:558:16,
    inlined from ‘constexpr std::__cxx11::basic_string<_CharT, _Traits, _Alloc>::basic_string(std::__cxx11::basic_string<_CharT, _Traits, _Alloc>&&) [with _CharT = wchar_t; _Traits = std::char_traits<wchar_t>; _Alloc = std::allocator<wchar_t>]’ at /usr/include/c++/12/bits/basic_string.h:675:23,
    inlined from ‘std::wstring foo(std::wstring)’ at main.cpp:5:12,
    inlined from ‘int main()’ at main.cpp:9:8:
/usr/include/x86_64-linux-gnu/bits/wchar2.h:42:10: warning: call to ‘__wmemcpy_chk_warn’ declared with attribute warning: wmemcpy called with length bigger than size of destination buffer [-Wattribute-warning]
   42 |   return __glibc_fortify_n (wmemcpy, __n, sizeof (wchar_t),
      |          ^~~~~~~~~~~~~~~~~
```

On stackoverflow.com you can already find a few discussion about it:

- https://stackoverflow.com/questions/75689057

user17732522 reduced it to:

```
#include <wchar.h>

wchar_t* _M_dataplus;
int _M_string_length;

wchar_t _M_local_buf[4];
wchar_t _M_local_buf2[4];

void g();

void f() {
    g();
    _M_string_length = 4;
    if (_M_dataplus == _M_local_buf)
        wmemcpy(_M_local_buf2, _M_local_buf, _M_string_length + 1);
}
```

```
$ g++ -O1 -D_FORTIFY_SOURCE=2 -c main.cpp 
In file included from /usr/include/features.h:486,
                 from /usr/include/x86_64-linux-gnu/bits/libc-header-start.h:33,
                 from /usr/include/wchar.h:27,
                 from main.cpp:1:
In function ‘wchar_t* wmemcpy(wchar_t*, const wchar_t*, size_t)’,
    inlined from ‘void f()’ at main.cpp:15:16:
/usr/include/x86_64-linux-gnu/bits/wchar2.h:42:10: warning: call to ‘__wmemcpy_chk_warn’ declared with attribute warning: wmemcpy called with length bigger than size of destination buffer [-Wattribute-warning]
   42 |   return __glibc_fortify_n (wmemcpy, __n, sizeof (wchar_t),
      |          ^~~~~~~~~~~~~~~~~
```
Comment 1 Xi Ruoyao 2023-03-27 15:38:33 UTC
You are reducing it wrong.  The reduced example should indeed trigger the warning, as it's copying 5 wchar's into a buffer with length 4.
Comment 2 Jonathan Wakely 2023-03-27 20:57:31 UTC
Are you really using GCC 12.0? If not, please fix the Version field.
Comment 3 Benjamin Buch 2023-03-27 21:14:56 UTC
You are right, I use GCC 12.1.0, sorry!

I also ran reproduce it via godbolt.org starting from version 12.1 until current trunk.

https://godbolt.org/z/Y9frhGr86
Comment 4 Richard Biener 2023-03-28 07:11:22 UTC
Confirmed for the unreduced testcase.  Works for GCC 11.
Comment 5 Jonathan Wakely 2023-03-28 09:28:53 UTC
This is not a library bug. The library code is:

      basic_string(basic_string&& __str) noexcept
      : _M_dataplus(_M_local_data(), std::move(__str._M_get_allocator()))
      {
	if (__str._M_is_local())
	  {
	    traits_type::copy(_M_local_buf, __str._M_local_buf,
			      __str.length() + 1);
	  }
	else
	  {
	    _M_data(__str._M_data());
	    _M_capacity(__str._M_allocated_capacity);
	  }


The warning is coming from the call to traits_type::copy which only happens if the string fits in the local buffer. Warning that it overflows the buffer is not helpful when we don't take that branch BECAUSE IT WOULD OVERFLOW THE BUFFER.

These warnings need to be removed from the compiler.
Comment 6 Jonathan Wakely 2023-03-28 09:37:53 UTC
This change to bits/basic_string.h seems to shut the compiler up:

--- include/bits/basic_string.h 2023-02-27 13:42:21.295513537 +0000
@@ -271,7 +271,15 @@
       _GLIBCXX20_CONSTEXPR
       bool
       _M_is_local() const
-      { return _M_data() == _M_local_data(); }
+      {
+        if (_M_data() == _M_local_data())
+          {
+            if (_M_string_length > _S_local_capacity)
+              __builtin_unreachable();
+            return true;
+          }
+        return false;
+      }
 
       // Create & Destroy
       _GLIBCXX20_CONSTEXPR
Comment 7 Jakub Jelinek 2023-03-28 09:47:24 UTC
Do we also get better generated code with your above patch?  If so, we should go with it.
Comment 8 Jonathan Wakely 2023-03-28 10:03:59 UTC
Yes, it looks better:

--- 109299-orig.s       2023-03-28 11:01:09.886119878 +0100
+++ 109299-fixed.s      2023-03-28 11:01:21.361187289 +0100
@@ -235,21 +235,14 @@
        xorl    %esi, %esi
        call    wmemset
        movq    (%rsp), %rax
-       leaq    48(%rsp), %rdi
-       movq    $42, 8(%rsp)
-       movq    %rdi, 32(%rsp)
+       leaq    32(%rsp), %rdi
+       movq    %rbx, (%rsp)
+       movq    $42, 40(%rsp)
        movl    $0, 168(%rax)
-       cmpq    %rbx, %rax
-       je      .L35
        movq    %rax, 32(%rsp)
        movq    16(%rsp), %rax
-       movq    %rax, 48(%rsp)
-       movl    $42, %eax
-.L36:
-       leaq    32(%rsp), %rdi
-       movq    %rax, 40(%rsp)
-       movq    %rbx, (%rsp)
        movq    $0, 8(%rsp)
+       movq    %rax, 48(%rsp)
        movl    $0, 16(%rsp)
        call    _ZNSt7__cxx1112basic_stringIwSt11char_traitsIwESaIwEE10_M_disposeEv
        movq    %rsp, %rdi
@@ -261,17 +254,10 @@
        popq    %rbx
        .cfi_def_cfa_offset 8
        ret
-.L35:
+.L36:
        .cfi_restore_state
-       movl    $4, %ecx
-       movl    $43, %edx
-       movq    %rbx, %rsi
-       call    __wmemcpy_chk
-       movq    8(%rsp), %rax
-       jmp     .L36
-.L38:
        movq    %rax, %rbx
-       jmp     .L37
+       jmp     .L35
        .globl  __gxx_personality_v0
        .section        .gcc_except_table,"a",@progbits
 .LLSDA1811:
@@ -282,7 +268,7 @@
 .LLSDACSB1811:
        .uleb128 .LEHB0-.LFB1811
        .uleb128 .LEHE0-.LEHB0
-       .uleb128 .L38-.LFB1811
+       .uleb128 .L36-.LFB1811
        .uleb128 0
 .LLSDACSE1811:
        .section        .text.startup
@@ -294,7 +280,7 @@
        .type   main.cold, @function
 main.cold:
 .LFSB1811:
-.L37:
+.L35:
        .cfi_def_cfa_offset 80
        .cfi_offset 3, -16
        movq    %rsp, %rdi
Comment 9 Jonathan Wakely 2023-03-28 10:05:15 UTC
And the reason I put the hint in _M_is_local() not in the function giving the warning is that it might help in other functions too.

I haven't tested it yet though.
Comment 10 GCC Commits 2023-03-28 20:14:21 UTC
The master branch has been updated by Jonathan Wakely <redi@gcc.gnu.org>:

https://gcc.gnu.org/g:bf78b43873b0b7e8f9a430df38749b8b61f9c9b8

commit r13-6915-gbf78b43873b0b7e8f9a430df38749b8b61f9c9b8
Author: Jonathan Wakely <jwakely@redhat.com>
Date:   Tue Mar 28 10:50:40 2023 +0100

    libstdc++: Tell GCC what basic_string::_M_is_local() means [PR109299]
    
    This avoids a bogus warning about overflowing a buffer, because GCC
    can't tell that we don't copy into the buffer unless it fits. By adding
    a __builtin_unreachable() hint we inform the compiler about the
    invariant that the buffer is only used when it's big enough.
    
    This can also improve codegen, by eliminating dead code that GCC
    couldn't tell was unreachable.
    
    libstdc++-v3/ChangeLog:
    
            PR libstdc++/109299
            * include/bits/basic_string.h (basic_string::_M_is_local()): Add
            hint for compiler that local strings fit in the local buffer.
Comment 11 GCC Commits 2023-03-28 23:33:55 UTC
The releases/gcc-12 branch has been updated by Jonathan Wakely <redi@gcc.gnu.org>:

https://gcc.gnu.org/g:00ac6fa3f2a54fb9cc17b7b7f51eae3c6bf7a1bd

commit r12-9330-g00ac6fa3f2a54fb9cc17b7b7f51eae3c6bf7a1bd
Author: Jonathan Wakely <jwakely@redhat.com>
Date:   Tue Mar 28 10:50:40 2023 +0100

    libstdc++: Tell GCC what basic_string::_M_is_local() means [PR109299]
    
    This avoids a bogus warning about overflowing a buffer, because GCC
    can't tell that we don't copy into the buffer unless it fits. By adding
    a __builtin_unreachable() hint we inform the compiler about the
    invariant that the buffer is only used when it's big enough.
    
    This can also improve codegen, by eliminating dead code that GCC
    couldn't tell was unreachable.
    
    libstdc++-v3/ChangeLog:
    
            PR libstdc++/109299
            * include/bits/basic_string.h (basic_string::_M_is_local()): Add
            hint for compiler that local strings fit in the local buffer.
    
    (cherry picked from commit bf78b43873b0b7e8f9a430df38749b8b61f9c9b8)
Comment 12 Jonathan Wakely 2023-03-28 23:34:34 UTC
Fixed for 12.3