Bug 94655 - [10 Regression] -Wstringop-overflow on implicit string assignment with vectorized char store
Summary: [10 Regression] -Wstringop-overflow on implicit string assignment with vector...
Status: NEW
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: 10.0
: P2 major
Target Milestone: 10.5
Assignee: Not yet assigned to anyone
Keywords: diagnostic, patch
: 94881 95988 96259 96984 (view as bug list)
Depends on:
Blocks: Wstringop-overflow 95988
  Show dependency treegraph
Reported: 2020-04-18 23:35 UTC by Jens-Wolfhard Schicke
Modified: 2022-06-28 10:40 UTC (History)
9 users (show)

See Also:
Known to work: 11.0, 9.3.0
Known to fail: 10.2.0
Last reconfirmed: 2021-01-21 00:00:00


Note You need to log in before you can comment on or make changes to this bug.
Description Jens-Wolfhard Schicke 2020-04-18 23:35:18 UTC

g++-10 -O3 -W -Wall -Wextra -Werror --std=c++2a -c problem.c++

on this code:

#include <functional>
#include <memory>
#include <string>

struct A {
  std::string strA; // remove this line and it works
  char arr[16]; // remove this line and it works
  std::string strB; // remove this line and it works

const A getA(); // remove "const" and it works

void frob(const std::function<void()> &);

struct B {
  int i; // remove this line and it works
  A data;

  B() { }

  void f();

void B::f() {
  frob([this]() {
    data = getA();

  // data = getA(); // comment out the frob(...) and use this line, it works

results in:

In member function ‘A& A::operator=(const A&)’,
    inlined from ‘B::f()::<lambda()>’ at problem.c++:26:17,
    inlined from ‘constexpr _Res std::__invoke_impl(std::__invoke_other, _Fn&&, _Args&& ...) [with _Res = void; _Fn = B::f()::<lambda()>&; _Args = {}]’ at /usr/include/c++/10/bits/invoke.h:60:36,
    inlined from ‘constexpr std::enable_if_t<is_invocable_r_v<_Res, _Callable, _Args ...>, _Res> std::__invoke_r(_Callable&&, _Args&& ...) [with _Res = void; _Callable = B::f()::<lambda()>&; _Args = {}]’ at /usr/include/c++/10/bits/invoke.h:110:28,
    inlined from ‘static _Res std::_Function_handler<_Res(_ArgTypes ...), _Functor>::_M_invoke(const std::_Any_data&, _ArgTypes&& ...) [with _Res = void; _Functor = B::f()::<lambda()>; _ArgTypes = {}]’ at /usr/include/c++/10/bits/std_function.h:291:30:
problem.c++:5:8: error: writing 16 bytes into a region of size 0 [-Werror=stringop-overflow=]
    5 | struct A {
      |        ^
problem.c++: In static member function ‘static _Res std::_Function_handler<_Res(_ArgTypes ...), _Functor>::_M_invoke(const std::_Any_data&, _ArgTypes&& ...) [with _Res = void; _Functor = B::f()::<lambda()>; _ArgTypes = {}]’:
problem.c++:6:15: note: at offset 0 to object ‘A::strA’ with size 32 declared here
    6 |   std::string strA; // remove this line and it works
      |               ^~~~

Using instead -O2, it works.
Using instead g++9, it works.

% g++-10 --version
g++-10 (Debian 10-20200222-1) 10.0.1 20200222 (experimental) [master revision 01af7e0a0c2:487fe13f218:e99b18cf7101f205bfdd9f0f29ed51caaec52779]
Comment 1 Martin Liška 2020-04-20 07:24:20 UTC
Confirmed, started with r10-5451-gef29b12cfbb4979a.
Comment 2 Jakub Jelinek 2020-04-20 08:52:12 UTC
We must have dups for this, the warning relies on something that the IL doesn't provide. fre4 changes:
   _8 = &MEM[(struct A *)_5 + 8B].strA;
   std::__cxx11::basic_string<char>::_M_assign (_8, &D.113592.strA);
   <bb 3> [local count: 1073741824]:
   _9 = &MEM[(struct A *)_5 + 8B].arr;
-  _31 = &MEM[(struct A *)_5 + 8B];
-  _29 = _31 + 32;
-  _24 = &D.113592 + 33;
-  _21 = _29 - _24;
+  _29 = _8 + 32;
+  _21 = _29 - &MEM <const struct A> [(void *)&D.113592 + 33B];
-  vectp.110_64 = &D.113592 + 32;
-  _69 = &MEM[(struct A *)_5 + 8B];
-  vectp.113_68 = _69 + 32;
-  vect__13.111_67 = MEM <const vector(16) char> [(const char *)vectp.110_64];
-  MEM <vector(16) char> [(char *)vectp.113_68] = vect__13.111_67;
-  vectp.109_66 = vectp.110_64 + 16;
-  vectp.112_71 = vectp.113_68 + 16;
-  ivtmp_74 = 1;
+  vect__13.111_67 = MEM <const vector(16) char> [(const char *)&D.113592 + 32B];
+  MEM <vector(16) char> [(char *)_29] = vect__13.111_67;
+  vectp.112_71 = _29 + 16;

and the warning warns because it thinks the store uses the strA + 32 pointer to store the 16 bytes (vectorized), but the user code doesn't do that, only the value numbering found that the value is equal to that.

Perhaps SCCVN could try to modify the &something.field into &MEM_REF[..., off]
if it decides to reuse it in some context that doesn't use the same field base,
but not sure how hard would that be.
Comment 3 Martin Sebor 2020-04-20 15:35:01 UTC
Using the base object in the MEM_REF instead of the member when accessing another member should certainly fix it.

Another option might be to somehow mark up these synthesized stores (e.g., by setting some currently unused bit) to make it clear they were emitted for valid source code.  I'd prefer the former option since that would preserve the validity (in the C/C++ sense) of the source code.
Comment 4 Jakub Jelinek 2020-04-20 15:42:28 UTC
I don't see how it makes a difference between whether it is a store created by the vectorizer or some original user store.
What matters is what ADDR_EXPR picks up SCCVN for the base, and it will pick up the first one which has the same value; if there are multiple fields starting at the same address, it can pick any of them.
Comment 5 Martin Sebor 2020-04-20 16:57:17 UTC
We want stores by user code to be diagnosed based on strict language rules (e.g., accessing a member via a reference to another member).  To do that we either have to teach the middle end to avoid taking shortcuts that make the IL look like invalid user code, or be able to apply more permissive rules for stores synthesized by it.

I assume the former is what you meant by "modify the &something.field into &MEM_REF[..., off]."  My comment was in support of that because, provided the latter part actually meant "&MEM_REF[something, offsetoff (typeof (something), field)], it could be added to and subtracted from to obtain pointers to any other member of "something."  (In contrast, &something.field + N is only valid for N <= sizeof something.field (flexible arrays aside)).
Comment 6 Jeffrey A. Law 2020-04-21 20:10:24 UTC
Fundamentally if we're relying on the type of the MEM_REF, then we're going to run into problems.  It simply does not map back to what the user originally wrote.  That's the key issue Richi and Jakub are raising.

pr94675 is another recently reported example.  In that case as well FRE creates a MEM_REF from address arithmetic, but the type isn't useful and leads to a false positive out of bounds array warning.  Forwprop can do this stuff as well, and I suspect if we were to dig elsewhere we'd find more examples.

While I don't want to lose these warnings, they are proving problematical in the MEM_REF cases and I'm wondering if we need a distinct knob to disable them in the MEM_REF cases given the fundamental incompatibility between giving good diagnostics and the actions of various optimization passes.

In the immediate term I'm going to mark this as a dup of 94675 (and any others I encounter in my BZ wanderings).

*** This bug has been marked as a duplicate of bug 94675 ***
Comment 7 Martin Sebor 2020-04-21 21:27:53 UTC
The root cause in this case or the solution are not the same as in bug 94675.  The MEM_REF type is the type of the access, so it's correct to rely on it for the size of the access.  The problem here is that the first MEM_REF pointer operand doesn't point to the actual destination of the access: it points to one member of the struct but as a result of adding the offset to it the access is actually to the subsequent member.  It's as if the final store in valid code like this:
  struct S { char i; char a[8], b[8]; };
  void g (struct S *p)
    typedef __attribute__ ((vector_size (8))) char V;

    char *pa = p->a + 8;
    char *pb = p->b;

    *(V*)pb = (V) {0};   // pb points to p->b

were transformed into:

  pa_2 = &MEM <char[8]> [(void *)p_1(D) + 9B];
  MEM[(V *)pa_2] = { 0, 0, 0, 0, 0, 0, 0, 0 };   // but here store is into (p->a + 9)
Comment 8 Martin Sebor 2020-04-30 16:14:39 UTC
*** Bug 94881 has been marked as a duplicate of this bug. ***
Comment 9 Jakub Jelinek 2020-05-07 11:56:31 UTC
GCC 10.1 has been released.
Comment 10 Martin Sebor 2020-07-21 14:55:15 UTC
*** Bug 96259 has been marked as a duplicate of this bug. ***
Comment 11 Martin Sebor 2020-07-21 20:08:02 UTC
*** Bug 95988 has been marked as a duplicate of this bug. ***
Comment 12 Richard Biener 2020-07-23 06:51:57 UTC
GCC 10.2 is released, adjusting target milestone.
Comment 13 Martin Sebor 2021-01-14 20:11:28 UTC
*** Bug 96984 has been marked as a duplicate of this bug. ***
Comment 14 Martin Sebor 2021-01-21 22:34:29 UTC
Still present in GCC 11.  My hack for pr96963 also suppresses this warning as well.
Comment 16 CVS Commits 2021-03-04 00:05:41 UTC
The master branch has been updated by Martin Sebor <msebor@gcc.gnu.org>:


commit r11-7497-g8d57bdadd2d9c2e5c95515ca7a583d7b407b55c4
Author: Martin Sebor <msebor@redhat.com>
Date:   Wed Mar 3 16:56:45 2021 -0700

    Correct a workaround for vectorized stores.
    PR middle-end/96963 - -Wstringop-overflow false positive with -ftree-vectorize when assigning consecutive char struct members
    PR middle-end/94655 - -Wstringop-overflow on implicit string assignment with vectorized char store
            PR middle-end/96963
            PR middle-end/94655
            * builtins.c (handle_array_ref): New helper.
            (handle_mem_ref): New helper.
            (compute_objsize_r): Factor out ARRAY_REF and MEM_REF handling
            into new helper functions.  Correct a workaround for vectorized
            PR middle-end/96963
            PR middle-end/94655
            * gcc.dg/Wstringop-overflow-47.c: Xfail tests.
            * gcc.dg/Wstringop-overflow-65.c: New test.
            * gcc.dg/Warray-bounds-69.c: Same.
Comment 17 Richard Biener 2021-04-08 12:02:03 UTC
GCC 10.3 is being released, retargeting bugs to GCC 10.4.
Comment 18 Martin Sebor 2022-03-17 19:54:04 UTC
I'm no longer planning to bakcport the fix.
Comment 19 Jakub Jelinek 2022-06-28 10:40:16 UTC
GCC 10.4 is being released, retargeting bugs to GCC 10.5.