Bug 107561 - [13 Regression] g++.dg/pr71488.C and [g++.dg/warn/Warray-bounds-16.C -m32] regression due to -Wstringop-overflow problem
Summary: [13 Regression] g++.dg/pr71488.C and [g++.dg/warn/Warray-bounds-16.C -m32] r...
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: 13.0
: P3 normal
Target Milestone: 13.0
Assignee: Richard Biener
URL:
Keywords: diagnostic, missed-optimization
: 107578 (view as bug list)
Depends on:
Blocks: Wstringop-overflow
  Show dependency treegraph
 
Reported: 2022-11-07 20:06 UTC by Aldy Hernandez
Modified: 2023-03-30 11:17 UTC (History)
8 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2022-12-14 00:00:00


Attachments
preprocessed testcase (73.48 KB, text/plain)
2022-11-08 08:31 UTC, Aldy Hernandez
Details
another hack (1.11 KB, patch)
2023-03-29 11:41 UTC, Richard Biener
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Aldy Hernandez 2022-11-07 20:06:42 UTC
g++.dg/pr17488.C fails after commit r13-3761 with:

warning: ‘void* __builtin_memcpy(void*, const void*, long unsigned int)’ writing 8 or more bytes into a region of size 0 overflows the destination [-Wstringop-overflow=]
...
...
note: destination object of size 0 allocated by ‘operator new’

There is no difference in the IL the -Wstringop-overflow pass sees.  For
that matter, there is no difference in the entire IL across through
the *.optimized dump.  However, there are different ranges for two SSA
names.

In the *.waccess1 pass we see:

--- /tmp/a.ii.025t.waccess1.orig.787460 2022-11-07 19:43:41.341855227 +0100
+++ /tmp/a.ii.025t.waccess1.new.787460  2022-11-07 19:43:41.344855238 +0100
@@ -331,8 +331,8 @@
   max_depth:          3
 
 pointer_query cache contents:
-  5.0[1]: _5 = _5 (base0); size: unknown
-  6.0[3]: _6 = _5 (base0); size: unknown
+  5.0[1]: _5 = _5 (base0); size: [16, 9223372036854775807]
+  6.0[3]: _6 = _5 (base0); size: [16, 9223372036854775807]
 
 __attribute__((malloc))
 struct valarray * std::__valarray_get_storage<std::valarray<long long int> > (size_t __n)
@@ -364,8 +364,8 @@
   max_depth:          3
 
 pointer_query cache contents:
-  5.0[1]: _5 = _5 (base0); size: unknown
-  6.0[3]: _6 = _5 (base0); size: unknown
+  5.0[1]: _5 = _5 (base0); size: [8, 9223372036854775807]
+  6.0[3]: _6 = _5 (base0); size: [8, 9223372036854775807]
 
 __attribute__((malloc))
 long long int * std::__valarray_get_storage<long long int> (size_t __n)

This coincides with two unsigned multiplications by 16 and 8, where we correctly restrict the range.  This is from the evrp dump:

-Global Exported: _2 = [irange] long unsigned int [0, +INF] NONZERO 0xfffffffffffffff0
+Global Exported: _2 = [irange] long unsigned int [0, 0][16, +INF] NONZERO 0xfffffffffffffff0

-Global Exported: _2 = [irange] long unsigned int [0, +INF] NONZERO 0xfffffffffffffff8
+Global Exported: _2 = [irange] long unsigned int [0, 0][8, +INF] NONZERO 0xfffffffffffffff8

This is correct as we know the ranges cannot be [1,15] and [1,7] respectively.
Comment 1 Aldy Hernandez 2022-11-08 08:31:20 UTC
Created attachment 53848 [details]
preprocessed testcase
Comment 2 Aldy Hernandez 2022-11-08 08:48:51 UTC
The following test also fails for the same reason:

g++.dg/warn/Warray-bounds-16.C -m32 -O2

It is perhaps a better reduced test for the issue:

inline void* operator new (__SIZE_TYPE__, void * v)
{
  return v;
}

struct S
{
  int* p;
  int m;

  S (int i)
  {
    m = i;
    p = (int*) new unsigned char [sizeof (int) * m];

    for (int i = 0; i < m; i++)
      new (p + i) int (); /* { dg-bogus "bounds" "pr102690" { xfail *-*-* } } */
  }
};

S a (0);

$ ./cc1plus a.c -fdump-tree-all-details -quiet -I/tmp -O2 -std=gnu++98 -m32
In constructor ‘S::S(int)’,
    inlined from ‘void __static_initialization_and_destruction_0()’ at a.c:26:7,
    inlined from ‘(static initializers for a.c)’ at a.c:26:8:
a.c:22:24: warning: ‘void* __builtin_memset(void*, int, unsigned int)’ writing 4 or more bytes into a region of size 0 overflows the destination [-Wstringop-overflow=]
   22 |       new (p + i) int (); /* { dg-bogus "bounds" "pr102690" { xfail *-*-* } } */
      |                        ^
a.c:19:51: note: destination object of size 0 allocated by ‘operator new []’
   19 |     p = (int*) new unsigned char [sizeof (int) * m];
      |                                                   ^
+ set +x

The ranges for some pointers are now different as early as .waccess1, even though the IL is the same:

--- /tmp/a.c.025t.waccess1.orig.805839  2022-11-08 09:46:00.513031310 +0100
+++ /tmp/a.c.025t.waccess1.new.805839   2022-11-08 09:46:00.515031315 +0100
@@ -41,9 +41,9 @@
   max_depth:          2
 
 pointer_query cache contents:
-  3.0[5]: _3 = _17 (base0); size: unknown
+  3.0[5]: _3 = _17 (base0); size: [4, 2147483647]
   11.0[1]: this_11(D) = this_11(D); size: unknown
-  17.0[3]: _17 = _17 (base0); size: unknown
+  17.0[3]: _17 = _17 (base0); size: [4, 2147483647]
 
Similarly by evrp time:

-Global Exported: _15 = [irange] unsigned int [0, +INF] NONZERO 0xfffffffc
+Global Exported: _15 = [irange] unsigned int [0, 0][4, +INF] NONZERO 0xfffffffc

etc etc.

The range is correct, as it is the result of a multiplication by a power of 2:

    _15 = _2 * 4;

_15 can never be [1,3].
Comment 3 Richard Biener 2022-11-08 09:34:10 UTC
This probably boils down to the object size machinery ignoring [0, 0] (aka not picking a convex hull of the range).  The range implementation of all these
passes should be using irange<> (but with wide_ints, not trees) instead of
doing their own thing.
Comment 4 Aldy Hernandez 2022-11-08 13:43:17 UTC
(In reply to Richard Biener from comment #3)
> This probably boils down to the object size machinery ignoring [0, 0] (aka
> not picking a convex hull of the range).  The range implementation of all
> these
> passes should be using irange<> (but with wide_ints, not trees) instead of
> doing their own thing.

Amen. I've been saying that for years.
Comment 5 Andrew Pinski 2022-11-08 18:27:16 UTC
*** Bug 107578 has been marked as a duplicate of this bug. ***
Comment 6 Hans-Peter Nilsson 2023-02-02 18:53:39 UTC
IMHO these tests and AFAICT the underlying issue has seen no attention for months and should be xfailed.  On it...
Comment 7 Hans-Peter Nilsson 2023-02-02 21:03:30 UTC
(In reply to Hans-Peter Nilsson from comment #6)
> IMHO these tests and AFAICT the underlying issue has seen no attention for
> months and should be xfailed.  On it...

https://gcc.gnu.org/pipermail/gcc-patches/2023-February/611206.html
Comment 8 GCC Commits 2023-02-10 00:42:00 UTC
The master branch has been updated by Hans-Peter Nilsson <hp@gcc.gnu.org>:

https://gcc.gnu.org/g:41015797ad14bc9030a87d102e4ab1ad891345f6

commit r13-5766-g41015797ad14bc9030a87d102e4ab1ad891345f6
Author: Hans-Peter Nilsson <hp@axis.com>
Date:   Thu Feb 2 20:35:52 2023 +0100

    testsuite: XFAIL g++.dg/pr71488.C and warn/Warray-bounds-16.C, PR107561
    
    These appear as regressions from a baseline before
    r13-3761-ga239a63f868e29.  See the PR trail.
    
    Note that the warning for g++.dg/pr71488.C is for a *header*
    file, thus we can't match the line number (sanely).
    
    gcc/testsuite:
    
            PR tree-optimization/107561
            * g++.dg/warn/Warray-bounds-16.C: XFAIL bogus "overflows destination"
            warning.
            * g++.dg/pr71488.C: Ditto, but just for ilp32 targets.
Comment 9 Richard Biener 2023-02-27 09:56:41 UTC
For the testcase in comment#2 we end up with

<bb 2> [local count: 1073741824]:
MEM[(struct __as_base  &)&a] ={v} {CLOBBER};
a.m = 0;
_5 = operator new [] (0);
a.p = _5;
_2 = a.m;
if (_2 > 0)
  goto <bb 3>; [89.00%]
else
  goto <bb 5>; [11.00%]

and the code diagnosed is in the unreachable branch we cannot resolve as not
taken because 'operator new [] (0)' is thought to clobber 'a.m'.

We've been circling around improving alias analysis around new and delete
but since users can override them we threw the towel.

For the original testcase in g++.dg/pr71488.C we have

<bb 5> [local count: 214748368]:
*_51._M_size = 0;
_147 = operator new (0);

<bb 6> [local count: 214748368]:
*_51._M_data = _147;
_101 = *_51._M_size;
_99 = _101 * 8;
__builtin_memcpy (_147, _72, _99);

so again we fail to CSE '*_51._M_size' because of 'operator new (0)' but
also the diagnostic code simply assumes that size is at least 1 and
doesn't consider zero here for "reasons".

Note I think there's still a bug in value_range (irange) here.  get_size_range
does

  if (integral)
    {
      value_range vr;

      query->range_of_expr (vr, exp, stmt);

      if (vr.undefined_p ())
        vr.set_varying (TREE_TYPE (exp));
      range_type = vr.kind ();
      min = wi::to_wide (vr.min ());
      max = wi::to_wide (vr.max ());

and we have vr:

(gdb) p vr
$13 = {<irange> = {<vrange> = {
      _vptr.vrange = 0x3693a30 <vtable for int_range<1u>+16>, 
      m_kind = VR_ANTI_RANGE, m_discriminator = VR_IRANGE}, 
    m_num_ranges = 1 '\001', m_max_ranges = 1 '\001', 
    m_nonzero_mask = <tree 0x0>, m_base = 0x7fffffffc8f0}, m_ranges = {
    <integer_cst 0x7ffff68143f0>, <integer_cst 0x7ffff5e82090>}}
(gdb) p vr.dump (stderr)
[irange] unsigned int [0, 0][8, +INF]$17 = void

but vr.min () produces 1 and vr.max () produces 7, just as if it doesn't
interpret VR_ANTI_RANGE transparently here (if that's the intent?!).
At least

// Return the highest bound of a range expressed as a tree.

inline tree
irange::tree_upper_bound () const

suggests that.  Note that vr.num_pairs () produces 2 (because constant_p ())
but vr.m_num_ranges is 1 and tree_upper_bound uses m_num_ranges.

I suppose irange::{min,max,tree_lower_bound,tree_upper_bound} miss "support"
for legacy_mode_p here.

Either using int_range<2> or vr.lower_bound/upper_bound works to avoid this
issue.  Still it's fragile.  I suppose since 'value_range' is legacy
itself using int_range<2> is better here.  So I'm testing the following
to get rid of that legacy, plus get rid of ::min/max which are legacy as well.
Comment 10 Richard Biener 2023-02-27 11:18:53 UTC
(In reply to Richard Biener from comment #9)
> Note I think there's still a bug in value_range (irange) here. 
> get_size_range
> does
> 
>   if (integral)
>     {
>       value_range vr;
> 
>       query->range_of_expr (vr, exp, stmt);
> 
>       if (vr.undefined_p ())
>         vr.set_varying (TREE_TYPE (exp));
>       range_type = vr.kind ();
>       min = wi::to_wide (vr.min ());
>       max = wi::to_wide (vr.max ());
> 
> and we have vr:
> 
> (gdb) p vr
> $13 = {<irange> = {<vrange> = {
>       _vptr.vrange = 0x3693a30 <vtable for int_range<1u>+16>, 
>       m_kind = VR_ANTI_RANGE, m_discriminator = VR_IRANGE}, 
>     m_num_ranges = 1 '\001', m_max_ranges = 1 '\001', 
>     m_nonzero_mask = <tree 0x0>, m_base = 0x7fffffffc8f0}, m_ranges = {
>     <integer_cst 0x7ffff68143f0>, <integer_cst 0x7ffff5e82090>}}
> (gdb) p vr.dump (stderr)
> [irange] unsigned int [0, 0][8, +INF]$17 = void
> 
> but vr.min () produces 1 and vr.max () produces 7, just as if it doesn't
> interpret VR_ANTI_RANGE transparently here (if that's the intent?!).
> At least
> 
> // Return the highest bound of a range expressed as a tree.
> 
> inline tree
> irange::tree_upper_bound () const
> 
> suggests that.  Note that vr.num_pairs () produces 2 (because constant_p ())
> but vr.m_num_ranges is 1 and tree_upper_bound uses m_num_ranges.
> 
> I suppose irange::{min,max,tree_lower_bound,tree_upper_bound} miss "support"
> for legacy_mode_p here.

OTOH gimple-array-bounds.cc does

  const value_range *vr = NULL;
  if (TREE_CODE (low_sub_org) == SSA_NAME)
    {                
      vr = get_value_range (low_sub_org, stmt);
      if (!vr->undefined_p () && !vr->varying_p ())
        {
          low_sub = vr->kind () == VR_RANGE ? vr->max () : vr->min ();
          up_sub = vr->kind () == VR_RANGE ? vr->min () : vr->max ();
        }

so the bug is a documentation bug on min/max/lower/upper bound?!

I'm policeing other uses of value_range right now.
Comment 11 Richard Biener 2023-02-27 12:49:00 UTC
So I've missed the VR_ANTI_RANGE handling in get_size_range where we run into

          wide_int maxsize = wi::to_wide (max_object_size ());
          min = wide_int::from (min, maxsize.get_precision (), UNSIGNED);
          max = wide_int::from (max, maxsize.get_precision (), UNSIGNED);
          if (wi::eq_p (0, min - 1))
            {
              /* EXP is unsigned and not in the range [1, MAX].  That means
                 it's either zero or greater than MAX.  Even though 0 would
                 normally be detected by -Walloc-zero, unless ALLOW_ZERO
                 is set, set the range to [MAX, TYPE_MAX] so that when MAX
                 is greater than the limit the whole range is diagnosed.  */
              wide_int maxsize = wi::to_wide (max_object_size ());
              if (flags & SR_ALLOW_ZERO)
                {
                  if (wi::leu_p (maxsize, max + 1)
                      || !(flags & SR_USE_LARGEST))
                    min = max = wi::zero (expprec);
                  else
                    {
                      min = max + 1;
                      max = wi::to_wide (TYPE_MAX_VALUE (exptype));
                    }
                }
              else
                {
                  min = max + 1;
                  max = wi::to_wide (TYPE_MAX_VALUE (exptype));
                }

and from [0,0] [8, +INF] pick [8, +INF] based on the comments reasoning.

This all wouldn't happen if we'd be able to CSE the zero size ...

We can now try to put additional heuristic ontop of the above heuristic,
namely when the object we write to is of size zero set SR_ALLOW_ZERO.
Or try to "undo" the multiplication trick which would probably make us
end up with VARYING.

I'll note that with the earlier proposed change we regress the following,
that's an observation I make a lot of times - all "weirdness" in the code
is backed by (artificial) testcases verifying it works exactly as coded ...

    FAIL: gcc.dg/Wstringop-overflow-15.c pr82608 (test for warnings, line 37)
    FAIL: gcc.dg/Wstringop-overflow-15.c pr82608 (test for warnings, line 38)
    FAIL: gcc.dg/Wstringop-overflow-15.c pr82608 (test for warnings, line 69)
    FAIL: gcc.dg/Wstringop-overflow-15.c pr82608 (test for warnings, line 70)
    FAIL: gcc.dg/Wstringop-overflow-56.c  (test for warnings, line 64)
    FAIL: gcc.dg/Wstringop-overflow-56.c  (test for warnings, line 75)
    FAIL: gcc.dg/Wstringop-overflow-56.c  (test for warnings, line 86)
    FAIL: gcc.dg/Wstringop-overflow-56.c  (test for warnings, line 97)
    FAIL: gcc.dg/Wstringop-overflow-56.c  (test for warnings, line 108)
    FAIL: gcc.dg/Wstringop-overflow-56.c  (test for warnings, line 148)
    FAIL: gcc.dg/Wstringop-overflow-56.c  (test for warnings, line 159)
    FAIL: gcc.dg/attr-alloc_size-11.c  (test for warnings, line 52)
    FAIL: gcc.dg/attr-alloc_size-11.c  (test for warnings, line 53)
    FAIL: gcc.dg/attr-alloc_size-11.c  (test for warnings, line 54)
    FAIL: gcc.dg/attr-alloc_size-11.c  (test for warnings, line 55)
    FAIL: gcc.dg/attr-alloc_size-11.c  (test for warnings, line 56)
    FAIL: gcc.dg/attr-alloc_size-11.c  (test for warnings, line 57)
    FAIL: gcc.dg/attr-alloc_size-11.c  (test for warnings, line 58)
    FAIL: gcc.dg/attr-alloc_size-11.c  (test for warnings, line 64)
    FAIL: gcc.dg/attr-alloc_size-11.c  (test for warnings, line 65)
    FAIL: gcc.dg/attr-alloc_size-3.c  (test for warnings, line 438)
    FAIL: gcc.dg/attr-alloc_size-4.c  (test for warnings, line 138)
    FAIL: gcc.dg/attr-alloc_size-4.c  (test for warnings, line 143)
    FAIL: gcc.dg/attr-alloc_size-4.c  (test for warnings, line 187)
    FAIL: gcc.dg/pr98721-1.c  (test for warnings, line 11)
    FAIL: gcc.dg/pr98721-1.c  (test for warnings, line 12)

For example gcc.dg/pr98721-1.c has

int
foo (int n)
{
  if (n <= 0)
    {
      char vla[n];                      /* { dg-message "source object 'vla' of size 0" } */
      return __builtin_strlen (vla);    /* { dg-warning "'__builtin_strlen' reading 1 or more bytes from a region of size 0" } */

but of course we do not diagnose

int
foo (int n)
{
  if (n < 0)
    {
      char vla[n];

or when no condition is present or a n > 32 condition is present.

I fear it's not possible to "fix" this testcase without changing the
expectation on a bunch of other testcases.  But the messaging to the
user is quite unhelpful because it doesn't actually inform him about
above reasoning.

That we are not allowed to optimize the code is not of help either.

We can again work around this in libstdc++ by CSEing ->_M_size ourselves.
The following helps:

diff --git a/libstdc++-v3/include/std/valarray b/libstdc++-v3/include/std/valarray
index 7a23c27a0ce..7383071f98d 100644
--- a/libstdc++-v3/include/std/valarray
+++ b/libstdc++-v3/include/std/valarray
@@ -647,8 +647,13 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
     inline
     valarray<_Tp>::valarray(const valarray<_Tp>& __v)
     : _M_size(__v._M_size), _M_data(__valarray_get_storage<_Tp>(__v._M_size))
-    { std::__valarray_copy_construct(__v._M_data, __v._M_data + _M_size,
-                                    _M_data); }
+    {
+      auto __v_M_size = __v._M_size;
+      _M_size = __v_M_size;
+      _M_data = __valarray_get_storage<_Tp>(__v_M_size);
+      std::__valarray_copy_construct(__v._M_data, __v._M_data + __v_M_size,
+                                    _M_data);
+    }
 
 #if __cplusplus >= 201103L
   template<typename _Tp>
Comment 12 Aldy Hernandez 2023-02-27 13:45:17 UTC
(In reply to Richard Biener from comment #10)
> (In reply to Richard Biener from comment #9)
> > Note I think there's still a bug in value_range (irange) here. 
> > get_size_range
> > does
> > 
> >   if (integral)
> >     {
> >       value_range vr;
> > 
> >       query->range_of_expr (vr, exp, stmt);
> > 
> >       if (vr.undefined_p ())
> >         vr.set_varying (TREE_TYPE (exp));
> >       range_type = vr.kind ();
> >       min = wi::to_wide (vr.min ());
> >       max = wi::to_wide (vr.max ());
> > 
> > and we have vr:
> > 
> > (gdb) p vr
> > $13 = {<irange> = {<vrange> = {
> >       _vptr.vrange = 0x3693a30 <vtable for int_range<1u>+16>, 
> >       m_kind = VR_ANTI_RANGE, m_discriminator = VR_IRANGE}, 
> >     m_num_ranges = 1 '\001', m_max_ranges = 1 '\001', 
> >     m_nonzero_mask = <tree 0x0>, m_base = 0x7fffffffc8f0}, m_ranges = {
> >     <integer_cst 0x7ffff68143f0>, <integer_cst 0x7ffff5e82090>}}
> > (gdb) p vr.dump (stderr)
> > [irange] unsigned int [0, 0][8, +INF]$17 = void
> > 
> > but vr.min () produces 1 and vr.max () produces 7, just as if it doesn't
> > interpret VR_ANTI_RANGE transparently here (if that's the intent?!).
> > At least
> > 
> > // Return the highest bound of a range expressed as a tree.
> > 
> > inline tree
> > irange::tree_upper_bound () const
> > 
> > suggests that.  Note that vr.num_pairs () produces 2 (because constant_p ())
> > but vr.m_num_ranges is 1 and tree_upper_bound uses m_num_ranges.
> > 
> > I suppose irange::{min,max,tree_lower_bound,tree_upper_bound} miss "support"
> > for legacy_mode_p here.
> 
> OTOH gimple-array-bounds.cc does
> 
>   const value_range *vr = NULL;
>   if (TREE_CODE (low_sub_org) == SSA_NAME)
>     {                
>       vr = get_value_range (low_sub_org, stmt);
>       if (!vr->undefined_p () && !vr->varying_p ())
>         {
>           low_sub = vr->kind () == VR_RANGE ? vr->max () : vr->min ();
>           up_sub = vr->kind () == VR_RANGE ? vr->min () : vr->max ();
>         }
> 
> so the bug is a documentation bug on min/max/lower/upper bound?!
> 
> I'm policeing other uses of value_range right now.

Haven't looked at this yet, but min/max/lower/upper bound are broken and inconsistent.  They give different results in legacy and the new irange API.  This was not by intent, but it crept in :/.  My fault.  I noticed this when removing the legacy code for next stage1.
Comment 13 Jonathan Wakely 2023-02-27 14:38:25 UTC
(In reply to Richard Biener from comment #11)
> We can again work around this in libstdc++ by CSEing ->_M_size ourselves.
> The following helps:
> 
> diff --git a/libstdc++-v3/include/std/valarray
> b/libstdc++-v3/include/std/valarray
> index 7a23c27a0ce..7383071f98d 100644
> --- a/libstdc++-v3/include/std/valarray
> +++ b/libstdc++-v3/include/std/valarray
> @@ -647,8 +647,13 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>      inline
>      valarray<_Tp>::valarray(const valarray<_Tp>& __v)
>      : _M_size(__v._M_size),
> _M_data(__valarray_get_storage<_Tp>(__v._M_size))
> -    { std::__valarray_copy_construct(__v._M_data, __v._M_data + _M_size,
> -                                    _M_data); }
> +    {
> +      auto __v_M_size = __v._M_size;
> +      _M_size = __v_M_size;
> +      _M_data = __valarray_get_storage<_Tp>(__v_M_size);
> +      std::__valarray_copy_construct(__v._M_data, __v._M_data + __v_M_size,
> +                                    _M_data);
> +    }
>  
>  #if __cplusplus >= 201103L
>    template<typename _Tp>

Ugh, gross.

This makes no sense to me. this->_M_size is already a local copy of __v._M_size that cannot have escaped, because its enclosing object hasn't been constructed yet. Why do we need another "more local" copy of it?

_M_size is a copy of __v._M_size, which is passed to the get_storage function. The compiler thinks that the get_storage call might modify __v, but it can't modify this->_M_size. So then _M_size still has the same value when passed to the copy_construct call.


Since it would be undefined for users to modify this->_M_size or __v._M_size from operator new (because they cannot access an object under construction, and cannot modify an object while it's in the process of being copied), I wish we could say that a specific call to operator new does not modify anything reachable from the enclosing function's arguments, including `this`.

Or maybe we just teach the compiler that operator new will not touch anything defined in namespace std, on pain of death.
Comment 14 Aldy Hernandez 2023-02-27 16:16:24 UTC
(In reply to Richard Biener from comment #11)
> So I've missed the VR_ANTI_RANGE handling in get_size_range where we run into
> 
>           wide_int maxsize = wi::to_wide (max_object_size ());
>           min = wide_int::from (min, maxsize.get_precision (), UNSIGNED);
>           max = wide_int::from (max, maxsize.get_precision (), UNSIGNED);
>           if (wi::eq_p (0, min - 1))
>             {
>               /* EXP is unsigned and not in the range [1, MAX].  That means
>                  it's either zero or greater than MAX.  Even though 0 would
>                  normally be detected by -Walloc-zero, unless ALLOW_ZERO
>                  is set, set the range to [MAX, TYPE_MAX] so that when MAX
>                  is greater than the limit the whole range is diagnosed.  */
>               wide_int maxsize = wi::to_wide (max_object_size ());
>               if (flags & SR_ALLOW_ZERO)
>                 {
>                   if (wi::leu_p (maxsize, max + 1)
>                       || !(flags & SR_USE_LARGEST))
>                     min = max = wi::zero (expprec);
>                   else
>                     {
>                       min = max + 1;
>                       max = wi::to_wide (TYPE_MAX_VALUE (exptype));
>                     }
>                 }
>               else
>                 {
>                   min = max + 1;
>                   max = wi::to_wide (TYPE_MAX_VALUE (exptype));
>                 }
> 
> and from [0,0] [8, +INF] pick [8, +INF] based on the comments reasoning.

Ughh, you're reaching all the problematic cases I ran into while trying to remove legacy.

> 
> This all wouldn't happen if we'd be able to CSE the zero size ...
> 
> We can now try to put additional heuristic ontop of the above heuristic,
> namely when the object we write to is of size zero set SR_ALLOW_ZERO.
> Or try to "undo" the multiplication trick which would probably make us
> end up with VARYING.
> 
> I'll note that with the earlier proposed change we regress the following,
> that's an observation I make a lot of times - all "weirdness" in the code
> is backed by (artificial) testcases verifying it works exactly as coded ...

+1

> 
>     FAIL: gcc.dg/Wstringop-overflow-15.c pr82608 (test for warnings, line 37)
>     FAIL: gcc.dg/Wstringop-overflow-15.c pr82608 (test for warnings, line 38)
>     FAIL: gcc.dg/Wstringop-overflow-15.c pr82608 (test for warnings, line 69)
>     FAIL: gcc.dg/Wstringop-overflow-15.c pr82608 (test for warnings, line 70)
>     FAIL: gcc.dg/Wstringop-overflow-56.c  (test for warnings, line 64)
>     FAIL: gcc.dg/Wstringop-overflow-56.c  (test for warnings, line 75)
>     FAIL: gcc.dg/Wstringop-overflow-56.c  (test for warnings, line 86)
>     FAIL: gcc.dg/Wstringop-overflow-56.c  (test for warnings, line 97)
>     FAIL: gcc.dg/Wstringop-overflow-56.c  (test for warnings, line 108)
>     FAIL: gcc.dg/Wstringop-overflow-56.c  (test for warnings, line 148)
>     FAIL: gcc.dg/Wstringop-overflow-56.c  (test for warnings, line 159)
>     FAIL: gcc.dg/attr-alloc_size-11.c  (test for warnings, line 52)
>     FAIL: gcc.dg/attr-alloc_size-11.c  (test for warnings, line 53)
>     FAIL: gcc.dg/attr-alloc_size-11.c  (test for warnings, line 54)
>     FAIL: gcc.dg/attr-alloc_size-11.c  (test for warnings, line 55)
>     FAIL: gcc.dg/attr-alloc_size-11.c  (test for warnings, line 56)
>     FAIL: gcc.dg/attr-alloc_size-11.c  (test for warnings, line 57)
>     FAIL: gcc.dg/attr-alloc_size-11.c  (test for warnings, line 58)
>     FAIL: gcc.dg/attr-alloc_size-11.c  (test for warnings, line 64)
>     FAIL: gcc.dg/attr-alloc_size-11.c  (test for warnings, line 65)
>     FAIL: gcc.dg/attr-alloc_size-3.c  (test for warnings, line 438)
>     FAIL: gcc.dg/attr-alloc_size-4.c  (test for warnings, line 138)
>     FAIL: gcc.dg/attr-alloc_size-4.c  (test for warnings, line 143)
>     FAIL: gcc.dg/attr-alloc_size-4.c  (test for warnings, line 187)
>     FAIL: gcc.dg/pr98721-1.c  (test for warnings, line 11)
>     FAIL: gcc.dg/pr98721-1.c  (test for warnings, line 12)
> 
> For example gcc.dg/pr98721-1.c has
> 
> int
> foo (int n)
> {
>   if (n <= 0)
>     {
>       char vla[n];                      /* { dg-message "source object 'vla'
> of size 0" } */
>       return __builtin_strlen (vla);    /* { dg-warning "'__builtin_strlen'
> reading 1 or more bytes from a region of size 0" } */
> 
> but of course we do not diagnose
> 
> int
> foo (int n)
> {
>   if (n < 0)
>     {
>       char vla[n];
> 
> or when no condition is present or a n > 32 condition is present.

Yup, ran into that too.

> 
> I fear it's not possible to "fix" this testcase without changing the
> expectation on a bunch of other testcases.  But the messaging to the
> user is quite unhelpful because it doesn't actually inform him about
> above reasoning.

Agreed.
Comment 15 Richard Biener 2023-03-01 14:22:09 UTC
(In reply to Jonathan Wakely from comment #13)
> (In reply to Richard Biener from comment #11)
> > We can again work around this in libstdc++ by CSEing ->_M_size ourselves.
> > The following helps:
> > 
> > diff --git a/libstdc++-v3/include/std/valarray
> > b/libstdc++-v3/include/std/valarray
> > index 7a23c27a0ce..7383071f98d 100644
> > --- a/libstdc++-v3/include/std/valarray
> > +++ b/libstdc++-v3/include/std/valarray
> > @@ -647,8 +647,13 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> >      inline
> >      valarray<_Tp>::valarray(const valarray<_Tp>& __v)
> >      : _M_size(__v._M_size),
> > _M_data(__valarray_get_storage<_Tp>(__v._M_size))
> > -    { std::__valarray_copy_construct(__v._M_data, __v._M_data + _M_size,
> > -                                    _M_data); }
> > +    {
> > +      auto __v_M_size = __v._M_size;
> > +      _M_size = __v_M_size;
> > +      _M_data = __valarray_get_storage<_Tp>(__v_M_size);
> > +      std::__valarray_copy_construct(__v._M_data, __v._M_data + __v_M_size,
> > +                                    _M_data);
> > +    }
> >  
> >  #if __cplusplus >= 201103L
> >    template<typename _Tp>
> 
> Ugh, gross.
> 
> This makes no sense to me. this->_M_size is already a local copy of
> __v._M_size that cannot have escaped, because its enclosing object hasn't
> been constructed yet. Why do we need another "more local" copy of it?
> 
> _M_size is a copy of __v._M_size, which is passed to the get_storage
> function. The compiler thinks that the get_storage call might modify __v,
> but it can't modify this->_M_size. So then _M_size still has the same value
> when passed to the copy_construct call.
> 
> 
> Since it would be undefined for users to modify this->_M_size or __v._M_size
> from operator new (because they cannot access an object under construction,
> and cannot modify an object while it's in the process of being copied), I
> wish we could say that a specific call to operator new does not modify
> anything reachable from the enclosing function's arguments, including `this`.
> 
> Or maybe we just teach the compiler that operator new will not touch
> anything defined in namespace std, on pain of death.

The compiler doesn't know that the allocation function cannot clobber *this.
The C++ frontend tries to communicate this by making 'this' restrict qualified
and we make use of that info, but for calls we do not know how to use the
info.

Maybe we can special-case directly the actual parameter case and compute
the restrictness info for the call arguments.  The canonical example is

void bar (void);
struct X {
  X (int);
  int i;
  int j;
};

X::X(int k)
{
  i = k;
  bar ();
  j = i != k;
}

where if I understand you correctly, bar () is not allowed to modify *this
(unless I pass it an argument to it, of course), even if *this is for
example

char *storage;

void bar ()
{
  ((X *)storage)->i = 0; // the cast is invalid, no object of type X yet there?
}

int main()
{
  storage = new char[8];
  new (storage) X (1);
}
Comment 16 Jakub Jelinek 2023-03-01 14:34:38 UTC
(In reply to Richard Biener from comment #15)
> The compiler doesn't know that the allocation function cannot clobber *this.
> The C++ frontend tries to communicate this by making 'this' restrict
> qualified
> and we make use of that info, but for calls we do not know how to use the
> info.
> 
> Maybe we can special-case directly the actual parameter case and compute
> the restrictness info for the call arguments.  The canonical example is
> 
> void bar (void);
> struct X {
>   X (int);
>   int i;
>   int j;
> };
> 
> X::X(int k)
> {
>   i = k;
>   bar ();
>   j = i != k;
> }
> 
> where if I understand you correctly, bar () is not allowed to modify *this
> (unless I pass it an argument to it, of course), even if *this is for
> example

Why?  Because it is a constructor and the object isn't fully constructed yet at that point?  For normal methods I certainly don't see anything that would preclude such modifications.
Comment 17 Richard Biener 2023-03-01 14:38:10 UTC
(In reply to Jakub Jelinek from comment #16)
> (In reply to Richard Biener from comment #15)
> > The compiler doesn't know that the allocation function cannot clobber *this.
> > The C++ frontend tries to communicate this by making 'this' restrict
> > qualified
> > and we make use of that info, but for calls we do not know how to use the
> > info.
> > 
> > Maybe we can special-case directly the actual parameter case and compute
> > the restrictness info for the call arguments.  The canonical example is
> > 
> > void bar (void);
> > struct X {
> >   X (int);
> >   int i;
> >   int j;
> > };
> > 
> > X::X(int k)
> > {
> >   i = k;
> >   bar ();
> >   j = i != k;
> > }
> > 
> > where if I understand you correctly, bar () is not allowed to modify *this
> > (unless I pass it an argument to it, of course), even if *this is for
> > example
> 
> Why?  Because it is a constructor and the object isn't fully constructed yet
> at that point?  For normal methods I certainly don't see anything that would
> preclude such modifications.

The canonical C example would be

void bar (void);

void foo (struct X * restrict this, int k)
{
  this->i = k;
  bar ();
  this->j = i != k;
}

where as I understand bar () cannot modify what *this points to since it
cannot build a proper derived access from 'this' (unless I pass it to bar).

The C++ frontend annotates 'this' with restrict.  For my example I get

void X::X (struct X * const this, int k)
{
  # PT = { D.2806 } (nonlocal, restrict)
  struct X * const this_5(D) = this;
  int k_7(D) = k;
  int _1;
  bool _2;
  int _3;

  <bb 2> :
  MEM[(struct X *)this_5(D) clique 1 base 1] ={v} {CLOBBER};
  MEM[(struct X *)this_5(D) clique 1 base 1].i = k_7(D);
  # USE = nonlocal escaped
  # CLB = nonlocal escaped
  bar ();
  _1 = MEM[(struct X *)this_5(D) clique 1 base 1].i;
  _2 = _1 != k_7(D);
  # RANGE [irange] int [0, 1] NONZERO 0x1
  _3 = (int) _2;
  MEM[(struct X *)this_5(D) clique 1 base 1].j = _3;
  return;
}
Comment 18 Jakub Jelinek 2023-03-01 14:55:17 UTC
Does the FE really do that?
I certainly don't see it in the gimple dump:
void X::X (struct X * const this, int k)
{
  *this = {CLOBBER};
  {
    this->i = k;
    # USE = anything
    # CLB = anything
    bar ();
    _1 = this->i;
    _2 = k != _1;
    _3 = (int) _2;
    this->j = _3;
  }
}
While if I compile the C variant after fixing it up:
struct X { int i, j; };
void bar (void);

void foo (struct X * restrict this, int k)
{
  this->i = k;
  bar ();
  this->j = this->i != k;
}
it is there:
void foo (struct X * restrict this, int k)
{
  this->i = k;
  bar ();
  _1 = this->i;
  _2 = k != _1;
  _3 = (int) _2;
  this->j = _3;
}
Comment 19 Jakub Jelinek 2023-03-01 14:57:59 UTC
And on
void bar (void);
struct X {
  X (int);
  int i;
  int j;
  void baz (int);
};

X::X(int k)
{
  i = k;
  bar ();
  j = i != k;
}

void
X::baz(int k)
{
  i = k;
  bar ();
  j = i != k;
}
while I see
  # PT = { D.2822 } (nonlocal, restrict)
  struct X * const this_5(D) = this;
later on in the dumps (not really sure what exactly causes it), in baz it is not there:
  # PT = nonlocal
  struct X * const this_5(D) = this;
Comment 20 Jonathan Wakely 2023-03-01 15:55:37 UTC
(In reply to Jakub Jelinek from comment #16)
> (In reply to Richard Biener from comment #15)
> > where if I understand you correctly, bar () is not allowed to modify *this
> > (unless I pass it an argument to it, of course), even if *this is for
> > example
> 
> Why?  Because it is a constructor and the object isn't fully constructed yet
> at that point?

Yes, exactly. The object's lifetime has not started until the constructor completes, so accessing it is only allowed in very limited ways, described in [basic.life] p6. However, it looks like for a non-trivial constructor the results are just unspecified, not undefined, see [class.cdtor] p2. Still, I don't see how operator new could meaningfully do anything to an object under construction if the object is in an unspecified state. And frankly, if anybody did write an operator new like that, they deserve what they get.

Could we have a flag that says "assume operator new is not stupid"?
Comment 21 Richard Biener 2023-03-02 07:51:45 UTC
(In reply to Jakub Jelinek from comment #19)
> And on
> void bar (void);
> struct X {
>   X (int);
>   int i;
>   int j;
>   void baz (int);
> };
> 
> X::X(int k)
> {
>   i = k;
>   bar ();
>   j = i != k;
> }
> 
> void
> X::baz(int k)
> {
>   i = k;
>   bar ();
>   j = i != k;
> }
> while I see
>   # PT = { D.2822 } (nonlocal, restrict)
>   struct X * const this_5(D) = this;
> later on in the dumps (not really sure what exactly causes it), in baz it is
> not there:
>   # PT = nonlocal
>   struct X * const this_5(D) = this;

We have

static void
intra_create_variable_infos (struct function *fn)
{     
  tree t;
  bitmap handled_struct_type = NULL;
  bool this_parm_in_ctor = DECL_CXX_CONSTRUCTOR_P (fn->decl);
...
      varinfo_t p           
        = create_variable_info_for_1 (t, alias_get_name (t), false, true,
                                      handled_struct_type, this_parm_in_ctor);

and 'this_parm_in_ctor' makes the pointer as if it were restrict qualified.
Not sure why we chose that route as compared to actually making it restrict
(I think we did that at some point but it caused issues?).

As said the real issue is that I didn't implement effects of restrict
qualification on calls.  I'll see to give it a try for next stage1.
Comment 22 Richard Biener 2023-03-02 07:53:28 UTC
(In reply to Jonathan Wakely from comment #20)
> (In reply to Jakub Jelinek from comment #16)
> > (In reply to Richard Biener from comment #15)
> > > where if I understand you correctly, bar () is not allowed to modify *this
> > > (unless I pass it an argument to it, of course), even if *this is for
> > > example
> > 
> > Why?  Because it is a constructor and the object isn't fully constructed yet
> > at that point?
> 
> Yes, exactly. The object's lifetime has not started until the constructor
> completes, so accessing it is only allowed in very limited ways, described
> in [basic.life] p6. However, it looks like for a non-trivial constructor the
> results are just unspecified, not undefined, see [class.cdtor] p2. Still, I
> don't see how operator new could meaningfully do anything to an object under
> construction if the object is in an unspecified state. And frankly, if
> anybody did write an operator new like that, they deserve what they get.
> 
> Could we have a flag that says "assume operator new is not stupid"?

We certainly could add that and with that revert the effect of not making
new/delete "pure" (I'd have to look up the PR we've reverted the change
that generally made it so).
Comment 23 Richard Biener 2023-03-29 11:38:49 UTC
So we can "mitigate" the diagnostic for g++.dg/pr17488.C with a hack, but for
g++.dg/warn/Warray-bounds-16.C we see

<bb 2> [local count: 1073741824]:
a ={v} {CLOBBER};
a.m = 0;
_5 = operator new [] (0);
a.p = _5;
_2 = a.m;
if (_2 > 0)
  goto <bb 3>; [89.00%]
else
  goto <bb 5>; [11.00%]

<bb 3> [local count: 955630225]:
_12 = (sizetype) _2;
_11 = _12 * 4;
__builtin_memset (_5, 0, _11); [tail call]

where we'd have a clear range (_2 > 0) even without the multiplication but
we're only now picking that up.  The bug here is quite the same
missed optimization though, we fail to CSE a.m around the 'operator new [] (0)'
call and so obviously dead code remains.

C++ is simply an awful language to work with here.  A static analyzer would
maybe simply look past possibly clobbering calls deriving the code is
likely dead and refrain from diagnosing it.

Note while for g++.dg/warn/Warray-bounds-16.C we are again working inside a
CTOR the issue extends to any code with intermediate allocations via
new or delete expressions.  Yes, we can add some flag like -fnew-is-not-stupid,
but then we couldn't make it the default.  Maybe(?) we can somehow detect
whether we are dealing with overloaded global new/delete with LTO, like
detecting we're resolving it to the copy in libstdc++?  The resolution info
just tells us RESOLVED_DYN though, maybe we can add something like
RESOLVED_STDLIB_DYN and handle a set of known libraries specially?

I'm putting this back to P3, we do have a load more (late) diagnostic regressions in GCC 13.
Comment 24 Richard Biener 2023-03-29 11:41:12 UTC
Created attachment 54784 [details]
another hack
Comment 25 GCC Commits 2023-03-30 11:16:14 UTC
The master branch has been updated by Richard Biener <rguenth@gcc.gnu.org>:

https://gcc.gnu.org/g:04b0a7b1a6d9e0f3782888f1ebf187c26690038b

commit r13-6943-g04b0a7b1a6d9e0f3782888f1ebf187c26690038b
Author: Richard Biener <rguenther@suse.de>
Date:   Wed Mar 29 13:49:24 2023 +0200

    tree-optimization/107561 - reduce -Wstringop-overflow false positives
    
    The following tells pointer-query to prefer a zero size when we
    are querying for the size range for a write into an object we've
    determined is of zero size.  That avoids diagnostics about really
    varying size arguments that just get a meaningful range for example
    because they are multiplied by an element size.
    
    I've adjusted only one call to get_size_range since that's what
    I have a testcase for.
    
            PR tree-optimization/107561
            * gimple-ssa-warn-access.cc (get_size_range): Add flags
            argument and pass it on.
            (check_access): When querying for the size range pass
            SR_ALLOW_ZERO when the known destination size is zero.
    
            * g++.dg/pr71488.C: Remove XFAILed bogus diagnostic again.
            * g++.dg/warn/Warray-bounds-16.C: Likewise.
Comment 26 Richard Biener 2023-03-30 11:17:47 UTC
Fixed.