Bug 94335 - [11/12/13/14 Regression] False positive -Wstringop-overflow warning with -O2
Summary: [11/12/13/14 Regression] False positive -Wstringop-overflow warning with -O2
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: 10.0
: P2 normal
Target Milestone: 11.5
Assignee: Not yet assigned to anyone
URL:
Keywords: diagnostic
Depends on:
Blocks: Wstringop-overflow
  Show dependency treegraph
 
Reported: 2020-03-25 22:35 UTC by Romain Geissler
Modified: 2024-03-10 21:46 UTC (History)
7 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail: 10.1.0, 11.0
Last reconfirmed: 2021-01-21 00:00:00


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Romain Geissler 2020-03-25 22:35:38 UTC
Hi,

The following code emits a false positive -Wstringop-overflow warning with gcc trunk with -O2 and -O3.

It's reduced manually from a much more complex scenario, where we try to define an iterator that can be stored in shared memory, thus referencing object through relative addresses rather than absolute addresses. In the warning of this scenario, I don't see how we can access offset -424242 as the initial pointed address is never nullptr.

Note: gcc 8/9 do not emit such warning.


#include <iterator>
#include <memory>
#include <cassert>
#include <limits>
#include <cstring>
 
template <class T>
struct relative_address_iterator {
    typedef T value;
    typedef T *pointer;
    typedef const T *const_pointer;
    typedef T &reference;
    typedef const T &const_reference;
 
    typedef T value_type;
    typedef ptrdiff_t difference_type;
    typedef std::forward_iterator_tag iterator_category;
 
 
    off_t d;
    static constexpr off_t kEmptyPointer = std::numeric_limits<off_t>::min();

    /** Constructs the object from a pointer.
     */
    relative_address_iterator(pointer t)
    {
        if (t)
        {
            d = (char *)t - (char *)this;
        }   
        else
        {
            d = kEmptyPointer;
        }   
    }   
    
    /**  Copy constructor.
     */
    relative_address_iterator(const relative_address_iterator &r)
    {
        if (r.d != kEmptyPointer)
        {
            d = (char *)&r - (char *)this + r.d;
        }   
        else
        {
            // Normally this should not happen in this example, but gcc thinks it does,
            // as the error message references the special value -424242.
            // d = kEmptyPointer;
            d = -424242;
        }   
    }   
    
    /** Initializes the pointee with the given value.
     */
    void setVal(const_reference value) { *(pointer)((char *)this + d) = value; }
    
    /** Dereferences the object.
     */
    operator pointer()
    {
        if (d == kEmptyPointer) return nullptr;
        return ((pointer)((char *)this + d));
    }   
    
    /** Preincrement operator.
     */
    relative_address_iterator &operator++()
    {
        d += sizeof(value);
        return (*this);
    }   
};  

void f()
{
    char* aDummyBuffer = static_cast<char*>(malloc(10));
    memset(aDummyBuffer, 10, 0);
    relative_address_iterator<char> it(aDummyBuffer);
    relative_address_iterator<char> itCopy(it);
    
    itCopy.setVal('A');
    
    char* aDummySource = static_cast<char*>(malloc(10));
    memset(aDummySource, 10, 0);
    std::copy(relative_address_iterator<char>(aDummySource),
            relative_address_iterator<char>(aDummySource + 5),
            it);
}           
            
int main()  
{
    f();
}



In member function 'void relative_address_iterator<T>::setVal(relative_address_iterator<T>::const_reference) [with T = char]',
    inlined from 'void f()' at <source>:82:18:
<source>:56:71: warning: writing 1 byte into a region of size 0 [-Wstringop-overflow=]
   56 |     void setVal(const_reference value) { *(pointer)((char *)this + d) = value; }
      |                                          ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~
<source>: In function 'void f()':
<source>:80:37: note: at offset -424242 to object 'itCopy' with size 8 declared here
   80 |     relative_address_iterator<char> itCopy(it);
      |                                     ^~~~~~
Compiler returned: 0


Cheers,
Romain
Comment 1 Martin Sebor 2020-03-26 00:07:49 UTC
This type of warning is new GCC 10; it was added in the commit below.  It works as designed here.  It sees the following IL (the memset calls don't do anything).  The MEM[] = 65; statement is what triggers it.

  <bb 2> [local count: 1073741824]:
  aDummyBuffer_4 = malloc (10);
  it ={v} {CLOBBER};
  if (aDummyBuffer_4 != 0B)
    goto <bb 3>; [70.00%]
  else
    goto <bb 18>; [30.00%]

  <bb 3> [local count: 751619281]:
  _24 = aDummyBuffer_4 - &it;
  it.d = _24;
  itCopy ={v} {CLOBBER};
  if (_24 != -9223372036854775808)
    goto <bb 4>; [94.29%]
  else
    goto <bb 5>; [5.71%]

  <bb 4> [local count: 708669601]:
  _23 = aDummyBuffer_4 - &itCopy;
  itCopy.d = _23;
  *aDummyBuffer_4 = 65;
  aDummySource_97 = malloc (10);
  D.40357 ={v} {CLOBBER};
  _17 = aDummyBuffer_4 - &D.40357;
  D.40357.d = _17;
  goto <bb 6>; [100.00%]

  <bb 5> [local count: 365072224]:
  itCopy.d = -424242;
  MEM[(char *)&itCopy + -424242B] = 65;   <<< warning here
  aDummySource_105 = malloc (10);
  D.40357 ={v} {CLOBBER};
  D.40357.d = -424242;
  ...
  <bb 18> [local count: 322122544]:
  it.d = -9223372036854775808;
  itCopy ={v} {CLOBBER};
  goto <bb 5>; [100.00%]


It doesn't matter (much) whether the initial address is or can be null (the warning persists even with operator new that doesn't return null or when the ctor never does set d to  kEmptyPointer).  The branch of the code that sets d to -424242 isn't eliminated because the pointer subtraction in either ctor could, as far as GCC can tell, result in the same value as kEmptyPointer.

Asserting that the subtraction doesn't result in such a value, for instance like so:
            if (d == kEmptyPointer) __builtin_unreachable ();
and also guaranteeing that the initial address isn't null (e.g., by using operator new) eliminates the warning.

Short of teaching GCC that the magnitude of the difference between any two pointers must be less than PTRDIFF_MAX I don't think there's anything that can be done do improve things (either codegen, or avoid the warning in this case).   I'll leave this report unresolved in case someone feels otherwise.

commit b631bdb3c16e85f35d38e39b3d315c35e4a5747c
Author: Martin Sebor <msebor@redhat.com>
Date:   Thu Jul 25 00:29:17 2019 +0000

    PR tree-optimization/91183 - strlen of a strcpy result with a conditional source not folded
    
    PR tree-optimization/91183 - strlen of a strcpy result with a conditional source not folded
    PR tree-optimization/86688 - missing -Wstringop-overflow using a non-string local array in strnlen with excessive bound
Comment 2 Romain Geissler 2020-03-26 00:50:29 UTC
Thanks for the explanation.

However few observations:
 - Is it really expected that the wording of the warning seems to imply gcc knows for sure that there is an invalid access ? What is the warning meant to find ? Potential issues, or real issues gcc can prove will happen 100% of the time ? Here I see that you pasted:

  if (_24 != -9223372036854775808)
    goto <bb 4>; [94.29%]
  else
    goto <bb 5>; [5.71%]

so gcc itself is able to see that the most likely case is that we go in basic block 4 rather than 5 which emits this warning.

So either the wording of the warning shall be updated to reflect that "maybe" the code wrong, either if possible we should try to make a different when gcc is sure and when it is not (like -Wuninitialized vs -Wmaybe-uninitialized).

 - Second observations, how do you suggest we teach gcc that this is indeed a false positive and we want the warning to be silenced. You mentioned "if (d == kEmptyPointer) __builtin_unreachable;", does this statement result in effectively having a new basic block branching where one side terminates the program and the other runs the actual expected behavior of "setVal". In other words, will the condition in the "if" really be emitted by the codegen and evaluated at runtime, or do I have the guarantee it will not emit new branches and only teach the optimizer that some cases are impossible ? In the case the answer is that it will emit a real codegen branch potentially impacting runtime, do you think it's worth adding a __builtin_assume in gcc like clang has (not in gcc 10 obviously, but in gcc 11 during stage 1) ?

Cheers,
Romain
Comment 3 Richard Biener 2020-03-26 07:01:14 UTC
You should be using (intptr_t)t - (intptr_t)this when computing the relative
pointer, not sure if that makes a difference but pointer difference between
pointers to different objects invokes undefined behavior.
Comment 4 Romain Geissler 2020-03-26 08:42:41 UTC
Thanks Richard.

Indeed, beyond the false positive described in this bug, our whole code that implement "relative pointer" is I think quite hacky and not very compiler friendly (around alignment, aliasing rule, pointer arithmetic). We should review and update it a lot. Actually a similar class exists in Boost.Interprocess under the name "offset_ptr", and they did write this in their header: https://github.com/boostorg/interprocess/blob/develop/include/boost/interprocess/offset_ptr.hpp

//Note: using the address of a local variable to point to another address
//is not standard conforming and this can be optimized-away by the compiler.
//Non-inlining is a method to remain illegal but correct

//Undef BOOST_INTERPROCESS_OFFSET_PTR_INLINE_XXX if your compiler can inline
//this code without breaking the library

any time they need to deal with pointer to offset conversion, or vice-versa. I happens that we also suffer from similar problems and had to put attribute "noinline" "randomly" in places to "fix" (actually workaround our poor understanding of how the compiler works) problems we are seeing with the behavior of this library when compiled with optimizations. We should obviously review greatly in depths what we are doing which seems wrong in many places.

PS Martin: I am ok to leave unresolved, just I think the wording of the error should be somehow updated so that the fact that it's only a possibility for not a certainty, other of your warnings around string management do print the range of value that gcc found out is possible, and that helps in understanding and fixing/silencing the warnings.
Comment 5 Martin Sebor 2020-03-26 19:39:54 UTC
Few middle-end warnings consider control flow -- most simply look at a single statement at a time and consider ranges of argument values (if any).  Those that do consider control flow (e.g., -Wreturn-local-addr) only do so for PHI nodes that, for the most part, are the result of ?: expressions.  We have work underway to improve the way GCC computes and exposes range information that we expect will help improve the accuracy in cases where there's more than a single range.  I've also been thinking about changing some existing warnings to consider control flow to improve both their accuracy and efficacy, but this would only make a difference for PHI nodes.

I haven't thought too much about using branch probabilities to decide whether to issue a maybe kind of a warning.  Right now, a statement in the IL with a constant out-of-bounds argument triggers a definitive warning regardless of how likely the branch it's in is executed.  It might be something to explore, though I would expect it to quickly turn most warnings in non-trivial code into the maybe kind.

Using 'if (condition) __builtin_unreachable ();' shouldn't have an adverse effect on efficiency.  Rather I'd expect it to improve codegen since it tells GCC that and any code that would otherwise be reachable after it can be eliminated.  I would expect its effects to be comparable to __builtin_assume (condition).
Comment 6 kal.conley 2020-05-20 13:10:56 UTC
We are hitting this warning too with:

#include <stdint.h>
#include <vector>

int main() {
    std::vector<uint64_t> inputs(2);
    std::vector<uint8_t> outputs{inputs.begin(), inputs.end()};
    outputs.back() = 1;
    return 0;
}

Regards,
Kal
Comment 7 Martin Sebor 2020-05-20 15:15:07 UTC
Thanks for the small test case!  The warning for reference is:

pr94335-c6.C: In function ‘int main()’:
pr94335-c6.C:8:20: warning: writing 1 byte into a region of size 0 [-Wstringop-overflow=]
    8 |     outputs.back() = 1;
      |     ~~~~~~~~~~~~~~~^~~
In file included from /build/gcc-master/x86_64-pc-linux-gnu/libstdc++-v3/include/x86_64-pc-linux-gnu/bits/c++allocator.h:33,
                 from /build/gcc-master/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/allocator.h:46,
                 from /build/gcc-master/x86_64-pc-linux-gnu/libstdc++-v3/include/vector:64,
                 from pr94335-c6.C:3:
/build/gcc-master/x86_64-pc-linux-gnu/libstdc++-v3/include/ext/new_allocator.h:115:41: note: at offset 0 to an object with size 0 allocated by ‘operator new’ here
  115 |  return static_cast<_Tp*>(::operator new(__n * sizeof(_Tp)));
      |                           ~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~

The warning in this case is actually due to a different problem, this one in the warning infrastructure itself.  The relevant IL the warning works with (-fdump-tree-strlen) is below:

main ()
{
  ...
  unsigned char * _48;
  ...
  unsigned char * _59;
  ...
  <bb 2> [local count: 1073741825]:
  ...
  _59 = operator new (2);

  <bb 3> [local count: 1073007519]:
  outputs.D.19139._M_impl.D.18482._M_start = _59;
  _48 = _59 + 2;
  ...
  MEM[(value_type &)_48 + 18446744073709551615] = 1;   <<< warning here (18446744073709551615 == -2)
  ...
}

To determine the size of what _48 points to the warning code calls the compute_objsize() function.  It returns zero because of a design limitation: it returns the size of the remaining space in the object (i.e., the full size of the pointed-to object minus the offset, which is 2 in this case).  The caller has no way to increase the size it gets back by its negative offset (the large number which is -2).

I have a rewritten the function to avoid this (and other problems with it) and expect to have a fix for GCC 11, and possibly even for GCC 10.2.

Since this is a different problem than the originally reported bug I don't want to use it to track it.  Feel free to open a separate bug for it.
Comment 8 Martin Sebor 2020-06-11 15:28:42 UTC
(In reply to kal.conley from comment #6)

For reference, this was also submitted as pr95353 and is now fixed on trunk (GCC 11).

The test case in comment #0 still triggers a warning.
Comment 9 Richard Biener 2020-07-23 06:52:00 UTC
GCC 10.2 is released, adjusting target milestone.
Comment 10 Martin Sebor 2021-01-21 22:45:06 UTC
No change in GCC 11.
Comment 11 Richard Biener 2021-04-08 12:02:00 UTC
GCC 10.3 is being released, retargeting bugs to GCC 10.4.
Comment 12 Andrew Pinski 2021-09-21 08:34:06 UTC
The warning for the one in comment #0 is gone on the trunk
The warning for the testcase in comment #6 is gone in GCC 11.1.0.
Comment 13 Andrew Pinski 2021-09-21 09:43:30 UTC
(In reply to Andrew Pinski from comment #12)
> The warning for the testcase in comment #6 is gone in GCC 11.1.0.

But that looks like on accident (there is some early inlining differences) and not really fixing the problem and there becomes a missing optimization.  I Have a patch for the missed optimization (it is the same as PR 102216).
Comment 14 Andrew Pinski 2021-11-23 10:24:10 UTC
(In reply to kal.conley from comment #6)
> We are hitting this warning too with:

The missed optimization in this one was fixed with r12-5465-g911b633803dcbb298.
Comment 15 Jakub Jelinek 2022-06-28 10:40:07 UTC
GCC 10.4 is being released, retargeting bugs to GCC 10.5.
Comment 16 Richard Biener 2023-07-07 10:37:07 UTC
GCC 10 branch is being closed.
Comment 17 Jeffrey A. Law 2024-03-10 21:46:55 UTC
Per c#12, c#13, c#14.