[PATCH] enhance -Warray-bounds to detect out-of-bounds offsets (PR 82455)

Richard Biener rguenther@suse.de
Thu Nov 2 11:48:00 GMT 2017


On Mon, 30 Oct 2017, Martin Sebor wrote:

> On 10/30/2017 02:56 PM, Richard Biener wrote:
> > On October 30, 2017 9:13:04 PM GMT+01:00, Martin Sebor <msebor@gmail.com>
> > wrote:
> > > On 10/30/2017 01:53 PM, Richard Biener wrote:
> > > > On October 30, 2017 4:19:25 PM GMT+01:00, Martin Sebor
> > > <msebor@gmail.com> wrote:
> > > > > On 10/30/2017 05:45 AM, Richard Biener wrote:
> > > > > > On Sun, 29 Oct 2017, Martin Sebor wrote:
> > > > > > 
> > > > > > > In my work on -Wrestrict, to issue meaningful warnings, I found
> > > > > > > it important to detect both out of bounds array indices as well
> > > > > > > as offsets in calls to restrict-qualified functions like strcpy.
> > > > > > > GCC already detects some of these cases but my tests for
> > > > > > > the enhanced warning exposed a few gaps.
> > > > > > > 
> > > > > > > The attached patch enhances -Warray-bounds to detect more
> > > instances
> > > > > > > out-of-bounds indices and offsets to member arrays and non-array
> > > > > > > members.  For example, it detects the out-of-bounds offset in the
> > > > > > > call to strcpy below.
> > > > > > > 
> > > > > > > The patch is meant to be applied on top posted here but not yet
> > > > > > > committed:
> > > > > > >      https://gcc.gnu.org/ml/gcc-patches/2017-10/msg01304.html
> > > > > > > 
> > > > > > > Richard, since this also touches tree-vrp.c I look for your
> > > > > comments.
> > > > > > 
> > > > > > You fail to tell what you are changing and why - I have to reverse
> > > > > > engineer this from the patch which a) isn't easy in this case, b)
> > > > > feels
> > > > > > like a waste of time.  Esp. since the patch does many things.
> > > > > > 
> > > > > > My first question is why do you add a warning from forwprop?  It
> > > > > > _feels_ like you're trying to warn about arbitrary out-of-bound
> > > > > > addresses at the point they are folded to MEM_REFs.  And it looks
> > > > > > like you're warning about pointer arithmetic like &p->a + 6.
> > > > > > That doesn't look correct to me.  Pointer arithmetic in GIMPLE
> > > > > > is not restricted to operate within fields that are appearantly
> > > > > > accessed here - the only restriction is with respect to the
> > > > > > whole underlying pointed-to-object.
> > > > > > 
> > > > > > By doing the warning from forwprop you'll run into all such cases
> > > > > > introduced by GCC itself during quite late optimization passes.
> > > > > 
> > > > > I haven't run into any such cases.  What would a more appropriate
> > > > > place to detect out-of-bounds offsets?  I'm having a hard time
> > > > > distinguishing what is appropriate and what isn't.  For instance,
> > > > > if it's okay to detect some out of bounds offsets/indices in vrp
> > > > > why is it wrong to do a better job of it in forwprop?
> > > > > 
> > > > > > 
> > > > > > You're trying to re-do __builtin_object_size even when that wasn't
> > > > > > used.
> > > > > 
> > > > > That's not the quite my intent, although it is close.
> > > > > 
> > > > > > 
> > > > > > So it looks like you're on the wrong track.  Yes,
> > > > > > 
> > > > > >     strcpy (p->a + 6, "y");
> > > > > > 
> > > > > > _may_ be "invalid" C (I'm not even sure about that!) but it
> > > > > > is certainly not invalid GIMPLE.
> > > > > 
> > > > > Adding (or subtracting) an integer to/from a pointer to an array
> > > > > is defined in both C and C++ only if the result points to an element
> > > > > of the array or just past the last element of the array.  Otherwise
> > > > > it's undefined. (A non-array object is considered an array of one
> > > > > for this purpose.)
> > > > 
> > > > On GIMPLE this is indistinguishable from (short *) (p->a) + 3.
> > > 
> > > Sure, they're both the same:
> > > 
> > >    pa_3 = &p_2(D)->a;
> > >    _1 = pa_3 + 6;
> > > 
> > > and that's fine because the implementation of the warning sees and
> > > uses the byte offset from the beginning of a, so I don't understand
> > > the problem you are pointing out.  Can you clarify what you mean?
> > 
> > It does not access the array but the underlying object. On GIMPLE it is just
> > an address calculation without constraints.
> 
> But the computation starts with the subobject and so is only
> valid within the bounds of the subobject.  Or are you saying
> that GCC emits such GIMPLE expressions for valid code?  If so,
> can you give an example of such code?

There were elaborate transforms of ptr + CST to ptr->a.b.c[3] in the
past.  We have ripped out _most_ of them because of bad interaction
with dependence analysis and _b_o_s warnings.

But for example PRE might still end up propagating

 _1 = &ptr->a.b.c;
 _2 = (*_1)[i_3];

as

 _2 = ptr->a.b.c[i_3];

But it's not so much GCC building up GIMPLE expressions that would
cause false positives but "valid" C code and "invalid" C code
represented exactly the same in GCC.  Let's say

struct S {
    short s[4];
    int i;
};

char foo (struct S *p)
{
  *((char *)p->s + 8);
}

for example which I think is perfectly valid but ends up as

foo (struct S * p)
{
  short int[4] * _1;
  char * _2;

  <bb 2> [0.00%] [count: INV]:
  _1 = &p_3(D)->s;
  _2 = _1 + 8;
  return;

in GIMPLE.  You might say well, I should have used (char *)p,
not (char *)p->s, but I believe such code is more common than
you think (and we have the different _FORITIFY_SOURCE levels
for a reason!  You are forcing the most strict interpretation
on all address computations!)

> Or, if that's not it, what exactly is your concern with this
> enhancement?  If it's that it's implemented in forwprop, what
> would be a better place, e.g., earlier in the optimization
> phase?  If it's something something else, I'd appreciate it
> if you could explain what.

For one implementing this in forwprop looks like a move in the
wrong direction.  I'd like to have separate warning passes or
at most amend warnings from optimization passes, not add new ones.

Richard.



More information about the Gcc-patches mailing list