This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH] enhance -Warray-bounds to detect out-of-bounds offsets (PR 82455)
- From: Richard Biener <rguenther at suse dot de>
- To: Martin Sebor <msebor at gmail dot com>
- Cc: Gcc Patch List <gcc-patches at gcc dot gnu dot org>, Jeff Law <law at redhat dot com>
- Date: Thu, 2 Nov 2017 12:48:19 +0100 (CET)
- Subject: Re: [PATCH] enhance -Warray-bounds to detect out-of-bounds offsets (PR 82455)
- Authentication-results: sourceware.org; auth=none
- References: <b63d021f-ceca-e26f-848a-40182de004c9@gmail.com> <alpine.LSU.2.20.1710301238080.8202@zhemvz.fhfr.qr> <79634da6-bf31-b7f0-15f5-0436fc21a51a@gmail.com> <C174CF6C-E18C-4575-AA5E-AB7D52A6C1D8@suse.de> <ba57f7d9-fd4b-0692-462f-cf4fcf9b6f93@gmail.com> <5FF222AD-B155-434B-9C65-721009D1964E@suse.de> <1064925a-ac64-502e-d0dd-85c27e7432f6@gmail.com>
On Mon, 30 Oct 2017, Martin Sebor wrote:
> On 10/30/2017 02:56 PM, Richard Biener wrote:
> > On October 30, 2017 9:13:04 PM GMT+01:00, Martin Sebor <msebor@gmail.com>
> > wrote:
> > > On 10/30/2017 01:53 PM, Richard Biener wrote:
> > > > On October 30, 2017 4:19:25 PM GMT+01:00, Martin Sebor
> > > <msebor@gmail.com> wrote:
> > > > > On 10/30/2017 05:45 AM, Richard Biener wrote:
> > > > > > On Sun, 29 Oct 2017, Martin Sebor wrote:
> > > > > >
> > > > > > > In my work on -Wrestrict, to issue meaningful warnings, I found
> > > > > > > it important to detect both out of bounds array indices as well
> > > > > > > as offsets in calls to restrict-qualified functions like strcpy.
> > > > > > > GCC already detects some of these cases but my tests for
> > > > > > > the enhanced warning exposed a few gaps.
> > > > > > >
> > > > > > > The attached patch enhances -Warray-bounds to detect more
> > > instances
> > > > > > > out-of-bounds indices and offsets to member arrays and non-array
> > > > > > > members. For example, it detects the out-of-bounds offset in the
> > > > > > > call to strcpy below.
> > > > > > >
> > > > > > > The patch is meant to be applied on top posted here but not yet
> > > > > > > committed:
> > > > > > > https://gcc.gnu.org/ml/gcc-patches/2017-10/msg01304.html
> > > > > > >
> > > > > > > Richard, since this also touches tree-vrp.c I look for your
> > > > > comments.
> > > > > >
> > > > > > You fail to tell what you are changing and why - I have to reverse
> > > > > > engineer this from the patch which a) isn't easy in this case, b)
> > > > > feels
> > > > > > like a waste of time. Esp. since the patch does many things.
> > > > > >
> > > > > > My first question is why do you add a warning from forwprop? It
> > > > > > _feels_ like you're trying to warn about arbitrary out-of-bound
> > > > > > addresses at the point they are folded to MEM_REFs. And it looks
> > > > > > like you're warning about pointer arithmetic like &p->a + 6.
> > > > > > That doesn't look correct to me. Pointer arithmetic in GIMPLE
> > > > > > is not restricted to operate within fields that are appearantly
> > > > > > accessed here - the only restriction is with respect to the
> > > > > > whole underlying pointed-to-object.
> > > > > >
> > > > > > By doing the warning from forwprop you'll run into all such cases
> > > > > > introduced by GCC itself during quite late optimization passes.
> > > > >
> > > > > I haven't run into any such cases. What would a more appropriate
> > > > > place to detect out-of-bounds offsets? I'm having a hard time
> > > > > distinguishing what is appropriate and what isn't. For instance,
> > > > > if it's okay to detect some out of bounds offsets/indices in vrp
> > > > > why is it wrong to do a better job of it in forwprop?
> > > > >
> > > > > >
> > > > > > You're trying to re-do __builtin_object_size even when that wasn't
> > > > > > used.
> > > > >
> > > > > That's not the quite my intent, although it is close.
> > > > >
> > > > > >
> > > > > > So it looks like you're on the wrong track. Yes,
> > > > > >
> > > > > > strcpy (p->a + 6, "y");
> > > > > >
> > > > > > _may_ be "invalid" C (I'm not even sure about that!) but it
> > > > > > is certainly not invalid GIMPLE.
> > > > >
> > > > > Adding (or subtracting) an integer to/from a pointer to an array
> > > > > is defined in both C and C++ only if the result points to an element
> > > > > of the array or just past the last element of the array. Otherwise
> > > > > it's undefined. (A non-array object is considered an array of one
> > > > > for this purpose.)
> > > >
> > > > On GIMPLE this is indistinguishable from (short *) (p->a) + 3.
> > >
> > > Sure, they're both the same:
> > >
> > > pa_3 = &p_2(D)->a;
> > > _1 = pa_3 + 6;
> > >
> > > and that's fine because the implementation of the warning sees and
> > > uses the byte offset from the beginning of a, so I don't understand
> > > the problem you are pointing out. Can you clarify what you mean?
> >
> > It does not access the array but the underlying object. On GIMPLE it is just
> > an address calculation without constraints.
>
> But the computation starts with the subobject and so is only
> valid within the bounds of the subobject. Or are you saying
> that GCC emits such GIMPLE expressions for valid code? If so,
> can you give an example of such code?
There were elaborate transforms of ptr + CST to ptr->a.b.c[3] in the
past. We have ripped out _most_ of them because of bad interaction
with dependence analysis and _b_o_s warnings.
But for example PRE might still end up propagating
_1 = &ptr->a.b.c;
_2 = (*_1)[i_3];
as
_2 = ptr->a.b.c[i_3];
But it's not so much GCC building up GIMPLE expressions that would
cause false positives but "valid" C code and "invalid" C code
represented exactly the same in GCC. Let's say
struct S {
short s[4];
int i;
};
char foo (struct S *p)
{
*((char *)p->s + 8);
}
for example which I think is perfectly valid but ends up as
foo (struct S * p)
{
short int[4] * _1;
char * _2;
<bb 2> [0.00%] [count: INV]:
_1 = &p_3(D)->s;
_2 = _1 + 8;
return;
in GIMPLE. You might say well, I should have used (char *)p,
not (char *)p->s, but I believe such code is more common than
you think (and we have the different _FORITIFY_SOURCE levels
for a reason! You are forcing the most strict interpretation
on all address computations!)
> Or, if that's not it, what exactly is your concern with this
> enhancement? If it's that it's implemented in forwprop, what
> would be a better place, e.g., earlier in the optimization
> phase? If it's something something else, I'd appreciate it
> if you could explain what.
For one implementing this in forwprop looks like a move in the
wrong direction. I'd like to have separate warning passes or
at most amend warnings from optimization passes, not add new ones.
Richard.