This is the mail archive of the mailing list for the GCC project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] restore -Warray-bounds for string literals (PR 83776)

On 07/13/2018 05:09 PM, Martin Sebor wrote:
>>> +  /* At level 2 check also intermediate offsets.  */
>>> +  int i = 0;
>>> +  if (extrema[i] < -strbounds[1]
>>> +      || extrema[i = 1] > strbounds[1] + eltsize)
>>> +    {
>>> +      HOST_WIDE_INT tmpidx = extrema[i].to_shwi () / eltsize.to_shwi
>>> ();
>>> +
>>> +      warning_at (location, OPT_Warray_bounds,
>>> +          "intermediate array offset %wi is outside array bounds "
>>> +          "of %qT",
>>> +          tmpidx,  strtype);
>>> +      TREE_NO_WARNING (ref) = 1;
>>> +    }
>>> +}
>> This seems ill-advised.  All that matters is the actual index used in
>> the dereference.  Intermediate values (ie, address computations) along
>> the way are uninteresting -- we may form an address out of the bounds of
>> the array as an intermediate computation.  But the actual memory
>> reference will be within the range.
> The idea is to help detect bugs that cannot be detected otherwise.
> Take the example below:
>   void g (int i)
>   {
>     if (i < 1 || 2 < i)
>       i = 1;
>     const char *p1 = "12" + i;
>     const char *p2 = p1 + i;
>     extern int x, y;
>     x = p2[-4];   // #1: only valid if p2 is out of bounds
>     y = p2[0];    // #2 therefore this must be out of bounds
>   }
> For the dereference at #1 to be valid i must be 2 (i.e.,
> the upper bound of its range) and so p2 is therefore out
> of bounds.
> We may be able to compensate for it and compute the right
> address at #1 but if the out-of-bounds value is used in
> another dereference that makes a different assumption
> (such as #2) one of the two is definitely wrong.  It might
> be possible to do something smarter and determine if the
> pointer really is used this way but I didn't want to
> complicate the things too much so I put the logic under
> level 2.
> I will post an updated patch shortly.
The problem here is the intermediate values may well be out of bounds
addresses that are brought into bounds by an offset in the actual

The loop optimizers can create this situation internally and IIRC Ada
also creates them before handing the code off to gimple.

I'm not aware of a way to distinguish between cases where the compiler
has created the out of bounds address and will compensate in the final
access vs user error.  Worse yet, I know of no way to stop creation of
the out of bounds address, so users would get an warning which they
can't reasonably cope with.

I guess having it under level 2 may be OK.  I'd kind of like to hear
other opinions here.


ps.  And yes, this is braindamaged and is known to break badly on some
segmented architectures which select the segment based on the base
register rather than the full effective address.  THe PA has that as a
design "feature".  The mn103 (and defunct mn102 IIRC) have this as an
implementation bug.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]