This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] restore -Warray-bounds for string literals (PR 83776)


On 07/13/2018 05:45 PM, Martin Sebor wrote:
>>
>> +      offset_int ofr[] = {
>> +       wi::to_offset (fold_convert (ptrdiff_type_node, vr->min)),
>> +       wi::to_offset (fold_convert (ptrdiff_type_node, vr->max))
>> +      };
>>
>> huh.  Do you maybe want to use widest_int for ofr[]?  What's
>> wrong with wi::to_offset (vr->min)?  Building another intermediate
>> tree node here looks just bogus.
> 
> I need to convert size_type indices to signed to keep their sign
> if it's negative and include it in warnings.  I've moved the code
> into a conditional where it's used to minimize the cost but other
> than that I don't know how else to convert it.
> 
>>
>> What are you trying to do in this loop anyways?
> 
> The loop It determines the range of the final index/offset for
> a series of POINTER_PLUS_EXPRs.  It handles cases like this:
> 
>   int g (int i, int j, int k)
>   {
>     if (i < 1) i = 1;
>     if (j < 1) j = 1;
>     if (k < 1) k = 1;
> 
>     const char *p0 = "123";
>     const char *p1 = p0 + i;
>     const char *p2 = p1 + j;
>     const char *p3 = p2 + k;
> 
>     // p2[3] and p3[1] are out of bounds
>     return p0[4] + p1[3] + p2[2] + p3[1];
>   }
> 
>> I suppose
>> look at
>>
>>   p_1 = &obj[i_2];  // already bounds-checked, but with ignore_off_by_one
>>   ... = MEM[p_1 + CST];
>>
>> ?  But then
>>
>> +      if (TREE_CODE (varoff) != SSA_NAME)
>> +       break;
>>
>> you should at least handle INTEGER_CSTs here?
> 
> It's already handled (and set in CSTOFF).  There should be no
> more constant offsets after that (I haven't come across any.)
> 
>>
>> +      if (!vr || vr->type == VR_UNDEFINED || !vr->min || !vr->max)
>> +       break;
>>
>> please use positive tests here, VR_RANGE || VR_ANTI_RANGE.  As usual
>> the anti-range handling looks odd.  Please add comments so we can follow
>> what you were thinking when writing range merging code.  Even better if you
>>
>> can stick to use existing code rather than always re-inventing the wheel...
>>
> 
> The anti-range handling is to conservatively add
> [-MAXOBJSIZE -1, MAXOBJSIZE] to OFFRANGE.  I've added comments
> to make it clear.  I'd be more than happy to reuse existing
> code if I knew where to find it (if it exists).  It sure would
> save me lots of work to have a library of utility functions to
> call instead of rolling my own code each time.
Finding stuff is never easy :(  GCC's just gotten so big it's virtually
impossible for anyone to know all the APIs.

The suggestion I'd have would be to (when possible) factor this stuff
into routines you can reuse.  We (as a whole) have this tendency to
open-code all kinds of things rather than factoring the code into
reusable routines.

In addition to increasing the probability that you'll be able to reuse
code later, just reading a new function's header tends to make me (as a
reviewer) internally ask if there's already a routine we should be using
instead.  When it's open-coded it's significantly harder to spot those
cases (at least for me).


> 
>>
>> I think I commented on earlier variants but this doesn't seem to resemble
>> any of them.
> 
> I've reworked the patch (sorry) to also handle arrays.  For GCC
> 9 it seems I might as well do both in one go.
> 
> Attached is an updated patch with these changes.
> 
> Martin
> 
> gcc-83776.diff
> 
> 
> PR tree-optimization/84047 - missing -Warray-bounds on an out-of-bounds index into an array
> PR tree-optimization/83776 - missing -Warray-bounds indexing past the end of a string literal
> 
> gcc/ChangeLog:
> 
> 	PR tree-optimization/84047
> 	PR tree-optimization/83776
> 	* tree-vrp.c (vrp_prop::check_mem_ref): New function.
> 	(check_array_bounds): Call it.
> 	* /gcc/tree-sra.c (get_access_for_expr): Fail for out-of-bounds
> 	array offsets.
> 
> gcc/testsuite/ChangeLog:
> 
> 	PR tree-optimization/83776
> 	PR tree-optimization/84047
> 	* gcc.dg/Warray-bounds-29.c: New test.
> 	* gcc.dg/Warray-bounds-30.c: New test.
> 	* gcc.dg/Warray-bounds-31.c: New test.
> 	* gcc.dg/Warray-bounds-32.c: New test.
> 

> diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c
> index 3e30f6b..8221a06 100644
> --- a/gcc/tree-sra.c
> +++ b/gcc/tree-sra.c
> @@ -3110,6 +3110,19 @@ get_access_for_expr (tree expr)
>        || !DECL_P (base))
>      return NULL;
>  
> +  /* Fail for out-of-bounds array offsets.  */
> +  tree basetype = TREE_TYPE (base);
> +  if (TREE_CODE (basetype) == ARRAY_TYPE)
> +    {
> +      if (offset < 0)
> +	return NULL;
> +
> +      if (tree size = DECL_SIZE (base))
> +	if (tree_fits_uhwi_p (size)
> +	    && tree_to_uhwi (size) < (unsigned HOST_WIDE_INT) offset)
> +	  return NULL;
> +    }
> +
>    if (!bitmap_bit_p (candidate_bitmap, DECL_UID (base)))
>      return NULL;
So I'm a bit curious about this hunk.  Did we end up creating an access
structure that walked off the end of the array?  Presumably  you
suppressing SRA at this point so that you can see the array access later
and give a suitable warning.  Right?


>  
> diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
> index a7453ab..27021760 100644
> --- a/gcc/tree-vrp.c
> +++ b/gcc/tree-vrp.c
> @@ -4930,21 +4931,280 @@ vrp_prop::check_array_ref (location_t location, tree ref,
>      }
>  }
>  
> +/* Checks one MEM_REF in REF, located at LOCATION, for out-of-bounds
> +   references to string constants.  If VRP can determine that the array
> +   subscript is a constant, check if it is outside valid range.
> +   If the array subscript is a RANGE, warn if it is non-overlapping
> +   with valid range.
> +   IGNORE_OFF_BY_ONE is true if the MEM_REF is inside an ADDR_EXPR
> +   (used to allow one-past-the-end indices for code that takes
> +   the address of the just-past-the-end element of an array).  */
> +
> +void
> +vrp_prop::check_mem_ref (location_t location, tree ref, bool ignore_off_by_one)
> +{
> +  if (TREE_NO_WARNING (ref))
> +    return;
> +
> +  tree arg = TREE_OPERAND (ref, 0);
> +  /* The constant and variable offset of the reference.  */
> +  tree cstoff = TREE_OPERAND (ref, 1);
> +  tree varoff = NULL_TREE;
> +
> +  const offset_int maxobjsize = tree_to_shwi (max_object_size ());
> +
> +  /* The array or string constant bounds in bytes.  Initially set
> +     to [-MAXOBJSIZE - 1, MAXOBJSIZE]  until a tighter bound is
> +     determined.  */
> +  offset_int arrbounds[2];
> +  arrbounds[1] = maxobjsize;
> +  arrbounds[0] = -arrbounds[1] - 1;
I realize that arrbounds[1] == maxobjsize at this point.  But is there
any reason not to use maxobjsize in the assignment to arrbounds[0] so
that its computation matches the comment.  It would also seem advisable
to set the min first, then the max since that's the ordering in the
comment and in the array.


> +
> +  /* The minimum and maximum intermediate offset.  For a reference
> +     to be valid, not only does the final offset/subscript must be
> +     in bounds but all intermediate offsets must be as well. */
> +  offset_int ioff = wi::to_offset (fold_convert (ptrdiff_type_node, cstoff));
> +  offset_int extrema[2] = { 0, wi::abs (ioff) };
You should probably tone back the comment here since the intermediate
offsets do not have to be in bounds.


[ Big snip ]

> +  /* The type of the object being referred to.  It can be an array,
> +     string literal, or a non-array type when the MEM_REF represents
> +     a reference/subscript via a pointer to an object that is not
> +     an element of an array.  References to members of structs and
> +     unions are excluded because MEM_REF doesn't make it possible
> +     to identify the member where the reference orginated.  */
s/orginated/originated/

OK with those changes.

jeff


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]