This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] restore -Warray-bounds for string literals (PR 83776)


On 07/19/2018 03:51 PM, Jeff Law wrote:
On 07/13/2018 05:45 PM, Martin Sebor wrote:

+      offset_int ofr[] = {
+       wi::to_offset (fold_convert (ptrdiff_type_node, vr->min)),
+       wi::to_offset (fold_convert (ptrdiff_type_node, vr->max))
+      };

huh.  Do you maybe want to use widest_int for ofr[]?  What's
wrong with wi::to_offset (vr->min)?  Building another intermediate
tree node here looks just bogus.

I need to convert size_type indices to signed to keep their sign
if it's negative and include it in warnings.  I've moved the code
into a conditional where it's used to minimize the cost but other
than that I don't know how else to convert it.


What are you trying to do in this loop anyways?

The loop It determines the range of the final index/offset for
a series of POINTER_PLUS_EXPRs.  It handles cases like this:

  int g (int i, int j, int k)
  {
    if (i < 1) i = 1;
    if (j < 1) j = 1;
    if (k < 1) k = 1;

    const char *p0 = "123";
    const char *p1 = p0 + i;
    const char *p2 = p1 + j;
    const char *p3 = p2 + k;

    // p2[3] and p3[1] are out of bounds
    return p0[4] + p1[3] + p2[2] + p3[1];
  }

I suppose
look at

  p_1 = &obj[i_2];  // already bounds-checked, but with ignore_off_by_one
  ... = MEM[p_1 + CST];

?  But then

+      if (TREE_CODE (varoff) != SSA_NAME)
+       break;

you should at least handle INTEGER_CSTs here?

It's already handled (and set in CSTOFF).  There should be no
more constant offsets after that (I haven't come across any.)


+      if (!vr || vr->type == VR_UNDEFINED || !vr->min || !vr->max)
+       break;

please use positive tests here, VR_RANGE || VR_ANTI_RANGE.  As usual
the anti-range handling looks odd.  Please add comments so we can follow
what you were thinking when writing range merging code.  Even better if you

can stick to use existing code rather than always re-inventing the wheel...


The anti-range handling is to conservatively add
[-MAXOBJSIZE -1, MAXOBJSIZE] to OFFRANGE.  I've added comments
to make it clear.  I'd be more than happy to reuse existing
code if I knew where to find it (if it exists).  It sure would
save me lots of work to have a library of utility functions to
call instead of rolling my own code each time.
Finding stuff is never easy :(  GCC's just gotten so big it's virtually
impossible for anyone to know all the APIs.

The suggestion I'd have would be to (when possible) factor this stuff
into routines you can reuse.  We (as a whole) have this tendency to
open-code all kinds of things rather than factoring the code into
reusable routines.

In addition to increasing the probability that you'll be able to reuse
code later, just reading a new function's header tends to make me (as a
reviewer) internally ask if there's already a routine we should be using
instead.  When it's open-coded it's significantly harder to spot those
cases (at least for me).




I think I commented on earlier variants but this doesn't seem to resemble
any of them.

I've reworked the patch (sorry) to also handle arrays.  For GCC
9 it seems I might as well do both in one go.

Attached is an updated patch with these changes.

Martin

gcc-83776.diff


PR tree-optimization/84047 - missing -Warray-bounds on an out-of-bounds index into an array
PR tree-optimization/83776 - missing -Warray-bounds indexing past the end of a string literal

gcc/ChangeLog:

	PR tree-optimization/84047
	PR tree-optimization/83776
	* tree-vrp.c (vrp_prop::check_mem_ref): New function.
	(check_array_bounds): Call it.
	* /gcc/tree-sra.c (get_access_for_expr): Fail for out-of-bounds
	array offsets.

gcc/testsuite/ChangeLog:

	PR tree-optimization/83776
	PR tree-optimization/84047
	* gcc.dg/Warray-bounds-29.c: New test.
	* gcc.dg/Warray-bounds-30.c: New test.
	* gcc.dg/Warray-bounds-31.c: New test.
	* gcc.dg/Warray-bounds-32.c: New test.


diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c
index 3e30f6b..8221a06 100644
--- a/gcc/tree-sra.c
+++ b/gcc/tree-sra.c
@@ -3110,6 +3110,19 @@ get_access_for_expr (tree expr)
       || !DECL_P (base))
     return NULL;

+  /* Fail for out-of-bounds array offsets.  */
+  tree basetype = TREE_TYPE (base);
+  if (TREE_CODE (basetype) == ARRAY_TYPE)
+    {
+      if (offset < 0)
+	return NULL;
+
+      if (tree size = DECL_SIZE (base))
+	if (tree_fits_uhwi_p (size)
+	    && tree_to_uhwi (size) < (unsigned HOST_WIDE_INT) offset)
+	  return NULL;
+    }
+
   if (!bitmap_bit_p (candidate_bitmap, DECL_UID (base)))
     return NULL;
So I'm a bit curious about this hunk.  Did we end up creating an access
structure that walked off the end of the array?  Presumably  you
suppressing SRA at this point so that you can see the array access later
and give a suitable warning.  Right?

Yes, but I didn't make a note of the test case that triggered
it and I'm not able to trigger the code path now, so the change
might not be necessary.  I've removed it from the patch.  If it
turns out that it is needed after all I'll commit it later.

diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
index a7453ab..27021760 100644
--- a/gcc/tree-vrp.c
+++ b/gcc/tree-vrp.c
@@ -4930,21 +4931,280 @@ vrp_prop::check_array_ref (location_t location, tree ref,
     }
 }

+/* Checks one MEM_REF in REF, located at LOCATION, for out-of-bounds
+   references to string constants.  If VRP can determine that the array
+   subscript is a constant, check if it is outside valid range.
+   If the array subscript is a RANGE, warn if it is non-overlapping
+   with valid range.
+   IGNORE_OFF_BY_ONE is true if the MEM_REF is inside an ADDR_EXPR
+   (used to allow one-past-the-end indices for code that takes
+   the address of the just-past-the-end element of an array).  */
+
+void
+vrp_prop::check_mem_ref (location_t location, tree ref, bool ignore_off_by_one)
+{
+  if (TREE_NO_WARNING (ref))
+    return;
+
+  tree arg = TREE_OPERAND (ref, 0);
+  /* The constant and variable offset of the reference.  */
+  tree cstoff = TREE_OPERAND (ref, 1);
+  tree varoff = NULL_TREE;
+
+  const offset_int maxobjsize = tree_to_shwi (max_object_size ());
+
+  /* The array or string constant bounds in bytes.  Initially set
+     to [-MAXOBJSIZE - 1, MAXOBJSIZE]  until a tighter bound is
+     determined.  */
+  offset_int arrbounds[2];
+  arrbounds[1] = maxobjsize;
+  arrbounds[0] = -arrbounds[1] - 1;
I realize that arrbounds[1] == maxobjsize at this point.  But is there
any reason not to use maxobjsize in the assignment to arrbounds[0] so
that its computation matches the comment.  It would also seem advisable
to set the min first, then the max since that's the ordering in the
comment and in the array.

Done.



+
+  /* The minimum and maximum intermediate offset.  For a reference
+     to be valid, not only does the final offset/subscript must be
+     in bounds but all intermediate offsets must be as well. */
+  offset_int ioff = wi::to_offset (fold_convert (ptrdiff_type_node, cstoff));
+  offset_int extrema[2] = { 0, wi::abs (ioff) };
You should probably tone back the comment here since the intermediate
offsets do not have to be in bounds.

Done.



[ Big snip ]

+  /* The type of the object being referred to.  It can be an array,
+     string literal, or a non-array type when the MEM_REF represents
+     a reference/subscript via a pointer to an object that is not
+     an element of an array.  References to members of structs and
+     unions are excluded because MEM_REF doesn't make it possible
+     to identify the member where the reference orginated.  */
s/orginated/originated/

OK with those changes.

Committed without the SRA change as 262893.

Martin


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]