This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [PATCH] enhance -Warray-bounds to handle strings and excessive indices

From: Martin Sebor <msebor at gmail dot com>
To: Richard Biener <richard dot guenther at gmail dot com>
Cc: Gcc Patch List <gcc-patches at gcc dot gnu dot org>
Date: Fri, 20 Oct 2017 09:43:40 -0600
Subject: Re: [PATCH] enhance -Warray-bounds to handle strings and excessive indices
Authentication-results: sourceware.org; auth=none
References: <a26a14b4-1f1e-2df1-8a17-5a8dbbb87db0@gmail.com> <CAFiYyc0Gzrentk=5JuVyzdVo1nzardZDvh4-w1m5xEYcw==THg@mail.gmail.com> <76dcea19-8c46-f275-b13b-e562236c776e@gmail.com> <CAFiYyc3jiy9xx+TUAf44=jFLzVUTjFRT4RO5s-DW8LjYjyxaHQ@mail.gmail.com> <d8bd0164-c8a6-3dd6-7e49-7c83f8caf6d7@gmail.com> <CAFiYyc3ZFxtGNJNZix7q23ySt6zzb6mr1U8DTPDsjHsLpUQOUA@mail.gmail.com>

On 10/20/2017 02:08 AM, Richard Biener wrote:

On Fri, Oct 20, 2017 at 1:00 AM, Martin Sebor <msebor@gmail.com> wrote:

On 10/19/2017 02:34 AM, Richard Biener wrote:


On Thu, Oct 19, 2017 at 1:19 AM, Martin Sebor <msebor@gmail.com> wrote:


On 10/18/2017 04:48 AM, Richard Biener wrote:



On Wed, Oct 18, 2017 at 5:34 AM, Martin Sebor <msebor@gmail.com> wrote:



While testing my latest -Wrestrict changes I noticed a number of
opportunities to improve the -Warray-bounds warning.  Attached
is a patch that implements a solution for the following subset
of these:

PR tree-optimization/82596 - missing -Warray-bounds on an out-of
  bounds index into string literal
PR tree-optimization/82588 - missing -Warray-bounds on an excessively
  large index
PR tree-optimization/82583 - missing -Warray-bounds on out-of-bounds
  inner indices

I meant to use size_type_node (size_t), not sizetype.  But
I just checked that ptrdiff_type_node is initialized in
build_common_tree_nodes and thus always available.



I see.  Using ptrdiff_type_node is preferable for the targets
where ptrdiff_t has a greater precision than size_t (e.g., VMS).
It makes sense now.  I should remember to change all the other
places where I introduced ssizetype to use ptrdiff_type_node.

As an aside, at some point I would like to get away from a type
based limit in all these warnings and instead use one that can
be controlled by an option so that a user can impose a lower limit
on the maximum size of an object and have all size-related warnings
(and perhaps even optimizations) enforce it and benefit from it.



You could add a --param that is initialized from ptrdiff_type_node.



Yes, that's an option to consider.  Thanks.

+      tree arg = TREE_OPERAND (ref, 0);
+      tree_code code = TREE_CODE (arg);
+      if (code == COMPONENT_REF)
+       {
+         HOST_WIDE_INT off;
+         if (tree base = get_addr_base_and_unit_offset (ref, &off))
+           up_bound_p1 = fold_build2 (MINUS_EXPR, ssizetype,
up_bound_p1,
+                                      TYPE_SIZE_UNIT (TREE_TYPE
(base)));
+         else
+           return;

so this gives up on a.b[i].c.d[k] (ok, array_at_struct_end_p will be
false).
simply not subtracting anyhing instead of returning would be
conservatively
correct, no?  Likewise subtracting the offset of the array for all
"previous"
variably indexed components with assuming the lowest value for the
index.
But as above I think compensating for the offset of the array within the
object
is academic ... ;)




I was going to say yes (it gives up) but on second thought I don't
think it does.  Only the major index can be unbounded and the code
does consider the size of the sub-array when checking the major
index.  So, IIUC, I think this works correctly as is (*).  What
doesn't work is VLAs but those are a separate problem.  Let me
know if I misunderstood your question.



get_addr_base_and_unit_offset will return NULL if there's any variable
component in 'ref'.  So as written it seems to be dead code (you
want to pass 'arg'?)



Sorry, I'm not sure I understand what you mean.  What do you think
is dead code?  The call to get_addr_base_and_unit_offset() is also
made for an array of unspecified bound (up_bound is null) and for
an array at the end of a struct.  For those the function returns
non-null, and for the others (arrays of runtime bound) it returns
null.  (I passed arg instead of ref but I see no difference in
my tests.)


If you pass a.b.c[i] it will return NULL, if you pass a.b.c ('arg') it will
return the offset of 'c'.  If you pass a.b[j].c it will still return NULL.
You could use get_ref_base_and_extent which will return the offset
of a.b[0].c in this case and sets max_size != size - but you are only
interested in offset.  The disadvantage of get_ref_base_and_extent
is it returns offset in bits thus if the offset is too large for a HWI
you'll instead get offset == 0 and max_size == -1.

Thus I'm saying this is dead code for variable array accesses
(even for the array you are warning about).  Yes, for constant index
and at-struct-end you'll get sth, but the warning is in VRP because
of variable indexes.

So I suggest to pass 'arg' and use get_ref_base_and_extent
for some extra precision (and possible lossage for very very large
structures).


Computing bit offsets defeats the out-of-bounds flexible array
index detection because the computation overflows (the function
sets the offset to zero).  I'll go ahead with the first version
unless you have a different suggestion.

Thanks
Martin


Thus instead of

+      tree maxbound = TYPE_MAX_VALUE (ptrdiff_type_node);
+
+      up_bound_p1 = int_const_binop (TRUNC_DIV_EXPR, maxbound, eltsize);
+
+      tree arg = TREE_OPERAND (ref, 0);
+      tree_code code = TREE_CODE (arg);
+      if (code == COMPONENT_REF)
+       {
+         HOST_WIDE_INT off;
+         if (tree base = get_addr_base_and_unit_offset (arg, &off))
+           {
+             tree size = TYPE_SIZE_UNIT (TREE_TYPE (base));

(not sure why you're subtracting the size of the base here?!  I
thought you subtract off!)

+             if (TREE_CODE (size) == INTEGER_CST)
+               up_bound_p1 = int_const_binop (MINUS_EXPR, up_bound_p1, size);
+           }

do

    if (handled_component_p (arg))
      {
         HOST_WIDE_INT bitoff, bitsize, bitmax_size;
         bool reverse;
         get_ref_base_and_extent (arg, &bitoff, &bitsize,
&bitmax_size, &reverse);
         up_bound_p1 = int_const_binop (MINUS_EXPR, up_bound_p1,
                              build_int_cst (TREE_TYPE (up_bound_p1),
bitoff / BITS_PER_UNIT));
      }

ok with that change.

Richard.


I was asking you to remove the 'else return' because w/o subtracting
the upper bound is just more conservative.



Sure, that sounds good.

Attached is an updated patch.

Thanks
Martin

Follow-Ups:
- Re: [PATCH] enhance -Warray-bounds to handle strings and excessive indices
  - From: Richard Biener

References:
- [PATCH] enhance -Warray-bounds to handle strings and excessive indices
  - From: Martin Sebor
- Re: [PATCH] enhance -Warray-bounds to handle strings and excessive indices
  - From: Richard Biener
- Re: [PATCH] enhance -Warray-bounds to handle strings and excessive indices
  - From: Martin Sebor
- Re: [PATCH] enhance -Warray-bounds to handle strings and excessive indices
  - From: Richard Biener
- Re: [PATCH] enhance -Warray-bounds to handle strings and excessive indices
  - From: Martin Sebor
- Re: [PATCH] enhance -Warray-bounds to handle strings and excessive indices
  - From: Richard Biener

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]