[PATCH] enhance -Warray-bounds to handle strings and excessive indices
Richard Biener
richard.guenther@gmail.com
Fri Oct 20 08:12:00 GMT 2017
On Fri, Oct 20, 2017 at 1:00 AM, Martin Sebor <msebor@gmail.com> wrote:
> On 10/19/2017 02:34 AM, Richard Biener wrote:
>>
>> On Thu, Oct 19, 2017 at 1:19 AM, Martin Sebor <msebor@gmail.com> wrote:
>>>
>>> On 10/18/2017 04:48 AM, Richard Biener wrote:
>>>>
>>>>
>>>> On Wed, Oct 18, 2017 at 5:34 AM, Martin Sebor <msebor@gmail.com> wrote:
>>>>>
>>>>>
>>>>> While testing my latest -Wrestrict changes I noticed a number of
>>>>> opportunities to improve the -Warray-bounds warning. Attached
>>>>> is a patch that implements a solution for the following subset
>>>>> of these:
>>>>>
>>>>> PR tree-optimization/82596 - missing -Warray-bounds on an out-of
>>>>> bounds index into string literal
>>>>> PR tree-optimization/82588 - missing -Warray-bounds on an excessively
>>>>> large index
>>>>> PR tree-optimization/82583 - missing -Warray-bounds on out-of-bounds
>>>>> inner indices
>
>
>> I meant to use size_type_node (size_t), not sizetype. But
>> I just checked that ptrdiff_type_node is initialized in
>> build_common_tree_nodes and thus always available.
>
>
> I see. Using ptrdiff_type_node is preferable for the targets
> where ptrdiff_t has a greater precision than size_t (e.g., VMS).
> It makes sense now. I should remember to change all the other
> places where I introduced ssizetype to use ptrdiff_type_node.
>
>>
>>> As an aside, at some point I would like to get away from a type
>>> based limit in all these warnings and instead use one that can
>>> be controlled by an option so that a user can impose a lower limit
>>> on the maximum size of an object and have all size-related warnings
>>> (and perhaps even optimizations) enforce it and benefit from it.
>>
>>
>> You could add a --param that is initialized from ptrdiff_type_node.
>
>
> Yes, that's an option to consider. Thanks.
>
>
>>
>>>> + tree arg = TREE_OPERAND (ref, 0);
>>>> + tree_code code = TREE_CODE (arg);
>>>> + if (code == COMPONENT_REF)
>>>> + {
>>>> + HOST_WIDE_INT off;
>>>> + if (tree base = get_addr_base_and_unit_offset (ref, &off))
>>>> + up_bound_p1 = fold_build2 (MINUS_EXPR, ssizetype,
>>>> up_bound_p1,
>>>> + TYPE_SIZE_UNIT (TREE_TYPE
>>>> (base)));
>>>> + else
>>>> + return;
>>>>
>>>> so this gives up on a.b[i].c.d[k] (ok, array_at_struct_end_p will be
>>>> false).
>>>> simply not subtracting anyhing instead of returning would be
>>>> conservatively
>>>> correct, no? Likewise subtracting the offset of the array for all
>>>> "previous"
>>>> variably indexed components with assuming the lowest value for the
>>>> index.
>>>> But as above I think compensating for the offset of the array within the
>>>> object
>>>> is academic ... ;)
>>>
>>>
>>>
>>> I was going to say yes (it gives up) but on second thought I don't
>>> think it does. Only the major index can be unbounded and the code
>>> does consider the size of the sub-array when checking the major
>>> index. So, IIUC, I think this works correctly as is (*). What
>>> doesn't work is VLAs but those are a separate problem. Let me
>>> know if I misunderstood your question.
>>
>>
>> get_addr_base_and_unit_offset will return NULL if there's any variable
>> component in 'ref'. So as written it seems to be dead code (you
>> want to pass 'arg'?)
>
>
> Sorry, I'm not sure I understand what you mean. What do you think
> is dead code? The call to get_addr_base_and_unit_offset() is also
> made for an array of unspecified bound (up_bound is null) and for
> an array at the end of a struct. For those the function returns
> non-null, and for the others (arrays of runtime bound) it returns
> null. (I passed arg instead of ref but I see no difference in
> my tests.)
If you pass a.b.c[i] it will return NULL, if you pass a.b.c ('arg') it will
return the offset of 'c'. If you pass a.b[j].c it will still return NULL.
You could use get_ref_base_and_extent which will return the offset
of a.b[0].c in this case and sets max_size != size - but you are only
interested in offset. The disadvantage of get_ref_base_and_extent
is it returns offset in bits thus if the offset is too large for a HWI
you'll instead get offset == 0 and max_size == -1.
Thus I'm saying this is dead code for variable array accesses
(even for the array you are warning about). Yes, for constant index
and at-struct-end you'll get sth, but the warning is in VRP because
of variable indexes.
So I suggest to pass 'arg' and use get_ref_base_and_extent
for some extra precision (and possible lossage for very very large
structures).
Thus instead of
+ tree maxbound = TYPE_MAX_VALUE (ptrdiff_type_node);
+
+ up_bound_p1 = int_const_binop (TRUNC_DIV_EXPR, maxbound, eltsize);
+
+ tree arg = TREE_OPERAND (ref, 0);
+ tree_code code = TREE_CODE (arg);
+ if (code == COMPONENT_REF)
+ {
+ HOST_WIDE_INT off;
+ if (tree base = get_addr_base_and_unit_offset (arg, &off))
+ {
+ tree size = TYPE_SIZE_UNIT (TREE_TYPE (base));
(not sure why you're subtracting the size of the base here?! I
thought you subtract off!)
+ if (TREE_CODE (size) == INTEGER_CST)
+ up_bound_p1 = int_const_binop (MINUS_EXPR, up_bound_p1, size);
+ }
do
if (handled_component_p (arg))
{
HOST_WIDE_INT bitoff, bitsize, bitmax_size;
bool reverse;
get_ref_base_and_extent (arg, &bitoff, &bitsize,
&bitmax_size, &reverse);
up_bound_p1 = int_const_binop (MINUS_EXPR, up_bound_p1,
build_int_cst (TREE_TYPE (up_bound_p1),
bitoff / BITS_PER_UNIT));
}
ok with that change.
Richard.
>
>>
>> I was asking you to remove the 'else return' because w/o subtracting
>> the upper bound is just more conservative.
>
>
> Sure, that sounds good.
>
> Attached is an updated patch.
>
> Thanks
> Martin
More information about the Gcc-patches
mailing list