This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH] enhance -Warray-bounds to handle strings and excessive indices
On October 20, 2017 5:43:40 PM GMT+02:00, Martin Sebor <msebor@gmail.com> wrote:
>On 10/20/2017 02:08 AM, Richard Biener wrote:
>> On Fri, Oct 20, 2017 at 1:00 AM, Martin Sebor <msebor@gmail.com>
>wrote:
>>> On 10/19/2017 02:34 AM, Richard Biener wrote:
>>>>
>>>> On Thu, Oct 19, 2017 at 1:19 AM, Martin Sebor <msebor@gmail.com>
>wrote:
>>>>>
>>>>> On 10/18/2017 04:48 AM, Richard Biener wrote:
>>>>>>
>>>>>>
>>>>>> On Wed, Oct 18, 2017 at 5:34 AM, Martin Sebor <msebor@gmail.com>
>wrote:
>>>>>>>
>>>>>>>
>>>>>>> While testing my latest -Wrestrict changes I noticed a number of
>>>>>>> opportunities to improve the -Warray-bounds warning. Attached
>>>>>>> is a patch that implements a solution for the following subset
>>>>>>> of these:
>>>>>>>
>>>>>>> PR tree-optimization/82596 - missing -Warray-bounds on an out-of
>>>>>>> bounds index into string literal
>>>>>>> PR tree-optimization/82588 - missing -Warray-bounds on an
>excessively
>>>>>>> large index
>>>>>>> PR tree-optimization/82583 - missing -Warray-bounds on
>out-of-bounds
>>>>>>> inner indices
>>>
>>>
>>>> I meant to use size_type_node (size_t), not sizetype. But
>>>> I just checked that ptrdiff_type_node is initialized in
>>>> build_common_tree_nodes and thus always available.
>>>
>>>
>>> I see. Using ptrdiff_type_node is preferable for the targets
>>> where ptrdiff_t has a greater precision than size_t (e.g., VMS).
>>> It makes sense now. I should remember to change all the other
>>> places where I introduced ssizetype to use ptrdiff_type_node.
>>>
>>>>
>>>>> As an aside, at some point I would like to get away from a type
>>>>> based limit in all these warnings and instead use one that can
>>>>> be controlled by an option so that a user can impose a lower limit
>>>>> on the maximum size of an object and have all size-related
>warnings
>>>>> (and perhaps even optimizations) enforce it and benefit from it.
>>>>
>>>>
>>>> You could add a --param that is initialized from ptrdiff_type_node.
>>>
>>>
>>> Yes, that's an option to consider. Thanks.
>>>
>>>
>>>>
>>>>>> + tree arg = TREE_OPERAND (ref, 0);
>>>>>> + tree_code code = TREE_CODE (arg);
>>>>>> + if (code == COMPONENT_REF)
>>>>>> + {
>>>>>> + HOST_WIDE_INT off;
>>>>>> + if (tree base = get_addr_base_and_unit_offset (ref,
>&off))
>>>>>> + up_bound_p1 = fold_build2 (MINUS_EXPR, ssizetype,
>>>>>> up_bound_p1,
>>>>>> + TYPE_SIZE_UNIT (TREE_TYPE
>>>>>> (base)));
>>>>>> + else
>>>>>> + return;
>>>>>>
>>>>>> so this gives up on a.b[i].c.d[k] (ok, array_at_struct_end_p will
>be
>>>>>> false).
>>>>>> simply not subtracting anyhing instead of returning would be
>>>>>> conservatively
>>>>>> correct, no? Likewise subtracting the offset of the array for
>all
>>>>>> "previous"
>>>>>> variably indexed components with assuming the lowest value for
>the
>>>>>> index.
>>>>>> But as above I think compensating for the offset of the array
>within the
>>>>>> object
>>>>>> is academic ... ;)
>>>>>
>>>>>
>>>>>
>>>>> I was going to say yes (it gives up) but on second thought I don't
>>>>> think it does. Only the major index can be unbounded and the code
>>>>> does consider the size of the sub-array when checking the major
>>>>> index. So, IIUC, I think this works correctly as is (*). What
>>>>> doesn't work is VLAs but those are a separate problem. Let me
>>>>> know if I misunderstood your question.
>>>>
>>>>
>>>> get_addr_base_and_unit_offset will return NULL if there's any
>variable
>>>> component in 'ref'. So as written it seems to be dead code (you
>>>> want to pass 'arg'?)
>>>
>>>
>>> Sorry, I'm not sure I understand what you mean. What do you think
>>> is dead code? The call to get_addr_base_and_unit_offset() is also
>>> made for an array of unspecified bound (up_bound is null) and for
>>> an array at the end of a struct. For those the function returns
>>> non-null, and for the others (arrays of runtime bound) it returns
>>> null. (I passed arg instead of ref but I see no difference in
>>> my tests.)
>>
>> If you pass a.b.c[i] it will return NULL, if you pass a.b.c ('arg')
>it will
>> return the offset of 'c'. If you pass a.b[j].c it will still return
>NULL.
>> You could use get_ref_base_and_extent which will return the offset
>> of a.b[0].c in this case and sets max_size != size - but you are only
>> interested in offset. The disadvantage of get_ref_base_and_extent
>> is it returns offset in bits thus if the offset is too large for a
>HWI
>> you'll instead get offset == 0 and max_size == -1.
>>
>> Thus I'm saying this is dead code for variable array accesses
>> (even for the array you are warning about). Yes, for constant index
>> and at-struct-end you'll get sth, but the warning is in VRP because
>> of variable indexes.
>>
>> So I suggest to pass 'arg' and use get_ref_base_and_extent
>> for some extra precision (and possible lossage for very very large
>> structures).
>
>Computing bit offsets defeats the out-of-bounds flexible array
>index detection because the computation overflows (the function
>sets the offset to zero).
It shouldn't if you pass arg rather than ref.
I'll go ahead with the first version
>unless you have a different suggestion.
>
>Thanks
>Martin
>
>>
>> Thus instead of
>>
>> + tree maxbound = TYPE_MAX_VALUE (ptrdiff_type_node);
>> +
>> + up_bound_p1 = int_const_binop (TRUNC_DIV_EXPR, maxbound,
>eltsize);
>> +
>> + tree arg = TREE_OPERAND (ref, 0);
>> + tree_code code = TREE_CODE (arg);
>> + if (code == COMPONENT_REF)
>> + {
>> + HOST_WIDE_INT off;
>> + if (tree base = get_addr_base_and_unit_offset (arg, &off))
>> + {
>> + tree size = TYPE_SIZE_UNIT (TREE_TYPE (base));
>>
>> (not sure why you're subtracting the size of the base here?! I
>> thought you subtract off!)
>>
>> + if (TREE_CODE (size) == INTEGER_CST)
>> + up_bound_p1 = int_const_binop (MINUS_EXPR,
>up_bound_p1, size);
>> + }
>>
>> do
>>
>> if (handled_component_p (arg))
>> {
>> HOST_WIDE_INT bitoff, bitsize, bitmax_size;
>> bool reverse;
>> get_ref_base_and_extent (arg, &bitoff, &bitsize,
>> &bitmax_size, &reverse);
>> up_bound_p1 = int_const_binop (MINUS_EXPR, up_bound_p1,
>> build_int_cst (TREE_TYPE (up_bound_p1),
>> bitoff / BITS_PER_UNIT));
>> }
>>
>> ok with that change.
>>
>> Richard.
>>
>>>
>>>>
>>>> I was asking you to remove the 'else return' because w/o
>subtracting
>>>> the upper bound is just more conservative.
>>>
>>>
>>> Sure, that sounds good.
>>>
>>> Attached is an updated patch.
>>>
>>> Thanks
>>> Martin