This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] include size and offset in -Wstringop-overflow


On 11/6/19 2:06 PM, Martin Sebor wrote:
> On 11/6/19 1:39 PM, Jeff Law wrote:
>> On 11/6/19 1:27 PM, Martin Sebor wrote:
>>> On 11/6/19 11:55 AM, Jeff Law wrote:
>>>> On 11/6/19 11:00 AM, Martin Sebor wrote:
>>>>> The -Wstringop-overflow warnings for single-byte and multi-byte
>>>>> stores mention the amount of data being stored and the amount of
>>>>> space remaining in the destination, such as:
>>>>>
>>>>> warning: writing 4 bytes into a region of size 0 [-Wstringop-overflow=]
>>>>>
>>>>>     123 |   *p = 0;
>>>>>         |   ~~~^~~
>>>>> note: destination object declared here
>>>>>      45 |   char b[N];
>>>>>         |        ^
>>>>>
>>>>> A warning like this can take some time to analyze.  First, the size
>>>>> of the destination isn't mentioned and may not be easy to tell from
>>>>> the sources.  In the note above, when N's value is the result of
>>>>> some non-trivial computation, chasing it down may be a small project
>>>>> in and of itself.  Second, it's also not clear why the region size
>>>>> is zero.  It could be because the offset is exactly N, or because
>>>>> it's negative, or because it's in some range greater than N.
>>>>>
>>>>> Mentioning both the size of the destination object and the offset
>>>>> makes the existing messages clearer, are will become essential when
>>>>> GCC starts diagnosing overflow into allocated buffers (as my
>>>>> follow-on patch does).
>>>>>
>>>>> The attached patch enhances -Wstringop-overflow to do this by
>>>>> letting compute_objsize return the offset to its caller, doing
>>>>> something similar in get_stridx, and adding a new function to
>>>>> the strlen pass to issue this enhanced warning (eventually, I'd
>>>>> like the function to replace the -Wstringop-overflow handler in
>>>>> builtins.c).  With the change, the note above might read something
>>>>> like:
>>>>>
>>>>> note: at offset 11 to object ‘b’ with size 8 declared here
>>>>>      45 |   char b[N];
>>>>>         |        ^
>>>>>
>>>>> Tested on x86_64-linux.
>>>>>
>>>>> Martin
>>>>>
>>>>> gcc-store-offset.diff
>>>>>
>>>>> gcc/ChangeLog:
>>>>>
>>>>>      * builtins.c (compute_objsize): Add an argument and set it to
>>>>> offset
>>>>>      into destination.
>>>>>      * builtins.h (compute_objsize): Add an argument.
>>>>>      * tree-object-size.c (addr_object_size): Add an argument and
>>>>> set it
>>>>>      to offset into destination.
>>>>>      (compute_builtin_object_size): Same.
>>>>>      * tree-object-size.h (compute_builtin_object_size): Add an
>>>>> argument.
>>>>>      * tree-ssa-strlen.c (get_addr_stridx): Add an argument and set it
>>>>>      to offset into destination.
>>>>>      (maybe_warn_overflow): New function.
>>>>>      (handle_store): Call maybe_warn_overflow to issue warnings.
>>>>>
>>>>> gcc/testsuite/ChangeLog:
>>>>>
>>>>>      * c-c++-common/Wstringop-overflow-2.c: Adjust text of expected
>>>>> messages.
>>>>>      * g++.dg/warn/Wstringop-overflow-3.C: Same.
>>>>>      * gcc.dg/Wstringop-overflow-17.c: Same.
>>>>>
>>>>
>>>>> Index: gcc/tree-ssa-strlen.c
>>>>> ===================================================================
>>>>> --- gcc/tree-ssa-strlen.c    (revision 277886)
>>>>> +++ gcc/tree-ssa-strlen.c    (working copy)
>>>>> @@ -189,6 +189,52 @@ struct laststmt_struct
>>>>>    static int get_stridx_plus_constant (strinfo *, unsigned
>>>>> HOST_WIDE_INT, tree);
>>>>>    static void handle_builtin_stxncpy (built_in_function,
>>>>> gimple_stmt_iterator *);
>>>>>    +/* Sets MINMAX to either the constant value or the range VAL is in
>>>>> +   and returns true on success.  */
>>>>> +
>>>>> +static bool
>>>>> +get_range (tree val, wide_int minmax[2], const vr_values *rvals =
>>>>> NULL)
>>>>> +{
>>>>> +  if (tree_fits_uhwi_p (val))
>>>>> +    {
>>>>> +      minmax[0] = minmax[1] = wi::to_wide (val);
>>>>> +      return true;
>>>>> +    }
>>>>> +
>>>>> +  if (TREE_CODE (val) != SSA_NAME)
>>>>> +    return false;
>>>>> +
>>>>> +  if (rvals)
>>>>> +    {
>>>>> +      gimple *def = SSA_NAME_DEF_STMT (val);
>>>>> +      if (gimple_assign_single_p (def)
>>>>> +      && gimple_assign_rhs_code (def) == INTEGER_CST)
>>>>> +    {
>>>>> +      /* get_value_range returns [0, N] for constant assignments.  */
>>>>> +      val = gimple_assign_rhs1 (def);
>>>>> +      minmax[0] = minmax[1] = wi::to_wide (val);
>>>>> +      return true;
>>>>> +    }
>>>> Umm, something seems really off with this hunk.  If the SSA_NAME is set
>>>> via a simple constant assignment, then the range ought to be a
>>>> singleton
>>>> ie [CONST,CONST].   Is there are particular test were this is not true?
>>>>
>>>> The only way offhand I could see this happening is if originally the
>>>> RHS
>>>> wasn't a constant, but due to optimizations it either simplified into a
>>>> constant or a constant was propagated into an SSA_NAME appearing on the
>>>> RHS.  This would have to happen between the last range analysis and the
>>>> point where you're making this query.
>>>
>>> Yes, I think that's right.  Here's an example where it happens:
>>>
>>>    void f (void)
>>>    {
>>>      char s[] = "1234";
>>>      unsigned n = strlen (s);
>>>      char vla[n];   // or malloc (n)
>>>      vla[n] = 0;    // n = [4, 4]
>>>      ...
>>>    }
>>>
>>> The strlen call is folded to 4 but that's not propagated to
>>> the access until sometime after the strlen pass is done.
>> Hmm.  Are we calling set_range_info in that case?  That goes behind the
>> back of pass instance of vr_values.  If so, that might argue we want to
>> be setting it in vr_values too.
> 
> No, set_range_info is only called for ranges.  In this case,
> handle_builtin_strlen replaces the strlen() call with 4:
> 
>   s = "1234";
>   _1 = __builtin_strlen (&s);
>   n_2 = (unsigned int) _1;
>   a.1_8 = __builtin_alloca_with_align (_1, 8);
>   (*a.1_8)[n_2] = 0;
Right.  But at the point where we make the substitution for the call on
the RHS the range is a singleton and we could set the range of _1 to [4,
4].  We could also set its SSA_NAME_VALUE to 4.  Hell, we could even
forward propagate the constant to the uses.  Any/all of those would seem
better than the hack in question.


> 
> When the access is made, the __builtin_alloca_with_align call
> is found as the destination and the _1 SSA_NAME is used to
> get its size.  We get back the range [4, 4].
Now I'm confused.  If we're getting [4, 4], then that's exactly what we
want.

Jeff


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]