[PATCH] handle vector and aggregate stores in -Wstringop-overflow [PR 97027]

Richard Biener richard.guenther@gmail.com
Thu Jul 15 07:31:52 GMT 2021


On Wed, Jul 14, 2021 at 8:46 PM Martin Sebor <msebor@gmail.com> wrote:
>
> On 7/14/21 1:01 AM, Richard Biener wrote:
> > On Tue, Jul 13, 2021 at 9:27 PM Martin Sebor via Gcc-patches
> > <gcc-patches@gcc.gnu.org> wrote:
> >>
> >> An existing, previously xfailed test that I recently removed
> >> the xfail from made me realize that -Wstringop-overflow doesn't
> >> properly detect buffer overflow resulting from vectorized stores.
> >> Because of a difference in the IL the test passes on x86_64 but
> >> fails on targets like aarch64.  Other examples can be constructed
> >> that -Wstringop-overflow fails to diagnose even on x86_64.  For
> >> INSTANCE, the overflow in the following function isn't diagnosed
> >> when the loop is vectorized:
> >>
> >>     void* f (void)
> >>     {
> >>       char *p = __builtin_malloc (8);
> >>       for (int i = 0; i != 16; ++i)
> >>         p[i] = 1 << i;
> >>       return p;
> >>     }
> >>
> >> The attached change enhances the warning to detect those as well.
> >> It found a few bugs in vectorizer tests that the patch corrects.
> >> Tested on x86_64-linux and with an aarch64 cross.
> >
> > -      dest = gimple_call_arg (stmt, 0);
> > +      if (gimple_call_builtin_p (stmt, BUILT_IN_NORMAL)
> > +         && gimple_call_num_args (stmt))
> > +       dest = gimple_call_arg (stmt, 0);
> > +      else
> > +       dest = gimple_call_lhs (stmt);
> > +
> > +      if (!dest)
> > +       return;
> >
> > so this uses arg0 for memcpy (dst, src, 4) and also for bcopy (src, dst, 4)?
>
> No.  The code is only called for assignments like *p = f () and for
> a handful of built-ins (memcpy, strcpy, and memset).

I see - that wasn't obvious when looking at the patch.

> bcopy() returns void and so its result cannot be assigned.  I believe
> bcopy() and the other legacy bxxx() functions are also lowered into
> memcpy/memmove etc. so we should see no calls to it in the middle end.
> In any case, I have adjusted the function as described below to avoid
> even this hypothetical issue.
>
> > It looks quite fragile to me.  I think you want to use the LHS only if it is
> > aggregate (and not a pointer or some random other value).  Likewise
> > you should only use arg0 for a whitelist of builtins, not for any random one.
>
> I've added an argument to the function to make the distinction
> between a call result and argument explicit but I haven't been able
> to create a test case to exercise it.  For all the built-ins I've
> tried in an assignment like:
>
>    extern char a[4];
>    *(double*)a = nan ("foo");
>
> the call result ends up assigned to a temporary:
>
>    _1 = __builtin_nan (s_2(D));
>    MEM[(double *)&a] = _1;
>
> I can only get a call and assignment in one for user-defined functions
> that return an aggregate.

Yes, call LHS will be SSA names if the result is of register type.

> >
> > It's bad enough that compute_objsize decides for itself whether it is
> > passed a pointer or an object rather than the API being explicit about this.
> >
> >     if (VAR_P (exp) || TREE_CODE (exp) == CONST_DECL)
> >       {
> > -      exp = ctor_for_folding (exp);
> > -      if (!exp)
> > -       return false;
> > +      /* If EXP can be folded into a constant use the result.  Otherwise
> > +        proceed to use EXP to determine a range of the result.  */
> > +      if (tree fold_exp = ctor_for_folding (exp))
>           ^^^^^^^^^^^^^^^^^
> > +       if (fold_exp != error_mark_node)
> > +         exp = fold_exp;
> >
> > fold_exp can be NULL, meaning a zero-initializer but below you'll run into
>
> fold_exp is assigned to exp if it's neither null (as I underlined
> above) nor error_mark_node so I think it's correct as is.

Oops.  Somehow missed that - the

  if (..)
    if (..)

style for an && is a "bad" C++ requirement as "nice" as the
new if (tree x = ...) syntax is.

> >
> >    const char *prep = NULL;
> >    if (TREE_CODE (exp) == STRING_CST)
> >      {
> >
> > and crash.  Either you handle a NULL fold_expr explicitely or conservatively
> > continue to return false.
> >
> > +  /* The LHS and RHS of the store.  The RHS is null if STMT is a function
> > +     call.  RHSTYPE is the type of the store.  */
> > +  tree lhs, rhs, rhstype;
> > +  if (is_gimple_assign (stmt))
> > +    {
> > +      lhs = gimple_assign_lhs (stmt);
> > +      rhs = gimple_assign_rhs1 (stmt);
> > +      rhstype = TREE_TYPE (rhs);
> > +    }
> > +  else if (is_gimple_call (stmt))
> > +    {
> > +      lhs = gimple_call_lhs (stmt);
> > +      rhs = NULL_TREE;
> > +      rhstype = TREE_TYPE (gimple_call_fntype (stmt));
> > +    }
> >
> > The type of the store in a call is better determined from the LHS.
> > For internal function calls the above will crash.
> >
> > Otherwise looks like reasonable changes.
>
> Please see the attached revision.

LGTM.

Thanks,
Richard.

> Martin


More information about the Gcc-patches mailing list