[PATCH] handle vector and aggregate stores in -Wstringop-overflow [PR 97027]
Richard Biener
richard.guenther@gmail.com
Thu Jul 15 07:31:52 GMT 2021
On Wed, Jul 14, 2021 at 8:46 PM Martin Sebor <msebor@gmail.com> wrote:
>
> On 7/14/21 1:01 AM, Richard Biener wrote:
> > On Tue, Jul 13, 2021 at 9:27 PM Martin Sebor via Gcc-patches
> > <gcc-patches@gcc.gnu.org> wrote:
> >>
> >> An existing, previously xfailed test that I recently removed
> >> the xfail from made me realize that -Wstringop-overflow doesn't
> >> properly detect buffer overflow resulting from vectorized stores.
> >> Because of a difference in the IL the test passes on x86_64 but
> >> fails on targets like aarch64. Other examples can be constructed
> >> that -Wstringop-overflow fails to diagnose even on x86_64. For
> >> INSTANCE, the overflow in the following function isn't diagnosed
> >> when the loop is vectorized:
> >>
> >> void* f (void)
> >> {
> >> char *p = __builtin_malloc (8);
> >> for (int i = 0; i != 16; ++i)
> >> p[i] = 1 << i;
> >> return p;
> >> }
> >>
> >> The attached change enhances the warning to detect those as well.
> >> It found a few bugs in vectorizer tests that the patch corrects.
> >> Tested on x86_64-linux and with an aarch64 cross.
> >
> > - dest = gimple_call_arg (stmt, 0);
> > + if (gimple_call_builtin_p (stmt, BUILT_IN_NORMAL)
> > + && gimple_call_num_args (stmt))
> > + dest = gimple_call_arg (stmt, 0);
> > + else
> > + dest = gimple_call_lhs (stmt);
> > +
> > + if (!dest)
> > + return;
> >
> > so this uses arg0 for memcpy (dst, src, 4) and also for bcopy (src, dst, 4)?
>
> No. The code is only called for assignments like *p = f () and for
> a handful of built-ins (memcpy, strcpy, and memset).
I see - that wasn't obvious when looking at the patch.
> bcopy() returns void and so its result cannot be assigned. I believe
> bcopy() and the other legacy bxxx() functions are also lowered into
> memcpy/memmove etc. so we should see no calls to it in the middle end.
> In any case, I have adjusted the function as described below to avoid
> even this hypothetical issue.
>
> > It looks quite fragile to me. I think you want to use the LHS only if it is
> > aggregate (and not a pointer or some random other value). Likewise
> > you should only use arg0 for a whitelist of builtins, not for any random one.
>
> I've added an argument to the function to make the distinction
> between a call result and argument explicit but I haven't been able
> to create a test case to exercise it. For all the built-ins I've
> tried in an assignment like:
>
> extern char a[4];
> *(double*)a = nan ("foo");
>
> the call result ends up assigned to a temporary:
>
> _1 = __builtin_nan (s_2(D));
> MEM[(double *)&a] = _1;
>
> I can only get a call and assignment in one for user-defined functions
> that return an aggregate.
Yes, call LHS will be SSA names if the result is of register type.
> >
> > It's bad enough that compute_objsize decides for itself whether it is
> > passed a pointer or an object rather than the API being explicit about this.
> >
> > if (VAR_P (exp) || TREE_CODE (exp) == CONST_DECL)
> > {
> > - exp = ctor_for_folding (exp);
> > - if (!exp)
> > - return false;
> > + /* If EXP can be folded into a constant use the result. Otherwise
> > + proceed to use EXP to determine a range of the result. */
> > + if (tree fold_exp = ctor_for_folding (exp))
> ^^^^^^^^^^^^^^^^^
> > + if (fold_exp != error_mark_node)
> > + exp = fold_exp;
> >
> > fold_exp can be NULL, meaning a zero-initializer but below you'll run into
>
> fold_exp is assigned to exp if it's neither null (as I underlined
> above) nor error_mark_node so I think it's correct as is.
Oops. Somehow missed that - the
if (..)
if (..)
style for an && is a "bad" C++ requirement as "nice" as the
new if (tree x = ...) syntax is.
> >
> > const char *prep = NULL;
> > if (TREE_CODE (exp) == STRING_CST)
> > {
> >
> > and crash. Either you handle a NULL fold_expr explicitely or conservatively
> > continue to return false.
> >
> > + /* The LHS and RHS of the store. The RHS is null if STMT is a function
> > + call. RHSTYPE is the type of the store. */
> > + tree lhs, rhs, rhstype;
> > + if (is_gimple_assign (stmt))
> > + {
> > + lhs = gimple_assign_lhs (stmt);
> > + rhs = gimple_assign_rhs1 (stmt);
> > + rhstype = TREE_TYPE (rhs);
> > + }
> > + else if (is_gimple_call (stmt))
> > + {
> > + lhs = gimple_call_lhs (stmt);
> > + rhs = NULL_TREE;
> > + rhstype = TREE_TYPE (gimple_call_fntype (stmt));
> > + }
> >
> > The type of the store in a call is better determined from the LHS.
> > For internal function calls the above will crash.
> >
> > Otherwise looks like reasonable changes.
>
> Please see the attached revision.
LGTM.
Thanks,
Richard.
> Martin
More information about the Gcc-patches
mailing list