This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH] detect nonstring arguments to string functions (PR 82945)
On 11/13/2017 06:21 PM, Martin Sebor wrote:
> Attached is an improved version that rather than hardcoding
> the built-in functions to check uses the constness of each
> built-in's char* argument as a trigger to do the checking.
>
> This avoids the problem of the list being incomplete either
> by omission or due to getting out of sync, and also makes it
> trivial to extend the same approach to user-defined functions
> in the future.
>
> I've excluded strndup and strnlen (but no other "bounded"
> string functions like strncmp) from the checking.
>
> Martin
>
> On 11/12/2017 05:52 PM, Martin Sebor wrote:
>> The recently introduced -Wstringop-truncation warning relies
>> on the new nonstring attribute to allow the historical use case
>> of calling strncpy to completely fill the destination with a copy
>> of a string without adding a terminating nul. Glibc is currently
>> considering making use of the attribute to decorate some of its
>> data members. To help find misuses of such data members in
>> arguments to string functions like strlen or strdup, the attached
>> patch adds checking for this new attribute in these contexts.
>> The checking is intentionally done late so that uses such arrays
>> that can be proven to be safe (and thus folded) are not diagnosed.
>>
>> While testing this simple enhancement I noticed that the handling
>> I added for the nul termination in cases like
>>
>> strncpy (array, s, sizeof array);
>> array[sizeof array - 1] = 0;
>>
>> to avoid the new warning wasn't quite as robust as it could and
>> arguably should be so I improved it a bit to silently accept
>> more forms of the idiom. For instance, this is now correctly
>> handled (and not diagnosed):
>>
>> *stpcpy (array, s, sizeof array - 1) = 0;
>>
>> Martin
>>
>> PS A useful future enhancement would be to detect the one byte
>> overflow in:
>>
>> *stpcpy (array, s, sizeof array) = 0;
>
>
> gcc-82945.diff
>
>
> PR tree-optimization/82945 - add warning for passing non-strings to functions that expect string arguments
>
> gcc/ChangeLog:
>
> PR tree-optimization/82945
> * calls.c (get_attr_nonstring_decl, maybe_warn_nonstring_arg): New
> functions.
> (initialize_argument_information): Call maybe_warn_nonstring_arg.
> * calls.h (get_attr_nonstring_decl): Declare new function.
> * doc/extend.texi (attribute nonstring): Update.
> * gimple-fold.c (gimple_fold_builtin_strncpy): Call
> get_attr_nonstring_decl and handle it.
> * tree-ssa-strlen.c (maybe_diag_stxncpy_trunc): Same. Improve
> detection of nul-termination.
>
> gcc/testsuite/ChangeLog:
>
> PR tree-optimization/82945
> * c-c++-common/Wstringop-truncation-2.c: New test.
> * c-c++-common/Wstringop-truncation.c: Adjust.
> * c-c++-common/attr-nonstring-2.c: Adjust.
> * c-c++-common/attr-nonstring-3.c: New test.
>
> diff --git a/gcc/calls.c b/gcc/calls.c
> index 3730f43..197d907 100644
> --- a/gcc/calls.c
> +++ b/gcc/calls.c
> @@ -1494,6 +1494,108 @@ maybe_warn_alloc_args_overflow (tree fn, tree exp, tree args[2], int idx[2])
> }
> }
>
> +/* If EXPR refers to a character array or pointer declared attribute
> + nonstring return a decl for that array or pointer and set *REF to
> + the referenced enclosing object or pointer. Otherwise returns
> + null. */
Generally we'd write NULL rather than null for a generic pointer and in
this specific case NULL_TREE is even better.
> +
> +tree
> +get_attr_nonstring_decl (tree expr, tree *ref)
> +{
> + tree dcl = expr;
I think Jakub may have pointed this out, but we generally prefer decl.
Truly a nit though.
> + if (TREE_CODE (dcl) == SSA_NAME)
> + {
> + if (SSA_NAME_IS_DEFAULT_DEF (dcl))
> + dcl = SSA_NAME_VAR (dcl);
> + else
> + {
> + gimple *def = SSA_NAME_DEF_STMT (dcl);
> +
> + if (is_gimple_assign (def))
> + {
> + tree_code code = gimple_assign_rhs_code (def);
> + if (code == ADDR_EXPR
> + || code == COMPONENT_REF
> + || code == VAR_DECL)
> + dcl = gimple_assign_rhs1 (def);
Note that COMPONENT_REFs can have arguments that are themselves
COMPONENT_REFS. So you typically want to use a loop to strip all away.
> + /* It's safe to call strndup and strnlen with a non-string argument
> + since the functions provide an explicit bound for this purpose. */
> + if (DECL_FUNCTION_CODE (fndecl) == BUILT_IN_STRNDUP)
> + return;
So I see you handle BUILT_IN_STRNDUP here, but not BUILT_IN_STRNLEN. Is
there a reason for the omission?
Jeff