[WIP PATCH] add object access attributes (PR 83859)

Jeff Law law@redhat.com
Sun Oct 27 17:37:00 GMT 2019


On 9/29/19 1:51 PM, Martin Sebor wrote:
> -Wstringop-overflow detects a subset of past-the-end read and write
> accesses by built-in functions such as memcpy and strcpy.  It relies
> on the functions' effects the knowledge of which is hardwired into
> GCC.  Although it's possible for users to create wrappers for their
> own functions to detect similar problems, it's quite cumbersome and
> so only lightly used outside system libraries like Glibc.  Even Glibc
> only checks for buffer overflow and not for reading past the end.
> 
> PR 83859 asks to expose the same checking that GCC does natively for
> built-in calls via a function attribute that associates a pointer
> argument with the size argument, such as:
> 
>   __attribute__((buffer_size (1, 2))) void
>   f (char* dst, size_t dstsize);
> 
> The attached patch is my initial stab at providing this feature by
> introducing three new attributes:
> 
>   * read_only (ptr-argno, size-argno)
>   * read_only (ptr-argno, size-argno)
>   * read_write (ptr-argno, size-argno)
> 
> As requested, the attributes associate a pointer parameter to
> a function with a size parameter.  In addition, they also specify
> how the function accesses the object the pointer points to: either
> it only reads from it, or it only writes to it, or it does both.
> 
> Besides enabling the same buffer overflow detection as for built-in
> string functions they also let GCC issue -Wuninitialized warnings
> for uninitialized objects passed to read-only functions by reference,
> and -Wunused-but-set warnings for objects passed to write-only
> functions that are otherwise unused (PR 80806).  The -Wununitialized
> part is done. The -Wunused-but-set detection is implemented only in
> the C FE and not yet in C++.
> 
> Besides the diagnostic improvements above the attributes also open
> up optimization opportunities such as DCE.  I'm still working on this
> and so it's not yet part of the initial patch.
> 
> I plan to finish the patch for GCC 10 but I don't expect to have
> the time to start taking advantage of the attributes for optimization
> until GCC 11.
> 
> Besides regression testing on x86_64-linux, I also tested the patch
> by compiling Binutils/GDB, Glibc, and the Linux kernel with it.  It
> found no new problems but caused a handful of -Wunused-but-set-variable
> false positives due to an outstanding bug in the C front-end introduced
> by the patch that I still need to fix.
> 
> Martin
> 
> gcc-80806.diff
> 
> PR c/80806 - gcc does not warn if local array is memset only
> PR middle-end/83859 - attribute to associate buffer and its size
> 
> gcc/ChangeLog:
> 
> 	PR c/80806
> 	PR middle-end/83859
> 	* builtin-attrs.def (ATTR_NO_SIDE_EFFECT): New.
> 	(ATTR_READ_ONLY, ATTR_READ_WRITE, ATTR_WRITE_ONLY): New.
> 	(ATTR_NOTHROW_WRONLY1_LEAF, ATTR_NOTHROW_WRONLY1_2_LEAF): New.
> 	(ATTR_NOTHROW_WRONLY1_3_LEAF, ATTR_NOTHROW_WRONLY2_3_LEAF): New.
> 	(ATTR_RET1_NOTHROW_WRONLY1_LEAF, ATTR_RET1_NOTHROW_WRONLY1_3_LEAF): New.
> 	(ATTR_RET1_NOTHROW_NONNULL_RDONLY2_LEAF): New.
> 	(ATTR_RET1_NOTHROW_NONNULL_RDWR1_RDONLY2_LEAF): New.
> 	(ATTR_RET1_NOTHROW_WRONLY1_3_RDONLY2_3_LEAF): New.
> 	(ATTR_RET1_NOTHROW_WRONLY1_RDONLY2_LEAF): New.
> 	(ATTR_RET1_NOTHROW_WRONLY1_3_RDONLY2_LEAF): New.
> 	(ATTR_MALLOC_NOTHROW_NONNULL_RDONLY1_LEAF): New.
> 	(ATTR_PURE_NOTHROW_NONNULL_RDONLY1_LEAF): New.
> 	(ATTR_PURE_NOTHROW_NONNULL_RDONLY1_RDONLY2_LEAF): New.
> 	(ATTR_RETNONNULL_RDONLY2_3_NOTHROW_LEAF): New.
> 	(ATTR_RETNONNULL_WRONLY1_3_RDONLY2_3_NOTHROW_LEAF): New.
> 	(ATTR_PURE_NOTHROW_NONNULL_RDONLY1_3_LEAF): New.
> 	(ATTR_PURE_NOTHROW_NONNULL_RDONLY1_2_3_LEAF): New.
> 	(ATTR_RETNONNULL_RDONLY2_NOTHROW_LEAF): New.
> 	(ATTR_RETNONNULL_WRONLY1_RDONLY2_NOTHROW_LEAF): New.
> 	(ATTR_RETNONNULL_WRONLY1_3_RDONLY2_NOTHROW_LEAF): New.
> 	* builtins.c (check_access): Make extern.  Consistently set
> 	the no-warning bit after issuing a warning.
> 	* builtins.h (check_access): Declare.
> 	* builtins.def (bcopy, bzero, index, memchr, memcmp, memcpy): Add
> 	read_only and write_only attributes.
> 	(memset, rindex, stpcpy, stpncpy, strcasecmp, strcat): Same.
> 	(strchr, strcmp, strcpy, strcspn, strdup, strndup, strlen): Same.
> 	(strncasecmp, strncat, strncmp, strncpy, strrchr, strspn, strstr): Same.
> 	(free, __memcpy_chk, __memmove_chk, __memset_chk): Same.
> 	(__strcpy_chk, __strncpy_chk): Same.
> 	* calls.c (rdwr_access_hash): New type.
> 	(rdwr_map): Same.
> 	(init_attr_rdwr_indices): New function.
> 	(maybe_warn_rdwr_sizes): Same.
> 	(initialize_argument_information): Call init_attr_rdwr_indices.
> 	Call maybe_warn_rdwr_sizes.
> 	* doc/extend.texi (no_side_effect): Document new attribute.
> 	(read_only, write_only, read_write): Same.
> 	* tree-ssa-uninit.c (maybe_warn_uninit_accesss): New functions.
> 	(warn_uninitialized_vars): Rename argument.  Factor out code into
> 	maybe_warn_uninit_accesss.  Call it.
> 
> gcc/c/ChangeLog:
> 
> 	PR c/80806
> 	PR middle-end/83859
> 	* c-parser.c (c_parser::no_set_read, no_init, in_arg): New members.
> 	(c_parser_expr_list): Add argument.
> 	(c_parser_attribute): Pass a new argument to get_nonnull_operand.
> 	(c_parser_initializer): Set parser->in_init.
> 	(c_parser_binary_expression): Use parser->no_set_read.
> 	(c_parser_unary_expression): Same.
> 	(c_parser_sizeof_expression): Use parser->in_arg.
> 	(c_parser_postfix_expression_after_primary): Adjust.
> 	(is_write_only_p): New function.
> 	(c_parser_expr_list): Add argument.
> 	Avoid setting DECL_READ_P for decls passed to write-only function
> 	arguments.
> 	(c_parser_objc_keywordexpr): Pass a new argument to c_parser_expr_list.
> 	(c_parser_oacc_wait_list): Same.
> 	* c-tree.h (parser_build_binary_op): Add argument.
> 	* c-typeck.c (default_conversion): Same.  Use it.
> 	(parser_build_binary_op): Same.
> 	(build_binary_op): Same.
> 
> gcc/c-family/ChangeLog:
> 
> 	PR c/80806
> 	PR middle-end/83859
> 	* c-attribs.c (handle_no_side_effect_attribute): New function.
> 	(handle_read_only_attribute, handle_write_only_attribute): Same.
> 	(handle_read_write_attribute): Same.
> 	(c_common_attribute_table): Add new attributes.
> 	(get_argument_type): New function.
> 	(handle_rdwr_attributes): Same.
> 	(has_attribute): Add argument to callback signature.  Pass to it
> 	a new value.
> 	(get_nonnull_operand): Rename...
> 	(get_attribute_operand): ...to this.
> 	* c-common.c (get_nonnull_operand): Rename...
> 	(get_attribute_operand): ...to this.
> 	(build_binary_op): Add a new argument.
> 	(default_conversion): Same.
> 	(has_attribute): Adjust argument type.
> 
> gcc/cp/ChangeLog:
> 
> 	PR c/80806
> 	PR middle-end/83859
> 	* typeck.c (cp_default_conversion): Add argument.
> 	(build_binary_op): Same.
> 
> gcc/testsuite/ChangeLog:
> 
> 	PR c/80806
> 	PR middle-end/83859
> 	* c-c++-common/Wsizeof-pointer-memaccess1.c: Adjust.
> 	* c-c++-common/Wsizeof-pointer-memaccess2.c: Adjust.
> 	* gcc.dg/Wstrict-aliasing-bogus-vla-1.c: Adjust.
> 	* gcc/testsuite/gcc.dg/Wunused-but-set-var.c: New test.
> 	* gcc.dg/attr-alloc_size.c: Adjust.
> 	* gcc/testsuite/gcc.dg/attr-read-only-2.c: New test.
> 	* gcc/testsuite/gcc.dg/attr-read-only.c: New test.
> 	* gcc/testsuite/gcc.dg/attr-write-only-2.c: New test.
> 	* gcc/testsuite/gcc.dg/attr-write-only.c: New test.
> 	* gcc.dg/nonnull-3.c: Adjust.
> 	* gcc.dg/pr40340-2.c: Adjust.
> 	* gcc.dg/pr78768.c: Adjust.
> 	* gcc.dg/pr79715.c: Adjust.
> 	* gcc.dg/tree-ssa/builtin-snprintf-7.c: Adjust.
> 	* gcc/testsuite/gcc.dg/uninit-builtin.c: New test.
You mention that this is almost done, but not yet finished.  If I were
working on this I could have split this up into adding the support for
the attributes first without trying to use them anywhere first.  Then a
separate patch that adds the attribute to the builtins, then a patch
that exploited the new attributes to do something useful.

Please try to avoid creating large patches that are so easily broken up.
 It just makes review harder than it needs to be and slows down getting
the patches integrated.

Your ChangeLog references "no_side_effect" in a few places.  But I don't
see any of those changes actually in the patch, except for adding it in
builtin-attrs.def.  I'm guessing those are supposed to all be part of a
follow-up patch?

FWIW, I believe there may be a DSE case that your patches allow us to
handle.  Essentially we had a case where DSE didn't fire because we
didn't have detailed information about how the operands to one of the
mem* or str* calls was used.  I don't expect you to own this missing opt
as part of this submission.


> diff --git a/gcc/attribs.h b/gcc/attribs.h
> index 23a7321e04a..0e2320701b1 100644
> --- a/gcc/attribs.h
> +++ b/gcc/attribs.h
> @@ -218,4 +218,36 @@ lookup_attribute_by_prefix (const char *attr_name, tree list)
>      }
>  }
>  
> +/* Description of a function argument declared attribute read_only,
> +   read_write, or write_only.  Used as an "iterator" over all such
> +   arguments in a function declaration or call.  */
> +
> +struct attr_access
> +{
> +  /* Attribute chain for the given function declaration.  */
> +  tree attrs;
> +
> +  /* The attribute pointer argument.  */
> +  tree ptr;
> +  /* The size of the pointed-to object or NULL when not specified.  */
> +  tree size;
> +
> +  /* The tree node corresponding to the current argument in the chain
> +     of formal function arguments in a call to the given function.
> +     Used by attributes that specify that relevant arguments are of
> +     the given kind, as in
> +       strcmp (const char*, const char*) __attribute__ ((read_only))  */
> +  tree argchain;
> +
> +  /* The zero-based number of each of the formal function arguments.  */
> +  unsigned ptrarg;
> +  unsigned sizarg;
> +  enum Kind { read_only, write_only, read_write };
Nit: Mixed case "Kind" use either KIND or kind.





> diff --git a/gcc/c/c-typeck.c b/gcc/c/c-typeck.c
> index d4e12eb93d1..2e439c43b91 100644
> --- a/gcc/c/c-typeck.c
> +++ b/gcc/c/c-typeck.c
> @@ -2146,14 +2146,30 @@ perform_integral_promotions (tree exp)
>     In addition, manifest constants symbols are replaced by their values.  */
>  
>  tree
> -default_conversion (tree exp)
> +default_conversion (tree exp, bool read_p /* = true */)
Please update the function comment to describe the new argument.  I
think you need to do that for c_parser_expr_list and
parser_build_binary_op and build_binary_op.


> diff --git a/gcc/gimple.h b/gcc/gimple.h
> index cf1f8da5ae2..4d850a088cf 100644
> --- a/gcc/gimple.h
> +++ b/gcc/gimple.h
> @@ -24,6 +24,8 @@ along with GCC; see the file COPYING3.  If not see
>  
>  #include "tree-ssa-alias.h"
>  #include "gimple-expr.h"
> +#include "function.h"
> +#include "basic-block.h"
How important is this (ie, how painful is it to have the .c/.cc files
include function.h/basic-block.h?  We generally frown on having one
header include others like this.



> diff --git a/gcc/tree-ssa-uninit.c b/gcc/tree-ssa-uninit.c
> index fe8f8f0bc28..b9d455159cb 100644
> --- a/gcc/tree-ssa-uninit.c
> +++ b/gcc/tree-ssa-uninit.c
Something else to consider (as a separate follow-up).

IIUC your code adds checking arguments at call sites which is a nice
improvement.  There may be BZs related to this issue.

Another concept we might want to consider based on what I've seen pop up
fairly often in code is the concept of "must write".  In the caller we'd
consider passing a must-write object to a function call as
initialization which would cut down on false positives.  We could verify
the behavior in the callee.

Also, I think Fortran has the concept of "Intent" which does largely the
same thing you're doing.  You might consider reaching out to the Fortran
front-end folks and see if they can encode their Intent information into
your attributes.  I believe there's BZs related to using Intent
information to avoid false positives from the uninit warning pass.

I don't see anything terribly concerning.  Looking forward to the final
iteration here.

jeff



More information about the Gcc-patches mailing list