This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] detect missing nuls in address of const char (PR 87756)


Jeff, I'd like to go ahead and commit the patch as is.  I believe
the use of the default argument is appropriate and in line with
GCC practice.  Please let me know if you have strong objections.
If I don't hear any I will proceed later this week

Thanks
Martin

On 10/30/2018 10:38 AM, Martin Sebor wrote:
On 10/30/2018 09:54 AM, Jeff Law wrote:
On 10/30/18 9:44 AM, Martin Sebor wrote:
On 10/30/2018 09:27 AM, Jeff Law wrote:
On 10/29/18 5:51 PM, Martin Sebor wrote:
The missing nul detection fails when the argument of the %s or
similar sprintf directive is the address of a non-nul character
constant such as in:

  const char c = 'a';
  int f (void)
  {
    return snprintf (0, 0, "%s", &c);
  }

This is because the string_constant function only succeeds for
arguments that refer to STRRING_CSTs, not to individual characters.

For the same reason, calls to memchr() such as the one below aren't
folded into constants:

  const char d = '\0';
  void* g (void)
  {
    return memchr (&d, 0, 1);
  }

To detect and diagnose the missing nul in the first example and
to fold the second, the attached patch modifies string_constant
to return a synthesized STRING_CST object for such references
(while also indicating whether such an object is properly
nul-terminated).

Tested on x86_64-linux.

Martin

gcc-87756.diff

PR tree-optimization/87756 - missing unterminated argument warning
using address of a constant character

gcc/ChangeLog:

    PR tree-optimization/87756
    * expr.c (string_constant): Synthesize a string literal from
    the address of a constant character.
    * tree.c (build_string_literal): Add an argument.
    * tree.h (build_string_literal): Same.

gcc/testsuite/ChangeLog:

    PR tree-optimization/87756
    * gcc.dg/builtin-memchr-2.c: New test.
    * gcc.dg/builtin-memchr-3.c: Same.
    * gcc.dg/warn-sprintf-no-nul-2.c: Same.

Index: gcc/expr.c
===================================================================
--- gcc/expr.c    (revision 265496)
+++ gcc/expr.c    (working copy)
@@ -11484,18 +11484,40 @@ string_constant (tree arg, tree
*ptr_offset, tree
     offset = off;
     }

-  if (!init || TREE_CODE (init) != STRING_CST)
+  if (!init)
     return NULL_TREE;

+  *ptr_offset = offset;
+
+  tree eltype = TREE_TYPE (init);
+  tree initsize = TYPE_SIZE_UNIT (eltype);
   if (mem_size)
-    *mem_size = TYPE_SIZE_UNIT (TREE_TYPE (init));
+    *mem_size = initsize;
+
   if (decl)
     *decl = array;

-  gcc_checking_assert (tree_to_shwi (TYPE_SIZE_UNIT (TREE_TYPE
(init)))
-               >= TREE_STRING_LENGTH (init));
+  if (TREE_CODE (init) == INTEGER_CST)
+    {
+      /* For a reference to (address of) a single constant character,
+     store the native representation of the character in
CHARBUF.   */
+      unsigned char charbuf[MAX_BITSIZE_MODE_ANY_MODE /
BITS_PER_UNIT];
+      int len = native_encode_expr (init, charbuf, sizeof charbuf,
0);
+      if (len > 0)
+    {
+      /* Construct a string literal with elements of ELTYPE and
+         the representation above.  Then strip
+         the ADDR_EXPR (ARRAY_REF (...)) around the STRING_CST.  */
+      init = build_string_literal (len, (char *)charbuf, eltype);
+      init = TREE_OPERAND (TREE_OPERAND (init, 0), 0);
+    }
+    }

-  *ptr_offset = offset;
+  if (TREE_CODE (init) != STRING_CST)
+    return NULL_TREE;
+
+  gcc_checking_assert (tree_to_shwi (initsize) >= TREE_STRING_LENGTH
(init));
+
   return init;
 }

Index: gcc/tree.c
===================================================================
--- gcc/tree.c    (revision 265496)
+++ gcc/tree.c    (working copy)

Index: gcc/tree.h
===================================================================
--- gcc/tree.h    (revision 265496)
+++ gcc/tree.h    (working copy)
@@ -4194,7 +4194,7 @@ extern tree
build_call_expr_internal_loc_array (lo
 extern tree maybe_build_call_expr_loc (location_t, combined_fn, tree,
                        int, ...);
 extern tree build_alloca_call_expr (tree, unsigned int,
HOST_WIDE_INT);
-extern tree build_string_literal (int, const char *);
+extern tree build_string_literal (int, const char *, tree =
char_type_node);

 /* Construct various nodes representing data types.  */
There's only about a dozen calls to build_string_literal.  Instead of
defaulting the argument, just fix them.    OK with that change.  Make
sure to catch those in config/{rs6000,i386}/ and cp/

Why?  Default arguments (and overloading) exist in C++ to deal
with just this case: to avoid having to provide the common
argument value while letting callers provide a different value
when they need to.  What purpose will it serve to make these
unnecessary changes and to force new callers to provide
the default argument value?  It will only make using
the function more error-prone and its callers harder
to read.I find them much harder to read especially once you get more
than one.
In cases where we have a small number of call sites we should just fix
them.  I think that we're well in that range here.  If there's a large
number of call sites, then overloading via default args makes plenty of
sense.

Sorry, but I still don't see why or agree with the rule.  There
is just one call site out of the 14 that provides a value other
than the default.  There are APIs in GCC with default arguments
with fewer calls than that.

Why do you think we should change the other 14 callers only to have
the function do for them what it does now, or that it's a good rule
to follow in general?  And where is the threshold between a small
and large number of call sites?  What if I'm adding a new API with
just a handful of callers?

Martin


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]