This is the mail archive of the mailing list for the GCC project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] fix a couple of bugs in const string folding (PR 86532)

On 07/19/2018 02:45 PM, Bernd Edlinger wrote:
@@ -11413,8 +11429,10 @@ string_constant (tree arg, tree *ptr_offset)
     const char a[4] = "abc\000\000";
     The excess elements contribute to TREE_STRING_LENGTH()
     but not to strlen().  */
-  unsigned HOST_WIDE_INT length
-    = strnlen (TREE_STRING_POINTER (init), TREE_STRING_LENGTH (init));
+  unsigned HOST_WIDE_INT charsize
+    = tree_to_uhwi (TYPE_SIZE_UNIT (TREE_TYPE (TREE_TYPE (init))));
+  unsigned HOST_WIDE_INT length = TREE_STRING_LENGTH (init);
+  length = string_length (TREE_STRING_POINTER (init), charsize, length);
  if (compare_tree_int (array_size, length + 1) < 0)
    return NULL_TREE;

But TREE_STRING_LENGTH is the length in bytes including NUL-termination.
then length is passed to string_length with expects it in units of charsize.
and returns number of characters.
then compare_tree_int compares array_size which is in units of bytes,
but not the size of the innermost enclosing array.

Whoops.  I forgot TREE_STRING_LENGTH(s) is really sizeof(s),
not the length or size in characters.  Not the most fortunate
name.  The update to this patch I sent corrects this mistake.
The subsequent patch I sent to implement the warning whose
absence you lamented for non-nul-terminated strings tightens
up the bound on the array size.

I really don't see why we need to support wide characters especially
when there is no reasonable test coverage, and no usable wstrlen
builtin first.

I'm not sure what special support are you talking about.
The string_constant function handles all kinds of strings
and the change above doesn't affect that.  The purpose of
the check above is to conservatively fail for constant
arrays with excess initializers such as in the comment:

    const char a[4] = "abc\000\000";

Defining such strings is undefined, as is using out-of-bounds
offsets into such things, so a permissive check isn't inherently
wrong, nor would it be incorrect to remove it and accept and
fold them.  But neither is essential to the fix for this bug.

If you are you referring to the handling of non-const offsets
here, those are already handled in c_strlen() for a subset of
cases (plain ARRAY_REFs).  This change just makes it consistent
for all references, including those to multi-dimensinal arrays
and other aggregates.  (I noticed it while some of my tests
for the fix were failing.)


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]