This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Check the STRING_CSTs in varasm.c


Richard Biener wrote:

> Note that I'm a little bit confused here given build_string
> already appends a '\0' after TREE_STRING_LENGTH.  So it is safe
> to call strlen() on all STRING_CSTs.  I think I raised this in
> the review of one of Bernds patches but to be honest all the
> various threads have become a bit convoluted (esp. if joining
> in after two weeks of vacation).
>
> So -- why is what we do currently not enough?
>
> Quoting our single STRING_CST creator:
>
> tree
> build_string (int len, const char *str)
> {
>   tree s;
>   size_t length;
>
>  /* Do not waste bytes provided by padding of struct tree_string.  */
>  length = len + offsetof (struct tree_string, str) + 1;
>
>  record_node_allocation_statistics (STRING_CST, length);
>
>  s = (tree) ggc_internal_alloc (length);
>
>  memset (s, 0, sizeof (struct tree_typed));
>  TREE_SET_CODE (s, STRING_CST);
>  TREE_CONSTANT (s) = 1;
>  TREE_STRING_LENGTH (s) = len;
>  memcpy (s->string.str, str, len);
>  s->string.str[len] = '\0';
>
>  return s;
>}

Thanks for this question.

I have two issues with that.
1. it adds not a wide character nul, only a single byte '\0'.
2. the '\0' here is _guaranteed_ not to be assembled
    by varasm.
   Since it is at TREE_STRING_LENGTH+1.

That is fine for some string constants, like ASM constaraints.
it makes most of the time sense, as long as it is not
used for folding of memxxx functions.

Of course the whole patch series is dependent on the
consensus about how we want to handle string constants
in the middle-end, so no problem with going back to the
drawing board, that's the reason why I did not apply the
already approved bits, and I guess we are not in a hurry.

What I see, is that string constants are most of the time
handled like the C language strings, that is there
are different character wides, and the array_type on the
string constants, includes another NUL (wide) char which
is assembled along with the rest of the bytes, but it is
not written in the language string constant.
The front end appends this before giving the string
constant to build_string.
That means at least most of the time.

So I thought it might be good to have some more
easily checkable things regarding the zero termiation,
or what is allowed for excess precision.

It's possible that this shoots over the target, but I think
this hardening in the varasm is of some value.

This way how this patch works has one advantage:
That it is easy to check for "there must be one wide char zero",
if it is someting that can't be checked, like:
"there may be a nul or not", then it is impossible to be checked.

Well yeah, it's an idea.



Bernd.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]