This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] [Ada] Make middle-end string literals NUL terminated


On 08/03/18 18:50, Olivier Hainque wrote:
> Hi Bernd,
> 
>> On 31 Jul 2018, at 14:07, Bernd Edlinger <bernd.edlinger@hotmail.de> wrote:
>> 
>> Hi!
>> 
>> 
>> This fixes a couple STRING_CST that are not explicitly NUL terminated.
>> These were caught in a new check in varasm.c I am currently working on.
>> 
>> Having a NUL terminated string does not change the binary output, but it
>> makes things easier for he middle-end.
>> 
>> 
>> Bootstrapped and reg-tested on x86_64-pc-linux-gnu.
>> Is it OK for trunk?
> 
> --- gcc/ada/gcc-interface/utils2.c      2017-12-21 07:57:41.000000000 +0100
> +++ gcc/ada/gcc-interface/utils2.c      2018-07-31 11:44:01.517117923 +0200
> @@ -1844,7 +1844,7 @@ expand_sloc (Node_Id gnat_node, tree *fi
>       }
> 
>     const int len = strlen (str);
> -  *filename = build_string (len, str);
> +  *filename = build_string (len + 1, str);
>     TREE_TYPE (*filename) = build_array_type (char_type_node,
>                                               build_index_type (size_int (len)));
>     *line = build_int_cst (NULL_TREE, line_number);
> 
> This one looks good to me. I'm not officially listed as a maintainer
> so I'll let Eric have the final word. I'm answering because ...
> 
> 
> --- gcc/ada/gcc-interface/trans.c       2018-07-17 10:10:04.000000000 +0200
> +++ gcc/ada/gcc-interface/trans.c       2018-07-31 11:16:27.350728886 +0200
> @@ -6079,7 +6079,7 @@ gnat_to_gnu (Node_Id gnat_node)
>                where GCC will want to treat it as a C string.  */
>             string[i] = 0;
> 
> -         gnu_result = build_string (length, string);
> +         gnu_result = build_string (length + 1, string);
> 
>             /* Strings in GCC don't normally have types, but we want
>                this to not be converted to the array type.  */
> 
> We have a local variant of this one, on which I worked after we realized
> that it was not enough to achieve the intended effect.
> 
> The difference at this spot is simply that we prevent the +1 if the
> original string happens to be explicitly nul terminated already. Like:
> 
>               build_string
>                 ((length > 0 && string[length-1] == 0) ? length : length + 1,
>                  string);
> 
> This is however not enough because the extra zero is only conveyed
> through the string value, not the corresponding type, and
> 
>    varasm.c: mergeable_string_section
> 
> currently uses the type bounds to search for a zero termination.
> 
> We can't really change the type, so came up with an adjustment to
> varasm. The motivation for using the type bounds was
> 
> https://gcc.gnu.org/ml/gcc-patches/2006-06/msg01004.html
> 
> which, IIUC, was tightly linked to string constants used as
> initializers for objects which have a size of their own.
> (Jakub cc'ed)
> 
> For string constants not used as initializers, it seems that
> we might be able to use the string bounds instead, possibly
> wider. The attached patch does that and passed testing a few weeks
> ago.
> 
> So, while we're at it, does that look right, in light of all the
> string length related issues that have been discussed lately ?
> 
> I'm not sure I grasped all the ramifications.
> 
> Thanks in advance!
> 
> 
>          * varasm.c (mergeable_string_section): Accept an extra SCAT
>          argument conveying the section category from which the request
>          originates.  Only restrict the search to the string type bounds
>          if we're queried for an initializer.  Consider the string bounds
>          otherwise.
>          (default_elf_select_section, STR and STR_INIT): Pass the section
>          category to mergeable_string_section.
> 
> 
> 
> 
> 

Oh, how interesting.

My patch only intends to add a dummy byte that is stripped again
in the assembly.  The purpose is to make it trivial to find out if
a string constant is NUL-terminated in  the middle-end by comparing
TYPE_SIZE_UNIT (TREE_TYPE (decl)) with TREE_STRING_LENGTH (decl).
TYPE_SIZE_UNIT >= TREE_STRING_LENGTH means zero terminated
TYPE_SIZE_UNIT < TREE_STRING_LENGTH means not necessarily zero terminated,
Such strings shall never chop off more than one nul character.
Ada strings are generally not zero terminated, right?

So in your search loop you would not have to look after
type_len, because that byte would be guaranteed to be zero.

That is if we agree that we want to restrict the string constants
in that way as I proposed.

In the case str_len > type_len I do not understand if that is right.

Because varasm will only output type_size bytes,
this seems only to select a string section
but the last nul byte will not be assembled.
Something that is not contained in the type_size
should not be assembled.

What exactly do you want to accomplish?


Thanks
Bernd.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]