This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Check the STRING_CSTs in varasm.c


On 08/17/2018 07:53 AM, Bernd Edlinger wrote:
> On 08/17/18 15:38, Richard Biener wrote:
>> On Fri, 17 Aug 2018, Bernd Edlinger wrote:
>>
>>> On 08/17/18 14:19, Richard Biener wrote:
>>>> On Fri, 17 Aug 2018, Bernd Edlinger wrote:
>>>>
>>>>> Richard Biener wrote:
>>>>>> +embedded @code{NUL} characters.  However, the
>>>>>> +@code{TREE_STRING_LENGTH} always includes a trailing @code{NUL} that
>>>>>> +is not part of the language string literal but appended by the front end.
>>>>>> +If the string shall not be @code{NUL}-terminated the @code{TREE_TYPE}
>>>>>> +is one character shorter than @code{TREE_STRING_LENGTH}.
>>>>>> +Excess caracters other than one trailing @code{NUL} character are not
>>>>
>>>> characters btw.
>>>>
>>>
>>> thanks, updated.
>>>
>>>> I read the above that the string literal for
>>>>
>>>> char x[2] = "1";
>>>>
>>>> is actually "1\0\0" - there's one NUL that is not part of the language
>>>> string literal.  The second sentence then suggests that both \0
>>>> are removed because 2 is less than 3?
>>>>
>>>
>>> maybe 2 is a bad example, lets consider:
>>> char x[2000000000] = "1";
>>>
>>> That is a string_cst with STRING_LENGTH = 2, content = "2\0\0"
>>> the array_type is used on both x, and the string_cst,
>>> I was assuming that both tree objects refer to the same type object.
>>> which is char[0..2000000000-1]
>>
>> Oh, didn't realize we use char[200000000] for the STRING_CST.  Makes
>> my suggestion to use char[] instead not (very) much worse than the
>> existing practice then.
>>
>>> varasm assembles the bytes that are given by STRING_LENGTH
>>> and appends zeros as appropriate.
>>>
>>>> As said, having this extra semantics of a STRING_CST tied to
>>>> another tree node (its TREE_TYPE) looks ugly.
>>>>
>>>>>> +permitted.
>>>>>>
>>>>>> I find this very confusing and oppose to that change.  Can we get
>>>>>> back to the drawing board please?  If we want an easy way to
>>>>>> see whether a string is "properly" terminated then maybe we can
>>>>>> simply use a flag that gets set by build_string?
>>>>>>
>>>>>
>>>>> What I mean with that is the case like
>>>>> char x[2] = "123456";
>>>>>
>>>>> which is build_string(7, "123456"), but with a type char[2],
>>>>> so varasm throws away "3456\0".
>>>>
>>>> I think varasm throws away chars not because of the type of
>>>> the STRING_CST but because of the available storage in x.
>>>>
>>>
>>> But at other places we look at the type of the string_cst, don't we?
>>> Shouldn't those be the same?
>>
>> I think most (all?) places look at TREE_TYPE (TREE_TYPE (string))
>> only.  I'm not aware of users of the array domain of the array type
>> of a string - but I'm far from knowing GCC inside-out ;)
>>
> 
> Yes, I know, that happens to me as well on the first day after my holidays ;)
I'm not aware of anything using the array domain of the array type of a
string.  But there's still a lot of old code in tree.c and expr.c.  Who
knows what would be found with a thorough audit.

jeff


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]