[PATCH] Handle '\0' in strcmp in RTL expansion (PR tree-optimization/90892).

Martin Liška mliska@suse.cz
Tue Jun 18 10:27:00 GMT 2019


On 6/18/19 12:16 PM, Jakub Jelinek wrote:
> On Tue, Jun 18, 2019 at 11:56:41AM +0200, Martin Liška wrote:
>> On 6/18/19 10:23 AM, Martin Liška wrote:
>>> On 6/18/19 10:11 AM, Jakub Jelinek wrote:
>>>> On Tue, Jun 18, 2019 at 10:07:50AM +0200, Martin Liška wrote:
>>>>> diff --git a/gcc/builtins.c b/gcc/builtins.c
>>>>> index 3463ffb1539..b58e1e58d4d 100644
>>>>> --- a/gcc/builtins.c
>>>>> +++ b/gcc/builtins.c
>>>>> @@ -7142,6 +7142,20 @@ inline_expand_builtin_string_cmp (tree exp, rtx target)
>>>>>    const char *src_str1 = c_getstr (arg1, &len1);
>>>>>    const char *src_str2 = c_getstr (arg2, &len2);
>>>>>  
>>>>> +  if (src_str1 != NULL)
>>>>> +    {
>>>>> +      unsigned HOST_WIDE_INT str_str1_strlen = strnlen (src_str1, len1);
>>>>> +      if (str_str1_strlen + 1 < len1)
>>>>> +	len1 = str_str1_strlen + 1;
>>>>
>>>> You really don't need any of this after strnlen.  strnlen is already
>>>> guaranteed to return a number from 0 to len1 inclusive, so you can really
>>>> just do:
>>>>   if (src_str1 != NULL)
>>>>     len1 = strnlen (src_str1, len1);
>>>>
>>>> 	Jakub
>>>>
>>>
>>> Got it, I'm testing that.
>>>
>>> Martin
>>>
>>
>> Ok, there's an off-by-one error in the previous patch candidate.
>> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
>>
>> Ready to be installed?
>> Thanks,
>> Martin
> 
>> >From fe4ef7d43c506f450de802a7d8a602fab7da4545 Mon Sep 17 00:00:00 2001
>> From: Martin Liska <mliska@suse.cz>
>> Date: Mon, 17 Jun 2019 10:39:15 +0200
>> Subject: [PATCH] Handle '\0' in strcmp in RTL expansion (PR
>>  tree-optimization/90892).
>>
>> gcc/ChangeLog:
>>
>> 2019-06-17  Martin Liska  <mliska@suse.cz>
>>
>> 	PR tree-optimization/90892
>> 	* builtins.c (inline_expand_builtin_string_cmp): Handle '\0'
>> 	in string constants.
> 
> Oops.  The problematic case is then if the STRING_CST c_getstr finds
> is not NUL terminated (dunno if we ever construct that) or if
> string_size is smaller than string_length and there are no NULs in that
> size.

The function always returns a null-terminated string:

 14587  /* Return a pointer P to a NUL-terminated string representing the sequence
 14588     of constant characters referred to by SRC (or a subsequence of such
 14589     characters within it if SRC is a reference to a string plus some
 14590     constant offset).  If STRLEN is non-null, store the number of bytes
 14591     in the string constant including the terminating NUL char.  *STRLEN is
 14592     typically strlen(P) + 1 in the absence of embedded NUL characters.  */
 14593  
 14594  const char *
 14595  c_getstr (tree src, unsigned HOST_WIDE_INT *strlen /* = NULL */)
 14596  {
 14597    tree offset_node;
 14598    tree mem_size;

That said, the unconditional strnlen should be fine.

Martin

> With your patch that would mean setting len1 or len2 one larger than needed.
> 
> Looking at the function, I think we want to do more changes.
> 
> I think any such length changes should be moved after the two punt checks.
> Move also the len3 setting before the new checks (of course conditional on
> is_ncmp).
> Sorry for the earlier advice, you indeed should store the strnlen result
> into a different variable, and through that you can easily differentiate between
> when the string is NUL terminated and when it is not (if strnlen < lenN,
> then NUL terminated).
> If !is_ncmp, I think you should punt for not NUL terminated strings
> (return NULL_RTX).
> If is_ncmp and not NUL terminated, you should punt if len3 is bigger than
> len{1,2}.
> If NUL terminated, you should set lenN to the strnlen returned value + 1.
> 
> Does this sound reasonable?
> 
> 	Jakub
> 



More information about the Gcc-patches mailing list