[PATCH][Tree-optimization/PR89772]fold memchr builtins for character not in constant nul-padded string

Bernd Edlinger bernd.edlinger@hotmail.de
Wed May 8 19:02:00 GMT 2019


On 5/8/19 4:31 PM, Richard Biener wrote:
> On Tue, May 7, 2019 at 4:34 AM JunMa <JunMa@linux.alibaba.com> wrote:
>>
>> 在 2019/5/6 下午7:58, JunMa 写道:
>>> 在 2019/5/6 下午6:02, Richard Biener 写道:
>>>> On Thu, Mar 21, 2019 at 5:57 AM JunMa <JunMa@linux.alibaba.com> wrote:
>>>>> Hi
>>>>> For now, gcc can not fold code like:
>>>>>
>>>>> const char a[5] = "123"
>>>>> __builtin_memchr (a, '7', sizeof a)
>>>>>
>>>>> It tries to avoid folding out of string length although length of a
>>>>> is 5.
>>>>> This is a bit conservative, it's safe to folding memchr/bcmp/memcmp
>>>>> builtins when constant string stores in array with some trailing nuls.
>>>>>
>>>>> This patch folds these cases by exposing additional length of
>>>>> trailing nuls in c_getstr().
>>>>> Bootstrapped/regtested on x86_64-linux, ok for trunk?
>>>> It's probably better if gimple_fold_builtin_memchr uses string_constant
>>>> directly instead?
>>> We can use string_constant in gimple_fold_builtin_memchr, however it is a
>>> bit complex to use it in memchr/memcmp constant folding.
>>>> You are changing memcmp constant folding but only have a testcase
>>>> for memchr.
>>> I'll add later.
>>>> If we decide to amend c_getstr I would rather make it return the
>>>> total length instead of the number of trailing zeros.
>>> I think return the total length is safe in other place as well.
>>> I used the argument in patch since I do not want any impact on
>>> other part at all.
>>>
>>
>> Although it is safe to use total length, I found that function
>> inline_expand_builtin_string_cmp() which used c_getstr() may emit
>> redundant rtls for trailing null chars when total length is returned.
>>
>> Also it is trivial to handle constant string with trailing null chars.
>>
>> So this updated patch follow richard's suggestion: using string_constant
>> directly.
>>
>> Bootstrapped/regtested on x86_64-linux, ok for trunk?
> 
> Doesn't this fold to NULL for the case where seaching for '0' and it
> doesn't occur in the string constant but only the zero-padding?
> So you'd need to conditionalize on c being not zero (or handle
> that case specially by actually finding the first zero in the padding)?
> I think there was work to have all string constants zero terminated
> but I'm not sure if we can rely on that here.  Bernd?
> 

It turned out that there is a number of languages that don't have zero-terminated
strings by default, which would have complicated the patch just too much for too
little benefit.

In the end, it was more important to guarantee that mem_size >= string_length holds.

In C it is just a convention that string constants have usually a zero-termination,
but even with C there are ways how strings constants can be not zero-terminated.

There can in theory be optimizations that introduce not zero-terminated strings,
like tree-ssa-forwprop.c, where a not zero-terminated string constant is folded
in simplify_builtin_call.

In such a case, c_getstr might in theory return a string without zero-termination,
but I think it will be rather difficult to find a C test case for that.

Well, if I had a test case for that I would probably fix it in c_getstr to consider
the implicit padding as equivalent to an explicit zero-termination.


Bernd.


> Richard.
> 
>> Regards
>> JunMa
>>
>> gcc/ChangeLog
>>
>> 2019-05-07  Jun Ma <JunMa@linux.alibaba.com>
>>
>>      PR Tree-optimization/89772
>>      * gimple-fold.c (gimple_fold_builtin_memchr): consider trailing nuls in
>>      out-of-bound accesses checking.
>>
>> gcc/testsuite/ChangeLog
>>
>> 2019-05-07  Jun Ma <JunMa@linux.alibaba.com>
>>
>>      PR Tree-optimization/89772
>>      * gcc.dg/builtin-memchr-4.c: New test.
>>> Thanks
>>> JunMa
>>>> Richard.
>>>>
>>>>> Regards
>>>>> JunMa
>>>>>
>>>>>
>>>>> gcc/ChangeLog
>>>>>
>>>>> 2019-03-21  Jun Ma <JunMa@linux.alibaba.com>
>>>>>
>>>>>       PR Tree-optimization/89772
>>>>>       * fold-const.c (c_getstr): Add new parameter to get length of
>>>>> additional
>>>>>       trailing nuls after constant string.
>>>>>       * gimple-fold.c (gimple_fold_builtin_memchr): consider
>>>>> trailing nuls in
>>>>>       out-of-bound accesses checking.
>>>>>       * fold-const-call.c (fold_const_call): Likewise.
>>>>>
>>>>>
>>>>> gcc/testsuite/ChangeLog
>>>>>
>>>>> 2019-03-21  Jun Ma <JunMa@linux.alibaba.com>
>>>>>
>>>>>       PR Tree-optimization/89772
>>>>>       * gcc.dg/builtin-memchr-4.c: New test.
>>>
>>


More information about the Gcc-patches mailing list