[PATCH] Make strlen range computations more conservative
Bernd Edlinger
bernd.edlinger@hotmail.de
Fri Nov 16 17:26:00 GMT 2018
Just a reminder:
those are the two parts of this patch, which have been posted already
a while ago when we were still in stage 1:
https://gcc.gnu.org/ml/gcc-patches/2018-09/msg00805.html
https://gcc.gnu.org/ml/gcc-patches/2018-10/msg01237.html
Bernd.
On 10/20/18 11:16 AM, Bernd Edlinger wrote:
> On 10/17/18 11:56 PM, Jeff Law wrote:
>> On 10/12/18 9:34 PM, Bernd Edlinger wrote:
>>> On 10/12/18 16:55, Jeff Law wrote:
>>>> On 9/15/18 2:43 AM, Bernd Edlinger wrote:
>>>>> Hi,
>>>>>
>>>>> this is an update on my strlen range patch (V7). Again re-based and
>>>>> retested to current trunk.
>>>>>
>>>>> I am aware that Martin wants to re-factor the interface of get_range_strlen
>>>>> and have no objections against, but I'd suggest that to be a follow-up patch.
>>>>>
>>>>> I might suggest to rename one of the two get_range_strlen functions at the
>>>>> same time as it is rather confusing to have to count the parameters in order
>>>>> to tell which function is meant.
>>>>>
>>>>> Bootstrapped and reg-tested on x86_64-pc-linux-gnu.
>>>>> Is it OK for trunk?
>>>>>
>>>>>
>>>>> Thanks
>>>>> Bernd.
>>>>>
>>>>>
>>>>> changelog-range-strlen-v7.txt
>>>>>
>>>>> gcc:
>>>>> 2018-08-26 Bernd Edlinger <bernd.edlinger@hotmail.de>
>>>>>
>>>>> * gimple-fold.c (looks_like_a_char_array_without_typecast_p): New
>>>>> helper function for strlen range estimations.
>>>>> (get_range_strlen): Use looks_like_a_char_array_without_typecast_p
>>>>> for warnings, but use GIMPLE semantics otherwise.
>>>>> * tree-ssa-strlen.c (maybe_set_strlen_range): Use GIMPLE semantics.
>>>>> (get_min_string_length): Avoid not NUL terminated string literals.
>>>> The introduction of looks_like_a_char_array_without_typecast_p is
>>>> probably a good thing. Too much code is already implemented inline
>>>> within get_range_strlen.
>>>>
>>>> It looks like you added handling of ARRAY_RANGE_REF. I don't know how
>>>> often they come up in practice, but handling it seems like a reasonable
>>>> extension to what we're doing. Bonus points if it's triggering with any
>>>> kind of consistency.
>>>>
>>>
>>> I did only want to be consistent with get_inner_reference here,
>>> but did not have encountered these, probably only an Ada thing?
>> Trying to be consistent with get_inner_reference is fine :-) GCC
>> supports case ranges as an extension for C/C++. No clue if they're
>> natively supported by Ada or any other langauge.
>>
>>
>>
>>>
>>>> I actually prefer Martin's unification of type/fuzzy into a single
>>>> enumeration to describe the desired behavior. Doing it with two args
>>>> where some values are mutually exclusive is just asking for trouble.
>>>> Though I like that you called out the values that are mutually exclusive.
>>>>
>>>> I definitely want to look at how your patch and Martin's differ on the
>>>> handling of flexible array members -- clearly we must avoid setting a
>>>> range in that case. I'm surprised this didn't trigger a failure in the
>>>> testsuite though. Martin's work in this space did.
>>>>
>>>> The bugfix in get_min_string_length looks like it probably stands on its
>>>> own.
>>>>
>>>> I'm still evaluating the two approaches...
>>>>
>>>
>>> One thing I should mention is, that there is still one place where opportunistic
>>> range info influence conde gen. I mean at least with my patch.
>> ACK. That's soemthing Martin's patch does address. AT least it's
>> supposed to.
>
> Okay, based on my previous patch I can of course do the same.
>
> See attached. This was bootstrapped and reg-tested together with my
> previous patch. The only "regression" was pr79376.c, which is xfailed
> because the test case is expecting the return value to be in the limits given
> by in the opportunistic range info.
>
> While I think the strlen return optimization will be safe with this patch,
> I have however still a philosophical problem with it, because s[n]printf
> is a highly complex piece of software, and we take it away the right
> to return a failure code, when it has to because of an implementation bug.
>
>>>
>>> That is the return value from sprintf is using the range info from the
>>> warning, and uses that to set the range info of the result.
>>> In try_substitute_return_value, which uses the range info that was
>>> from the warnings and feeds that into set_range_info.
>> Right. In Martin's work we have enough range info to distinguish
>> between the range info for warnings and the true range info and only use
>> the latter in the call to set_range_info.
>>
>>
>
> Well I have tried the test cases from Martins patch, and all except one
> work fine for me, and pass with my patch-set as well.
>
> The problematic one is strlenopt-59.c (in his patch, my patch has picked
> the same name, unfortunately).
>
> The difference is how object declarations are handled. While my patch
> does not try to solve that problem at all, his patch does probably look
> at the declaration size to improve the strict limits.
>
> I am not totally against it, but do not feel any need to implement that
> feature in the same patch together with a function interface change, and
> a code-correctness fix.
>
> From the test case it looks like the globals are comdat objects, because there
> is no initialization. You can declare "char a3[3];" and "char a3[100];" in
> different translation units and it will be a3[100] at run-time.
>
> For me the red line here is basically, that the strlen optimization should
> _not_ be more aggressive than the loop-niter optimization, thus the lackmus
> test is, would the test case pass if strlen is implemented as:
>
> #define strlen(c) ({ __SIZE_TYPE__ _n; for(_n=0; (c)[_n]; _n++); _n; })
>
> Well, it does not. But that should probably considered as a goal.
>
>
>
> Bernd.
More information about the Gcc-patches
mailing list