PING [PATCH] warn for strlen of arrays with missing nul (PR 86552, 86711, 86714) )

Jeff Law law@redhat.com
Sat Aug 25 17:33:00 GMT 2018


On 08/25/2018 12:32 AM, Bernd Edlinger wrote:
> On 08/25/18 01:54, Jeff Law wrote:
>> On 08/24/2018 11:26 AM, Bernd Edlinger wrote:
>>> On 08/24/18 18:51, Jeff Law wrote:
>>>>> Well, this is broken for wide character strings.
>>>>> but I hope we can get rid of STRING_CST which are
>>>>> not explicitly null terminated.
>>>
>>> I am afraid that is not going to happen.
>>> Maybe we can get STRING_CST that are never longer
>>> than the TYPE_UNIT_SIZE, but c_strlen and c_getstr
>>> need to take care that the string is zero-terminated.
>>>
>>> string_constant, should not promise the string is zero terminated.
>>> But instead it can promise that:
>>> 1) the STRING_CST is valid up to TREE_STRING_LENGTH
>>> 2) mem_size is >= TREE_STRING_LENGTH
>>> 3) memory between TREE_STRING_LENGTH and mem_size is ZERO.
>>>
>>> It will not guarantee anything about zero termination any more.
>> Interesting because those conditions would be sufficient to deal with a
>> regression I stumbled over after fixing Martin's patch to not assume
>> that all STRING_CSTs are NUL terminated.
>>
>> But I need to think about this a bit more.  Essentially the question
>> we'd need to ask is whether or not these are sufficient in general or
>> just in specific cases.
>>
>> I tend to think they're not sufficient in general. If a string returned
>> by string_constant that didn't have a terminating NUL, but which did
>> pass the tests above were ultimately passed to the runtime's str*
>> routines, then the call may run off the end of the string.  We'd like to
>> be able to warn for that.
>>
>> So ISTM those rules are only valid in contexts where we know the result
>> isn't going to be passed to str* and friends within the C library.
>>
>> I do think they're sufficient to avoid problems with the
>> tree-ssa-forwprop code we've looked at.  So what may make the most sense
>> is to have that routine indicate it's willing to accept unterminated
>> strings, then check the conditions above before optimizing the code.
>>
> 
> There are not too many callers of string_constant.
> Not all need zero termination.
Right.  And in retrospect we probably should have avoided default
parameter overloads and just fixed the callers.  But that can be a
follow-up.

> 
> But I think if the are interested in zero-termination
> they should simply call c_strlen or c_getstr.
Perhaps.


> 
> 
>>>
>>> In the end, the best approach might be to either merge my patch
>>> with Martins, or step-wise, first fixing wrong code, and then
>>> implementing warnings without fixing wrong code.
>> Unsure at this time.  I've been working with both.  I suspect that if we
>> went with yours that we'd then turn around and layer Martin's on top of
>> it because of the desire to signal to callers that we have an
>> unterminated string and have the callers take appropriate action.  Which
>> begs the question of whether or not we just go with Martin's -- ie, is
>> there really any value in using both.  I haven't seen indications there
>> is value in that approach, but I'm still poking at things.
>>
> 
> Well, ya call it "layer one patch over the other"
> I call it "incremental improvements".
It is (of course) a case by case basis.  The way I try to look at these
things is to ask whether or not the first patch under consideration
would have any value/purpose after the second patch was installed.  If
so, then it may make sense to include both.  If not, then we really just
want one patch.

Jeff



More information about the Gcc-patches mailing list