[PATCH] Make strlen range computations more conservative

Jeff Law law@redhat.com
Fri Aug 3 07:38:00 GMT 2018


On 08/01/2018 01:19 AM, Richard Biener wrote:
> On Tue, 31 Jul 2018, Martin Sebor wrote:
> 
>> On 07/31/2018 09:48 AM, Jakub Jelinek wrote:
>>> On Tue, Jul 31, 2018 at 09:17:52AM -0600, Martin Sebor wrote:
>>>> On 07/31/2018 12:38 AM, Jakub Jelinek wrote:
>>>>> On Mon, Jul 30, 2018 at 09:45:49PM -0600, Martin Sebor wrote:
>>>>>> Even without _FORTIFY_SOURCE GCC diagnoses (some) writes past
>>>>>> the end of subobjects by string functions.  With _FORTIFY_SOURCE=2
>>>>>> it calls abort.  This is the default on popular distributions,
>>>>>
>>>>> Note that _FORTIFY_SOURCE=2 is the mode that goes beyond what the
>>>>> standard
>>>>> requires, imposes extra requirements.  So from what this mode accepts or
>>>>> rejects we shouldn't determine what is or isn't considered valid.
>>>>
>>>> I'm not sure what the additional requirements are but the ones
>>>> I am referring to are the enforcing of struct member boundaries.
>>>> This is in line with the standard requirements of not accessing
>>>> [sub]objects via pointers derived from other [sub]objects.
>>>
>>> In the middle-end the distinction between what was originally a reference
>>> to subobjects and what was a reference to objects is quickly lost
>>> (whether through SCCVN or other optimizations).
>>> We've run into this many times with the __builtin_object_size already.
>>> So, if e.g.
>>> struct S { char a[3]; char b[5]; } s = { "abc", "defg" };
>>> ...
>>> strlen ((char *) &s) is well defined but
>>> strlen (s.a) is not in C, for the middle-end you might not figure out which
>>> one is which.
>>
>> Yes, I'm aware of the middle-end transformation to MEM_REF
>> -- it's one of the reasons why detecting invalid accesses
>> by the middle end warnings, including -Warray-bounds,
>> -Wformat-overflow, -Wsprintf-overflow, and even -Wrestrict,
>> is less than perfect.
>>
>> But is strlen(s.a) also meant to be well-defined in the middle
>> end (with the semantics of computing the length or "abcdefg"?)
> 
> Yes.
> 
>> And if so, what makes it well defined?
> 
> The fact that strlen takes a char * argument and thus inline-expansion
> of a trivial implementation like
[ ... ]
And ISTM again the key here is the type of the object that actually gets
passed to strlen at the gimple level.  If it's a char *, then the type
does not constrain the return value in any way shape or form.

Jeff



More information about the Gcc-patches mailing list