[PATCH] handle non-constant offsets in -Wstringop-overflow (PR 77608)

Martin Sebor msebor@gmail.com
Tue Nov 21 19:53:00 GMT 2017


On 11/21/2017 09:55 AM, Jeff Law wrote:
> On 11/19/2017 04:28 PM, Martin Sebor wrote:
>> On 11/18/2017 12:53 AM, Jeff Law wrote:
>>> On 11/17/2017 12:36 PM, Martin Sebor wrote:
>>>> The attached patch enhances -Wstringop-overflow to detect more
>>>> instances of buffer overflow at compile time by handling non-
>>>> constant offsets into the destination object that are known to
>>>> be in some range.  The solution could be improved by handling
>>>> even more cases (e.g., anti-ranges or offsets relative to
>>>> pointers beyond the beginning of an object) but it's a start.
>>>>
>>>> In addition to bootsrapping/regtesting GCC, also tested with
>>>> Binutils/GDB, Glibc, and the Linux kernel on x86_64 with no
>>>> regressions.
>>>>
>>>> Martin
>>>>
>>>> The top of GDB fails to compile at the moment so the validation
>>>> there was incomplete.
>>>>
>>>> gcc-77608.diff
>>>>
>>>>
>>>> PR middle-end/77608 - missing protection on trivially detectable
>>>> runtime buffer overflow
>>>>
>>>> gcc/ChangeLog:
>>>>
>>>>     PR middle-end/77608
>>>>     * builtins.c (compute_objsize): Handle non-constant offsets.
>>>>
>>>> gcc/testsuite/ChangeLog:
>>>>
>>>>     PR middle-end/77608
>>>>     * gcc.dg/Wstringop-overflow.c: New test.
>>> The recursive call into compute_objsize passing in the ostype avoids
>>> having to think about the whole object vs nearest containing object
>>> issues.  Right?
>>>
>>> What's left to worry about is maximum or minimum remaining bytes in the
>>> object.  At least that's my understanding of how ostype works here.
>>>
>>> So we get the amount remaining, ignoring the variable offset, from the
>>> recursive call (SIZE).  The space left after we account for the variable
>>> offset is [SIZE - MAX, SIZE - MIN].  So ISTM for type 0/1 you have to
>>> return SIZE-MIN (which you do) and for type 2/3 you have to return
>>> SIZE-MAX which I think you get wrong (and you have to account for the
>>> possibility that MAX or MIN is greater than SIZE and thus there's
>>> nothing left).
>>
>> Subtracting the upper bound of the offset from the size instead
>> of the lower bound when the caller is asking for the minimum
>> object size would make the result essentially meaningless in
>> common cases where the offset is smaller than size_t, as in:
>>
>>   char a[7];
>>
>>   void f (const char *s, unsigned i)
>>   {
>>     __builtin_strcpy (a + i, s);
>>   }
>>
>> Here, i's range is [0, UINT_MAX].
>>
>> IMO, it's only useful to use the lower bound here, otherwise
>> the result would only rarely be non-zero.
> But when we're asking for the minimum left, aren't we essentially asking
> for "how much space am I absolutely sure I can write"?  And if that is
> the question, then the only conservatively correct answer is to subtract
> the high bound.

I suppose you could look at it that way but IME with this work
(now, and also last year when I submitted a patch actually
changing the built-in), using the upper bound is just not that
useful because it's too often way too big.  There's no way to
distinguish an out-of-range upper bound that's the result of
an inadequate attempt to constrain a value from an out-of-range
upper bound that is sufficiently constrained but in a way GCC
doesn't see.

There are no clients of this API that would be affected by
the decision one way or the other (unless the user specifies
a -Wstringop-overflow= argument greater than the default 2)
so I don't think what we do now matters much, if at all.
Perhaps in the future with some of the range improvements
that you, Aldy and Andrew have been working on.

That said, if it helps us move forward with this enhancement
I'll use the upper bound -- let me know.  In the future, when
it is actually used, we'll adjust it as necessary.

>> This is also what other warnings that deal with ranges do.  For
>> -Warray-bounds only considers the lower bound (unless it's negative)
>> when deciding whether or not to warn for
>>
>>   int g (unsigned i)
>>   {
>>     return a[i];
>>   }>
>> It would be too noisy to be of practical use otherwise (even at
>> level 2).
> Which argues that:
>
> 1. Our ranges suck
> 2. We're currently avoiding dealing with that by giving answers that are
> not conservatively correct.
>
> Right?

I don't feel quite as strongly.  Modulo the pesky bugs we get
every so often for some of the corner cases or known limitations
they seem actually reasonably accurate in most cases, and the
warnings, for the most part, strike a reasonable balance between
false positives and negatives.  But I certainly agree that there
is room to improve and I look forward to taking some of the range
enhancement out for a spin.

Martin



More information about the Gcc-patches mailing list