[PATCH] warn for integer overflow in allocation calls (PR 96838)

Martin Sebor msebor@gmail.com
Tue Nov 24 18:39:51 GMT 2020


On 11/24/20 10:44 AM, Andrew MacLeod wrote:
> On 11/24/20 12:42 PM, Andrew MacLeod wrote:
>> On 11/23/20 4:38 PM, Martin Sebor wrote:
>>> On 11/21/20 6:26 AM, Andrew MacLeod wrote:
>>>> On 11/21/20 12:07 AM, Jeff Law wrote:
>>>>>
>>>>> On 11/9/20 9:00 AM, Martin Sebor wrote:
>>>>>> Ping:
>>>>>> https://gcc.gnu.org/pipermail/gcc-patches/2020-September/554000.html
>>>>>>
>>>>>> Jeff, I don't expect to have the cycles to reimplement this patch
>>>>>> using the Ranger APIs before stage 1 closes.  I'm open to giving
>>>>>> it a try in stage 3 if it's still in scope for GCC 11. Otherwise,
>>>>>> is this patch okay to commit?
>>>>> So all we're going to get from the ranger is ranges of operands, 
>>>>> right?
>>>>> Meaning that we still need to either roll our own evaluator
>>>>> (eval_size_vflow) or overload range_for_stmt with our own, which 
>>>>> likely
>>>>> looks like eval_size_vflow anyway, right?
>>>>>
>>>>> My hope was to avoid the roll our own evaluator, but that doesn't look
>>>>> like it's in the cards in the reasonably near future.
>>>>
>>>> Is there a PR open showing what exactly you are looking for?
>>>> I'm using open PRs to track enhancement requests, and they will all 
>>>> feed back into the development roadmap  I am working on.
>>>
>>> Not that I know of.  The background is upthread, in particular in
>>> Aldy's response here:
>>> https://gcc.gnu.org/pipermail/gcc-patches/2020-September/554242.html
>>>
>>> I like the suggestion and if/when I have the time I'd like to give
>>> it a try.  Until then, I think the patch is useful on its own so
>>> I'll go with it for now.
>>>
>>> Longer term, I do hope we can revisit the idea of computing either
>>> mathematically correct ranges alongside those required by the language
>>> semantics, or tracking signed overflow or unsigned wraparound. E.g.,
>>> in:
>>>
>>>   void* f (int n)
>>>   {
>>>     if (n < INT_MAX / 3)
>>>       n = INT_MAX / 3;
>>>
>>>     n *= sizeof (int);
>>>     // n is [(INT_MAX / 3) * 4, INF] mathematically
>>>     // undefined due to overflow in C
>>>     // but [INT_MIN, INT_MAX] according to VRP
>>
>> but sizeof returns a size_t.. which is an unsigned. thus the multiply 
>> is promoted to an unsigned multiply  which means there is lots of 
>> wrapping and I don't see how you can conclude those ranges? [INT_MIN, 
>> INT_MAX] are all possible outcomes based on the code that is generated.
>>
>> If I change that to
>>   n *= (int) sizeof (int) to keep it as signed arithmetic, I see:
>>
>>
>> Folding statement: n_4 = n_1 * 4;
>> EVRP:hybrid: RVRP found singleton 2147483647
>> Queued stmt for removal.  Folds to: 2147483647
>> evrp visiting stmt _7 = malloc (n_4);
>>
>> extract_range_from_stmt visiting:
>> _7 = foo (n_4);
>> Folding statement: _7 = foo (n_4);
>> EVRP:hybrid: RVRP found singleton 2147483647
>> Folded into: _7 = malloc (2147483647);
>>
>> So I'm not sure what exactly you want to do?  We are calculating what 
>> the program can produce?
>>
>> Why do we care about alternative calculations?
>>
> Or rather, why do we want to do this?

When computing the sizes of things, programmers commonly forget
to consider unsigned wrapping (or signed overflow).  We simply
assume it can't happen and that (for instance) N * sizeof (X)
is necessarily big enough for N elements of type X.  (Grepping
any code base for the pattern '\* sizeof' and looking for code
that tests that the result doesn't wrap is revealing.)

When overflow or wrapping does happen (typically because of poor
precondition checking) it often leads to bugs when we end up
allocating less space than we need and use.  A simple example
to help illustrate what I mean:

   void* g (int *a, int n)
   {
     a = realloc (a, n * sizeof (int) + 32);
     for (int i = n; i != n + 32; ++i)
       a[i] = f ();
   }

In ILP32, if (n > INT_MAX / 4 - 32) holds, n * sizeof(int) will
wrap around zero.  The realloc call will end up allocating less
space than expected, and the loop will write past the end of
the allocated block.

(The bug above can only be detected if we know n's range.
I left that part out.)

Historically, bugs caused by integer overflow and wrapping have
been among the most serious security weaknesses.  Detecting these
mistakes will help prevent some of these.

The problem is that according to C/C++, nothing in the function
above is undefined except for the buffer overflow in the loop,
and the buffer overflow only happens because of the well-defined
integer wrapping.  To detect the wrapping, we either need to do
the computation in as-if infinite math and compare the final result
to the result we get under C's truncating rules, or we need to set
and propagate the "wraparound" bit throughout the computation.

Martin


More information about the Gcc-patches mailing list