Bug 107890 - UB on integer overflow impacts code flow
Summary: UB on integer overflow impacts code flow
Status: RESOLVED INVALID
Alias: None
Product: gcc
Classification: Unclassified
Component: c (show other bugs)
Version: 12.2.0
: P3 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords: diagnostic
Depends on:
Blocks:
 
Reported: 2022-11-27 21:31 UTC by ux
Modified: 2022-12-01 07:44 UTC (History)
3 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description ux 2022-11-27 21:31:20 UTC
Following is a code that is sensible to a signed integer overflow. I was under the impression that this kind of undefined behavior essentially meant that the value of that integer could become unreliable. But apparently this is not limited to the value of said integer, it can also dramatically impact the code flow.

Here is the pathological code:

    #include <stdint.h>
    #include <stdio.h>
    #include <stdlib.h>

    uint8_t tab[0x1ff + 1];

    uint8_t f(int32_t x)
    {
        if (x < 0)
            return 0;
        int32_t i = x * 0x1ff / 0xffff;
        if (i >= 0 && i < sizeof(tab)) {
            printf("tab[%d] looks safe because %d is between [0;%d[\n", i, i, (int)sizeof(tab));
            return tab[i];
        }

        return 0;
    }

    int main(int ac, char **av)
    {
        return f(atoi(av[1]));
    }

Triggering an overflow actually enters the printf/dereference scope, violating the protective condition and thus causing a crash:

    % cc -Wall -O2 overflow.c -o overflow && ./overflow 50000000
    tab[62183] looks safe because 62183 is between [0;512[
    zsh: segmentation fault (core dumped)  ./overflow 50000000

I feel extremely uncomfortable about an integer overflow actually impacting something else than the integer itself. Is it expected or is this a bug?
Comment 1 Andrew Pinski 2022-11-27 21:36:35 UTC
>I was under the impression that this kind of undefined behavior essentially meant that the value of that integer could become unreliable.

Your impression is incorrect. Once undefined behavior happens, anything can happen. 

This is why things like -fsanitize=undefined is there now.
Comment 2 Jonathan Wakely 2022-11-28 01:02:45 UTC
You should read https://blog.regehr.org/archives/213
Comment 3 Martin Uecker 2022-11-28 07:51:10 UTC
Of course, instead of using the standard as an excuse, we could also try to make the compiler less of a footgun. 

Even if this is standard conforming, it is still a severe usability issue with safety implications and I do not think we should simply close such bugs.
Comment 4 Andrew Pinski 2022-11-28 08:26:49 UTC
There is always a trade off here.
We made the decision that signed integer overflow is undefined because we want to do optimizations. Gcc does provide an option which makes them behave well defined at runtime, -fwrapv . This is similar to strict aliasing with respect to optimizations in the sense it is hard to decide if the optimizations is being done will break what people assumptions are. This is part of the reason why there is a specifications (standard) so people can write to it.
There are other undefined behavior that gcc has started to take advantage of (e.g. in c++ if there is no return with a value in a function with that returns non-void). The question is where do you draw the line with respect to undefined behaviors, the answer is complex sometimes, especially if you are optimizing.

In this example the range of x is known to be positive so multiplying by another positive # gives a positive result and then dividing by a positive value still is positive. The check for negative result is optimized away.
Comment 5 Eric Gallager 2022-12-01 07:44:02 UTC
What I wonder is, why doesn't -Wstrict-overflow=5 catch this? That's supposed to be its strictest level, right?

$ /usr/local/bin/gcc -Wall -Wextra -pedantic -Wstrict-overflow=5 -O2 107890.c
107890.c: In function 'f':
107890.c:13:25: warning: comparison of integer expressions of different signedness: 'int32_t' {aka 'int'} and 'long unsigned int' [-Wsign-compare]
   13 |         if (i >= 0 && i < sizeof(tab)) {
      |                         ^
107890.c: In function 'main':
107890.c:21:18: warning: unused parameter 'ac' [-Wunused-parameter]
   21 |     int main(int ac, char **av)
      |              ~~~~^~
$