Bug 49820 - Explicit check for integer negative after abs optimized away
Summary: Explicit check for integer negative after abs optimized away
Status: RESOLVED INVALID
Alias: None
Product: gcc
Classification: Unclassified
Component: c (show other bugs)
Version: 4.5.2
: P3 major
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-07-23 07:10 UTC by Agner Fog
Modified: 2011-07-27 14:27 UTC (History)
2 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed:


Attachments
Example generating bug (275 bytes, text/plain)
2011-07-23 07:10 UTC, Agner Fog
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Agner Fog 2011-07-23 07:10:45 UTC
Created attachment 24812 [details]
Example generating bug

The integer abs function can overflow if the argument is 0x80000000. An intended check for overflow is ignored. The gcc compiler optimizes away a check for the value < 0 after abs with -O2 optimization:
int b;
b = abs(b);
if (b < 0)  // check for overflow optimized away

The error occurs when compiling the attached file with -O2, 32 or 64 bit mode, C or C++. The C/C++ language does not normally need to check for overflow, but it should acknowledge an explicit check for overflow.
Comment 1 Andreas Schwab 2011-07-23 07:20:20 UTC
Integer overflow is undefined. You have to check before the fact, or compile with -fwrapv.
Comment 2 Andrew Pinski 2011-07-23 17:23:44 UTC
(In reply to comment #1)
> Integer overflow is undefined. You have to check before the fact, or compile
> with -fwrapv.

That should say signed integer overflow is undefined.
Comment 3 Andreas Schwab 2011-07-23 17:30:21 UTC
Unsigned integers never overflow.
Comment 4 Jeffrey Walton 2011-07-23 20:47:55 UTC
(In reply to comment #1)
> Integer overflow is undefined. You have to check before the fact, or compile
> with -fwrapv.

I'm probably spinning wheels here, but checking 'a priori' is a bit ridiculous. Though the result is undefined, it seems to me the operation prior to the overflow (or the cause of the overflow) is still valid. Section 3.4.3 does not claim an operation which results in overflow is itself an 'undefined' operation. You don't get the 'undefined behavior' (ie, undefined result) until you perform the operation.

CPUs which use 2s compliment representations are usually well equipped to deal with overflow and carry. I would claim a result which overflows is actually predictable and repeatable. Fortunately, CPU manufacturers have not taken the 'ignore the problem' approach taken by ISO/IEC.
Comment 5 Eric Botcazou 2011-07-23 20:58:52 UTC
> CPUs which use 2s compliment representations are usually well equipped to deal
> with overflow and carry. I would claim a result which overflows is actually
> predictable and repeatable. Fortunately, CPU manufacturers have not taken the
> 'ignore the problem' approach taken by ISO/IEC.

See http://www.airs.com/blog/archives/120 for more elaborated views.
Comment 6 Jeffrey Walton 2011-07-23 22:28:07 UTC
(In reply to comment #5)
> > CPUs which use 2s compliment representations are usually well equipped to deal
> > with overflow and carry. I would claim a result which overflows is actually
> > predictable and repeatable. Fortunately, CPU manufacturers have not taken the
> > 'ignore the problem' approach taken by ISO/IEC.
> 
> See http://www.airs.com/blog/archives/120 for more elaborated views.

Thanks Eric - I really enjoy reading Ian and Jonathan on GCC-Users (I wish they would write a modern GCC book). I believe Ian might have overlooked one usage case (but speaking to the masses, he is spot-on).

In many cases, folks are oblivious to overflow and will truck on as if nothing happened. In the "what's overflow" case, I think Ian's arguments are correct in assuming all statements are well formed and all results are well defined. The majority of folks don't code for overflow so GCC should not care either.

However, there is a minority of folks who do care. I'm willing to perform an operation that will lead to overflow, but I'm not willing to use the [possibly undefined] result. In between the operation and result, I need help from GCC to efficiently determine overflow or carry. For the architectures I have worked on, this is inevitably some flags. Note that I've never worked on a sign-magnitude machine, and have no idea what the efficient methods are.

I think there is a disconnect between ISO/IEC and their desire to produce portable code, secure programming, and practical implementations. Confer:

  ME: "I want to check the flags register on x86/x64 to determine overflow anfter an ADD or SUB operation."
  ISO/IEC: "What's overflow? Our abstract machines do not overflow. And a FLAGS register is not portable, so we're not making any provisions for it."

Interestingly, GCC seems to add its own twist: it wants to produce optimized code.

In the end, it would be a lot of help (to a minority of folks) if GCC moved from its position of "all programs do not have undefined behavior" and provided some intrinsics (where applicable) to help folks with the problem:
 * http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48580
 * http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49467

Jeff
Comment 7 Eric Botcazou 2011-07-24 06:56:13 UTC
> I think there is a disconnect between ISO/IEC and their desire to produce
> portable code, secure programming, and practical implementations. Confer:
> 
>   ME: "I want to check the flags register on x86/x64 to determine overflow
> anfter an ADD or SUB operation."
>   ISO/IEC: "What's overflow? Our abstract machines do not overflow. And a
> FLAGS register is not portable, so we're not making any provisions for it."

That's right for the C language, not for other ISO/IEC languages, e.g. Ada.

> Interestingly, GCC seems to add its own twist: it wants to produce optimized
> code.

GCC doesn't want anything, rather it can be argued that you asked it to optimize your code this way.  One of the optimization activated at the -O2 level is

`-fstrict-overflow'
     Allow the compiler to assume strict signed overflow rules,
     depending on the language being compiled.  For C (and C++) this
     means that overflow when doing arithmetic with signed numbers is
     undefined, which means that the compiler may assume that it will
     not happen.  This permits various optimizations.  For example, the
     compiler will assume that an expression like `i + 10 > i' will
     always be true for signed `i'.  This assumption is only valid if
     signed overflow is undefined, as the expression is false if `i +
     10' overflows when using twos complement arithmetic.  When this
     option is in effect any attempt to determine whether an operation
     on signed numbers will overflow must be written carefully to not
     actually involve overflow.

     This option also allows the compiler to assume strict pointer
     semantics: given a pointer to an object, if adding an offset to
     that pointer does not produce a pointer to the same object, the
     addition is undefined.  This permits the compiler to conclude that
     `p + u > p' is always true for a pointer `p' and unsigned integer
     `u'.  This assumption is only valid because pointer wraparound is
     undefined, as the expression is false if `p + u' overflows using
     twos complement arithmetic.

     See also the `-fwrapv' option.  Using `-fwrapv' means that integer
     signed overflow is fully defined: it wraps.  When `-fwrapv' is
     used, there is no difference between `-fstrict-overflow' and
     `-fno-strict-overflow' for integers.  With `-fwrapv' certain types
     of overflow are permitted.  For example, if the compiler gets an
     overflow when doing arithmetic on constants, the overflowed value
     can still be used with `-fwrapv', but not otherwise.

     The `-fstrict-overflow' option is enabled at levels `-O2', `-O3',
     `-Os'.

so, by passing -O2, you effectively passed -fstrict-overflow.  If you don't care that much about performances, then use -O1 or add -fno-strict-overflow or -fwrapv.  Finally, the compiler will warn if you pass -Wstrict-overflow.

> In the end, it would be a lot of help (to a minority of folks) if GCC moved
> from its position of "all programs do not have undefined behavior" and
> provided some intrinsics (where applicable) to help folks with the problem:
>  * http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48580
>  * http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49467

There are more than a hundred of distinct cases of undefined behavior in the C language as standardized by ISO/IEC.  Coping with them by default would lead to bigger binaries that run slower.  The C language was designed to be efficient and GCC is sort of true to this vision in using all the liberty granted by the language.  That has been GCC's consistent policy for more than a decade and is very unlikely to change in the near future.

Instead GCC provides either options (like -fwrapv, -fno-strict-overflow or -fno-strict-aliasing) or intrinsics for specific needs.  But, of course, at the expense of portability so this solution comes with its own drawbacks.
Comment 8 Agner Fog 2011-07-24 08:16:39 UTC
Thanks for your comments.

Why is the behavior different for signed and unsigned?
The expression (a + 5 < a) is reduced to always false when a is signed, but not when a is unsigned. 

-Wall produces the warning "assuming signed overflow does not occur when assuming that (X + c) < X is always false" in the above example, but there is no warning when it assumes that abs(a) < 0 is always false.

I believe that the behavior of a compiler must be predictable. An ordinary programmer would never predict that the compiler can optimize away an explicit check for overflow, no matter how many C++ textbooks he has read. If the compiler can remove a security check without warning then we have a security issue.

To say that the behavior in case of overflow is undefined is not the same as denying that overflow can occur. I think we need a sensible compromise that allows the compiler to e.g. optimize a loop under the assumption that the loop counter doesn't overflow, but doesn't allow it to optimize away an explicit overflow check. I know the compiler can't guess the programmers' intentions, but then we must at least have a warning. Any optimization rule that allows the compiler to optimize away an overflow check without warning is unacceptable in my opinion.
Comment 9 Andrew Pinski 2011-07-24 09:40:57 UTC
(In reply to comment #8)
> Why is the behavior different for signed and unsigned?
> The expression (a + 5 < a) is reduced to always false when a is signed, but not
> when a is unsigned. 

Because for unsigned, wrapping is the behavior for UINT_MAX +1.  As there is no overflow for unsigned integer, a+5 can be less than a.
Comment 10 Agner Fog 2011-07-25 07:43:58 UTC
I still think that a compiler should be predictable and consistent. It is inconsistent that  a+5<a = false  produces a warning, while  abs(a)<0 = false does not. Both expressions could be intended overflow checks.

Besides, some compilers produce a warning when a branch condition is always true or always false. That is sound behavior because it is likely to be a bug. gcc does not produce a warning when optimizing away something like  if (2+2 != 4)
Comment 11 Eric Botcazou 2011-07-25 07:45:36 UTC
> -Wall produces the warning "assuming signed overflow does not occur when
> assuming that (X + c) < X is always false" in the above example, but there is
> no warning when it assumes that abs(a) < 0 is always false.

As already mentioned in comment #7, you need to pass -Wstrict-overflow for this case.  There are various levels of -Wstrict-overflow, see the manual.

> I believe that the behavior of a compiler must be predictable. An ordinary
> programmer would never predict that the compiler can optimize away an explicit
> check for overflow, no matter how many C++ textbooks he has read. If the
> compiler can remove a security check without warning then we have a security
> issue.

The behavior of the compiler is predictable, no doubt about that.  And it's documented at length in the manual.  And explained in blogs, etc.  It's only a matter of learning how to write overflow checks in C, that's all.
Comment 12 Agner Fog 2011-07-25 14:21:52 UTC
No the behavior is not predictable when it sometimes warns about ignoring overflow, and sometimes not. Please add a warning when it optimizes away an overflow check after the abs function.

Unsafe optimizations are sometimes good, sometimes causing hard-to-find bugs. The programmer can't always predict what kind of optimizations the compiler makes. A warning feature is the best way to enable the programmer to check if the compiler does the right thing. The programmer can then turn off specific warnings after verifying that the optimizations are OK.
Comment 13 Agner Fog 2011-07-26 19:31:48 UTC
My example does indeed give a warning when compiled with -Wstrict-overflow=2.
Unfortunately, -Wall implies only -Wstrict-overflow=1 so I got no warning in the first place. I think the warning levels need to be adjusted so that we get the warning with -Wall because the consequences are no less serious than ignoring an overflow check with if(a+const<a), which gives a warning with -Wstrict-overflow=1
Comment 14 Eric Botcazou 2011-07-27 11:23:25 UTC
> My example does indeed give a warning when compiled with -Wstrict-overflow=2.
> Unfortunately, -Wall implies only -Wstrict-overflow=1 so I got no warning in
> the first place. I think the warning levels need to be adjusted so that we get
> the warning with -Wall because the consequences are no less serious than
> ignoring an overflow check with if(a+const<a), which gives a warning with
> -Wstrict-overflow=1

If you are doing clever things, adjust the warning level you're using.  They have been chosen after considering several trade-offs, so changing them would entail redoing the full analysis.  Patches welcome.
Comment 15 Agner Fog 2011-07-27 14:27:33 UTC
How do you define "clever things"? Checking that a variable is within the allowed range is certainly a standard thing that every SW teacher tells you to do. I think it is reasonable to expect -Wall to do what it says and set a very high warning level. Optimizing away an overflow check is such a dangerous thing to do that it requires a warning.

I think it may be wise to distinguish between optimizing away a whole branch or loop, versus just making calculations more efficient, e.g. simplifying expressions or making induction variables. If a branch can be optimized away then it is either violating the intentions of the programmer or the program has a logical error. A warning would be in place in either case.

What I am suggesting is that optimizing away a branch should give a warning at a lower level than simplifying an arithmetic expression. I know this might be somewhat complicated to implement, but it would be useful for catching the situation where an overflow check is optimized away.

Checking for overflow in a "safe" way is so complicated and tedious that it is practically never done (see https://www.securecoding.cert.org/confluence/display/seccode/INT32-C.+Ensure+that+operations+on+signed+integers+do+not+result+in+overflow )

Sorry for being persistent, but I think this issue has serious security implications.