Bug 43089 - Optimizer ignores type in a conversion
Summary: Optimizer ignores type in a conversion
Status: RESOLVED INVALID
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: 4.4.3
: P3 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-02-16 09:52 UTC by Jan Ziak (http://atom-symbol.net)
Modified: 2010-02-17 18:12 UTC (History)
1 user (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Jan Ziak (http://atom-symbol.net) 2010-02-16 09:52:54 UTC
#include <stdio.h>
#include <assert.h>

struct AB {
        unsigned a:1;
        unsigned b:31;
};

int main(int argc, char **argv) {
        unsigned in;
        struct AB ab;
        unsigned b2;

        sscanf(argv[1], "%x", &in);
        ab = (struct AB){0,in};

        b2 = ab.b + ab.b;
        assert(!(b2 <= 0x7fffffff));

        return 0;
}

Architecture: i386
Command line: ./a.out 7fffffff
Succeeds when compiled with: gcc -O0 ...
Fails when compiled with: gcc -O2 ...
Expected behavior: the program should execute successfully

Possible explanation: In the expression (ab.b + ab.b), the bit-field "b" gets converted into an int. The addition is therefore of type (int+int), with an (int) as result. The (int) result is then converted into an (unsigned int) - but this step is skipped when using -O2, which leads the compiler to the wrong conclusion that (b2 <= 0x7fffffff) is always true.
Comment 1 Richard Biener 2010-02-16 10:16:35 UTC
0x7fffffff + 1 overflows.  Signed overflow invokes undefined behavior.
Use -fwrapv if you want wrapping signed overflow.
Comment 2 Jan Ziak (http://atom-symbol.net) 2010-02-16 10:59:29 UTC
(In reply to comment #1)
> 0x7fffffff + 1 overflows.  Signed overflow invokes undefined behavior.

Like so what? Is this your way of saying "I am not going to fix it"? Do you find it convenient to hide your laziness behind the words "undefined behavior".

If I were to modify the test case like this:

int i = ab.b;
b2 = i + i;

I would be ALSO triggering undefined behavior. But the modified test-case would succeed at any optimization level.

I don't think you understand what I am demanding here: I demand the compiler to have CONSISTENT BEHAVIOR in cases which are not defined by the standard. The modified code does clearly the SAME thing as the code in the test-case, only the intermediate conversion to the integer is now more explicit.
Comment 3 Jan Ziak (http://atom-symbol.net) 2010-02-16 11:23:40 UTC
(In reply to comment #2)
> 
> If I were to modify the test case like this:
> 
> int i = ab.b;
> b2 = i + i;
> 
> I would be ALSO triggering undefined behavior. But the modified test-case would
> succeed at any optimization level.

Whoops. This obviously is one of my bad days: the modified test-case would fail at -O2 as well. Anyway, the test-case was extracted from a much larger piece of code which works OK if I compile it with -O2 but generates a segmentation fault when compiled with -O3, because the optimization is deeper and allows the compiler to evaluate the conditional expression at compile-time.

Let me return to the original issue: the inconsistency between the behavior at -Oi vs -O(i+1). Are you going to fix it, or not?

> I don't think you understand what I am demanding here: I demand the compiler to
> have CONSISTENT BEHAVIOR in cases which are not defined by the standard. The
> modified code does clearly the SAME thing as the code in the test-case, only
> the intermediate conversion to the integer is now more explicit.
Comment 4 Jakub Jelinek 2010-02-16 11:56:30 UTC
There is nothing to fix.  Your program triggers undefined behavior.  It can do anything, which can include something you'd expect, or something completely different and it can depend on compiler options, position of stars, etc.

As Richard said, if you want signed overflow to be well defined, compile with -fwrapv.  Or, avoid doing the addition in this case in a signed type when you want it to wrap.  E.g. b2 = (unsigned) ab.b + ab.b; does the addition in unsigned type where wrapping is well defined (and even no wrapping occurs for 0x7fffffffU + 0x7fffffffU).
Comment 5 Jan Ziak (http://atom-symbol.net) 2010-02-16 17:37:11 UTC
(In reply to comment #4)
> There is nothing to fix.  Your program triggers undefined behavior.  It can do
> anything, which can include something you'd expect, or something completely
> different and it can depend on compiler options, position of stars, etc.

I understand what you are saying, but I do not agree with that. I my opinion, an *optimization* option should never result in any change of a program's behavior for this particular kind of undefined behaviors. I mean, there are basically two different kinds of undefined behaviors:

1. Where the compiler has to choose a *particular* implementation.

2. Where the compiler does not choose anything or cannot choose anything particular. (For example, what happens if accessing deallocated memory.)

The conversion test-case is of the 1st kind. Not of the 2nd kind. GCC -O0 chooses to generate a particular sequence of instructions to implement the undefined behavior. GCC -O2 does not respect the choice made at -O0 (or vice versa).

So, my question is: If it is possible for the problematic code to be implemented in all contexts by the same operation, and in this case it indeed is possible, why is GCC using two different operations? How do you justify that?
Comment 6 pinskia@gmail.com 2010-02-16 17:51:30 UTC
Subject: Re:  Optimizer ignores type in a conversion



Sent from my iPhone

On Feb 16, 2010, at 9:37 AM, "0xe2 dot 0x9a dot 0x9b at gmail dot com"  
<gcc-bugzilla@gcc.gnu.org> wrote:

>
>
> ------- Comment #5 from 0xe2 dot 0x9a dot 0x9b at gmail dot com   
> 2010-02-16 17:37 -------
> (In reply to comment #4)
>> There is nothing to fix.  Your program triggers undefined  
>> behavior.  It can do
>> anything, which can include something you'd expect, or something  
>> completely
>> different and it can depend on compiler options, position of stars,  
>> etc.
>
> I understand what you are saying, but I do not agree with that. I my  
> opinion,
> an *optimization* option should never result in any change of a  
> program's
> behavior for this particular kind of undefined behaviors. I mean,  
> there are
> basically two different kinds of undefined behaviors:
>
> 1. Where the compiler has to choose a *particular* implementation.

Huh, this is the opposite effect of undefined behavior. In fact for  
signed interger overflow, gcc sometimes optimizes it as wrapping and  
others as clamping. In this case it is clamping. It is hard sometimes  
to optimize constiently undefined behavior because of inlining and  
other optimizations that can change the ir before the optimization of  
undefined behavior.

>
> 2. Where the compiler does not choose anything or cannot choose  
> anything
> particular. (For example, what happens if accessing deallocated  
> memory.)
>
> The conversion test-case is of the 1st kind. Not of the 2nd kind.  
> GCC -O0
> chooses to generate a particular sequence of instructions to  
> implement the
> undefined behavior. GCC -O2 does not respect the choice made at -O0  
> (or vice
> versa).
>
> So, my question is: If it is possible for the problematic code to be
> implemented in all contexts by the same operation, and in this case  
> it indeed
> is possible, why is GCC using two different operations? How do you  
> justify
> that?
>
>
> -- 
>
> 0xe2 dot 0x9a dot 0x9b at gmail dot com changed:
>
>           What    |Removed                     |Added
> --- 
> --- 
> ----------------------------------------------------------------------
>             Status|RESOLVED                    |UNCONFIRMED
>         Resolution|INVALID                     |
>
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43089
>
Comment 7 Jakub Jelinek 2010-02-16 18:26:04 UTC
Where the compiler always chooses some particular implementation is implementation defined behavior, not undefined behavior.  Undefined behavior is always just that, undefined.
Comment 8 Jan Ziak (http://atom-symbol.net) 2010-02-17 14:16:37 UTC
(In reply to comment #7)
> Where the compiler always chooses some particular implementation is
> implementation defined behavior, not undefined behavior.  Undefined behavior is
> always just that, undefined.

In comment #4 you wrote:

"There is nothing to fix.  Your program triggers undefined behavior."

Why didn't you (as well as others here) write "triggers implementation defined behavior"?

Doesn't that invalidate the first of your statements I cited here (namely: "There is nothing to fix")?

ISO/IEC 9899:1999, section 6.3.1.3 states that the result is implementation defined.
Comment 9 Paolo Carlini 2010-02-17 14:39:10 UTC
.
Comment 10 Jakub Jelinek 2010-02-17 14:41:09 UTC
Please stop reopening.  6.3.1.3 is about casts between integer types.
Signed integer overflow is even mentioned as an example of undefined behavior in 3.4.3.
Comment 11 Jan Ziak (http://atom-symbol.net) 2010-02-17 17:52:07 UTC
(In reply to comment #10)
> Please stop reopening.  6.3.1.3 is about casts between integer types.
> Signed integer overflow is even mentioned as an example of undefined behavior
> in 3.4.3.

Well, look, maybe you didn't notice what is this about so I should spell it out for you in a more explicit way so that you can understand it more clearly:

1. The test-case which is an issue here is triggering undefined behavior. I completely agree that it indeed does trigger it. I have no problem admitting it.

2. On the grounds that it is undefined behavior, you are claiming that the things which GCC currently does in case the undefined behavior gets triggered are OK - the validity of which follows from the fact that since the compiler's handling of an undefined state in a program is allowed to be arbitrary. Well, guess what, I completely agree with this and I have no problem accepting the validity of this reasoning.

3. Since the compiler's handling of an undefined state in a program is allowed to be arbitrary, the sequence A of actions GCC is currently doing while compiling the test-case is not the only valid one. There also exist other perfectly valid sequences of actions GCC *could* be doing while compiling the test-case. Even if those other sequences mean that the generated code has a completely different semantics (when the program reaches the undefined state) than the code generated if using A. (And yes, this even includes mixing in the not-directly related section 6.3.1.3 into the sequence of actions. Even that is perfectly valid, since anything is valid.)

4. You are saying that you are not going to change how the compiler deals with the test-case. On the other hand, I am saying that the compiler should handle the test-case in a different way. You cannot dismiss my suggestion, because it is a valid proposal - and I cannot dismiss your will to maintain the status quo, because it is a valid approach.

5. Who is going to win here? Obviously, the one who has more power and control. In this case it is you and the other GCC folks, because you have more control over how GCC gets developed, and because I am not willing to spend the time I have on persuading you that GCC should be patched.

... so, now I am going to reopen this bug to piss you all off just because I *subjectively* think that my proposal is better than the status quo. And you have absolutely no way of persuading me that I am doing something wrong or something against the C99 standard. So, there you have it: the beauty of letting the notion of undefined behavior slip into the formalization of a programming language.
Comment 12 Paolo Carlini 2010-02-17 17:58:07 UTC
.
Comment 13 Andrew Pinski 2010-02-17 18:12:34 UTC
Use -fwrapv if you want signed integer overflow being defined the way you want it being defined.  That is the whole point of that flag.  The reason why GCC acts the way it acts by default is to allow more optimizations to happen.  As I mentioned before it is hard to have it act consistent with other optimizations happening around it unless the hardware implements signed integer overflow trapping or clamping.  That is the main reason why signed integer overflow is undefined because hardware could implement it differently.  Even K&R C had it undefined.