Bug 46899 - compiler optimization
Summary: compiler optimization
Status: RESOLVED INVALID
Alias: None
Product: gcc
Classification: Unclassified
Component: c (show other bugs)
Version: 4.4.5
: P3 minor
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-12-12 00:35 UTC by Eskil Steenberg
Modified: 2010-12-13 14:09 UTC (History)
0 users

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Eskil Steenberg 2010-12-12 00:35:47 UTC
This keeps bothering me because I really think there is something wrong with the gcc compiler. Its a bit of a corner case but still.

Lets consider the following code:

void my_func(short a)
{
	unsigned int b;

	b = (unsigned int)a;

	if(b == 70000)
	{
	/* something */
	}
}

Now my question is, will the "something" code ever run. So lets have a look at the line:

b = (unsigned int)a;

In no implementation that I am aware of will a 16bit short converted to an unsigned int ever produce the value 70000. However, reading the spec it says:

An example of undefined behavior is the behavior on integer overflow.

So in theory it could give b the value of 70000, on a particular (although imaginary) implementation. So according to the spec, the answer is yes.

So if the compiler wants to optimize this code it has two options: Assume b can be anything (and keep the if statement), or determine the behavior of the particular implementation of overflow on the target hardware, and conclude that it could NEVER yield the number 70000. It could then optimize away the if statement.

Gcc however makes the assumption, that b will be in the range of zero and max short, disregarding if this is true on the hardware. An assumption neither the spec, or most hardware implementations backs up.

This doesnt matter if the number is 70000 (because we dont have that weird hardware), but if it is 4294967294, it will, since most hardware will wrap the overflowing integer.

This causes problems since users (like me) can read out the value of b, see that it matched what ever I'm comparing it to, and yet it doesnt trigger the branch. 

Regards 

E
Comment 1 Andrew Pinski 2010-12-12 01:54:16 UTC
There is no integer overflow in the code provided at all.
Comment 2 Andrew Pinski 2010-12-12 01:55:56 UTC
(In reply to comment #1)
> There is no integer overflow in the code provided at all.
Even if there was, the standard says the behavior is undefined which means anything can happen.
Comment 3 Eskil Steenberg 2010-12-12 09:09:54 UTC
Hi

> (In reply to comment #1)
>> There is no integer overflow in the code provided at all.

Sorry it underflows. How about this:

void my_func(unsigned short a, unsigned short c)
{
	unsigned int b;

	b = a * c;

....

> Even if there was, the standard says the behavior is undefined which means
> anything can happen.

Yes, but the doesn't the C spec define the overflow as undefined, rather
then the entire program? The behavior is defined, just not by the C spec,
its defined by the hardware implementation. The compile time assumption
that nothing will ever overflow seams dangerous.

My problem is not that C has undefined behavior, but rather that gcc makes
assumptions about this behavior that _can_ turn out to be not true.

Cheers

E
Comment 4 Andrew Pinski 2010-12-12 10:20:03 UTC
>Sorry it underflows.

No, conversion does not have any overflow/underflow in it.

>void my_func(unsigned short a, unsigned short c)
>{
>    unsigned int b;
>
>    b = a * c;

There is no overflow here since this unsigned integers wrap and don't overflow.

> Yes, but the doesn't the C spec define the overflow as undefined, rather
> then the entire program?

No it is a runtime undefined behavior rather than the result being undefined.

> rather that gcc makes assumptions about this behavior that _can_ turn out to
> be not true.

But assumptions?  Since it is undefined behavior, it does not matter because GCC can make different assumptions in when it feels like.

Unless you can give a testcase that does not depend on undefined behavior, it is hard to prove GCC is doing something wrong.  -fwrapv can be used to define signed integer overflow as wrapping.  

See http://gcc.gnu.org/onlinedocs/gcc-4.5.1/gcc/Integers-implementation.html for how the conversion is implementation defined behavior:
> # The result of, or the signal raised by, converting an integer to a signed
> integer type when the value cannot be represented in an object of that type
> (C90 6.2.1.2, C99 6.3.1.3).
> For conversion to a type of width N, the value is reduced modulo 2^N to be 
> within range of the type; no signal is raised. 

Conversions are never causes an overflow rather it causes an implementation defined behavior in some cases.
Comment 5 Eskil Steenberg 2010-12-12 12:30:15 UTC
Hi

>>void my_func(unsigned short a, unsigned short c)
>>{
>>    unsigned int b;
>>
>>    b = a * c;
>
> There is no overflow here since this unsigned integers wrap and don't
> overflow.

Yes there is since a and c are promoted to signed ints and thats where the
multiplication takes place, before they are converted to an unsigned int.

A max unsigned short times a max unsigned short will overflow a signed
int. (given a 32 bit system at least)

>> Yes, but the doesn't the C spec define the overflow as undefined, rather
>> then the entire program?
>
> No it is a runtime undefined behavior rather than the result being
> undefined.

So how can the compiler make a compile time assumption about the outcome
of the behavior since it is undefined at compile time?

>> rather that gcc makes assumptions about this behavior that _can_ turn
>> out to
>> be not true.
>
> But assumptions?  Since it is undefined behavior, it does not matter
> because GCC can make different assumptions in when it feels like.

Could you clarify this statement?

> Unless you can give a testcase that does not depend on undefined behavior,
> it is hard to prove GCC is doing something wrong.

The very problem I'm addressing is how gcc deals with this, at compile
time, undefined behavior.

Cheers

E
Comment 6 Andrew Pinski 2010-12-12 21:02:55 UTC
>it is undefined at compile time?

No, it is undefined at runtime.  This again is not an undefined behavior at compile time but rather at runtime.  What that means is the behavior cannot be diagnosed (at least by the C standard definitions) at compile time and has to compile.  There is no undefined at compile time behavior here, only at runtime.
Comment 7 Eskil Steenberg 2010-12-12 21:46:18 UTC
Hi

> No, it is undefined at runtime.  This again is not an undefined behavior
> at
> compile time but rather at runtime.  What that means is the behavior
> cannot be
> diagnosed (at least by the C standard definitions) at compile time and has
> to
> compile.  There is no undefined at compile time behavior here, only at
> runtime.

Well the compiler does make assumptions about what the runtime will do.
Have a look at this function:

void my_func(unsigned short a, unsigned short b)
{
    unsigned int c;
    c = a * b;

    printf("c = %u\n", c);
    if(f < 3000000000) /* 3 billion */
        printf("foo bar\n");
}

using gcc you can get the output (with the input ffff, and ffff):

c = 4294836225
foo bar

This output should not be possible and will be very confusing to the user
(it was to me). My (limited) reading of the C spec should not support this
even though i understand why it happens.

At compile time the compiler decides that c can never be larger then max
singned int, and therefor it thinks that it can optimize away the if
statement.

Cheers

E

PS sorry I dont have a compiler on this machine so excuse any typos.
Comment 8 Andrew Pinski 2010-12-12 21:52:40 UTC
>This output should not be possible 

No, it is possible because the value is undefined so both the if being false and the printout happening can happen.
Comment 9 Eskil Steenberg 2010-12-12 22:23:36 UTC
Hi

> No, it is possible because the value is undefined so both the if being
> false and the printout happening can happen.

But undefined still means that the variable c has a value, just not
something that cant be determined at compile time.

The value c is not undefined, just the operation that produces the value
stored in c. Therefor anything the variable c touches shouldn't become
undefined too.

If i give someone the number 9 and tell them to do a square root of it,
they should produce a 3 even if I dont define where I got the number 9. if
they go ahead and produce the value 4, I'm going to say that they are
wrong, and not buy the argument "Since you dont define where you got the
number, I'm going to assume you really meent 16 and not 9".

I really dislike the idea that something can be undefined, and at the same
time the compiler can make assumptions about what it can be. pick one.

Cheers

E
Comment 10 Andreas Schwab 2010-12-13 00:21:09 UTC
The execution of an undefined operation produces an undefined value, and any further operation becomes undefined.
Comment 11 Eskil Steenberg 2010-12-13 14:09:46 UTC
Hi

> The execution of an undefined operation produces an undefined value, and
> any further operation becomes undefined.

My argument is that, at compile time this isn't known. Just because the C
spec doesn't guarantee any behavior to be consistent between runtime
environments, doesn't mean that a compiler designer should assume that the
programmer doesn't know the behavior of the hardware he or she is
programming for.

Searching around the net i found this:

http://www.groupsrv.com/groups/view.php?c=computer&g=6034&id=62241&p=0

And reading it, it looks like many consider this to be an area of the spec
that is perhaps lacking in clarity.

What ever side you come down on this, I think its very clear that gcc,
does a very poor job here in terms of usability, since it doesn't in any
way communicate to the user what it is doing. At least it should put up a
warning.

Cheers

E