This is the mail archive of the gcc-help@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Optimisations and undefined behaviour


On 05/11/15 20:52, David Brown wrote:

> I would like to know where the gcc developers stand on this - how can 
> the rules of the standards be interpreted?  And how /should/ they be 
> interpreted?  Does gcc aim for the most efficient code, or for the 
> principle of least astonishment?
> 
> For example, consider this requirement:
> 
> Write a function "int foo(int x)" that will call "bar(x)" if x is less 
> than 50000.  Then if x is less than 1000, return x*x*x - otherwise 
> return any integer.  (Assume 32-bit int's, no traps or other "funny 
> stuff" on the target.)
> 
> It seems reasonable to write:
> 
> 	int foo(int x) {
> 		if (x < 50000) bar(x);
> 		return x*x*x;
> 	}
> 
> For x greater than 1290, the cubing is going to overflow and generate a 
> mathematically incorrect answer.  But that's okay - we don't care about 
> the result in such cases.
> 
> The question is, can the compiler reason like this:
> 
> If x > 1290, then undefined behaviour will occur.  The user clearly 
> doesn't care what happens when x > 1290.  Therefore, we can assume that 
> x <= 1290.  Therefore, x < 50000.  Therefore, there is no need to check 
> the conditional, and bar(x) can be called directly.

That's right.

> The point here is that if the code hits undefined behaviour, then the 
> behaviour of the entire program is undefined.  Thus the undefined 
> behaviour in "x*x*x" can work backwards.
> 
> Equally clearly, this would be against the expectations of the 
> programmer.

Well, with UB it's not really possible to know the expectations of
the programmer, and I'm not sure how much sense it makes to even try.

> And any attempt to avoid the undefined behaviour (such as 
> changing the last line to "return (x < 1000) ? x*x*x : 0;") would lead 
> to bigger and slower code than the "obvious" compilation of foo() above.
> 
> (In tests, gcc and clang both keep the check on x < 50000. 
> <https://gcc.godbolt.org/> is really handy!)
> 
> How is the programmer to be sure that the compiler does not apply this 
> optimisation logic?

If your code may overflow, use an unsigned type.

> Or consider this code:
> 
> 	char arr[8192];
> 	int foo2(int i) {
> 		if (i < 0) return -1;
> 		i *= 3;
> 		if (i < 0) return -2;
> 		if (i >= 8192) return -3;
> 		return arr[i]++;
> 	}
> 
> The compiler can assume that since integer overflow is undefined 
> behaviour, and it knows that i >= 0 after the first check, then the 
> result of "i *= 3" must be non-negative.

Indeed it can.

> However, if you pass an integer over (2^31 / 3) to the function, it
> is likely that after the assembly code for i *= 3 is run, the value
> in the cpu register will be negative

You're thinking about this in the wrong way.  C is not defined in
terms of registers: it is defined by an abstract machine.  It terms of
the abstract machine, integer overflow is undefined.  Perhaps the
processor overheats.

> and the obvious assembly for the "i >= 8192" test will not trigger.
> The result is access (and modification) to memory well outside the
> bounds of arr, despite the clear safety check in the code.  The
> compiler would be allowed to do this even if it follows the
> Analyzability Annex L in C11 standards - and it would be turning a
> "bounded undefined behaviour" (integer overflow) into "unbounded
> undefined behaviour" (modification of memory outside of bounds).

Indeed so.  Appendix L is interesting, but ensuring that integer
overflow does not result in wild stores will defeat some
optimizations.

> In this second example, the skipped tests come /after/ the undefined
> behaviour, but it could still easily be seen as unexpected behaviour
> from the compiler.

Unexpected by who?  Skilled C programmers know that signed types have
this property, but unsigned types wrap.

Andrew.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]