This is the mail archive of the
gcc-help@gcc.gnu.org
mailing list for the GCC project.
RE: bad optimization
- From: "Andy Falanga (afalanga)" <afalanga at micron dot com>
- To: David Brown <david at westcontrol dot com>
- Cc: Andrew Haley <aph at redhat dot com>, Brian Budge <brian dot budge at gmail dot com>, GCC-help <gcc-help at gcc dot gnu dot org>
- Date: Thu, 22 Aug 2013 14:38:57 +0000
- Subject: RE: bad optimization
- References: <CANjXV6_=hQQPtk78VkumtHXhK+rE-UkT=-ibBx83BqYsS+sp5w at mail dot gmail dot com> <5213BE12 dot 6050207 at redhat dot com> <60F6FAE47D1BCE4380CC06D18F49789B4FB1CF4D at NTXBOIMBX02 dot micron dot com> <5215C637 dot 5070902 at westcontrol dot com>
> -----Original Message-----
> From: gcc-help-owner@gcc.gnu.org [mailto:gcc-help-owner@gcc.gnu.org] On
> Behalf Of David Brown
> Sent: Thursday, August 22, 2013 2:05 AM
> To: Andy Falanga (afalanga)
> Cc: Andrew Haley; Brian Budge; GCC-help
> Subject: Re: bad optimization
>
>
> In case Andrew's correct, but somewhat technical, explanation is not
> what you are after, here is an alternative. The concept of "undefined
> behaviour" can be difficult when you first meet it.
>
> Signed overflow is when you do something with signed ints that gives a
> result bigger than the int can express. For example, adding 2e9 + 1e9
> with 32-bit ints - the ideal result is 3e9 which takes 33 bits for a
> signed integer. When faced with that, there are three things the C
> standards, and therefore the compiler, could do.
>
> One is that it could "saturate" at the largest valid positive integer.
> This is an expensive operation for many processors, but it's used in
> special cases (mostly DSP work - and preferably using the new C "_Sat"
> types).
>
> It could also assume you were using a two's compliment processor (valid
> for virtually all cpu's you'll ever see), and just do the addition -
> ending up with an integer value of around -1e9. This is an easy
> operation, and is the way Java does it (and gcc generally, if you use
> the "-fwrapv" option). But it doesn't make mathematical sense - you
> added two positive numbers, and the result is negative.
>
> Or it could assume that if there is a bug in the source code and you
> are trying to do something non-sensical, then it doesn't matter what
> the compiler does with the results. This is what is meant by
> "undefined behaviour". The great thing about "undefined behaviour"
> from the compiler's viewpoint is that it can generate whatever code is
> easiest and fastest for the "defined" range, with no regard for what
> happens in the "undefined" range.
>
> In your case, abs(-2147483648) is undefined (with 32-bit ints), because
> +2147483648 is outside the range of integers. So the compiler can
> implement the abs() function like this:
>
> int abs(int x) {
> if (x == -2147483648) {
> eatYourHardDiskForLunch();
> } else if (x >= 0) {
> return x;
> } else {
> return -x;
> }
> }
>
> And because this means the result of abs() is always non-negative if it
> is defined, the compiler is then able to do more optimisations on it -
> such as assuming that "abs(i) < 0" will always fail.
>
>
> You might ask why the compiler can't spot the undefined behaviour at
> compile time, and give an error message. The answer is that it can
> sometimes, especially when you enable warnings (-Wall, -Wextra) and
> optimisation, but often it can't. (I think it should be possible to
> spot /this/ particular undefined behaviour, but my slightly outdated
> gcc fails to warn about it.)
>
>
> Basically, "undefined behaviour" is the technical term for "garbage in,
> garbage out" - the compiler assumes you don't care about the quality of
> the garbage out when you put garbage in, and uses it to give you better
> code when you put good data in. It is /your/ responsibility to avoid
> putting garbage in in the first place.
>
David,
Thanks for this great explanation. Andrew's answer was satisfactory, but having this much depth is greatly appreciated. I've learned quite a bit from this.
Andy