System: Debian 2.6.22.8-mw017 GCC: gcc (GCC) 4.1.2 20061115 (prerelease) (Debian 4.1.1-21) In the expression below: - VALUE has type uint32_T - Y gets a wrong result. - If VALUE is assigned to an int32_T variable, and the variable is used instead of VALUE, Y gets the correct result. See attached file missing_downcast.c for more details and reproduction steps ----------------- #define VALUE ((int32_T)((int64_T)U1 * (int64_T)3 >> 2) + 2) Y = (int32_T)( (int64_T)(VALUE * VALUE) * (int64_T)954437177 >> 29 );
Created attachment 15671 [details] Wrong result if expression is inlined At the bottom of the file there are: - Instructions on how to compile - Version of OS and GCC on which it failed
t = (int) ((long int) U1 * 3 >> 2) + 2; Y = (int) ((long int) (t * t) * 954437177 >> 29); if combined is folded to Y = (int) (((long int) (int) ((long int) U1 * 3 >> 2) * 954437177 + 1908874354) * (long int) ((int) ((long int) U1 * 3 >> 2) + 2) >> 29); We can see that for int foo(int x) { return (long)(((int)((long)x * 3) + 2) * (int)((long)x * 3)) * 2; } the conversion to long is distributed on the operands and the multiplication by 2 is then re-associated. I bet this is extract_muldiv at its work again.
I will try to have a look.
Reduced testcase, fails with 32bit and 64bit, works with 2.95.4. extern void abort (void); #define VALUE ((int)((long long)U1 * (long long)3) + 2) int main(void) { int U1; long long Y, Y2; int t; U1 = -2147483647-1; Y = ((long long)(VALUE * VALUE) * 3); t = VALUE; Y2 = ((long long)(t * t) * 3); if (Y != Y2) abort (); return 0; }
We enter extract_muldiv with (long long int) (((int) ((unsigned int) U1 * 3) + 2) * ((int) ((unsigned int) U1 * 3) + 2)) and 3 and go down the widening route with long long and then the same operand of MULT_EXPR route applying a long long widened multiplication to (int) ((unsigned int) U1 * 3) + 2 and 3 resulting in (long long int) (int) ((unsigned int) U1 * 3) * 3 + 6 I would say this bug is invalid as it assumes we need to preserve truncations of overflowed computations on signed integers, but with -fwrapv the same error happens. Testcase that works: extern void abort (void); #define VALUE (unsigned int)((int)((long long)U1 * (long long)3) + 2) int main(void) { int U1; long long Y, Y2; unsigned int t; U1 = -2147483647-1; Y = ((long long)(int)(VALUE * VALUE) * 3); t = VALUE; Y2 = ((long long)(int)(t * t) * 3); if (Y != Y2) abort (); return 0; }
Subject: Bug 36300 Author: rguenth Date: Mon May 26 12:38:19 2008 New Revision: 135913 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=135913 Log: 2008-05-26 Richard Guenther <rguenther@suse.de> PR middle-end/36300 * fold-const.c (extract_muldiv_1): Use TYPE_OVERFLOW_WRAPS, not TYPE_UNSIGNED. Use TYPE_PRECISION instead of GET_MODE_SIZE. * gcc.dg/pr36300-1.c: New testcase. * gcc.dg/pr36300-2.c: Likewise. Added: trunk/gcc/testsuite/gcc.dg/pr36300-1.c trunk/gcc/testsuite/gcc.dg/pr36300-2.c Modified: trunk/gcc/ChangeLog trunk/gcc/fold-const.c trunk/gcc/testsuite/ChangeLog
-fwrapv case fixed for 4.4.0.
Richard, I truly appreciate your effort on this bug. Can you please elaborate on your plan? Will this be fixed for -fwrapv only, or also for -fwrapv-free compilations? I understand that downcasting an out-of-range value to a signed integer is not defined in the standard. However, many compilers offer an "expected" modulo-2^n behavior. This behavior helps avoid writing extra cumbersome "safe downcast" code. Having GCC offer it for non -fwrapv compilations is an important interoperability and readability feature. Thank you for looking into it. Alex.
Subject: Re: Incorrect type used for inlined expression On Tue, 27 May 2008, acarmeli at mathworks dot com wrote: > ------- Comment #8 from acarmeli at mathworks dot com 2008-05-27 12:50 ------- > Richard, > > I truly appreciate your effort on this bug. > > Can you please elaborate on your plan? Will this be fixed for -fwrapv only, or > also for -fwrapv-free compilations? > > I understand that downcasting an out-of-range value to a signed integer is not > defined in the standard. However, many compilers offer an "expected" modulo-2^n > behavior. This behavior helps avoid writing extra cumbersome "safe downcast" > code. Having GCC offer it for non -fwrapv compilations is an important > interoperability and readability feature. > > Thank you for looking into it. This is only fixed for -fwrapv (this is the "expected" modulo-2^n behavior switch). It was never broken for the defined variant with the casts to unsigned before the multiplication. Richard.
Created attachment 15691 [details] Steps to reach numerically incorrect/impossible result Compile: gcc bad_numeric.c Builds up expression until it breaks numerically.
I believe there is still a violation of the C standard when expression folding folds this. The attached file demonstrates this. In step 7, we perform 4*3 in 32-bit and expect to get a 12 in a 64-bit. It should be possible to perform this without overflow and represent in 64-bit. It should be possible to divide the result in 3 and get the original number: 4 However, the folded expression does not permit this to happen.
We are probably not honoring the implementation defined signed truncation(s). But my head hurts looking at the testcase vs. the folding code ...
In step5 we have signed overflow in the multiplication of -2147483646 with itself. Thus starting from that, the following results are unreliable.
You are correct. But step 6 did not reveal that, and provided a correct s64 result. This result should have been carried over to step 7 to lead to a correct result. Otherwise, it will not be possible to get the result through a folded expression. It seems to me that adding more to the expression changes the optimization to inner expressions in a way that compromizes the numerical stability. We also tried masking to eliminate unwanted bits. The masks seem to have no effect and we suspect they get optimized out. It was an unfavorable solution anyway, due to the cost of the mask. In effect, there is no reliable way to grow a folded expression and consistently preserve the result of inner expressions. The only solution we have is temporary variables, which might get optimized out and folded as well.
Subject: Re: Incorrect type used for inlined expression On Wed, 28 May 2008, acarmeli at mathworks dot com wrote: > ------- Comment #14 from acarmeli at mathworks dot com 2008-05-28 12:25 ------- > You are correct. But step 6 did not reveal that, and provided a correct s64 > result. This result should have been carried over to step 7 to lead to a > correct result. Well, step 6 has a "correct" result by luck - the undefinedness doesn't expose an optimization opportunity. In step 7 the result from step 6 is not used, but instead its expression is inserted and the intermediate result from step 6 is unused (and so is combined with the added expression, exploiting the undefined behavior for a transformation the folding code thinks is worthwhile). > Otherwise, it will not be possible to get the result through a folded > expression. The code invokes undefined behavior if the expression in step 5 overflows. Undefined behavior can be also strange and "inconsistent" behavior as you see in your testcase. > It seems to me that adding more to the expression changes the optimization to > inner expressions in a way that compromizes the numerical stability. Yep. The folding code is run early and sees the whole C expression. Obviously this exploits more "interesting" opportunities to mangle the expressions. > We also tried masking to eliminate unwanted bits. The masks seem to have no > effect and we suspect they get optimized out. It was an unfavorable solution > anyway, due to the cost of the mask. > > In effect, there is no reliable way to grow a folded expression and > consistently preserve the result of inner expressions. The only solution we > have is temporary variables, which might get optimized out and folded as well. True. The only reliable way is to only write (partial) expressions that do not invoke undefined behavior in any case ;) This can be either done by carefully doing operations that may overflow in unsigned arithmetic or by providing -fwrapv as a compiler option. Richard.
Fixed for 4.3.1. Unassigning.
Subject: Bug 36300 Author: rguenth Date: Wed May 28 13:45:47 2008 New Revision: 136085 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=136085 Log: 2008-05-28 Richard Guenther <rguenther@suse.de> PR middle-end/36300 * fold-const.c (extract_muldiv_1): Use TYPE_OVERFLOW_WRAPS, not TYPE_UNSIGNED. Use TYPE_PRECISION instead of GET_MODE_SIZE. * gcc.dg/pr36300-1.c: New testcase. * gcc.dg/pr36300-2.c: Likewise. Added: branches/gcc-4_3-branch/gcc/testsuite/gcc.dg/pr36300-1.c branches/gcc-4_3-branch/gcc/testsuite/gcc.dg/pr36300-2.c Modified: branches/gcc-4_3-branch/gcc/ChangeLog branches/gcc-4_3-branch/gcc/fold-const.c branches/gcc-4_3-branch/gcc/testsuite/ChangeLog
Closing 4.1 branch.
Closing 4.2 branch, fixed in 4.3.1 and 4.4.