The current gcc trunk and gcc 4.8 produce wrong code for the following testcase on x86_64-linux when compiled at -O3 (in both 32-bit and 64-bit modes). This is a regression from 4.7.x. $ gcc-trunk -v gcc version 4.9.0 20130812 (experimental) [trunk revision 201658] (GCC) $ gcc-trunk -O2 small.c $ a.out 0 $ gcc-4.7 -O3 small.c $ a.out 0 $ gcc-trunk -O3 small.c $ a.out -1 $ gcc-4.8 -O3 small.c $ a.out -1 $ ------------------------------------------------------ int printf (const char *, ...); int a, b, c, d, e, f, g, h = 1, i; int foo (int p) { return p < 0 && a < -2147483647 - 1 - p ? 0 : 1; } int *bar () { int j; i = h ? 0 : 1 % h; for (j = 0; j < 1; j++) for (d = 0; d; d++) for (e = 1; e;) return 0; return 0; } int baz () { for (; b >= 0; b--) for (c = 1; c >= 0; c--) { int *k = &c; for (;;) { for (f = 0; f < 1; f++) { g = foo (*k); bar (); } if (*k) break; return 0; } } return 0; } int main () { baz (); printf ("%d\n", b); return 0; }
-2147483647 - 1 - p Hmm, this overflows for p > 1.
Andrew, because of short-circuiting, when p >= 0, the expression "-2147483647 - 1 - p" isn't actually evaluated. Thanks for looking into this so quickly! Zhendong
This wrong-code started with the PR54937 patch in r192710. Author CC:d.
Created attachment 30674 [details] slightly simplified test case attached is a slightly simplified test case. The problem seems to start in the optimizer pass 097t.lim1 where the following expression is identified as loop-invariant of loop 3. Moving statement _23 = -2147483648 - c.1_22; (cost 1) out of loop 3. unfortunatley it is executed unconditionally now. later the optimizer pass 110t.ivcanon uses the possible overflow in this statement as an argument, why the loop must be executed exactly once. Induction variable (int) 2147483647 + 1 * iteration does not wrap in statement _23 = -2147483648 - prephitmp_8; in loop 2. Statement _23 = -2147483648 - prephitmp_8; is executed at most 0 (bounded by 0) + 1 times in loop 2. and shortly after that an apparently pointless loop-exit is removed. Removed pointless exit: if (prephitmp_8 != 0) however that is based on a worng assumption, and causes worng code.
Summary: tree-ssa-loop-im.c moves code, out of an if statement inside the loop it it can not cause side effects or faults, but it does not care of integer overflows. this seems to be an optimization! BUT tree-ssa-loop-niter.c (infer_loop_bounds_from_signedness) does assume that the code will never execute integer additions or subtractions with the intention to use the result as modulo 2^32, thus ignoring overflow. It seems that -O3 and -fno-strict-overflow will fix the code. however this comment in tree.h points to another problem: IMPORTANT NOTE: Any optimization based on TYPE_OVERFLOW_UNDEFINED must issue a warning based on warn_strict_overflow. In some cases it will be appropriate to issue the warning immediately, and in other cases it will be appropriate to simply set a flag and let the caller decide whether a warning is appropriate or not. this example does not generate any warnings, not with -Wall and not with -Wstrict-overflow...
Created attachment 30681 [details] possible fix This seems to be a possible fix. What do you think of it, Jan?
How can I set the status of this tracker to CONFIRMED ? Should'nt the component be "tree-optimization" instead of "middle-end" ?
That patch looks wrong, and would very likely penalize tons of code, this predicate is used in many places in the compiler and the operations don't trap.
(In reply to Jakub Jelinek from comment #8) > That patch looks wrong, and would very likely penalize tons of code, this > predicate is used in many places in the compiler and the operations don't > trap. yes, thanks, I agree. This means then the "lim" pass (and probably others like "ifcvt" too) will move code out of the inner loop, as long as it does not trap. But this creates undefined results, and that should not be used by the loop optimization to throw away the loop termination code. In this case I'd say the only other simple solution will be to take out the function infer_loop_bounds_from_signedness() completely at tree-ssa-loop-niter.c, right? To illustrate what this function can do here is another example: loop.c: extern int bar (); int foo () { int i, k; for (i=0; i<4; i++) { k=1000000000*i; if (bar ()) break; } return k; } if you compile this function with -O3 the resulting code is very surprising (with zero warnings): foo: .LFB0: .cfi_startproc subl $12, %esp .cfi_def_cfa_offset 16 call bar testl %eax, %eax jne .L3 call bar testl %eax, %eax .p2align 4,,4 jne .L4 .p2align 4,,6 call bar movl $2000000000, %eax .L2: addl $12, %esp .cfi_remember_state .cfi_def_cfa_offset 4 ret .p2align 4,,7 .p2align 3 .L3: .cfi_restore_state xorl %eax, %eax jmp .L2 .p2align 4,,7 .p2align 3 .L4: movl $1000000000, %eax jmp .L2 Due to the fact, that k will overflow at the forth iteration, the loop is terminated at the third iteration! The reasoning is that the only way to prevent the undefined behaviour of k, one of the first tree invocations of bar must terminate the loop, and thus the loop is only unrolled 3 times. But if the loop is a bit more complex it will not be unrolled, and in this case the normal loop termination conditin "i<4" will not be used at all, resulting in an endless loop. To prevent the loop unrolling I can add a printf: loop.c: extern int bar (); int foo () { int i, k; for (i=0; i<4; i++) { k=1000000000*i; __builtin_printf("loop %d\n", i); if (bar ()) break; } return k; } Now this is an endless loop (bar always returns 0 but the compiler does not know)! foo: .LFB0: .cfi_startproc pushl %ebx .cfi_def_cfa_offset 8 .cfi_offset 3, -8 xorl %ebx, %ebx subl $24, %esp .cfi_def_cfa_offset 32 .L2: movl %ebx, 4(%esp) movl $.LC0, (%esp) call printf call bar testl %eax, %eax jne .L6 addl $1, %ebx .p2align 4,,3 jmp .L2 .p2align 4,,7 .p2align 3 .L6: addl $24, %esp .cfi_def_cfa_offset 8 imull $1000000000, %ebx, %eax popl %ebx .cfi_restore 3 .cfi_def_cfa_offset 4 ret .cfi_endproc
Created attachment 30693 [details] possible fix, next try... This variant eliminates the infer_loop_bounds_from_signedness function and some of the "invokes undefined behavior" warnings. Bootstrapped, and regression tested on i686-pc-linux-gnu. And by the way, it fixes PR57904 too. How do you like it now ?
No, that is wrong as well.
(In reply to Jakub Jelinek from comment #11) > No, that is wrong as well. Because it is too destructive? Maybe. I think this is a general problem here. 1. the undefined behavior warning may be triggered by artefacts from the lim pass or in the class_48.f90 case. 2. surprise optimizations may happen without this warning, see my previous comment #9. 3. in the case of integer overflow, "reliable" does only say that the operation is executed in every iteration, but not that the result is acually used for something, as in Zhedong's example. With array bounds I have not the same problem, here as I'd say if the array is accessed beyond the limit, the guarantee is void anyway, and the lim pass would never move an array access out of the if statement, right? But there are examples where the undefined behavior warning is not emitted after a possible array bounds exception. A nice example for this is gmp-4.3.2/tests/mpz/t-scan.c This example has a array bounds error: static const int offset[] = { -2, -1, 0, 1, 2, 3 }; ... for (oindex = 0; oindex <= numberof (offset); oindex++) // +-1 error here { o = offset[oindex]; ... if (got != want) { ... exit (1); // this cancels the aggressive-loop-optimizations warning } ... } The generated code at -O2 is without the loop termination check, surprise surprise... What do you think?
Because the bug is in lim, so hacking around it in other parts of the compiler and removing desirable optimizations just to mitigate the bug is not the right way to fix it. Either lim shouldn't move the expressions if they are conditional in the loop body and might trigger undefined behavior in the place where it has been moved to while it might not trigger undefined behavior originally, or lim should transform them into expressions that won't trigger undefined behavior while moving them if it can't prove this will not happen (in this case that would mean performing the arithmetics in a type that doesn't have undefined behavior on overflow).
Created attachment 30699 [details] patch to prevent undefined execution in lim OK, this time only the lim pass should be the prevented from introducing undefined behavior that was not there originally. This triggered a minor regression in gcc.target/i386/pr53397-1.c; Here lim used to move the expression "2*step" out of the loop, but this may cause undefined behavior on case of overflow, I propose to resolve this by adding -fno-strict-overflow, The test case looks pretty constructed anyway.
This patch was posted at: http://gcc.gnu.org/ml/gcc-patches/2013-08/msg01733.html
*** Bug 58227 has been marked as a duplicate of this bug. ***
*** Bug 58731 has been marked as a duplicate of this bug. ***
(In reply to Richard Biener from comment #17) > *** Bug 58731 has been marked as a duplicate of this bug. *** Richard, should I commit my patch now?
Created attachment 31008 [details] untested patch This is the alternative approach, re-writing stmts to a variant that does not exhibit undefined signed overflow. Division is not covered, neither is ABS_EXPR (or __builtin_abs ()). For both I cannot see an easy way to rewrite them but make their execution conditional (that requires more infrastructure changes in LIM though). The patch should cover usual cases though, a testcase showing the issue remaining with division and ABS_EXPR is appreciated (as SCEV analysis only can do sensible stuff with +- and * it shouldn't be possible to trigger a bug with the unrolling code - but clearly VRP could be tricked into some invalid transform as it also handles division and absolute). POINTER_PLUS_EXPR handling also needs to be added, though that's one that will eventually exhibit code generation regressions. But who knows ... (testcase for POINTER_PLUS_EXPR also welcome).
(In reply to Bernd Edlinger from comment #18) > (In reply to Richard Biener from comment #17) > > *** Bug 58731 has been marked as a duplicate of this bug. *** > > Richard, > > should I commit my patch now? No, I would have approved it if so. I am not happy with disabling loop invariant motion for these cases. Instead let's first fix the sequence of bugs exposed in all testcases sofar, LIM moving sth, SCEV plus niter analysis figuring out a wrong number of iterations, unroll killing the exit test. This restricts the operations we have to handle and ultimately simplifies the patch which makes it more suitable for backporting.
For the testcase in PR58731, int a, b, c, d, e; int main () { for (b = 4; b > -30; b--) for (; c;) for (;;) { e = a > 2147483647 - b; if (d) break; } return 0; } after rewriting 2147483647 - b to unsigned arithmetic, IVOPTs re-writes it to signed arithmetic again, making VRP remove the loop exit test. Bah. --- pr58143-3.c.118t.slp 2013-10-15 12:56:07.040864240 +0200 +++ pr58143-3.c.120t.ivopts 2013-10-15 12:56:07.041864251 +0200 @@ -5,18 +5,14 @@ { int b_lsm.10; int e_lsm.9; - int pretmp_3; - unsigned int ivtmp_6; - unsigned int _12; + int _13; int b.5_14; - unsigned int _15; int pretmp_16; int pretmp_17; int b.0_20; _Bool pretmp_22; int pretmp_23; int pretmp_24; - unsigned int ivtmp_27; <bb 2>: b = 4; @@ -45,11 +41,8 @@ <bb 8>: # b.0_20 = PHI <b.5_14(11), 4(2)> # e_lsm.9_19 = PHI <e_lsm.9_30(11), e_lsm.9_21(2)> - # ivtmp_6 = PHI <ivtmp_27(11), 34(2)> - _15 = (unsigned int) b.0_20; - _12 = 2147483647 - _15; - pretmp_3 = (int) _12; - pretmp_22 = pretmp_3 < pretmp_16; + _13 = 2147483647 - b.0_20; + pretmp_22 = _13 < pretmp_16; pretmp_23 = (int) pretmp_22; <bb 9>: @@ -62,8 +55,7 @@ <bb 10>: # e_lsm.9_30 = PHI <e_lsm.9_25(9)> b.5_14 = b.0_20 + -1; - ivtmp_27 = ivtmp_6 - 1; - if (ivtmp_27 != 0) + if (b.5_14 != -30) goto <bb 11>; else goto <bb 12>; possibly via some bug in SCEV and/or fold. That said, int a, b, c, d, e; int main () { for (b = 4; b > -30; b--) { e = a > (int)((unsigned int) __INT_MAX__ - (unsigned int) b); for (; c;) for (;;) { if (d) break; } } return 0; } fails the same way, even with -fno-tree-loop-im (but still needs -O3 for some reason).
(In reply to Richard Biener from comment #21) hmm... all these test cases simply work with my patch... I tried this example (with my latest patch, but trunk from 22/9/13) and the result is quite different: @@ -135,11 +332,10 @@ intD.6 a.2D.1739; unsigned intD.9 b.1D.1736; intD.6 b.0D.1735; - unsigned intD.9 _6; - intD.6 _7; + intD.6 _8; _BoolD.1701 _9; - unsigned int ivtmp_12; - unsigned int ivtmp_15; + unsigned int _11; + unsigned int _13; intD.6 pretmp_23; intD.6 pretmp_25; intD.6 pretmp_32; @@ -161,11 +357,10 @@ ;; 2 [100.0%] (FALLTHRU,EXECUTABLE) # .MEM_18 = PHI <.MEM_18(11), .MEM_2(D)(2)> # b.0_20 = PHI <b.6_14(11), 4(2)> - # ivtmp_15 = PHI <ivtmp_12(11), 34(2)> - b.1_5 = (unsigned intD.9) b.0_20; - _6 = 2147483647 - b.1_5; - _7 = (intD.6) _6; - _9 = _7 < pretmp_23; + _13 = (unsigned int) b.0_20; + _11 = 2147483647 - _13; + _8 = (intD.6) _11; + _9 = _8 < pretmp_23; e.3_10 = (intD.6) _9; goto <bb 9>; ;; succ: 9 [100.0%] (FALLTHRU,EXECUTABLE) @@ -219,8 +414,7 @@ ;; prev block 9, next block 11, flags: (NEW, REACHABLE) ;; pred: 9 [9.0%] (FALSE_VALUE,EXECUTABLE) b.6_14 = b.0_20 + -1; - ivtmp_12 = ivtmp_15 - 1; - if (ivtmp_12 != 0) + if (b.6_14 != -30) goto <bb 11>; else goto <bb 12>;
On Tue, 15 Oct 2013, bernd.edlinger at hotmail dot de wrote: > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58143 > > --- Comment #22 from Bernd Edlinger <bernd.edlinger at hotmail dot de> --- > (In reply to Richard Biener from comment #21) > hmm... > > all these test cases simply work with my patch... You are right - I have local IVOPTs changes in my tree that trigger the bug.
Wall, the logic behind the loop niter is really strange and the results are simply insane. When you consider this example: int a, b, c, d, e; int main () { for (b = 4; b > -30; b--) { e = 2147483647 - b; for (; c;) for (;;) { if (d) break; } } return 0; } then the reasoning is that an "undefined" value will be written to e in loop iteration #4. And therefore it does not matter what happens afterwards. But for the result of main() which will ultimately be 0 the undefined value of e does not matter. Why should a undefined value that is not used for anything create such a problem?
On Tue, 15 Oct 2013, bernd.edlinger at hotmail dot de wrote: > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58143 > > --- Comment #24 from Bernd Edlinger <bernd.edlinger at hotmail dot de> --- > Wall, > > the logic behind the loop niter is really strange > and the results are simply insane. > > When you consider this example: > > int a, b, c, d, e; > > int > main () > { > for (b = 4; b > -30; b--) > { > e = 2147483647 - b; > for (; c;) > for (;;) > { > if (d) > break; > } > } > return 0; > } > > then the reasoning is that an "undefined" value will > be written to e in loop iteration #4. > And therefore it does not matter what happens afterwards. > But for the result of main() which will ultimately be 0 > the undefined value of e does not matter. > > Why should a undefined value that is not used for anything > create such a problem? Because undefined behavior includes random effects (like formatting your hard drive). In this case the undefined behavior is simply assuming we never reach loop iteration #4 and thus the loop exit test is never true. Richard.
(In reply to Bernd Edlinger from comment #24) > Wall, > > the logic behind the loop niter is really strange > and the results are simply insane. > > When you consider this example: > > int a, b, c, d, e; > > int > main () > { > for (b = 4; b > -30; b--) > { > e = 2147483647 - b; > for (; c;) > for (;;) > { > if (d) > break; > } > } > return 0; > } > > then the reasoning is that an "undefined" value will > be written to e in loop iteration #4. > And therefore it does not matter what happens afterwards. > But for the result of main() which will ultimately be 0 > the undefined value of e does not matter. > > Why should a undefined value that is not used for anything > create such a problem? To re-emphasize what Richard wrote: it's undefined BEHAVIOUR not VALUE. The mere fact of causing undefined behaviour voids all guarantees about the future behaviour of that process. That the variable "e" above is unused is immaterial.
(In reply to rguenther@suse.de from comment #25) > > > > Why should a undefined value that is not used for anything > > create such a problem? > > Because undefined behavior includes random effects (like > formatting your hard drive). In this case the undefined > behavior is simply assuming we never reach loop iteration #4 > and thus the loop exit test is never true. > > Richard. I would have understood that in this example: int x[5]; for (b = 4; b > -30; b--) { e = x[b]; for (; c;) for (;;) { if (d) break; } } return 0; bacause x[-1] may trap. But e = 2147483647 - b; is just guaranteed to write an undefined value in e. We have -fno-trapv, or am I wrong?
On Tue, 15 Oct 2013, bernd.edlinger at hotmail dot de wrote: > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58143 > > --- Comment #27 from Bernd Edlinger <bernd.edlinger at hotmail dot de> --- > (In reply to rguenther@suse.de from comment #25) > > > > > > Why should a undefined value that is not used for anything > > > create such a problem? > > > > Because undefined behavior includes random effects (like > > formatting your hard drive). In this case the undefined > > behavior is simply assuming we never reach loop iteration #4 > > and thus the loop exit test is never true. > > > > Richard. > > I would have understood that in this example: > > int x[5]; > for (b = 4; b > -30; b--) > { > e = x[b]; > for (; c;) > for (;;) > { > if (d) > break; > } > } > return 0; > > bacause x[-1] may trap. > > But e = 2147483647 - b; > is just guaranteed to write an undefined > value in e. We have -fno-trapv, or am I wrong? We have -fwrapv, yes. -fno-trapv is the default and -ftrapv (while not really working correctly) also makes signed overflow "defined" (defined to trap).
GCC 4.8.2 has been released.
Author: rguenth Date: Thu Oct 17 09:59:47 2013 New Revision: 203745 URL: http://gcc.gnu.org/viewcvs?rev=203745&root=gcc&view=rev Log: 2013-10-17 Richard Biener <rguenther@suse.de> PR tree-optimization/58143 * tree-ssa-loop-im.c (arith_code_with_undefined_signed_overflow): New function. (rewrite_to_defined_overflow): Likewise. (move_computations_dom_walker::before_dom): Rewrite stmts with undefined signed overflow that are not always executed into unsigned arithmetic. * gcc.dg/torture/pr58143-1.c: New testcase. * gcc.dg/torture/pr58143-2.c: Likewise. * gcc.dg/torture/pr58143-3.c: Likewise. Added: trunk/gcc/testsuite/gcc.dg/torture/pr58143-1.c trunk/gcc/testsuite/gcc.dg/torture/pr58143-2.c trunk/gcc/testsuite/gcc.dg/torture/pr58143-3.c Modified: trunk/gcc/ChangeLog trunk/gcc/testsuite/ChangeLog trunk/gcc/tree-ssa-loop-im.c
Fixed on trunk sofar.
Author: rguenth Date: Mon Nov 18 15:13:14 2013 New Revision: 204965 URL: http://gcc.gnu.org/viewcvs?rev=204965&root=gcc&view=rev Log: 2013-11-18 Richard Biener <rguenther@suse.de> Backport from mainline 2013-10-21 Richard Biener <rguenther@suse.de> PR tree-optimization/58794 * fold-const.c (operand_equal_p): Compare FIELD_DECL operand of COMPONENT_REFs with OEP_CONSTANT_ADDRESS_OF left in place. * c-c++-common/torture/pr58794-1.c: New testcase. * c-c++-common/torture/pr58794-2.c: Likewise. 2013-10-21 Richard Biener <rguenther@suse.de> PR middle-end/58742 * fold-const.c (fold_binary_loc): Fold ((T) (X /[ex] C)) * C to (T) X for sign-changing conversions (or no conversion). * c-c++-common/fold-divmul-1.c: New testcase. 2013-11-06 Richard Biener <rguenther@suse.de> PR tree-optimization/58653 * tree-predcom.c (ref_at_iteration): Rewrite to generate a MEM_REF. (prepare_initializers_chain): Adjust. * gcc.dg/tree-ssa/predcom-6.c: New testcase. * gcc.dg/tree-ssa/predcom-7.c: Likewise. PR tree-optimization/59047 * tree-predcom.c (ref_at_iteration): Handle bitfield accesses properly. * gcc.dg/torture/pr59047.c: New testcase. 2013-10-15 Richard Biener <rguenther@suse.de> PR tree-optimization/58143 * tree-ssa-loop-im.c (arith_code_with_undefined_signed_overflow): New function. (rewrite_to_defined_overflow): Likewise. (move_computations_dom_walker::before_dom): Rewrite stmts with undefined signed overflow that are not always executed into unsigned arithmetic. * gcc.dg/torture/pr58143-1.c: New testcase. * gcc.dg/torture/pr58143-2.c: Likewise. * gcc.dg/torture/pr58143-3.c: Likewise. Added: branches/gcc-4_8-branch/gcc/testsuite/c-c++-common/fold-divmul-1.c branches/gcc-4_8-branch/gcc/testsuite/c-c++-common/torture/pr58794-1.c branches/gcc-4_8-branch/gcc/testsuite/c-c++-common/torture/pr58794-2.c branches/gcc-4_8-branch/gcc/testsuite/gcc.dg/torture/pr58143-1.c branches/gcc-4_8-branch/gcc/testsuite/gcc.dg/torture/pr58143-2.c branches/gcc-4_8-branch/gcc/testsuite/gcc.dg/torture/pr58143-3.c branches/gcc-4_8-branch/gcc/testsuite/gcc.dg/torture/pr59047.c branches/gcc-4_8-branch/gcc/testsuite/gcc.dg/tree-ssa/predcom-6.c branches/gcc-4_8-branch/gcc/testsuite/gcc.dg/tree-ssa/predcom-7.c Modified: branches/gcc-4_8-branch/gcc/ChangeLog branches/gcc-4_8-branch/gcc/fold-const.c branches/gcc-4_8-branch/gcc/testsuite/ChangeLog branches/gcc-4_8-branch/gcc/tree-predcom.c branches/gcc-4_8-branch/gcc/tree-ssa-loop-im.c
Fixed.