Created attachment 52166 [details] Reduced test case The loop in the attached reduced test case does not terminate when compiled with -O3 or -O2 -funswitch-loops with GCC 9.4.0 or GCC 10.3.1 even though it should only iterate 144 times. Curiously, the programs works as expected when using an if() instead of the ternary operator in the assume macro but maybe that's just luck. I could not reproduce the issue with GCC 8.5.0, GCC 11.2.1 or my GCC 12 checkout from 20220102.
Looks like it works in GCC 11.1.0.
Note ?: is not the difference as I can reproduce it with: #define assume(Expression) do { if((Expression)) (void)0; else __builtin_unreachable(); } while(0)
This might be a latent bug in GCC 11 even. The only difference in the IR before unswitch (besides BB reordering) is: GCC 10.x: <bb 5> [local count: 1073741824]: # it$m_pos$x_21 = PHI <0(4), _26(13)> if (it$4_29 == 3) goto <bb 11>; [11.00%] else goto <bb 6>; [89.00%] GCC 11+: <bb 10> [local count: 1073741824]: # it$m_pos$x_19 = PHI <0(9), _24(13)> if (it$4_32 != 3) goto <bb 3>; [89.00%] else goto <bb 11>; [11.00%] Note bb 11 is the return.
Unsiwtch loops looks the same in GCC 10.3.0 and 11.1.0 so ...
The difference between GCC 11 and GCC 10.3 where the code generation difference really comes into play is vrp2.
Actually there are some ranges on some SSA Names before hand which look like that matter: # RANGE [1, 1]
Those come from dom3. Here is a testcase which runs in finite time to test this (plus it is not a C testcase also): #define assume(Expression) do { if((Expression)) (void)0; else __builtin_unreachable(); } while(0) struct GridZYXIterator { short x; short y; short z; }; static inline void f(struct GridZYXIterator *t) { t->x++; if(t->x == 3) { t->x = 0; t->y++; if(t->y == 3) { t->y = 0; t->z++; } } } static int t = 0; static void g() __attribute__((noipa)); static void g(){t++; if (t > 27) __builtin_abort();}; int main() { struct GridZYXIterator it = {0,0,0}; while(it.z != 3) { assume(it.y < 3 && it.z < 3); g(); f(&it); } } Confirmed.
(In reply to Andrew Pinski from comment #7) > Those come from dom3. > > Here is a testcase which runs in finite time to test this (plus it is not a > C testcase also): This only fails at -O2 -funswitch-loops and not -O3 and only fails with GCC 10.x while the original one fails even with 9.x.
There's nothing wrong with the unswitching I think. The exit test is optimized out in VRP2 where the preceeding DOM pass ended up putting some more SSA ranges in, specifically those derived from the unreachable(). Disabling DOM3 avoids the miscompile. After DOM3 we have <bb 2> [local count: 118111600]: goto <bb 4>; [100.00%] <bb 4> [local count: 228582456]: # ivtmp.22_40 = PHI <0(2), ivtmp.22_42(3)> # RANGE [-32768, 2] it$z_43 = (short int) ivtmp.22_40; # RANGE [1, 1] _32 = it$z_43 <= 2; <bb 5> [local count: 443025880]: # ivtmp.14_3 = PHI <0(4), ivtmp.14_19(9)> # RANGE [-32768, 2] it$y_2 = (short int) ivtmp.14_3; # RANGE [1, 1] _10 = it$y_2 <= 2; # RANGE [1, 1] _29 = _10 & _32; if (ivtmp.22_40 != 3) goto <bb 7>; [89.00%] else goto <bb 6>; [11.00%] note how it$z_43, derived from ivtmp.22_40 has a [-32768, 2] range and the exit test is ivtmp.22_40 != 3. Now, if ranger would come along and we'd ask it for the _43 def with the global range [-32768, 2] how that constrains _40 it would compute a range for that that doesn't hold at this point. SO what we're likely seeing is bad effects of our handling of globalizing ranges for a derived from if (a) __builtin_unreachable (); There have been issues with that in the past. GCC 10 vs 11 we have the extra g:c76b3f9e83353a4cd437ca137c1fb835c9b5c21f which likely fixed this.
duplicate, I'll pick the fix *** This bug has been marked as a duplicate of bug 97953 ***