Bug 103986 - [9/10 Regression] Miscompilation with -O2 -funswitch-loops and __builtin_unreachable
Summary: [9/10 Regression] Miscompilation with -O2 -funswitch-loops and __builtin_unre...
Status: RESOLVED DUPLICATE of bug 97953
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: 10.3.1
: P3 normal
Target Milestone: 9.5
Assignee: Not yet assigned to anyone
URL:
Keywords: needs-bisection, wrong-code
Depends on:
Blocks:
 
Reported: 2022-01-12 03:38 UTC by Daniel Scharrer
Modified: 2022-01-12 08:20 UTC (History)
0 users

See Also:
Host:
Target:
Build:
Known to work: 8.5.0
Known to fail: 10.1.0, 10.2.0, 10.3.0, 9.1.0, 9.4.0
Last reconfirmed: 2022-01-12 00:00:00


Attachments
Reduced test case (476 bytes, text/plain)
2022-01-12 03:38 UTC, Daniel Scharrer
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Daniel Scharrer 2022-01-12 03:38:15 UTC
Created attachment 52166 [details]
Reduced test case

The loop in the attached reduced test case does not terminate when compiled with -O3 or -O2 -funswitch-loops with GCC 9.4.0 or GCC 10.3.1 even though it should only iterate 144 times.

Curiously, the programs works as expected when using an if() instead of the ternary operator in the assume macro but maybe that's just luck.

I could not reproduce the issue with GCC 8.5.0, GCC 11.2.1 or my GCC 12 checkout from 20220102.
Comment 1 Andrew Pinski 2022-01-12 03:54:39 UTC
Looks like it works in GCC 11.1.0.
Comment 2 Andrew Pinski 2022-01-12 03:59:50 UTC
Note ?: is not the difference as I can reproduce it with:

#define assume(Expression) do { if((Expression)) (void)0; else __builtin_unreachable(); } while(0)
Comment 3 Andrew Pinski 2022-01-12 04:05:44 UTC
This might be a latent bug in GCC 11 even.
The only difference in the IR before unswitch (besides BB reordering) is:

GCC 10.x:
  <bb 5> [local count: 1073741824]:
  # it$m_pos$x_21 = PHI <0(4), _26(13)>
  if (it$4_29 == 3)
    goto <bb 11>; [11.00%]
  else
    goto <bb 6>; [89.00%]

GCC 11+:
  <bb 10> [local count: 1073741824]:
  # it$m_pos$x_19 = PHI <0(9), _24(13)>
  if (it$4_32 != 3)
    goto <bb 3>; [89.00%]
  else
    goto <bb 11>; [11.00%]

Note bb 11 is the return.
Comment 4 Andrew Pinski 2022-01-12 06:29:38 UTC
Unsiwtch loops looks the same in GCC 10.3.0 and 11.1.0 so ...
Comment 5 Andrew Pinski 2022-01-12 06:36:32 UTC
The difference between GCC 11 and GCC 10.3 where the code generation difference really comes into play is vrp2.
Comment 6 Andrew Pinski 2022-01-12 06:38:12 UTC
Actually there are some ranges on some SSA Names before hand which look like that matter:
  # RANGE [1, 1]
Comment 7 Andrew Pinski 2022-01-12 07:08:04 UTC
Those come from dom3.

Here is a testcase which runs in finite time to test this (plus it is not a C testcase also):
#define assume(Expression) do { if((Expression)) (void)0; else __builtin_unreachable(); } while(0)
struct GridZYXIterator {	
	short x;
	short y;
	short z;
};
static inline
void f(struct GridZYXIterator *t) {
    t->x++;
    if(t->x == 3) {
        t->x = 0;
        t->y++;
        if(t->y == 3) {
            t->y = 0;
            t->z++;
        }
    }
}
static int t = 0;
static void g() __attribute__((noipa));
static void g(){t++; if (t > 27) __builtin_abort();};
int main()
{
	struct GridZYXIterator it = {0,0,0};
	while(it.z != 3) {
        assume(it.y < 3 && it.z < 3);
		g();
		f(&it);
	}	
}

Confirmed.
Comment 8 Andrew Pinski 2022-01-12 07:11:14 UTC
(In reply to Andrew Pinski from comment #7)
> Those come from dom3.
> 
> Here is a testcase which runs in finite time to test this (plus it is not a
> C testcase also):

This only fails at -O2 -funswitch-loops and not -O3 and only fails with GCC 10.x while the original one fails even with 9.x.
Comment 9 Richard Biener 2022-01-12 08:17:18 UTC
There's nothing wrong with the unswitching I think.  The exit test is optimized out in VRP2 where the preceeding DOM pass ended up putting some more SSA ranges
in, specifically those derived from the unreachable().  Disabling DOM3 avoids the miscompile.

After DOM3 we have

  <bb 2> [local count: 118111600]:
  goto <bb 4>; [100.00%]

  <bb 4> [local count: 228582456]:
  # ivtmp.22_40 = PHI <0(2), ivtmp.22_42(3)>
  # RANGE [-32768, 2]
  it$z_43 = (short int) ivtmp.22_40;
  # RANGE [1, 1]
  _32 = it$z_43 <= 2;

  <bb 5> [local count: 443025880]:
  # ivtmp.14_3 = PHI <0(4), ivtmp.14_19(9)>
  # RANGE [-32768, 2]
  it$y_2 = (short int) ivtmp.14_3;
  # RANGE [1, 1]
  _10 = it$y_2 <= 2;
  # RANGE [1, 1]
  _29 = _10 & _32;
  if (ivtmp.22_40 != 3)
    goto <bb 7>; [89.00%]
  else
    goto <bb 6>; [11.00%]

note how it$z_43, derived from ivtmp.22_40 has a [-32768, 2] range and the
exit test is ivtmp.22_40 != 3.

Now, if ranger would come along and we'd ask it for the _43 def with
the global range [-32768, 2] how that constrains _40 it would compute
a range for that that doesn't hold at this point.  SO what we're likely
seeing is bad effects of our handling of globalizing ranges for a
derived from if (a) __builtin_unreachable ();

There have been issues with that in the past.  GCC 10 vs 11 we have the
extra g:c76b3f9e83353a4cd437ca137c1fb835c9b5c21f which likely fixed this.
Comment 10 Richard Biener 2022-01-12 08:20:08 UTC
duplicate, I'll pick the fix

*** This bug has been marked as a duplicate of bug 97953 ***