Bug 82282 - PRE cannot blindly fold integer-to-pointer/pointer-to-integer round-trips
Summary: PRE cannot blindly fold integer-to-pointer/pointer-to-integer round-trips
Status: UNCONFIRMED
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: 6.4.0
: P3 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on: 65752
Blocks:
  Show dependency treegraph
 
Reported: 2017-09-21 16:31 UTC by Nuno Lopes
Modified: 2019-04-12 05:20 UTC (History)
8 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Nuno Lopes 2017-09-21 16:31:07 UTC
The following program gets miscompiled by gcc:

$ cat foo.c
#include <stdio.h>
#include <stdint.h>

int* glb;
int* tmp[10];

void main(int argc, char* argv) {
  int x[1] = { 555 }, y[1] = { 777 };
  uintptr_t u = (uintptr_t) (x + 1);
  uintptr_t v = (uintptr_t) y;
  uintptr_t w;

  int b1 = u != v;
  int b2 = u+1 != v+1;

  int* z;

  if (b1) {
    printf("b1 TRUE.\n");
    v = u;
  }
  glb = (int*) v;

  for (int i=0; i < 10; i++)
    tmp[i] = (int*) v;

  if (b2) {
    printf("b2 TRUE.\n");
    glb = x;
  }

  w = u;
  for (int i = 0; i < 100; ++i) {
    if (w < v) {
      w += 1;
    }
  }

  if (v == w) {
    z = x;
  } else {
    printf("IMPOSSIBLE!\n");
    z = y;
  }
  *z = 555;

  *glb = 1;

  printf("x=%d y=%d\n", x[0], y[0]);
}

$ gcc -O3 -fdump-tree-all foo.c
$ ./a
x=555 y=777


We start with:
  u_14 = (uintptr_t) &MEM[(void *)&x + 4B];
  v_15 = (uintptr_t) &y;

  if (u_14 != v_15)
    goto <bb 3>;
  else
    goto <bb 4>;

  <bb 4>:
  # v_1 = PHI <v_15(2), u_14(3)>
  v.0_19 = (int *) v_1
  glb = v.0_19;


Which is then (correctly) transformed by phiopt2 to:
  if (u_14 != v_15)
    goto <bb 3>;
  else
    goto <bb 4>;

  <bb 4>:
  # v_1 = PHI <u_14(2), u_14(3)>
  v.0_19 = (int *) v_1;
  glb = v.0_19;


Then PRE incorrectly removes the pointer-to-integer/integer-to-point cast round-trip through the PHI node:
  glb = &MEM[(void *)&x + 4B];

This is wrong because we've now lost the information that glb may also alias with y, as seen by the alias analysis report:
glb = { ESCAPED NONLOCAL x }


After a few more rounds of copy propagation and "dom3", "777" is constant propagated to the printf call, since the store to glb cannot possibly alias 'y' anymore.
All the subsequent transformations are correct. The bug is that PRE cannot blindly do a transformation of "int2ptr(ptr2int(x)) -> x".

Test case by Gil Hur.
Comment 1 Richard Biener 2017-09-22 08:39:16 UTC
This is really the same / similar as PR82177.

  if (u_14 != v_15)
    goto <bb 3>;
  else
    goto <bb 4>;

  <bb 4>:
  # v_1 = PHI <u_14(2), u_14(3)>
  v.0_19 = (int *) v_1;
  glb = v.0_19;

so there's a missed trivial optimization not done by phiopt to

  u_14 = (uintptr_t) &MEM[(void *)&x + 4B];
...
  if (u_14 != v_15)
    goto <bb 3>;
  else
    goto <bb 4>;

  <bb 4>:
  v.0_19 = (int *) u_14;

where (int *) u_14 would be folded to just &MEM[(void *)&x + 4B].

It's not partial redundancy elimination but value-numbering does this
kind of cleanup.  forwprop would also do it and there's nothing wrong
with this.

In C you are only allowed to convert a pointer back and forth
with [u]intptr_t, so when you convert it back you get the original
pointer which means it still points to one-after 'x'.  You can't
make it possibly point to 'y' by this trick.  Specifically you
are not allowed to make a uintptr_t from a pointer to object A,
do some magic on that value, convert it to a pointer again and
expect it to point to object B.
Comment 2 Nuno Lopes 2017-09-22 09:19:24 UTC
This is different from PR82177. That bug is in AA, this one is not.

See the C source:
  uintptr_t u = (uintptr_t) (x + 1);
  uintptr_t v = (uintptr_t) y;

  // if b1 true, then b2 must be true as well
  int b1 = u != v;
  int b2 = u+1 != v+1;

  if (b1) {
    v = u;
  }
  // glb = y if b1 false, glb = u if b1 true
  glb = (int*) v;

  if (b2) {
    // glb = x
    glb = x;
  }

  // glb = y if b1/b2 false, glb = x if b1/b2 true

So at this point, glb can alias either x or y. There's not UB. The cast is from a legitimate value.
The problem is that gcc replaces
  if (u != v) {
    v = u;
  }

simply with:
v = u;

Which is a correct integer transformation. The deal is that it breaks the data-flow dependency on 'y'. When the int2ptr cast is removed from "glb = (int*)v", we get "glb = x + 1", which is wrong.  This behavior couldn't ever happen in the C program.
Comment 3 rguenther@suse.de 2017-09-22 09:37:05 UTC
On Fri, 22 Sep 2017, nunoplopes at sapo dot pt wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82282
> 
> --- Comment #2 from Nuno Lopes <nunoplopes at sapo dot pt> ---
> This is different from PR82177. That bug is in AA, this one is not.
> 
> See the C source:
>   uintptr_t u = (uintptr_t) (x + 1);
>   uintptr_t v = (uintptr_t) y;
> 
>   // if b1 true, then b2 must be true as well
>   int b1 = u != v;
>   int b2 = u+1 != v+1;
> 
>   if (b1) {
>     v = u;
>   }
>   // glb = y if b1 false, glb = u if b1 true
>   glb = (int*) v;
> 
>   if (b2) {
>     // glb = x
>     glb = x;
>   }
> 
>   // glb = y if b1/b2 false, glb = x if b1/b2 true
> 
> So at this point, glb can alias either x or y. There's not UB. The cast is from
> a legitimate value.
> The problem is that gcc replaces
>   if (u != v) {
>     v = u;
>   }
> 
> simply with:
> v = u;
> 
> Which is a correct integer transformation. The deal is that it breaks the
> data-flow dependency on 'y'. When the int2ptr cast is removed from "glb =
> (int*)v", we get "glb = x + 1", which is wrong.  This behavior couldn't ever
> happen in the C program.

You say that deriving from the equivalency v == u the equivalency
&x + 1 == &y is wrong, correct?  But you are writing an obfuscated

 u = (uintptr_t) (x + 1);
 glb = (int *)u;

and expect glb now magically point to y as well.

Note that the points-to run after phiopt (it re-runs right from before
PRE) produces:

glb.2_3, points-to non-local, points-to escaped, points-to NULL, points-to 
vars: { D.2337 } (escaped)

because there we already have the

  # v_1 = PHI <u_14(2), u_14(3)>
  v.0_19 = (int *) v_1;
  glb = v.0_19;

and thus it sees v_1 only pointing to what u points to.

So this is the same as the other PR - it is about an alias derived
from the u == v equivalency.
Comment 4 Nuno Lopes 2017-09-22 10:55:06 UTC
There are two major transformations going on:
  if (u != v) {
    v = u;
  }
=>
  v = u

(with v, u integers)

and:

  glb = (int*)(uinptr_t)foo)
=>
  glb = foo


Doing both triggers the end-to-end miscompilation for a C program that exhibits no UB.  That means that one of transformations is wrong.
Disabling integer propagation would be really weird and bad.
So we are left with making "(int*)(uinptr_t)foo -> foo" an invalid transformation in general.  It's correct in some cases, but not always.

AA is behaving correctly because it is working just over pointers (since integer casts were removed) and thus it is allowed to looking strictly at data-flow dependencies.

The fix, we believe, is simple: remove the transformation "(int*)(uinptr_t)foo -> foo", or enable it just in the cases when it's correct.
For example, it's correct in this case:
  int v = p; // p is provably dereferenceable
  int *q = v;
  *q = ..;
    =>
  *p = ...;

but it's not correct in this case (at GIMPLE/SSA level):
  int v = p; // p may *not* be dereferenceable
  int *q = v;
  *q = ..;
    =>
  *p = ...;
Comment 5 rguenther@suse.de 2017-09-22 11:26:15 UTC
On Fri, 22 Sep 2017, nunoplopes at sapo dot pt wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82282
> 
> --- Comment #4 from Nuno Lopes <nunoplopes at sapo dot pt> ---
> There are two major transformations going on:
>   if (u != v) {
>     v = u;
>   }
> =>
>   v = u
> 
> (with v, u integers)
> 
> and:
> 
>   glb = (int*)(uinptr_t)foo)
> =>
>   glb = foo
> 
> 
> Doing both triggers the end-to-end miscompilation for a C program that exhibits
> no UB.  That means that one of transformations is wrong.
> Disabling integer propagation would be really weird and bad.
> So we are left with making "(int*)(uinptr_t)foo -> foo" an invalid
> transformation in general.  It's correct in some cases, but not always.

The C standard says it is correct as in, it yields the same pointer.
Comment 6 Richard Biener 2019-04-11 13:17:14 UTC
I think elsewhere I noted that propagating an equivalency is likely what makes
those cases appear.  In this cases it would be phiopt.  Still not doing that
would have some bad effects on optimization.