Bug 88433 - wrong code for printf after a pointer cast from a pointer to an adjacent object
Summary: wrong code for printf after a pointer cast from a pointer to an adjacent object
Status: RESOLVED DUPLICATE of bug 49330
Alias: None
Product: gcc
Classification: Unclassified
Component: rtl-optimization (show other bugs)
Version: 9.0
: P3 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords: alias, wrong-code
Depends on: 49330
Blocks:
  Show dependency treegraph
 
Reported: 2018-12-10 18:18 UTC by Martin Sebor
Modified: 2024-01-20 22:39 UTC (History)
3 users (show)

See Also:
Host:
Target: x86_64-linux
Build:
Known to work:
Known to fail: 6.4.0, 7.3.0, 8.2.0, 9.0
Last reconfirmed: 2018-12-11 00:00:00


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Martin Sebor 2018-12-10 18:18:14 UTC
This is from Exploring C Semantics and Pointer Provenance:
  https://www.cl.cam.ac.uk/~pes20/cerberus/cerberus-popl2019.pdf

the following test case:
  https://cerberus.cl.cam.ac.uk/cerberus?defacto/provenance_basic_using_uintptr_t_global_yx.c

GCC emits code with different effects for each of the two functions below even when the objects x and y are adjacent to each other: in f(), the call to printf outputs the modified value of y.  In g(), however, it outputs the value of y before modification.

I consider the code in the test case undefined, but a) my understanding from Richard is that the middle-end intentionally doesn't track pointer provenance through integer conversions (and so doesn't necessarily treat this sort of "object hopping" as undefined), and b) the PNVI model outlined in the paper above and expected to be proposed for C2X makes this code valid (the model allows p to take on the provenance of y as a result of the integer <-> pointer casts).

Looking at the dumps, I think (a) is true for this test case in both f() and g().  The different output from g() appears to be due to the x86_64 back performing the assignment *p = 11 only after it has stored the value of y in the register passed to printf.  Other back-ends I've looked at produce the same output from g() as from f().

int y = 2, x = 1;

void f (void)
{
  long ix = (long)&x;
  long iy = (long)&y;

  ix += 4;

  int *p = (int*)ix;
  int *q = (int*)iy;

  if (p == q) {
    *p = 11;
    __builtin_printf ("%i", y);   // prints 11
  }
}

void g (void)
{
  long ix = (long)&x;
  long iy = (long)&y;

  ix += 4;

  int *p = (int*)ix;
  int *q = (int*)iy;

  if (!__builtin_memcmp (&p, &q, sizeof p)) {
    *p = 11;
    __builtin_printf ("%i", y);   // prints 2
  }
}
Comment 1 Martin Sebor 2018-12-11 00:50:09 UTC
Then again, the "problem" would disappear if the middle-end could be made to understand that memcmp(&p, &q, sizeof p) is the same thing as p == q for any integer or pointer types p and q.  So maybe it is a middle-end issue because the back-end doesn't know and can't tell that y is the same as *p.
Comment 2 Richard Biener 2018-12-11 09:22:36 UTC
I think this is a dup of the bug pointing to RTL alias analysis which cannot
properly distinguish between pointers and integers when following to base
terms.

We expand from

  <bb 2> [local count: 1073741825]:
  ix_5 = (long int) &x;
  ix_6 = ix_5 + 4;
  ix.1_1 = (int *) ix_6;
  p = ix.1_1;
  q = &y;
  _14 = MEM[(char * {ref-all})&p];
  _15 = MEM[(char * {ref-all})&q];
  if (_14 == _15)
    goto <bb 3>; [33.00%]
  else
    goto <bb 4>; [67.00%]

  <bb 3> [local count: 354334802]:
  *ix.1_1 = 11;
  y.4_3 = y;
  __builtin_printf ("%i", y.4_3); [tail call]

where RTL has plenty opportunity to track down ix.1_1 to &x + 4 which
makes it non-aliasing to y.

Note we are saved for f just beacuse of optimization propagating a
conditional equivalence p_5 == &y.

Dup of PR49330.
Comment 3 Martin Sebor 2020-03-20 20:21:39 UTC
Per comment #2.

*** This bug has been marked as a duplicate of bug 49330 ***