Similar to pr93051. The optimizer sometimes changes `p == q ? p : q` to `q`. This is wrong when the actual provenance of `p` differs from that of `q`. There are two forms -- with the actual conditional operator and with the `if` statement. The ideal example would be constructed with the help of restricted pointers but it's run into a theoretical problem -- see the first testcase in pr92963. My other examples require two conditionals to eliminate the possibility of UB. Comparison of integers should give stable results, hopefully that would be enough to demonstrate the problem. Example with the conditional operator and with dead malloc (the wrong optimization seems to be applied before tree-opt): ---------------------------------------------------------------------- #include <stdint.h> #include <stdlib.h> #include <stdio.h> __attribute__((noipa,optnone)) // imagine it in a separate TU static void *opaque(void *p) { return p; } int main() { int *q = malloc(sizeof(int)); opaque(q); uintptr_t iq = (uintptr_t)(void *)q; free(q); int *p = malloc(sizeof(int)); opaque(p); uintptr_t ip = (uintptr_t)(void *)p; uintptr_t ir = ip == iq ? ip : iq; if (ip == iq) { *p = 1; *(int *)(void *)ir = 2; printf("result: %d\n", *p); } } ---------------------------------------------------------------------- $ gcc -std=c11 -pedantic -Wall -Wextra -Wno-attributes test.c && ./a.out result: 2 $ gcc -std=c11 -pedantic -Wall -Wextra -Wno-attributes -O3 test.c && ./a.out result: 1 ---------------------------------------------------------------------- gcc x86-64 version: gcc (GCC) 10.0.0 20191223 (experimental) ----------------------------------------------------------------------
Example with a past-the-end pointer (vrp1, similar to but 93051, comment 0 but this time with PHI): ---------------------------------------------------------------------- #include <stdio.h> __attribute__((noipa,optnone)) // imagine it in a separate TU static void *opaque(void *p) { return p; } static int been_there = 0; static int *f(int *p, int *q) { if (p == q) { been_there = 1; return p; } else { been_there = 0; return q; } } int main() { int x[5]; int y[1]; int *p = x; int *q = y + 1; opaque(q); int *p1 = opaque(p); // prevents early optimization of x==y+1 int *r = f(p1, q); if (been_there) { *p = 1; *r = 2; printf("result: %d\n", *p); } } ---------------------------------------------------------------------- $ gcc -std=c11 -pedantic -Wall -Wextra -Wno-attributes test.c && ./a.out result: 2 $ gcc -std=c11 -pedantic -Wall -Wextra -Wno-attributes -O3 test.c && ./a.out test.c: In function ‘main’: test.c:33:9: warning: array subscript 1 is outside array bounds of ‘int[1]’ [-Warray-bounds] 33 | *r = 2; | ^~ test.c:22:9: note: while referencing ‘y’ 22 | int y[1]; | ^ result: 1 ---------------------------------------------------------------------- gcc x86-64 version: gcc (GCC) 10.0.0 20191223 (experimental) ----------------------------------------------------------------------
Example with a dead malloc (phiopt2): ---------------------------------------------------------------------- #include <stdint.h> #include <stdlib.h> #include <stdio.h> __attribute__((noipa,optnone)) // imagine it in a separate TU static void *opaque(void *p) { return p; } static int been_there = 0; static uintptr_t f(uintptr_t ip, uintptr_t iq) { if (ip == iq) { been_there = 1; return ip; } else { been_there = 0; return iq; } } int main() { int *q = malloc(sizeof(int)); opaque(q); uintptr_t iq = (uintptr_t)(void *)q; free(q); int *p = malloc(sizeof(int)); opaque(p); uintptr_t ip = (uintptr_t)(void *)p; uintptr_t ir = f(ip, iq); if (been_there) { *p = 1; *(int *)(void *)ir = 2; printf("result: %d\n", *p); } } ---------------------------------------------------------------------- $ gcc -std=c11 -pedantic -Wall -Wextra -Wno-attributes test.c && ./a.out result: 2 $ gcc -std=c11 -pedantic -Wall -Wextra -Wno-attributes -O3 test.c && ./a.out result: 1 ---------------------------------------------------------------------- gcc x86-64 version: gcc (GCC) 10.0.0 20191223 (experimental) ----------------------------------------------------------------------
There is a C defect report about these cases.
Could you please provide a bit more specific reference? If you mean various discussions about C provenance semantics then they are not about these cases. All examples in pr93051 and in this pr fully respect provenance -- it's the compiler who changes the provenance. In some sense dealing with these bugs is a prerequisite for a meaningful discussion of C provenance semantics: it's hard to reason about possible boundaries of provenance when there are problems with cases where provenance is definitely right.
1. It should be noted that the idea of problems arising from `p == q ? p : q` is from Chung-Kil Hur via bug 65752, comment 15. 2. clang bug -- https://bugs.llvm.org/show_bug.cgi?id=44374.
I think for the integer issue there's an exact dup. Time to add a meta-bug linking all of them? All of the issues really point to the same very fundamental issue - provenance does not affect the actual value and since optimizing compilers track value equivalence provenance gets messed up. provenance is a dead end.