[Bug tree-optimization/65752] Too strong optimizations int -> pointer casts

Mon Nov 16 12:16:00 GMT 2015

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65752

--- Comment #45 from Jeehoon Kang <jeehoon.kang at sf dot snu.ac.kr> ---
> I think this is not true.  For example with MatLab (might be sth else,
> if I don't remember correctly) you are required to pass pointers to
> arrays in two halves in double(!) values (I believe the only function
> argument type they support).  GCC happily makes points-to analysis work 
> through those.

Thank you for giving me an example.  Yet, I think it is a little bit
unfortunate for MatLab (or sth else) to pass pointers by packing two into a
double, at least due to the readability problem.  I think it is beyond the
intended usage of the C/C++ language, but I understand that GCC is the
time-honored compiler for various software and systems.

> The other (unfortunate) thing is that in GCC pointer subtraction
> is always performed on integers, thus for the C source code
> 
>  int idx = ptr1 - ptr2;
> 
> we internally have sth like
> 
>  int idx = ((long)ptr1 - (long)ptr2) / 4;
> 
> so you can't really treat pointers as "escaped" here without loss.

Thank you for giving me the information.  I don't know the GCC internals, so I
would like to ask how much it would cost to introduce the syntax for pointer
subtractions.  I hope it is not that huge, but I really don't have any idea.

> Note that we've had this (kind of) in the past and it tried to go
> without making pointers escaped at these points but only consider
> casts from integers to pointers as pointing to anything.  But
> that's wrong correctness wise (not then treating the cast to integer
> as escape point).

Treating the cast to integer as escape point is proven-correct by a
machine-checked proof (in Coq) for various standard optimization examples, such
as CP, DCE, dead allocation elimination, etc.  For more detail, please see the
paper above-mentioned.

> I also don't think doing the above would solve the cases of equality
> compares of pointes themselves.  (hopefully undefined)

The formal memory model in the paper I mentioned invokes undefined behavior for
the pointer equality comparison example above.  In the formal model, a pointer
is represented as a pair of a memory block identifier (l) and an offset (o). 
(cf. the CompCert memory model)  When a memory is malloc-ed or alloca-ed, a new
memory block identifier is assigned.  A pointer equality, say of (l, o) and
(l', o'), invokes undefined behavior when l != l'.

So for the following example (by Alexander Cherepanov):

    #include <stdint.h>
    #include <stdio.h>

    int main() {
       int y, x = 0;
       int *volatile v = &x;
       int *xp = v;
       int *i = &y + 1;

       if (xp != i) {
         printf("hello\n");
         xp = i;
       }

       printf("%d\n", xp == &x);
    }

Say y and x are allocated at l1 and l2, respectively.  Then xp = (l2, 0), and i
= (l1, 4).  Thus comparing xp and i invokes undefined behavior, since l1 != l2.