This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: C provenance semantics proposal


On Thu, Apr 18, 2019 at 11:31 AM Richard Biener
<richard.guenther@gmail.com> wrote:
>
> On Wed, Apr 17, 2019 at 4:12 PM Uecker, Martin
> <Martin.Uecker@med.uni-goettingen.de> wrote:
> >
> > Am Mittwoch, den 17.04.2019, 15:34 +0200 schrieb Richard Biener:
> > > On Wed, Apr 17, 2019 at 2:56 PM Uecker, Martin
> > > <Martin.Uecker@med.uni-goettingen.de> wrote:
> > > >
> > > > Am Mittwoch, den 17.04.2019, 14:41 +0200 schrieb Richard Biener:
> > > > > On Wed, Apr 17, 2019 at 1:53 PM Uecker, Martin
> > > > > <Martin.Uecker@med.uni-goettingen.de> wrote:
> > > > > >
> > > > > > >  Since
> > > > > > > your proposal is based on an abstract machine there isn't anything
> > > > > > > like a pointer with multiple provenances (which "anything" is), just
> > > > > > > pointers with no provenance (pointing outside of any object), right?
> > > > > >
> > > > > > This is correct. What the proposal does though is put a limit
> > > > > > on where pointers obtained from integers are allowed to point
> > > > > > to: They cannot point to non-exposed objects. I assume GCC
> > > > > > "anything" provenances also cannot point to all possible
> > > > > > objects.
> > > > >
> > > > > Yes.  We exclude objects that do not have their address taken
> > > > > though (so somewhat similar to your "exposed").
> > > >
> > > > Also if the address never escapes?
> > >
> > > Yes.
> >
> > Then with respect to "expose" it seems GCC implements
> > a superset which means it allows some behavior which
> > is undefined according to the proposal. So all seems
> > well with respect to this part.
> >
> >
> > With respect to tracking provenance through integers
> > some changes might be required.
> >
> > Let's consider this example:
> >
> > int x;
> > int y;
> > uintptr_t pi = (uintptr_t)&x;
> > uintptr_t pj = (uintptr_t)&y;
> >
> > if (pi + 4 == pj) {
> >
> >    int* p = (int*)pj; // can be one-after pointer of 'x'
> >    p[-1] = 1;         // well defined?
> > }
> >
> > If I understand correctly, a pointer obtained from
> > pi + 4 would have a "anything" provenance (which is
> > fine). But the pointer obtained from 'pj' would have the
> > provenance of 'y' so the access to 'x' would not
> > be allowed.
>
> Correct.  This is the most difficult case for us to handle
> exactly also because (also valid for the proposal?)
>
> int x;
> int y;
> uintptr_t pi = (uintptr_t)&x;
> uintptr_t pj = (uintptr_t)&y;
>
> if (pi + 4 == pj) {
>
>    int* p = (int*)(pi + 4); // can be one-after pointer of 'x'
>    p[-1] = 1;         // well defined?
> }
>
> while well-handled by GCC in the written form (as you
> say, pi + 4 yields "anything" provenance), GCC itself
> may tranform it into the first variant by noticing
> the conditional equivalence and substituting pj for
> pi + 4.
>
> > But according to the preferred version of
> > our proposal, the pointer could also be used to
> > access 'x' because it is also exposed.
> >
> > GCC could make pj have a "anything" provenance
> > even though it is not modified. (This would break
> > some optimization such as the one for Matlab.)
> >
> > Maybe one could also refine this optimization to check
> > for additional conditions which rule out the case
> > that there is another object the pointer could point
> > to.
>
> The only feasible solution would be to not track
> provenance through non-pointers and make
> conversions of non-pointers to pointers have
> "anything" provenance.
>
> The additional issue that appears here though
> is that we cannot even turn (int *)(uintptr_t)p
> into p anymore since with the conditional
> substitution we can then still arrive at
> effectively (&y)[-1] = 1 which is of course
> undefined behavior.
>
> That is, your proposal makes
>
>  ((int *)(uintptr_t)&y)[-1] = 1
>
> well-defined (if &y - 1 == &x) but keeps
>
>   (&y)[-1] = 1
>
> as undefined which strikes me as a little bit
> inconsistent.  If that's true it's IMHO worth
> a defect report and second consideration.

Similarly that

int x;
int y;
uintptr_t pj = (uintptr_t)&y;

if (&x + 1 == &y) {

   int* p = (int*)pj; // can be one-after pointer of 'x'
   p[-1] = 1;         // well defined?
}

is undefined but when I add a no-op

 (uintptr_t)&x;

it is well-defined is undesirable.  Can this no-op
stmt appear in another function?  Or even in
another translation unit (if x and y are global variables)?
And does such stmt have to be present (in another
TU) to make the example valid in this case?

To me all this makes requiring exposal through a cast
to a non-pointer (or accessing its representation) not
in any way more "useful" for an optimizing compiler than
modeling exposal through address-taking.

Richard.

> Richard.
>
> > Best,
> > Martin


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]