This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Fix C++ strict-aliasing issues with memcpy folding


On Tue, 2 Feb 2010, Gabriel Dos Reis wrote:

> On Tue, Feb 2, 2010 at 5:13 AM, Richard Guenther <rguenther@suse.de> wrote:
> > On Tue, 2 Feb 2010, Gabriel Dos Reis wrote:
> >
> >> On Tue, Feb 2, 2010 at 4:25 AM, Richard Guenther <rguenther@suse.de> wrote:
> >> > On Tue, 2 Feb 2010, Gabriel Dos Reis wrote:
> >> >
> >> >> On Sun, Jan 24, 2010 at 3:18 PM, Richard Guenther <rguenther@suse.de> wrote:
> >> >> > On Sun, 24 Jan 2010, Mark Mitchell wrote:
> >> >> >
> >> >> >> Richard Guenther wrote:
> >> >> >>
> >> >> >> > The type of the first argument of memcpy doesn't mean anything, so no.
> >> >> >>
> >> >> >> I don't claim to be up on the current state of the standards with
> >> >> >> respect to these issues. ?The last time I tried to understand all the
> >> >> >> issues, it appeared that the committees had simply punted; they didn't
> >> >> >> seem to want to think about what these things meant.
> >> >> >>
> >> >> >> But, what evidence do we have that:
> >> >> >>
> >> >> >> + ? __builtin_memcpy (&i, p, 4);
> >> >> >> + ? if (*(float *)&i != 1.0)
> >> >> >>
> >> >> >> is in fact legal and intended to change the dynamic type of the storage?
> >> >> >> ?(In terms of the standard, that has to be "memcpy", not
> >> >> >> "__builtin_memcpy", but we can understand those to be the same in this
> >> >> >> context.)
> >> >> >
> >> >> > The evidence is 3.8/4 - 'A program may end the lifetime of any object
> >> >> > by reusing the storage which the object occupies ...'.
> >> >>
> >> >> I read "ending the lifetime" of an object very differently from changing
> >> >> the dynamic type of an object. ?I would be very surprised to see any
> >> >> evidence in the C++ standard to support the construct shown above.
> >> >>
> >> >> > ?Re-using
> >> >> > the storage involves constructing a new object in place.
> >> >>
> >> >> That is correct. ?But, that does not necessarily imply that
> >> >> the above construct is blessed.
> >> >>
> >> >> > ?For POD
> >> >> > types a copy using memcpy should be a standard conforming way
> >> >> > (I'm of course always using middle-end view here and do not see
> >> >> > a reason to not allow this).
> >> >>
> >> >> I believe the standard explicitly blesses using a roundtrip memcpy().
> >> >> I do not see any mention of changing declared object's dynamic type
> >> >> with memcpy().
> >> >>
> >> >> >
> >> >> > I failed to find any explicit reference to the effect of memcpy
> >> >> > on dynamic types in the C++ standard.
> >> >>
> >> >> I believe in that case, we have "undefined behaviour" :-)
> >> >>
> >> >> >?The C standard however
> >> >> > has in 6.5/6 'If a value is copied into an object having no
> >> >> > declared type using memcpy or memmove, or is copied as an
> >> >> > array of character type, then the effective type of the
> >> >> > modified object for that access and for subsequent accesses that
> >> >> > do not modify the values is the effective type of the object
> >> >> > from which the value is copied, if it has one'. ?The C standard
> >> >> > restricts itself to speak about anonymous memory only here
> >> >> > which effectively would make the testcase undefined in C
> >> >> > (at least that's my reading).
> >> >>
> >> >> yeah, I think that means objects allocated from free store.
> >> >> However, C++ uses "object" in a more restricted form than C.
> >> >
> >> > Do you have any opinion on the libstdc++ case which effectively does
> >> >
> >> > ?struct Y;
> >> > ?struct X { union T { long alignme; char mem[32]; } m; } x, y, tem;
> >> > ?new (&x.m.mem) Y;
> >>
> >> At this point x.m.mem holds a Y object (the character array essentially
> >> acts like an object container, due to the ambiguous status of char,
> >> and C++ lacking proper 'byte' types.)
> >>
> >> > ?tem = x;
> >>
> >> C++03, curiously, does not define copy-n-assignment of union
> >> objects. ?C++0x will fix that by prescribing a memcpy
> >>
> >> ? ?http://www.open-std.org/JTC1/SC22/WG21/docs/cwg_defects.html#653
> >>
> >> which copies the union-subobject x.m, therefore morally transferring
> >> the active dynamic type of the 'x.m' sub-object to tem.m
> >>
> >> > ?x = y;
> >>
> >> at this point 'y' was never initialized, so you should get
> >> an undefined behaviour on the read of 'y.m'. ?I don't know
> >> that is important for the discussion though.
> >>
> >> > ?y = tem;
> >>
> >> Assuming copy-n-assignment of unions is fixed as above,
> >> this should make y.m have the same active dynamic type
> >> as that of tem.m.
> >>
> >> >
> >> > ? ?Basically using an lvalue of type X to access the object of type Y it
> >> > constructed in-place in mem?
> >>
> >> I do not see an lvalue of type X used to access the object Y.
> >
> > Isn't 'x' an lvalue of type X?
> 
> yes, 'x' is lvalue of type X.  But, I do not see it used to access
> the Y object.  At most, we would have either 'x.m' or 'x.m.mem',
> none of which is of type X.
> 
> >?As the in-place constructed object of type Y is embedded in 'x',
> > isnt 'x' also accessing that object?
> 
> I do not see that access in the sample code.

See below...

> In general, when we write 'obj.member' to access a sub-object of
> an object 'obj', we don't say we use 'obj' to access the data designated
> by 'obj.member' -- rather, we talk of the complete path 'obj.member'.
> Otherwise, we would have to throw away almost all of C++ type rules.
> 
> >
> > Basically the middle-end sees
> >
> > ?new (&x.m.mem[0]) Y;
> > ?((Y *)&x.m.mem[0])->foo = 1;
> > ?tem = x;

... the copy-assignment is seen as a load from 'x' and a store to 'tem'
by the middle-end.  Supposedly this is what the frontend emits
for copy-assignments of POD types.

> > and the store does not conflict with the load of x as far
> > as TBAA is concerned (there is no Y in the alias-subsets of X).
> > So the scheduler ends up re-ordering the accesses.
> >
> > Thus I'm looking for a way to claim that the copy as above
> > invokes undefined behavior (because of that in-place construction),
> > and that the only valid way to copy 'x' is by using memcpy
> > (independed on that being broken by the middle-end).
> 
> Is there a way to tell the middle end that in the assignment
> 
>     tem = x;
> 
> both members of x.m are indistinguishable in terms of aliasing?
> Or would that be too pessimistic an assumption for the optimizer?

That wouldn't be even enough.  Or I don't understand what you
mean with indistinguishable.

> > So, is the copy tem = x above indeed invoking undefined behavior?
> 
> Technically, that assignment is undefined behaviour in C++03, but
> I think a compiler that goes by that would be terrible from practical
> point of view -- that omission was not intended, as indicated in the DR.
> That will be fixed soon, by requiring the compiler to do a memcpy().

So every copy-assignment of a POD type will have to be a memcpy?

> Now, I do not know whether the g++ front-end will insert that memcpy(),
> or if it would let the middle end do it.  Either way, I think g++ today
> should treat that assignment as making 'tem.m' have the same active
> dynamic type as 'x.m'.  Now, you're the middle-end expert, and I have
> no clue about how difficult that is to teach GCC with the current
> infrastructure.

It would be the frontends responsibility to translate tem = x
with using memcpy as only the frontend knows that the copy has
to conflict with stores through arbitrary types.

Whether using memcpy (or any other middle-end notion with the
same effect) or simply disabling type-based aliasing for C++ will
end up more efficient remains to be seen ...

Richard.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]