This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Fix C++ strict-aliasing issues with memcpy folding


On Tue, Feb 2, 2010 at 5:13 AM, Richard Guenther <rguenther@suse.de> wrote:
> On Tue, 2 Feb 2010, Gabriel Dos Reis wrote:
>
>> On Tue, Feb 2, 2010 at 4:25 AM, Richard Guenther <rguenther@suse.de> wrote:
>> > On Tue, 2 Feb 2010, Gabriel Dos Reis wrote:
>> >
>> >> On Sun, Jan 24, 2010 at 3:18 PM, Richard Guenther <rguenther@suse.de> wrote:
>> >> > On Sun, 24 Jan 2010, Mark Mitchell wrote:
>> >> >
>> >> >> Richard Guenther wrote:
>> >> >>
>> >> >> > The type of the first argument of memcpy doesn't mean anything, so no.
>> >> >>
>> >> >> I don't claim to be up on the current state of the standards with
>> >> >> respect to these issues. ?The last time I tried to understand all the
>> >> >> issues, it appeared that the committees had simply punted; they didn't
>> >> >> seem to want to think about what these things meant.
>> >> >>
>> >> >> But, what evidence do we have that:
>> >> >>
>> >> >> + ? __builtin_memcpy (&i, p, 4);
>> >> >> + ? if (*(float *)&i != 1.0)
>> >> >>
>> >> >> is in fact legal and intended to change the dynamic type of the storage?
>> >> >> ?(In terms of the standard, that has to be "memcpy", not
>> >> >> "__builtin_memcpy", but we can understand those to be the same in this
>> >> >> context.)
>> >> >
>> >> > The evidence is 3.8/4 - 'A program may end the lifetime of any object
>> >> > by reusing the storage which the object occupies ...'.
>> >>
>> >> I read "ending the lifetime" of an object very differently from changing
>> >> the dynamic type of an object. ?I would be very surprised to see any
>> >> evidence in the C++ standard to support the construct shown above.
>> >>
>> >> > ?Re-using
>> >> > the storage involves constructing a new object in place.
>> >>
>> >> That is correct. ?But, that does not necessarily imply that
>> >> the above construct is blessed.
>> >>
>> >> > ?For POD
>> >> > types a copy using memcpy should be a standard conforming way
>> >> > (I'm of course always using middle-end view here and do not see
>> >> > a reason to not allow this).
>> >>
>> >> I believe the standard explicitly blesses using a roundtrip memcpy().
>> >> I do not see any mention of changing declared object's dynamic type
>> >> with memcpy().
>> >>
>> >> >
>> >> > I failed to find any explicit reference to the effect of memcpy
>> >> > on dynamic types in the C++ standard.
>> >>
>> >> I believe in that case, we have "undefined behaviour" :-)
>> >>
>> >> >?The C standard however
>> >> > has in 6.5/6 'If a value is copied into an object having no
>> >> > declared type using memcpy or memmove, or is copied as an
>> >> > array of character type, then the effective type of the
>> >> > modified object for that access and for subsequent accesses that
>> >> > do not modify the values is the effective type of the object
>> >> > from which the value is copied, if it has one'. ?The C standard
>> >> > restricts itself to speak about anonymous memory only here
>> >> > which effectively would make the testcase undefined in C
>> >> > (at least that's my reading).
>> >>
>> >> yeah, I think that means objects allocated from free store.
>> >> However, C++ uses "object" in a more restricted form than C.
>> >
>> > Do you have any opinion on the libstdc++ case which effectively does
>> >
>> > ?struct Y;
>> > ?struct X { union T { long alignme; char mem[32]; } m; } x, y, tem;
>> > ?new (&x.m.mem) Y;
>>
>> At this point x.m.mem holds a Y object (the character array essentially
>> acts like an object container, due to the ambiguous status of char,
>> and C++ lacking proper 'byte' types.)
>>
>> > ?tem = x;
>>
>> C++03, curiously, does not define copy-n-assignment of union
>> objects. ?C++0x will fix that by prescribing a memcpy
>>
>> ? ?http://www.open-std.org/JTC1/SC22/WG21/docs/cwg_defects.html#653
>>
>> which copies the union-subobject x.m, therefore morally transferring
>> the active dynamic type of the 'x.m' sub-object to tem.m
>>
>> > ?x = y;
>>
>> at this point 'y' was never initialized, so you should get
>> an undefined behaviour on the read of 'y.m'. ?I don't know
>> that is important for the discussion though.
>>
>> > ?y = tem;
>>
>> Assuming copy-n-assignment of unions is fixed as above,
>> this should make y.m have the same active dynamic type
>> as that of tem.m.
>>
>> >
>> > ? ?Basically using an lvalue of type X to access the object of type Y it
>> > constructed in-place in mem?
>>
>> I do not see an lvalue of type X used to access the object Y.
>
> Isn't 'x' an lvalue of type X?

yes, 'x' is lvalue of type X.  But, I do not see it used to access
the Y object.  At most, we would have either 'x.m' or 'x.m.mem',
none of which is of type X.

>?As the in-place constructed object of type Y is embedded in 'x',
> isnt 'x' also accessing that object?

I do not see that access in the sample code.

In general, when we write 'obj.member' to access a sub-object of
an object 'obj', we don't say we use 'obj' to access the data designated
by 'obj.member' -- rather, we talk of the complete path 'obj.member'.
Otherwise, we would have to throw away almost all of C++ type rules.

>
> Basically the middle-end sees
>
> ?new (&x.m.mem[0]) Y;
> ?((Y *)&x.m.mem[0])->foo = 1;
> ?tem = x;
>
> and the store does not conflict with the load of x as far
> as TBAA is concerned (there is no Y in the alias-subsets of X).
> So the scheduler ends up re-ordering the accesses.
>
> Thus I'm looking for a way to claim that the copy as above
> invokes undefined behavior (because of that in-place construction),
> and that the only valid way to copy 'x' is by using memcpy
> (independed on that being broken by the middle-end).

Is there a way to tell the middle end that in the assignment

    tem = x;

both members of x.m are indistinguishable in terms of aliasing?
Or would that be too pessimistic an assumption for the optimizer?

>
> So, is the copy tem = x above indeed invoking undefined behavior?

Technically, that assignment is undefined behaviour in C++03, but
I think a compiler that goes by that would be terrible from practical
point of view -- that omission was not intended, as indicated in the DR.
That will be fixed soon, by requiring the compiler to do a memcpy().
Now, I do not know whether the g++ front-end will insert that memcpy(),
or if it would let the middle end do it.  Either way, I think g++ today
should treat that assignment as making 'tem.m' have the same active
dynamic type as 'x.m'.  Now, you're the middle-end expert, and I have
no clue about how difficult that is to teach GCC with the current
infrastructure.

>
>> > This is what started the whole discussion, as the proposed "fix",
>> >
>> > ?memcpy (&tem, &x, sizeof(X));
>> > ?memcpy (&x, &y, sizeof (X));
>> > ?memcpy (&y, &tem, sizeof (x));
>> >
>> > is folded to the plain assignments again.
>>
>> ah, but that transformation to plain assignment is middle-end
>> gimmicks, not C++ source-level constructs. ?I think the middle-end
>> should probably annotate that transformation (probably as
>> a VIEW_CONVERT_EXPR?) since the standard explicitly says that one can
>> move valid objects around with memcpy() (for objects of POD type.)
>> And surely the library should use memcpy() to copy objects with
>> statically ambiguous type.
>>
>> Does the annotation idea makes sense to the middle-end?
>
> Yes. ?I was playing with that during the weekend and have some
> prototype patch that I need to clean up a bit.

That would be fantastic.

-- Gaby


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]