This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: RTL alias analysis
- From: Michael Veksler <mveksler at techunix dot technion dot ac dot il>
- To: Richard Guenther <richard dot guenther at gmail dot com>
- Cc: Alexandre Oliva <aoliva at redhat dot com>, Ian Lance Taylor <ian at airs dot com>, Kai Henningsen <kaih at khms dot westfalen dot de>, gcc at gcc dot gnu dot org, richard at codesourcery dot com
- Date: Thu, 26 Jan 2006 16:04:50 +0200
- Subject: Re: RTL alias analysis
So, is union is a very useful feature in ISO C, without
gcc's extension? It seems that the only legal use of union
is to use the same type through the whole life of the object.
Here is the rationale:
Quoting Richard Guenther <richard.guenther@gmail.com>:
> On 1/25/06, Alexandre Oliva <aoliva@redhat.com> wrote:
> > On Jan 22, 2006, Richard Guenther <richard.guenther@gmail.com> wrote:
> >
[...]
> >
> > > int ii; double dd; void foo (int *ip, double *dp) {
> > > *ip = 15; ii = *ip; *dp = 1.5; dd = *dp; }
> > > void test (void) { union { int i; double d; } u;
> > > foo (&u.i, &u.d); }
> >
> > So it is perfectly valid, but if GCC reorders the read from *ip past
> > the store to *dp, it turns the valid program into one that misbehaves.
>
> *ip = 15; ii = *ip; *dp = 1.5; dd = *dp;
> Here ^^^
> you are accessing memory of type integer as type double. And gcc will
> happily reorder the read from *ip with the store to *dp based on TBAA
> unless it inlines the function and applies the "special" gcc rules about
> unions.
> So this example is invalid, too.
So in theory, if there is a union of two non-char types:
union { T1 v1, T2 v2} x;
it is illegal to access both x.v1 and x.v2 for the same variable x
anywhere in the whole program. If there exists a run (whole program run),
for which x.v1 is written (even if at the program's start), and later
x.v2 is written and read (even if at program's end) then the compiler
may reorder the writes at will, and get different results.
Even bison/yacc are not safe, since they use an array of YYSTYPE (YYSTYPE
is normally a union), and it assigns different values to it. The same
entry in the stack may be accessed differently depending on the current
active rules. I don't think the standard committee intended to do that,
or did they?
GCC seems to do better WRT unions than ISO C. Even that is not perfect,
because it may break things like bison in non obvious ways:
A bison rules may pass a pointer or a reference to a helper function
(and lose the link to the original union).
If functions are inline and bison's loop is unrolled or SMS-ed,
it is possible to reorder memory accesses from two perfectly valid rules,
in an invalid way.
Is it the responsibility of bison to make sure this does not happen? How?
ISO C does not seem to provide such capabilities.
Does it mean that bison language cannot be made valid (without using some
new gcc extensions)?
--
Michael