This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] C undefined behavior fix


> <<Sure. Except is must say _something_. It must be defined behaviour. That's
> what "implementation defined" means.
> >>

Robert Dewar writes:
> Defined does NOT mean it works as you want. For instance a C compiler would
> be free to say.
> 
> If you cast an integer to a pointer, the system disk will be deleted. This
> is an undesirable implementation, but not a non-conforming one.

I believe that you're being uselessly pedantic here.  "Implementation
defined", in terms of the way the C standard is written, means that it
does something reasonable and useful that will be documented.  This is
opposed to "undefined" which could logically include things like
generating code that crashes.

Furthermore, we are now documenting the behavior in extend.texi in
the CVS tree (though not in the version shipped with 3.0.3).  This
document now promises that when a pointer is cast to an integer type
that has the same number of bits, the bits are unchanged.

Because of C's traditional use in operating systems, this implementation
defined feature of C is needed by real users to do their jobs (like
addressing a device that is known to be at address 0x80000 in the address
space of an embedded processor).

If we had shipped this version of extend.texi with 3.0.x, it would seem
that Linus's argument would be bulletproof: we specified the
implementation-defined behavior and then we do an optimization that is
not legal.  It appears that the language from that document

"When casting from pointer to integer and back again, the resulting
pointer must reference the same object as the original pointer, otherwise
the behavior is undefined.  That is, one may not use integer arithmetic to
avoid the undefined behavior of pointer arithmetic as proscribed in 6.5.6/8."

is intended to be an "out".  But it seems to me that it doesn't work.  We
are defining P2I(p) to preserve bits, and we are defining I2P(i) to
preserve bits.  These two definitions, it seems to me, nail down the
definition of I2P(P2I(p)+offset).  It simply is not mathematically
consistent to define two functions rigorously (three, counting the
addition, which does not overflow) and then claim that their composition
is undefined.  That's nonsense, so Linus is right.

Now, the C standard allows us to assume no aliasing as far as
dereferencing goes, that is, the compiler is allowed to assume that
a pointer is only allowed to be used to access data inside the object
it points to, so it doesn't have to assume that the definitions of
all globals can be killed.

What this means, it seems to me, is the following: we have some pointer
p, which initially points to an array.  We do further operations on it.
We are allowed to assume that p points somewhere in the array, or one
past the end.  Now we cast it to an integer.  This is an implementation
defined operation, yes, but we have defined it.  We could still, at
this point, make some assumptions about the value of the expression
for purposes of range propagation analysis.  Now we add an integer
expression to it.  It is now INVALID to assume that this integer
expression corresponds to a pointer within the object!



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]