This is the mail archive of the
mailing list for the GCC project.
Re: Questions about C as used/implemented in practice
- From: Joseph Myers <joseph at codesourcery dot com>
- To: Peter Sewell <Peter dot Sewell at cl dot cam dot ac dot uk>
- Cc: "gcc at gcc dot gnu dot org" <gcc at gcc dot gnu dot org>
- Date: Sat, 25 Apr 2015 21:42:01 +0000
- Subject: Re: Questions about C as used/implemented in practice
- Authentication-results: sourceware.org; auth=none
- References: <CAHWkzRS5Pfedgre-b93g1yfRT4CeMkEkm+qzok9DFwYGVVZSeg at mail dot gmail dot com>
On Fri, 17 Apr 2015, Peter Sewell wrote:
> [1/15] How predictable are reads from padding bytes?
> If you zero all bytes of a struct and then write some of its members, do
> reads of the padding return zero? (e.g. for a bytewise CAS or hash of
> the struct, or to know that no security-relevant data has leaked into
The padding may not be zero (both in practice, and as specified by C11
184.108.40.206#6). A plausible sequence of optimizations is to apply SRA,
replacing the memset with a sequence of member assignments (discarding
assignments to padding) in order to do so. To avoid leaks, allow hashing
etc., padding should be explicitly named.
> [2/15] Uninitialised values
> Is reading an uninitialised variable or struct member (with a current
> mainstream compiler):
> (This might either be due to a bug or be intentional, e.g. when copying
> a partially initialised struct, or to output, hash, or set some bits of
> a value that may have been partially initialised.)
Going to give arbitrary, unstable values (that is, the variable assigned
from the uninitialised variable itself acts as uninitialised and having no
consistent value). (Quite possibly subsequent transformations will have
the effect of undefined behavior.)
Inconsistency of observed values is an inevitable consequence of
transformations PHI (undefined, X) -> X (useful in practice for programs
that don't actually use uninitialised variables, but where the compiler
can't see that).
> [3/15] Can one use pointer arithmetic between separately allocated C
> If you calculate an offset between two separately allocated C memory
> objects (e.g. malloc'd regions or global or local variables) by pointer
> subtraction, can you make a usable pointer to the second by adding the
> offset to the address of the first?
This is not safe in practice even if the alignment is sufficient (and if
the alignment of the type is less than its size, obviously such a
subtraction can't possibly work even with a naive compiler).
> [4/15] Is pointer equality sensitive to their original allocation sites?
> For two pointers derived from the addresses of two separate allocations,
> will equality testing (with ==) of them just compare their runtime
> values, or might it take their original allocations into account and
> assume that they do not alias, even if they happen to have the same
> runtime value? (for current mainstream compilers)
It is not safe to assume that equality has a stable result in such cases
(either in practice, or in my view of the standard as discussed in bug
> [5/15] Can pointer values be copied indirectly?
> Can you make a usable copy of a pointer by copying its representation
> bytes with code that indirectly computes the identity function on them,
> e.g. writing the pointer value to a file and then reading it back, and
> using compression or encryption on the way?
Yes, it is valid to copy any object that way (of course, the original
pointer must still be valid at the time it is read back in).
It is not, however, valid or safe to manufacture a pointer value out of
thin air by, for example, generating random bytes and seeing if the
representation happens to compare equal to that of a pointer. See DR#260.
Practical safety may depend on whether the compiler can see through how
the pointer representation was generated.
> [6/15] Pointer comparison at different types
> Can one do == comparison between pointers to objects of different types
> (e.g. pointers to int, float, and different struct types)?
Such a comparison violates the constraints on equality operators (C11
6.5.9#2). If you use conversions to compatible types or pointers to void,
it can only be expected to be safe if you restrict yourself to cases where
220.127.116.11 defines the value resulting from the conversion (aliasing rules
are based on the limitations on when pointer conversions are defined, not
just on 6.5#7, and comparisons can get optimised in practice based on
> [7/15] Pointer comparison across different allocations
> Can one do < comparison between pointers to separately allocated
This is likely to work in practice (for e.g. implementing functions like
memmove) although not permitted by ISO C.
> [8/15] Pointer values after lifetime end
> Can you inspect (e.g. by comparing with ==) the value of a pointer to an
> object after the object itself has been free'd or its scope has ended?
Such a comparison may not give meaningful or consistent results (although
the consequences are likely to be bounded in practice).
> [9/15] Pointer arithmetic
> Can you (transiently) construct an out-of-bounds pointer value (e.g.
> before the beginning of an array, or more than one-past its end) by
> pointer arithmetic, so long as later arithmetic makes it in-bounds
> before it is used to access memory?
This is not safe; compilers may optimise based on pointers being within
bounds. In some cases, it's possible such code might not even link,
depending on the offsets allowed in any relocations that get used in the
> [10/15] Pointer casts
> Given two structure types that have the same initial members, can you
> use a pointer of one type to access the intial members of a value of the
This is not safe in practice (unless a union is visibly used as described
> [11/15] Using unsigned char arrays
> Can an unsigned character array be used (in the same way as a mallocâd
> region) to hold values of other types?
No, this is not safe (if it's visible to the compiler that the memory in
question has unsigned char as its declared type).
> [12/15] Null pointers from non-constant expressions
> Can you make a null pointer by casting from an expression that isn't a
> constant but that evaluates to 0?
In practice this is safe with GCC (as a consequence of casting between
pointers and integers working), although not guaranteed by ISO C.
> [13/15] Null pointer representations
> Can null pointers be assumed to be represented with 0?
For all targets supported by GCC, yes.
> [14/15] Overlarge representation reads
> Can one read the byte representation of a struct as aligned words
> without regard for the fact that its extent might not include all of the
> last word?
In practice this is safe with GCC except for possibly generating errors
with sanitizers, valgrind etc. (but should be avoided except in special
cases such as vectorized string operations).
> [15/15] Union type punning
> When is type punning - writing one union member and then reading it as a
> different member, thereby reinterpreting its representation bytes -
> guaranteed to work (without confusing the compiler analysis and
> optimisation passes)?
It should work in all cases, though in practice internal compiler errors
have occasionally been known to occur for some of the less likely cases if
they result in things the back end didn't expect to see, e.g.
reinterpreting a pointer to a string constant as a floating-point number.
This was defined as a GCC extension even before C99 TC3 added a footnote
(non-normative) describing type punning.
Joseph S. Myers