Question on strict aliasing in C.
David Brown
david@westcontrol.com
Tue Jul 5 08:10:00 GMT 2011
On 04/07/2011 18:57, Georg-Johann Lay wrote:
> Hi,
>
> this it not a question on GCC, so I apologize for asking a question on
> C strict aliasing rules here. As I know that some people reading this
> list are much more familiar with C standard than I am, allow me to ask
> that question, anyway.
>
> Suppose the following C code that tries to implement the standard
> copysign function, i.e. copy the sign of y into x and return that
> value. double be 64 bits wide:
>
>
> #define MASK ((unsigned short) 0x8000)
>
> double copysign (double x, double y)
> {
> unsigned short * const px = (unsigned short*)(char*)&x + 3;
> unsigned short * const py = (unsigned short*)(char*)&y + 3;
>
> *px = *px& ~MASK | *py& MASK;
> return x;
> }
>
>
> While I say that this code is not correct because it breaks C's strict
> aliasing rules (e.g C89/90, Chapter 6.3; C98/99, Chapter 6.5, Clause
> 7), some other person very well familiar with the standard claims that
> is correct and no problem.
>
> So I want to reassure me if the code is ok or not.
>
> I am aware of gcc's -f[no]-strict-aliasing switch, but that's rather
> technical detail to cope with such presumably incorrect code.
>
> Thanks,
>
> Johann
>
> BTW, gcc does just what I would expect: it returns x unchanged and
> that is not an "optimizer bug".
>
Does it get compiled to an empty function?
Anyway, you are correct that strict aliasing means that this function is
buggy - the compiler can assume that a "double" such as x or y cannot
share the same address as a "short" (or "unsigned short"), thus the
pointer "px" cannot point to x, thus x is not changed by the function,
and can be returned unchanged.
The extra "(char*)" in the cast may be an attempt to work around
aliasing, but it won't work.
In other words, the function is buggy and unclear - never a good
combination. And for the 8-bit AVR (I'm guessing that's the target you
are interested in...), it is inefficient as well.
The code assumes 8-byte doubles, 16-bit shorts and little-endian
ordering. Obviously that's a common combination, but it's not the only
possibility - and IIRC the AVR uses 32-bit doubles at the moment (and
some embedded targets allow 32-bit doubles as an option for efficiency).
It would be /far/ better to stick to char pointers - these may alias
anything, and are more efficient on an 8-bit cpu (without being /too/
bad on most other targets). The alternative would be to use union types
to do the casting - that would make more sense on 32-bit RISC targets
without direct byte operations (such as earlier ARMs).
double copysign (double x, double y)
{
unsigned char * const px = (unsigned char*)(&x + 1) - 1;
const unsigned char * const py = (unsigned char*)(&y + 1) - 1;
*px = (*px & ~0x80) | (*py & 0x80);
return x;
}
This should make px and py point to the last bytes of x and y,
regardless of the size of a double. I've added a couple more brackets,
and removed the MASK macro - that's just wanton name-space pollution.
It's still little-endian only, of course.
More information about the Gcc-help
mailing list