Question on strict aliasing in C.

Tue Jul 5 08:10:00 GMT 2011

On 04/07/2011 18:57, Georg-Johann Lay wrote:
> Hi,
>
> this it not a question on GCC, so I apologize for asking a question on
> C strict aliasing rules here. As I know that some people reading this
> list are much more familiar with C standard than I am, allow me to ask
> that question, anyway.
>
> Suppose the following C code that tries to implement the standard
> copysign function, i.e. copy the sign of y into x and return that
> value. double be 64 bits wide:
>
>
> #define MASK ((unsigned short) 0x8000)
>
> double copysign (double x, double y)
> {
>      unsigned short * const px = (unsigned short*)(char*)&x + 3;
>      unsigned short * const py = (unsigned short*)(char*)&y + 3;
>
>      *px = *px&  ~MASK | *py&  MASK;
>      return x;
> }
>
>
> While I say that this code is not correct because it breaks C's strict
> aliasing rules (e.g C89/90, Chapter 6.3; C98/99, Chapter 6.5, Clause
> 7), some other person very well familiar with the standard claims that
> is correct and no problem.
>
> So I want to reassure me if the code is ok or not.
>
> I am aware of gcc's -f[no]-strict-aliasing switch, but that's rather
> technical detail to cope with such presumably incorrect code.
>
> Thanks,
>
> Johann
>
> BTW, gcc does just what I would expect: it returns x unchanged and
> that is not an "optimizer bug".
>

Does it get compiled to an empty function?

Anyway, you are correct that strict aliasing means that this function is 
buggy - the compiler can assume that a "double" such as x or y cannot 
share the same address as a "short" (or "unsigned short"), thus the 
pointer "px" cannot point to x, thus x is not changed by the function, 
and can be returned unchanged.

The extra "(char*)" in the cast may be an attempt to work around 
aliasing, but it won't work.

In other words, the function is buggy and unclear - never a good 
combination.  And for the 8-bit AVR (I'm guessing that's the target you 
are interested in...), it is inefficient as well.

The code assumes 8-byte doubles, 16-bit shorts and little-endian 
ordering.  Obviously that's a common combination, but it's not the only 
possibility - and IIRC the AVR uses 32-bit doubles at the moment (and 
some embedded targets allow 32-bit doubles as an option for efficiency).

It would be /far/ better to stick to char pointers - these may alias 
anything, and are more efficient on an 8-bit cpu (without being /too/ 
bad on most other targets).  The alternative would be to use union types 
to do the casting - that would make more sense on 32-bit RISC targets 
without direct byte operations (such as earlier ARMs).

double copysign (double x, double y)
{
      unsigned char * const px = (unsigned char*)(&x + 1) - 1;
      const unsigned char * const py = (unsigned char*)(&y + 1) - 1;

      *px = (*px & ~0x80) | (*py & 0x80);
      return x;
}

This should make px and py point to the last bytes of x and y, 
regardless of the size of a double.  I've added a couple more brackets, 
and removed the MASK macro - that's just wanton name-space pollution.

It's still little-endian only, of course.