c/5593: GCC miscompiles bitshifts on unsigned struct members when creating a 64-bit value

Byron Stanoszek gandalf@winds.org
Fri Nov 8 20:46:00 GMT 2002


The following reply was made to PR c/5593; it has been noted by GNATS.

From: Byron Stanoszek <gandalf@winds.org>
To: Christian Ehrhardt <ehrhardt@mathematik.uni-ulm.de>
Cc: bangerth@dealii.org, <gcc-bugs@gcc.gnu.org>, <gcc-prs@gcc.gnu.org>,
   <neil@gcc.gnu.org>, <gcc-gnats@gcc.gnu.org>
Subject: Re: c/5593: GCC miscompiles bitshifts on unsigned struct members
 when creating a 64-bit value
Date: Fri, 8 Nov 2002 23:44:04 -0500 (EST)

 On Fri, 8 Nov 2002, Christian Ehrhardt wrote:
 
 > On Tue, Nov 05, 2002 at 04:15:03PM -0000, bangerth@dealii.org wrote:
 > > Synopsis: GCC miscompiles bitshifts on unsigned struct members when creating a 64-bit value
 > > 
 > >     I can confirm this. However, I'm not sure whether what you
 > >     do is specified at all: it all boils down to this function:
 > >     long long equation4(struct field *data)
 > >     {
 > >       return ((long long)data->num << 32)|
 > >               (data->flags << 16)|
 > >               (data->container << 8)|
 > >               data->quantity;
 > >     }
 > >     and that data->flags is a 16-bit integer. I think, shifting
 > >     it by 16 is invoking undefined behavior, and you should
 > >     not be surprised. But then, I'm not a language lawyer and
 > >     leave this to someone else.
 > 
 > b) A type is promoted to int if the whole range can be represented in
 >    an int no matter what the sign of the original type was (6.3.1.1[#2])
 >    This is the culprit!
 
 This does appear to be the culprit. Modifing the function so that we have
 'unsigned short flags=0x8000' or 'unsigned char flags=0x80' has the same effect
 in 'equation 1' to promote the shift to a signed int.
 
 Both pieces of code function similarly in a 64-bit environment (e.g. Alpha) so
 I'm pretty much declaring this to be not a bug at all.  Thanks for pointing out
 the C spec.
 
  -Byron
 
 > c) According to 6.5.7[#2] it is actually unspecified what happens
 >    if an overflow occurs in a left shift of a signed integer. This is
 >    the undefined behaviour invoked here but I think it is clear what
 >    the _right_ behaviour is.
 > 
 > This means (assuming a 16 Bit short and a 32 Bit int) the standard
 > says that the equation below always holds. Actually all the casts on
 > the rhs aren't necessary .
 > 
 > unsigned short a;
 > a << 16 == (int)(((int)a)<<16);
 > 
 > The actual value of the rhs is still unspecified according to the standard
 > if the value of a is greater than 0x7fff. But again I don't think it is
 > unspecified what gcc does in this case.
 > 
 > Now looking at the bitwise or with a 64 Bit operand:
 > a) 6.5.11[#3] states that the usual arithmetic conversions are performed
 >    on the operands of ``|'' before the operator is applied.
 > b) The usual arithmetic conversions defined in 6.3.1.8 state in [#1]
 >    for this case:
 >         Otherwise, if both operands have signed integer types or both
 >         have unsigned integer types, the operand with the type of lesser
 >         integer conversion rank is converted to the type of the operand
 >         with greater rank [which is long long in the case of int and
 > 	long long].
 >    and converting a signed integer to a signed integer of another type
 >    is defined in 6.3.1.3[#1]:
 >         When a value with integer type is converted to another integer
 >         type other than _Bool,if the value can be represented by the new
 >         type, it is unchanged.
 > 
 > De facto this means that the operand of the smaller type is sign extended.
 > 
 > This means that we get these implicit casts assuming 64 Bit long long:
 > 
 > long long ll; unsigned short a;
 > ll | (a << 16) == ll | (long long)(int)((int)a << (int) 16)
 > 
 > Now using your values (ll = 0x123400000000, a = 0x8000) in this
 > expression yields:
 >     ll                         | (long long)(int)((int)a << (int) 16)
 > ==  (long long)0x123400000000  | (long long)(int)((int)0x8000 << 16)    (*)
 > ==  (long long)0x123400000000  | (long long)(int)0x80000000             (**)
 > ==  (long long)0x123400000000  | (long long)(-2147483648)
 > ==  (long long)0x123400000000  | (long long)0xffffffff80000000
 > ==  (long long)0xffffffff80000000
 > 
 > The step from (*) to (**) is the only place where undefined behaviour
 > is invoked and I think we all agree that we'd consider it a bug if this
 > calculation did anything else than what I did above.
 > 
 >     regards   Christian
 > 
 > http://gcc.gnu.org/cgi-bin/gnatsweb.pl?cmd=view%20audit-trail&database=gcc&pr=5593
 > 
 > 
 
 -- 
 Byron Stanoszek                         Ph: (330) 644-3059
 Systems Programmer                      Fax: (330) 644-8110
 Commercial Timesharing Inc.             Email: byron@comtime.com
 



More information about the Gcc-prs mailing list