This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

overlooked implementation


A friend of mine was interested in using as small amount as possible for a
network protocol and ran across bit-fields.  He was asking me some
questions and I found some implementation errors that I find quite
serious.  Consider the following

#include <iostream>

typedef unsigned char u_char;
typedef unsigned int u_int;

typedef struct bitmask_rep_
{
        u_int v1:4;     // CRITICAL PORTION: 'u_int v1:3;'
        u_int v2:4;
        bool b;
} bitmaskrep_;

typedef union
{
        u_char asChar[ sizeof( bitmask_rep_ ) ];
        bitmask_rep_ asBit;
} bitmask;

int main( int argc, char *argv[] )
{
        bitmask some_bits;

        some_bits.asBit.v1 = 7; // 0111
        some_bits.asBit.v2 = 14;        // 1110

        cout << sizeof( bitmask ) << endl;
//      cout << some_bits.asChar[ 0 ] << endl;  // prints '~'
        cout << some_bits.asChar << endl;       // prints '~'
        // notice: they are put in memory 'v1' followed by
        //      'v2' and packed to the left
        // v1 >< v2 => 0111 >< 1110 => 126 => '~'

        return 0;
}

This may appear fine at first, but there are four problems.
1) the size of 'bitmask_rep_' is 4 bytes
2) 'bitmask_rep_' members are put into memory in order listed
3) the unused memory of 'bitmask_rep_' is inconsistently padded with 0's
4) 'bitmask_rep_' members should be right-aligned, not left-aligned

The first two problems are not so serious, but the last two are extremely
serious (I will explain).
1) If I'm not mistaken bit-fields are to be byte aligned, not word
aligned...  I understand the reasoning behind each and so long as it's not
specified in the standard, then more power to ya.
2) Also, I think bit-field members are supposed to be put into memory in
reverse order with respect to the order listed within the
'struct'.  Again, if the standard has no specifications, more power to ya.
3) if 'bitmask_rep_' contains
{
	u_char v1:4;
	u_char v2:4;
	u_char v3;
};
and I try to print 'v1' and 'v2' as a "character string" using the
'asChar' member of the 'union' 'bitmask', I get '~' as expected (see
#4); however, if 'bitmask_rep_' contains
{
	u_char v1:4;
	u_char v2:4;
};
and I try to print 'v1' and 'v2' as a "character string" using teh
'asChar' member of the 'union' 'bitmask', I get '~iii'!  The
reason: previously, when 'bitmask_rep_' contained the additional
non-bit-field member, the unused memory was padded with 0's (which
translated to '\0's); however, without the additional non-bit-field
member, the unused memory was not padded with 0's as it should've been!
4) If 'bitmask_rep_' contains
{
	u_char v1:4;
	u_char v2:4;
};
everything works as expected (see #3), but if 'bitmask_rep_' contains
{
	u_char v1:3;
	u_char v2:4;
};
not everything works as expected.  With the former contents, the 'int'
representation of 'v1' and 'v2' (as one byte) is '126' (0111
1110); however, with the latter contents, the 'int' representation of 'v1'
and 'v2' (as one byte) is '252' (1111 1100 or 111 1110 0)!  This tells me
that the bits are left-aligned in memory--they should be right-aligned.

Feel free to send me questions and please give me a response to this
because I would like to know your thoughts concerning this.

With regards,
Mark M. Young
youngmm@hera.wku.edu


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]