Bug 93209

Summary: What is a bitfield's "underlying type"?
Product: gcc Reporter: John Adriaan <John.Adriaan>
Component: c++Assignee: Not yet assigned to anyone <unassigned>
Status: UNCONFIRMED ---    
Severity: normal CC: webrown.cpp
Priority: P3 Keywords: documentation
Version: 9.2.0   
Target Milestone: ---   
Host: Target:
Build: Known to work:
Known to fail: Last reconfirmed:
Bug Depends on:    
Bug Blocks: 114371    

Description John Adriaan 2020-01-09 09:07:32 UTC
For the following, please assume:
A. `sizeof(bool)` equals `1`;
B. `sizeof(unsigned)` equals `4`;
C. `-fstrict-volatile-bitfields` is ON;
D. The compiler target is _irrelevant_ - both ARM and x86-64 exhibit this.

With the following code:

    struct S {
        bool     b :  1;
        unsigned i :  1;
    //  unsigned   :  7; // Padding > 8 bits
    //  unsigned   : 15; // Padding >16 bits
    }; // S

    static volatile S s;

    int main() {
        s.b = true;
        s.i = 1;
        return sizeof(S);
    } // main()

`main()` always returns 4. It's a (padded) struct, so OK.

Looking at the compiled op-codes:
1. The access to `s.b` is by byte.
2. The access to `s.i` is by byte.

I then un-commented the "Padding > 8 bits" line:
3. The access to `s.b` is by byte.
4. The access to `s.i` is by half-word/WORD.

I then re-commented the above, and un-commented the "Padding >16 bits" line:
5. The access to `s.b` is by byte.
6. The access to `s.i` is by word/DWORD.

Note that in each version, the type of `s.i` has not changed: it's always `unsigned`. The difference is obvious - the compiler chooses the MINIMUM of:
a. The smallest representation of the field's type;
b. The smallest representation of the `struct`.

In other words, the documentation for `-fstrict-volatile-bitfields` that contains the reference to "underlying type" isn't clear whether that's the declared type of the field, or the ultimate (UN-padded) size of the `struct`.

I personally feel that this is a bug: the compiler should heed the declared type of the field, not some derivative of the ultimate `struct` size. This is of course also exacerbated by the current issue with bug 51242.
Comment 1 John Adriaan 2020-01-09 10:18:07 UTC
I've also looked at the `unsigned : 0;` anonymous field. Frankly, it changes nothing. Maybe it should?

    struct S {
        bool     b : 1;
        unsigned i : 1;
        unsigned   : 0;
    }; // S

With the above inserted into the examples given in the original post, there is no change. According to the standard, `: 0` means "align the _next_ field with the next alignment boundary." Maybe this is wrong: perhaps it should read "pad the _current_ `struct` to the next alignment boundary for _this_ field's type."

The above proposal has the advantage that it doesn't change existing code. And in the unlikely scenario of my "dangling `: 0`" proposal, perhaps that's what the developer expected anyway?

So I also tried the following:

    struct S {
        bool     b : 1;
        unsigned i : 1;
        unsigned   : 0;
        unsigned   : 0;
    }; // S

Alas, no change.
Comment 2 John Adriaan 2020-01-09 10:32:37 UTC
Wow: I just realised. The following code for my proposal is problematic:

    struct S {
        bool     b : 1;
        unsigned i : 1;
        unsigned   : 0;
        char     c : 8;
    }; // S

Should `c` be aligned at one byte (its field) or four bytes (the padding field) after the beginning of `S`?

According to the current implementation, it's four bytes - so again, the padding `: 0` operator seems to be the overriding by its own type, not that of the following field.

Also note that removing the `: 0` field pulls `c` back to a (padded) position six bits _after_ `i`
Comment 3 Richard Biener 2020-01-09 11:30:15 UTC
Note the behavior is probably dependent on the ABI.  Mixing different underlying types for the same bit group makes this interesting.

Does it change anything if you make b and i itself volatile (rather than just s)?
Comment 4 John Adriaan 2020-01-09 11:35:02 UTC
@RichardBiener,

I used the different types to prove my point: if they're all `unsigned`, it still happens.

And the ABI merely defines whether `-fstrict-volatile-bitfields` is the default or not (for ARM, it is): I explicitly enabled it in all cases to eliminate this.

Also, I did indeed try changing the example to make the bitfields themselves `volatile` rather than the variable itself (good pick!): no change.
Comment 5 John Adriaan 2020-01-09 12:52:00 UTC
@RichardBiener,

By "it still happens", I meant that accesses to `s.i` still occur as described. Changing `s.b` to be of type `unsigned` changes its accesses to match those of `s.i` - in other words, the type mismatch is not a factor.