Bug 88469 - [7/8 regression] AAPCS/AAPCS64 - Struct with 64-bit bitfield (128-bit on AArch64) may be passed in wrong registers
Summary: [7/8 regression] AAPCS/AAPCS64 - Struct with 64-bit bitfield (128-bit on AArc...
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: 8.2.0
: P2 normal
Target Milestone: 8.4
Assignee: Not yet assigned to anyone
URL:
Keywords: ABI, wrong-code
Depends on:
Blocks:
 
Reported: 2018-12-12 16:59 UTC by Stefan Ring
Modified: 2019-11-14 08:08 UTC (History)
0 users

See Also:
Host:
Target: arm aarch64
Build:
Known to work: 4.6.3
Known to fail: 6.5.0, 7.4.1, 8.2.1
Last reconfirmed: 2018-12-12 00:00:00


Attachments
Preprocessed sample (169.59 KB, application/x-xz)
2018-12-12 23:45 UTC, Stefan Ring
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Stefan Ring 2018-12-12 16:59:16 UTC
The compiler generates unaligned stack accesses for its own code, which causes it to trap on armv5. The disassembly of the offending function looks like this:

(configure arguments: --build=armv5tel-unknown-linux-gnueabi --prefix=$HOME/gcc8 --enable-languages=c,c++ --with-arch=armv5te --with-mode=arm --disable-nls)

00398e64 <_ZN11cgraph_node11create_edgeEPS_P5gcall13profile_count>:
  398e64:       e24dd008        sub     sp, sp, #8
  398e68:       e92d41f0        push    {r4, r5, r6, r7, r8, lr}
  398e6c:       e24dd018        sub     sp, sp, #24
  398e70:       e58d3034        str     r3, [sp, #52]   ; 0x34
  398e74:       e1cd63d4        ldrd    r6, [sp, #52]   ; 0x34
  398e78:       e28d3018        add     r3, sp, #24
  398e7c:       e1a08000        mov     r8, r0
  398e80:       e1a05001        mov     r5, r1
  398e84:       e59fe068        ldr     lr, [pc, #104]  ; 398ef4 <_ZN11cgraph_node11create_edgeEPS_P5gcall13profile_count+0x90>
  398e88:       e1cd61f0        strd    r6, [sp, #16]
  398e8c:       e9130003        ldmdb   r3, {r0, r1}
  398e90:       e3a0c000        mov     ip, #0
  398e94:       e1a03002        mov     r3, r2
  398e98:       e88d0003        stm     sp, {r0, r1}
  398e9c:       e1a02005        mov     r2, r5
  398ea0:       e59e0000        ldr     r0, [lr]
  398ea4:       e1a01008        mov     r1, r8
  398ea8:       e58dc008        str     ip, [sp, #8]
  398eac:       ebffff3c        bl      398ba4 <_ZN12symbol_table11create_edgeEP11cgraph_nodeS1_P5gcall13profile_countb>
  398eb0:       e1a04000        mov     r4, r0
  398eb4:       eb0709d8        bl      55b61c <_Z24initialize_inline_failedP11cgraph_edge>
  398eb8:       e5953044        ldr     r3, [r5, #68]   ; 0x44
  398ebc:       e5843014        str     r3, [r4, #20]
  398ec0:       e3530000        cmp     r3, #0
  398ec4:       15834010        strne   r4, [r3, #16]
  398ec8:       e5983040        ldr     r3, [r8, #64]   ; 0x40
  398ecc:       e1a00004        mov     r0, r4
  398ed0:       e3530000        cmp     r3, #0
  398ed4:       e584301c        str     r3, [r4, #28]
  398ed8:       15834018        strne   r4, [r3, #24]
  398edc:       e5884040        str     r4, [r8, #64]   ; 0x40
  398ee0:       e5854044        str     r4, [r5, #68]   ; 0x44
  398ee4:       e28dd018        add     sp, sp, #24
  398ee8:       e8bd41f0        pop     {r4, r5, r6, r7, r8, lr}
  398eec:       e28dd008        add     sp, sp, #8
  398ef0:       e12fff1e        bx      lr
  398ef4:       012f41a0        teqeq   pc, r0, lsr #3

The ldrd at 398e74 is the problem. To be honest, I don't fully understand understand this code. profile_count seems to be a struct with a 64 bit value as its first element. From my understanding of AAPCS, this should not be stored in r3, because it is not an even register number. But be that as it may, this seems to store the first part of the 64 bit counter into the stack so that it can then be loaded into r6/r7 together with its upper part. This can never be properly aligned.

For comparison, the same function in an armv7 hardfloat build looks like this:

(configure arguments: --build=arm-linux-gnueabihf --prefix=$HOME/gcc8 --enable-languages=c,c++ --with-arch=armv7-a --with-fpu=vfpv3-d16 --with-mode=arm --with-float=hard --disable-nls --enable-multilib)

003c21e0 <_ZN11cgraph_node11create_edgeEPS_P5gcall13profile_count>:
  3c21e0:       e24dd008        sub     sp, sp, #8
  3c21e4:       e309c130        movw    ip, #37168      ; 0x9130
  3c21e8:       e340c133        movt    ip, #307        ; 0x133
  3c21ec:       e92d4370        push    {r4, r5, r6, r8, r9, lr}
  3c21f0:       e24dd018        sub     sp, sp, #24
  3c21f4:       e1a05001        mov     r5, r1
  3c21f8:       e1a06000        mov     r6, r0
  3c21fc:       e58d3034        str     r3, [sp, #52]   ; 0x34
  3c2200:       e1a03002        mov     r3, r2
  3c2204:       e1cd83d4        ldrd    r8, [sp, #52]   ; 0x34
  3c2208:       e1a02001        mov     r2, r1
  3c220c:       e28d1018        add     r1, sp, #24
  3c2210:       e3a0e000        mov     lr, #0
  3c2214:       e1cd81f0        strd    r8, [sp, #16]
  3c2218:       e9110003        ldmdb   r1, {r0, r1}
  3c221c:       e58de008        str     lr, [sp, #8]
  3c2220:       e88d0003        stm     sp, {r0, r1}
  3c2224:       e1a01006        mov     r1, r6
  3c2228:       e59c0000        ldr     r0, [ip]
  3c222c:       ebffff43        bl      3c1f40 <_ZN12symbol_table11create_edgeEP11cgraph_nodeS1_P5gcall13profile_countb>
  3c2230:       e1a04000        mov     r4, r0
  3c2234:       eb071320        bl      586ebc <_Z24initialize_inline_failedP11cgraph_edge>
  3c2238:       e5953044        ldr     r3, [r5, #68]   ; 0x44
  3c223c:       e1a00004        mov     r0, r4
  3c2240:       e3530000        cmp     r3, #0
  3c2244:       e5843014        str     r3, [r4, #20]
  3c2248:       15834010        strne   r4, [r3, #16]
  3c224c:       e5963040        ldr     r3, [r6, #64]   ; 0x40
  3c2250:       e3530000        cmp     r3, #0
  3c2254:       e584301c        str     r3, [r4, #28]
  3c2258:       15834018        strne   r4, [r3, #24]
  3c225c:       e5864040        str     r4, [r6, #64]   ; 0x40
  3c2260:       e5854044        str     r4, [r5, #68]   ; 0x44
  3c2264:       e28dd018        add     sp, sp, #24
  3c2268:       e8bd4370        pop     {r4, r5, r6, r8, r9, lr}
  3c226c:       e28dd008        add     sp, sp, #8
  3c2270:       e12fff1e        bx      lr

The misaligned ldrd is still there, just arranged in a different way. It may be ok for armv7 (although this would surprise me), but it definitely is not for armv5.

This looks very similar to #86555, but it happens in gcc's own code, so it may not be appropriate to blame the C library in this case.
Comment 1 Richard Earnshaw 2018-12-12 17:05:31 UTC
Please upload a pre-processed testcase
Comment 2 Stefan Ring 2018-12-12 23:45:40 UTC
Created attachment 45222 [details]
Preprocessed sample

g++ -c -O2 -x c++ prep

produces the shown code.

$ g++ -v
Using built-in specs.
COLLECT_GCC=/home/sr/gcc8/bin/g++
COLLECT_LTO_WRAPPER=/home/sr/gcc8/libexec/gcc/armv5tel-unknown-linux-gnueabi/8.2.0/lto-wrapper
Target: armv5tel-unknown-linux-gnueabi
Configured with: ../gcc-8.2.0/configure --build=armv5tel-unknown-linux-gnueabi --prefix=/home/sr/gcc8 --enable-languages=c,c++ --with-arch=armv5te --with-mode=arm --disable-nls
Thread model: posix
gcc version 8.2.0 (GCC)
Comment 3 Richard Earnshaw 2018-12-13 16:22:24 UTC
Sigh, this is a PCS layout bug for argument marshalling.  I think it can only happen when a struct inherits 64-bit alignment from a bit-field.  In this case, the alignment on the bit-field is 1, but it's real alignment for the purposes of type-layout is derived from the bit-field's type.

I have a simple fix, but I need to think a bit more about how we might need to warn about this change occurring - made more complicated by the fact that we already have code to warn about a previous ABI bug in this area.
Comment 4 Richard Earnshaw 2018-12-13 17:20:22 UTC
Simpler test-case.

struct x
{
  long long a : 61;
};

void bar (int, struct x);

int foo (int a, int b, int c, struct x d)
{
  bar (a, d);
  return 2;
}

This does not seem to generate ldrd, but does show that 'd' is passed partially in r3 and partially on the stack, when it should be entirely on the stack.

Correct code should be something like:

        push    {r4, lr}
        ldrd    r2, [sp, #8]
        bl      bar
        mov     r0, #2
        pop     {r4, pc}

But we currently generate:

        sub     sp, sp, #8
        push    {r4, lr}
        ldr     r2, [sp, #16]
        mov     r1, r3
        str     r3, [sp, #12]
        bl      bar
        pop     {r4, lr}
        mov     r0, #2
        add     sp, sp, #8
        bx      lr
Comment 5 Richard Earnshaw 2018-12-17 16:21:37 UTC
Tentative patch posted here.

https://gcc.gnu.org/ml/gcc-patches/2018-12/msg01231.html
Comment 6 Richard Earnshaw 2019-01-22 14:03:54 UTC
Author: rearnsha
Date: Tue Jan 22 14:03:22 2019
New Revision: 268151

URL: https://gcc.gnu.org/viewcvs?rev=268151&root=gcc&view=rev
Log:
[arm] PR target/88469 fix incorrect argument passing with 64-bit bitfields

Unfortunately another PCS bug has come to light with the layout of
structs whose alignment is dominated by a 64-bit bitfield element.
Such fields in the type list appear to have alignment 1, but in
reality, for the purposes of alignment of the underlying structure,
the alignment is derived from the underlying bitfield's type.  We've
been getting this wrong since support for over-aligned record types
was added several releases back.  Worse still, the existing code may
generate unaligned memory accesses that may fault on some versions of
the architecture.

I've taken the opportunity to add a few more tests that check the
passing arguments with overalignment in the PCS.  Looking through the
existing tests it looked like they were really only checking
self-consistency and not the precise location of the arguments.

PR target/88469

gcc:
	* config/arm/arm.c (arm_needs_doubleword_align): Return 2 if a record's
	alignment is dominated by a bitfield with 64-bit aligned base type.
	(arm_function_arg): Emit a warning if the alignment has changed since
	earlier GCC releases.
	(arm_function_arg_boundary): Likewise.
	(arm_setup_incoming_varargs): Likewise.

gcc/testsuite:
	* gcc.target/arm/aapcs/bitfield1.c: New test.
	* gcc.target/arm/aapcs/overalign_rec1.c: New test.
	* gcc.target/arm/aapcs/overalign_rec2.c: New test.
	* gcc.target/arm/aapcs/overalign_rec3.c: New test.

Added:
    trunk/gcc/testsuite/gcc.target/arm/aapcs/bitfield1.c
    trunk/gcc/testsuite/gcc.target/arm/aapcs/overalign_rec1.c
    trunk/gcc/testsuite/gcc.target/arm/aapcs/overalign_rec2.c
    trunk/gcc/testsuite/gcc.target/arm/aapcs/overalign_rec3.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/arm/arm.c
    trunk/gcc/testsuite/ChangeLog
Comment 7 Richard Earnshaw 2019-01-22 14:16:11 UTC
Fixed on trunk.  Still need mitigation for gcc-7/8 and to deal with boostrapping gcc-9 with gcc-6/7/8.
Comment 8 Richard Earnshaw 2019-01-22 17:56:34 UTC
Author: rearnsha
Date: Tue Jan 22 17:56:02 2019
New Revision: 268160

URL: https://gcc.gnu.org/viewcvs?rev=268160&root=gcc&view=rev
Log:
[arm] Further fixes for PR88469

A bitfield that is exactly the same size as an integral type and
naturally aligned will have DECL_BIT_FIELD cleared.  So we need to
check DECL_BIT_FIELD_TYPE to be sure whether or not the underlying
type was declared with a bitfield declaration.

I've also added a test for bitfields that are based on overaligned types.

	PR target/88469
gcc:
	* config/arm/arm.c (arm_needs_double_word_align): Check
	DECL_BIT_FIELD_TYPE.

gcc/testsuite:
	* gcc.target/arm/aapcs/bitfield2.c: New test.
	* gcc.target/arm/aapcs/bitfield3.c: New test.


Added:
    trunk/gcc/testsuite/gcc.target/arm/aapcs/bitfield2.c
    trunk/gcc/testsuite/gcc.target/arm/aapcs/bitfield3.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/arm/arm.c
    trunk/gcc/testsuite/ChangeLog
Comment 9 Richard Earnshaw 2019-01-24 16:07:06 UTC
Author: rearnsha
Date: Thu Jan 24 16:06:34 2019
New Revision: 268240

URL: https://gcc.gnu.org/viewcvs?rev=268240&root=gcc&view=rev
Log:
Mitigation for PR target/88469 on arm-based systems bootstrapping with gcc-6/7/8

This patch, for gcc 8/9 is a mitigation patch for PR target/88469
where gcc-6/7/8 miscompile a structure whose alignment is dominated by
a 64-bit bitfield member.  Since the PCS rules for such a type must
ignore any overalignment of the base type we cannot address this by
simply adding a larger alignment to the class.  We can, however, force
the alignment of the bit-field itself and GCC will handle that as
desired.

	PR target/88469
	* profile-count.h (profile_count): On ARM systems using GCC 6/7/8
	force the alignment of m_val.


Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/profile-count.h
Comment 10 Richard Earnshaw 2019-01-24 16:10:38 UTC
Author: rearnsha
Date: Thu Jan 24 16:10:06 2019
New Revision: 268241

URL: https://gcc.gnu.org/viewcvs?rev=268241&root=gcc&view=rev
Log:
Mitigation for PR target/88469 on arm-based systems bootstrapping with gcc-6/7/8

This patch, for gcc 8/9 is a mitigation patch for PR target/88469
where gcc-6/7/8 miscompile a structure whose alignment is dominated by
a 64-bit bitfield member.  Since the PCS rules for such a type must
ignore any overalignment of the base type we cannot address this by
simply adding a larger alignment to the class.  We can, however, force
the alignment of the bit-field itself and GCC will handle that as
desired.

	PR target/88469
	* profile-count.h (profile_count): On ARM systems using GCC 6/7/8
	force the alignment of m_val.


Modified:
    branches/gcc-8-branch/gcc/ChangeLog
    branches/gcc-8-branch/gcc/profile-count.h
Comment 11 Richard Earnshaw 2019-01-25 17:10:08 UTC
Author: rearnsha
Date: Fri Jan 25 17:09:33 2019
New Revision: 268273

URL: https://gcc.gnu.org/viewcvs?rev=268273&root=gcc&view=rev
Log:
This is pretty unlikely in real code, but similar to Arm, the AArch64
ABI has a bug with the handling of 128-bit bit-fields, where if the
bit-field dominates the overall alignment the back-end code may end up
passing the argument correctly.  This is a regression that started in
gcc-6 when the ABI support code was updated to support overaligned
types.  The fix is very similar in concept to the Arm fix.  128-bit
bit-fields are fortunately extremely rare, so I'd be very surprised if
anyone has been bitten by this.

PR target/88469
gcc/
	* config/aarch64/aarch64.c (aarch64_function_arg_alignment): Add new
	argument ABI_BREAK.  Set to true if the calculated alignment has
	changed in gcc-9.  Check bit-fields for their base type alignment.
	(aarch64_layout_arg): Warn if argument passing has changed in gcc-9.
	(aarch64_function_arg_boundary): Likewise.
	(aarch64_gimplify_va_arg_expr): Likewise.

gcc/testsuite/
	* gcc.target/aarch64/aapcs64/test_align-10.c: New test.
	* gcc.target/aarch64/aapcs64/test_align-11.c: New test.
	* gcc.target/aarch64/aapcs64/test_align-12.c: New test.

Added:
    trunk/gcc/testsuite/gcc.target/aarch64/aapcs64/test_align-10.c
    trunk/gcc/testsuite/gcc.target/aarch64/aapcs64/test_align-11.c
    trunk/gcc/testsuite/gcc.target/aarch64/aapcs64/test_align-12.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/aarch64/aarch64.c
    trunk/gcc/testsuite/ChangeLog
Comment 12 Stefan Ring 2019-02-24 08:23:37 UTC
Unfortunately my armv5 device has died in the meantime, so I cannot verify my original use case. The behavior is indeed different on armv7. It does not trap, even for the original misaligned code. And contrary to x86, where the alignment check flag can be changed by user space, this is a privileged operation on arm, so I cannot even selectively enable it.
Comment 13 Richard Earnshaw 2019-02-26 11:00:01 UTC
(In reply to Stefan Ring from comment #12)
> Unfortunately my armv5 device has died in the meantime, so I cannot verify
> my original use case. The behavior is indeed different on armv7. It does not
> trap, even for the original misaligned code. And contrary to x86, where the
> alignment check flag can be changed by user space, this is a privileged
> operation on arm, so I cannot even selectively enable it.

Note that if you have root access on your board you can modify the kernel's behaviour for various unaligned accesses by changing /proc/cpu/alignment (see Documentation/arm/mem_alignment in the kernel sources).  You might want to try setting this to 3 to get the kernel to report (but fix up) any misaligned accesses).
Comment 14 Stefan Ring 2019-02-26 11:53:58 UTC
(In reply to Richard Earnshaw from comment #13)
> Note that if you have root access on your board you can modify the kernel's
> behaviour for various unaligned accesses by changing /proc/cpu/alignment
> (see Documentation/arm/mem_alignment in the kernel sources).  You might want
> to try setting this to 3 to get the kernel to report (but fix up) any
> misaligned accesses).

I know that, but armv7 does not trap at all for misaligned ldrd, at the hardware level. It does trap if it’s not even 32bit-aligned, but that’s a different matter.
Comment 15 Richard Biener 2019-11-14 07:51:44 UTC
The GCC 7 branch is being closed, re-targeting to GCC 8.4.
Comment 16 Stefan Ring 2019-11-14 08:08:34 UTC
This was already fixed by the commits in comments #8-#11.