With GCC 4.8.1 we encountered a regression in linux 3.4 kernel code on ARM devices. The problematic instruction sequence is below: sub sp, fp, #40 << fp-48 is above sp now ldr r3, [fp, #-60] ldr r0, [fp, #-56] str r6, [r2] ldr r2, [fp, #-48] << clobbered by handler str r3, [r2] << trying to load from 0xffffffff ldmfd sp, {r4, r5, r6, r7, r8, r9, r10, fp, sp, pc} A read-only pointer value is passed to the function and stored at [fp-48]. But later it is found that the value has been clobbered and become 0xffffffff. If I manually move the "sub sp, fp, #40" instruction right before the ldmfd instruction, the kernel becomes stable again. I can reproduce the regression on 4.8.1 and 4.8.2, but not on 4.7. The compilation command is arm-eabi-gcc -O2 -marm -fno-omit-frame-pointer -mapcs -march=armv7-a -mabi=aapcs-linux -S test.c and GCC is configured as Target: arm-eabi Configured with: /tmp/AOSP-toolchain/build/../gcc/gcc-4.8/configure --prefix=/tmp/toolchain-build-eabi/prefix --target=arm-eabi --host=x86_64-linux-gnu --build=x86_64-linux-gnu --with-gnu-as --with-gnu-ld --enable-languages=c,c++ --with-gmp=/tmp/toolchain-build-eabi/temp-install --with-mpfr=/tmp/toolchain-build-eabi/temp-install --with-mpc=/tmp/toolchain-build-eabi/temp-install --with-cloog=/tmp/toolchain-build-eabi/temp-install --with-isl=/tmp/toolchain-build-eabi/temp-install --with-ppl=/tmp/toolchain-build-eabi/temp-install --disable-ppl-version-check --disable-cloog-version-check --disable-isl-version-check --enable-cloog-backend=isl --with-host-libstdcxx='-static-libgcc -Wl,-Bstatic,-lstdc++,-Bdynamic -lm' --disable-libssp --enable-threads --disable-nls --disable-libmudflap --disable-libgomp --disable-libstdc__-v3 --disable-sjlj-exceptions --disable-shared --disable-tls --disable-libitm --with-float=soft --with-fpu=vfp --with-arch=armv5te --enable-target-optspace --with-abi=aapcs --enable-initfini-array --disable-nls --prefix=/tmp/toolchain-build-eabi/prefix --with-sysroot=/tmp/toolchain-build-eabi/prefix/sysroot --with-binutils-version=2.23 --with-mpfr-version=3.1.1 --with-mpc-version=1.0.1 --with-gmp-version=5.0.5 --with-gcc-version=4.8 --with-gdb-version=7.6 --with-gxx-include-dir=/tmp/toolchain-build-eabi/prefix/include/c++/4.8 --with-bugurl=http://source.android.com/source/report-bugs.html --disable-bootstrap --disable-libquadmath --enable-plugins --disable-libsanitizer --enable-gold --enable-graphite=yes --with-cloog-version=0.18.0 --with-isl-version=0.11.1 --enable-eh-frame-hdr-for-static --with-arch=armv5te --disable-gold --program-transform-name='s&^&arm-eabi-&' Thread model: single gcc version 4.8 (GCC)
Created attachment 31083 [details] stripped from kernel 3.4 fs/dcache.c
Started with r188742. The code generation difference at that revision is: @@ -115,16 +115,17 @@ bne .L6 .L9: ldr r3, [fp, #-56] + sub sp, fp, #40 ldr r2, [fp, #-60] + mov r0, sl str r6, [r3, #0] ldr r3, [fp, #-52] str r2, [r3, #0] - b .L8 + ldmfd sp, {r4, r5, r6, r7, r8, r9, sl, fp, sp, pc} .L2: + sub sp, fp, #40 mov sl, #0 -.L8: mov r0, sl - sub sp, fp, #40 ldmfd sp, {r4, r5, r6, r7, r8, r9, sl, fp, sp, pc} .L28: b .L23 Hoisting "sub sp, fp, #40" causes several loads, via FP minus an offset, to access locations below SP, and thus be vulnerable to clobbers from asynchronous calls (signal handles in user-space, exception handlers in the kernel as is the case here).
Confirmed. The problem appears to show up with the use of mapcs on the command line. Mine.
(In reply to Ramana Radhakrishnan from comment #3) > Confirmed. The problem appears to show up with the use of mapcs on the > command line. Mine. sched2 moves this ahead - smells familiar. Ramana
Created attachment 31105 [details] lightly tested patch. completely untested but appears to fix the problem - Ben, can you please try this patch and see if fixes your issues ? regards Ramana
Patch appears to be working: c012c7ec: e51b3034 ldr r3, [fp, #-52] ; 0x34 c012c7f0: e51b203c ldr r2, [fp, #-60] ; 0x3c c012c7f4: e51b0038 ldr r0, [fp, #-56] ; 0x38 c012c7f8: e5836000 str r6, [r3] c012c7fc: e51b3030 ldr r3, [fp, #-48] ; 0x30 c012c800: e5832000 str r2, [r3] c012c804: e24bd028 sub sp, fp, #40 ; 0x28 c012c808: e89daff0 ldm sp, {r4, r5, r6, r7, r8, r9, sl, fp, sp, pc} Thanks!
Author: ramana Date: Wed Oct 30 10:54:04 2013 New Revision: 204203 URL: http://gcc.gnu.org/viewcvs?rev=204203&root=gcc&view=rev Log: Fix PR target/58854 2013-10-30 Ramana Radhakrishnan <ramana.radhakrishnan@arm.com> PR target/58854 * config/arm/arm.c (arm_expand_epilogue_apcs_frame): Emit blockage. Modified: trunk/gcc/ChangeLog trunk/gcc/config/arm/arm.c
Author: ramana Date: Mon Nov 11 09:38:14 2013 New Revision: 204665 URL: http://gcc.gnu.org/viewcvs?rev=204665&root=gcc&view=rev Log: Backport fix for PR target/58854 2013-11-11 Ramana Radhakrishnan <ramana.radhakrishnan@arm.com> Backported from mainline 2013-10-30 Ramana Radhakrishnan <ramana.radhakrishnan@arm.com> PR target/58854 * config/arm/arm.c (arm_expand_epilogue_apcs_frame): Emit blockage Modified: branches/gcc-4_8-branch/gcc/ChangeLog branches/gcc-4_8-branch/gcc/config/arm/arm.c
Now fixed.
Comment on attachment 31105 [details] lightly tested patch. >diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c >index 212a4bc..23dfc0e 100644 >--- a/gcc/config/arm/arm.c >+++ b/gcc/config/arm/arm.c >@@ -26547,6 +26547,7 @@ arm_expand_epilogue_apcs_frame (bool really_return) > num_regs = bit_count (saved_regs_mask); > if ((offsets->outgoing_args != (1 + num_regs)) || cfun->calls_alloca) > { >+ emit_insn (gen_blockage ()); > /* Unwind the stack to just below the saved registers. */ > emit_insn (gen_addsi3 (stack_pointer_rtx, > hard_frame_pointer_rtx,
Comment on attachment 31083 [details] stripped from kernel 3.4 fs/dcache.c Created attachment 31083 [details] stripped from kernel 3.4 fs/dcache.c 2013-10-23 17:42 UTC, bccheng@android.com
Comment on attachment 31105 [details] lightly tested patch. >diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c >index 212a4bc..23dfc0e 100644 >--- a/gcc/config/arm/arm.c >+++ b/gcc/config/arm/arm.c >@@ -26547,6 +26547,7 @@ arm_expand_epilogue_apcs_frame (bool really_return) > num_regs = bit_count (saved_regs_mask); > if ((offsets->outgoing_args != (1 + num_regs)) || cfun->calls_alloca) > { >+1 emit_insn (gen_blockage ()); > /* Unwind the stack to just below the saved registers. */ > emit_insn (gen_addsi3 (stack_pointer_rtx, > hard_frame_pointer_rtx,
*** Bug 260998 has been marked as a duplicate of this bug. *** Seen live from the domain http://volichat.com/adult-chat-rooms Marked for reference. Resolved as fixed @bugzilla.