Created attachment 38204 [details] main.c test case source I have reproduced on gcc-4.9.2, gcc-4.9.3, and gcc-5.3.1 in Debian. System type: ODROID-U2 (Exynos4412) ARMv7 COLLECT_GCC=gcc COLLECT_LTO_WRAPPER=/usr/lib/gcc/arm-linux-gnueabihf/5/lto-wrapper Target: arm-linux-gnueabihf Configured with: ../src/configure -v --with-pkgversion='Debian 5.3.1-13' --with-bugurl=file:///usr/share/doc/gcc-5/README.Bugs --enable-languages=c,ada,c++,java,go,d,fortran,objc,obj-c++ --prefix=/usr --program-suffix=-5 --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-libitm --disable-libquadmath --enable-plugin --with-system-zlib --disable-browser-plugin --enable-java-awt=gtk --enable-gtk-cairo --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-5-armhf/jre --enable-java-home --with-jvm-root-dir=/usr/lib/jvm/java-1.5.0-gcj-5-armhf --with-jvm-jar-dir=/usr/lib/jvm-exports/java-1.5.0-gcj-5-armhf --with-arch-directory=arm --with-ecj-jar=/usr/share/java/eclipse-ecj.jar --enable-objc-gc --enable-multiarch --disable-sjlj-exceptions --with-arch=armv7-a --with-fpu=vfpv3-d16 --with-float=hard --with-mode=thumb --enable-checking=release --build=arm-linux-gnueabihf --host=arm-linux-gnueabihf --target=arm-linux-gnueabihf Thread model: posix gcc version 5.3.1 20160323 (Debian 5.3.1-13) With a struct set up with bitfield values like this: struct mystruct { unsigned int uid_set : 1; unsigned int is_loaded : 1; unsigned int nonexistent : 1; }; The following C code is compiled to bad ARMv7 assembly under certain circumstances at -O2: if (!user->is_loaded) set_is_loaded (user, 1); The resultant bad assembly is: 14: 7803 ldrb r3, [r0, #0] 16: 079b lsls r3, r3, #30 18: d400 bmi.n 1c <on_get_all_finished+0x8> 1a: d000 beq.n 1e <on_get_all_finished+0xa> 1c: 4770 bx lr 1e: e7ef b.n 0 <set_is_loaded.part.0> In the bitfield, uid_set is bit 0 and is_loaded is bit 1. The assembly loads the bitfield value and shifts left to have the value of is_loaded at bit 31. So the "bmi" instruction makes perfect sense: if is_loaded is set, jump to some code that is not going to call set_is_loaded. The following "beq" instruction is bad. Here we have is_loaded at bit 31, but we also have uid_set at bit 30. So the value of uid_set is clearly going to influence the code behaviour here: if uid_set is 1, we will not call set_is_loaded. This issue originates from freedesktop's accountsservice where I noticed the incorrect runtime behaviour. I am attaching a minimal test case which you can compile with: gcc -O2 -g -c main.c -o out.o gcc -c other.c -o other.o gcc other.o out.o -o prog Run with: ./prog 0 ./prog 1 The argument controls the value of uid_set. It should have no effect on the runtime behaviour, but you'll notice that myfunc is only called when the arg is 0. Unfortunately I couldn't figure out how to slim down the test case into a single file, as that resulted in different (working) asm being generated. I'm also attaching the preprocessed version of main.c.
Created attachment 38205 [details] other.c test case source
Created attachment 38206 [details] main.i preprocessed output
Confirmed on active branches. I get MYFUNC printed only when running ./prog 0. At -O0 or with GCC 4.8 I get MYFUNC printed when running both "./prog 0" and "prog 1". So this looks like a regression from 4.9 onwards.
The culprit seems to be the peephole in thumb2.md at line 1539. Adding -fno-peephole2 gives the correct behaviour.
I think the peephole to transform the 1-bit zero_extract + compare-with-zero into an lsls (from a tst-immediate) is not valid for positions other than zero (i.e. when the shift would be by 31). But that case should already be handled by the peephole below it at line 1567 in thumb2.md. So I propose we just delete the one at line 1539. Richard, what do you think?
Ah, on second glance the peephole looks correct in itself, but the second branch following the bmi uses an incorrect condition code. So we have: tst r3, #2 bne .L3 beq .L6 being transformed into: ldrb r3, [r0] @ zero_extendqisi2 lsls r3, r3, #30 bmi .L3 beq .L6 The beq needs to be updated to be the opposite of bmi. That is, bpl
(In reply to ktkachov from comment #6) > Ah, on second glance the peephole looks correct in itself, but the second > branch following the bmi uses an incorrect condition code. > So we have: > tst r3, #2 > bne .L3 > beq .L6 > > being transformed into: > ldrb r3, [r0] @ zero_extendqisi2 > lsls r3, r3, #30 > bmi .L3 > beq .L6 > > > The beq needs to be updated to be the opposite of bmi. That is, bpl Sounds like the peephole is missing a reg-dead check on the condition code value.
(In reply to Richard Earnshaw from comment #7) > (In reply to ktkachov from comment #6) > > Ah, on second glance the peephole looks correct in itself, but the second > > branch following the bmi uses an incorrect condition code. > > So we have: > > tst r3, #2 > > bne .L3 > > beq .L6 > > > > being transformed into: > > ldrb r3, [r0] @ zero_extendqisi2 > > lsls r3, r3, #30 > > bmi .L3 > > beq .L6 > > > > > > The beq needs to be updated to be the opposite of bmi. That is, bpl > > Sounds like the peephole is missing a reg-dead check on the condition code > value. Yep, that seems to do the trick. I'll test a patch. Thanks Richard.
Patch posted at: https://gcc.gnu.org/ml/gcc-patches/2016-04/msg00351.html
Author: ktkachov Date: Fri Apr 8 09:39:44 2016 New Revision: 234825 URL: https://gcc.gnu.org/viewcvs?rev=234825&root=gcc&view=rev Log: [ARM] PR target/70566 Check that condition register is dead in tst-imm -> lsls-imm Thumb2 peepholes PR target/70566 * config/arm/thumb2.md (tst + branch-> lsls + branch peephole below *orsi_not_shiftsi_si): Require that condition register is dead after the peephole. (second peephole after the above): Likewise. * gcc.c-torture/execute/pr70566.c: New test. Added: trunk/gcc/testsuite/gcc.c-torture/execute/pr70566.c Modified: trunk/gcc/ChangeLog trunk/gcc/config/arm/thumb2.md trunk/gcc/testsuite/ChangeLog
Fixed on trunk. The patch backports cleanly to 4.9 and 5 branches and bootstrap and test looks good. I'll ask for a backport after a bit of time on trunk
Author: ktkachov Date: Tue Apr 12 10:58:28 2016 New Revision: 234898 URL: https://gcc.gnu.org/viewcvs?rev=234898&root=gcc&view=rev Log: [ARM] PR target/70566 Check that condition register is dead in tst-imm -> lsls-imm Thumb2 peepholes Backport from mainline 2016-04-08 Kyrylo Tkachov <kyrylo.tkachov@arm.com> PR target/70566 * config/arm/thumb2.md (tst + branch-> lsls + branch peephole below *orsi_not_shiftsi_si): Require that condition register is dead after the peephole. (second peephole after the above): Likewise. * gcc.c-torture/execute/pr70566.c: New test. Added: branches/gcc-5-branch/gcc/testsuite/gcc.c-torture/execute/pr70566.c Modified: branches/gcc-5-branch/gcc/ChangeLog branches/gcc-5-branch/gcc/config/arm/thumb2.md branches/gcc-5-branch/gcc/testsuite/ChangeLog
Author: ktkachov Date: Wed Apr 13 08:24:43 2016 New Revision: 234931 URL: https://gcc.gnu.org/viewcvs?rev=234931&root=gcc&view=rev Log: [ARM] PR target/70566 Check that condition register is dead in tst-imm -> lsls-imm Thumb2 peepholes Backport from mainline 2016-04-08 Kyrylo Tkachov <kyrylo.tkachov@arm.com> PR target/70566 * config/arm/thumb2.md (tst + branch-> lsls + branch peephole below *orsi_not_shiftsi_si): Require that condition register is dead after the peephole. (second peephole after the above): Likewise. * gcc.c-torture/execute/pr70566.c: New test. Added: branches/gcc-4_9-branch/gcc/testsuite/gcc.c-torture/execute/pr70566.c Modified: branches/gcc-4_9-branch/gcc/ChangeLog branches/gcc-4_9-branch/gcc/config/arm/thumb2.md branches/gcc-4_9-branch/gcc/testsuite/ChangeLog
Fixed on all active branches. Thanks for the bug report.