Bug 70566 - [4.9/5/6 Regression] Bad ARM code generated for evaluating unsigned int bitfield value
Summary: [4.9/5/6 Regression] Bad ARM code generated for evaluating unsigned int bitfi...
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: 5.3.1
: P2 normal
Target Milestone: 4.9.4
Assignee: ktkachov
URL:
Keywords: wrong-code
Depends on:
Blocks:
 
Reported: 2016-04-06 19:36 UTC by Daniel Drake
Modified: 2016-08-03 08:06 UTC (History)
1 user (show)

See Also:
Host:
Target: arm
Build:
Known to work: 4.8.5, 6.0
Known to fail: 4.9.4, 5.3.1
Last reconfirmed: 2016-04-07 00:00:00


Attachments
main.c test case source (371 bytes, text/plain)
2016-04-06 19:36 UTC, Daniel Drake
Details
other.c test case source (115 bytes, text/plain)
2016-04-06 19:37 UTC, Daniel Drake
Details
main.i preprocessed output (3.55 KB, text/plain)
2016-04-06 19:38 UTC, Daniel Drake
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Daniel Drake 2016-04-06 19:36:32 UTC
Created attachment 38204 [details]
main.c test case source

I have reproduced on gcc-4.9.2, gcc-4.9.3, and gcc-5.3.1 in Debian.

System type: ODROID-U2 (Exynos4412) ARMv7

COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/arm-linux-gnueabihf/5/lto-wrapper
Target: arm-linux-gnueabihf
Configured with: ../src/configure -v --with-pkgversion='Debian 5.3.1-13' --with-bugurl=file:///usr/share/doc/gcc-5/README.Bugs --enable-languages=c,ada,c++,java,go,d,fortran,objc,obj-c++ --prefix=/usr --program-suffix=-5 --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-libitm --disable-libquadmath --enable-plugin --with-system-zlib --disable-browser-plugin --enable-java-awt=gtk --enable-gtk-cairo --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-5-armhf/jre --enable-java-home --with-jvm-root-dir=/usr/lib/jvm/java-1.5.0-gcj-5-armhf --with-jvm-jar-dir=/usr/lib/jvm-exports/java-1.5.0-gcj-5-armhf --with-arch-directory=arm --with-ecj-jar=/usr/share/java/eclipse-ecj.jar --enable-objc-gc --enable-multiarch --disable-sjlj-exceptions --with-arch=armv7-a --with-fpu=vfpv3-d16 --with-float=hard --with-mode=thumb --enable-checking=release --build=arm-linux-gnueabihf --host=arm-linux-gnueabihf --target=arm-linux-gnueabihf
Thread model: posix
gcc version 5.3.1 20160323 (Debian 5.3.1-13) 



With a struct set up with bitfield values like this:

struct mystruct {
        unsigned int uid_set : 1;
        unsigned int is_loaded : 1;
        unsigned int nonexistent : 1;
};

The following C code is compiled to bad ARMv7 assembly under certain circumstances at -O2:

        if (!user->is_loaded) 
                set_is_loaded (user, 1);

The resultant bad assembly is:

  14:	7803      	ldrb	r3, [r0, #0]
  16:	079b      	lsls	r3, r3, #30
  18:	d400      	bmi.n	1c <on_get_all_finished+0x8>
  1a:	d000      	beq.n	1e <on_get_all_finished+0xa>
  1c:	4770      	bx	lr
  1e:	e7ef      	b.n	0 <set_is_loaded.part.0>

In the bitfield, uid_set is bit 0 and is_loaded is bit 1.

The assembly loads the bitfield value and shifts left to have the value of is_loaded at bit 31. So the "bmi" instruction makes perfect sense: if is_loaded is set, jump to some code that is not going to call set_is_loaded.

The following "beq" instruction is bad. Here we have is_loaded at bit 31, but we also have uid_set at bit 30. So the value of uid_set is clearly going to influence the code behaviour here: if uid_set is 1, we will not call set_is_loaded.

This issue originates from freedesktop's accountsservice where I noticed the incorrect runtime behaviour.

I am attaching a minimal test case which you can compile with:
gcc -O2 -g -c main.c -o out.o
gcc -c other.c -o other.o
gcc other.o out.o -o prog

Run with:

./prog 0
./prog 1

The argument controls the value of uid_set. It should have no effect on the runtime behaviour, but you'll notice that myfunc is only called when the arg is 0.

Unfortunately I couldn't figure out how to slim down the test case into a single file, as that resulted in different (working) asm being generated.

I'm also attaching the preprocessed version of main.c.
Comment 1 Daniel Drake 2016-04-06 19:37:02 UTC
Created attachment 38205 [details]
other.c test case source
Comment 2 Daniel Drake 2016-04-06 19:38:01 UTC
Created attachment 38206 [details]
main.i preprocessed output
Comment 3 ktkachov 2016-04-07 08:31:12 UTC
Confirmed on active branches.
I get MYFUNC printed only when running ./prog 0.
At -O0 or with GCC 4.8 I get MYFUNC printed when
running both "./prog 0" and "prog 1".
So this looks like a regression from 4.9 onwards.
Comment 4 ktkachov 2016-04-07 09:01:01 UTC
The culprit seems to be the peephole in thumb2.md at line 1539.
Adding -fno-peephole2 gives the correct behaviour.
Comment 5 ktkachov 2016-04-07 09:10:56 UTC
I think the peephole to transform the 1-bit zero_extract + compare-with-zero into an lsls (from a tst-immediate) is not valid for positions other than zero (i.e. when the shift would be by 31). But that case should already be handled by the peephole below it at line 1567 in thumb2.md.

So I propose we just delete the one at line 1539.
Richard, what do you think?
Comment 6 ktkachov 2016-04-07 09:39:23 UTC
Ah, on second glance the peephole looks correct in itself, but the second branch following the bmi uses an incorrect condition code.
So we have:
	tst	r3, #2
	bne	.L3
	beq	.L6

being transformed into:
	ldrb	r3, [r0]	@ zero_extendqisi2
	lsls	r3, r3, #30
	bmi	.L3
	beq	.L6


The beq needs to be updated to be the opposite of bmi. That is, bpl
Comment 7 Richard Earnshaw 2016-04-07 09:46:35 UTC
(In reply to ktkachov from comment #6)
> Ah, on second glance the peephole looks correct in itself, but the second
> branch following the bmi uses an incorrect condition code.
> So we have:
> 	tst	r3, #2
> 	bne	.L3
> 	beq	.L6
> 
> being transformed into:
> 	ldrb	r3, [r0]	@ zero_extendqisi2
> 	lsls	r3, r3, #30
> 	bmi	.L3
> 	beq	.L6
> 
> 
> The beq needs to be updated to be the opposite of bmi. That is, bpl

Sounds like the peephole is missing a reg-dead check on the condition code value.
Comment 8 ktkachov 2016-04-07 09:50:22 UTC
(In reply to Richard Earnshaw from comment #7)
> (In reply to ktkachov from comment #6)
> > Ah, on second glance the peephole looks correct in itself, but the second
> > branch following the bmi uses an incorrect condition code.
> > So we have:
> > 	tst	r3, #2
> > 	bne	.L3
> > 	beq	.L6
> > 
> > being transformed into:
> > 	ldrb	r3, [r0]	@ zero_extendqisi2
> > 	lsls	r3, r3, #30
> > 	bmi	.L3
> > 	beq	.L6
> > 
> > 
> > The beq needs to be updated to be the opposite of bmi. That is, bpl
> 
> Sounds like the peephole is missing a reg-dead check on the condition code
> value.

Yep, that seems to do the trick. I'll test a patch.
Thanks Richard.
Comment 9 ktkachov 2016-04-07 16:15:53 UTC
Patch posted at:
https://gcc.gnu.org/ml/gcc-patches/2016-04/msg00351.html
Comment 10 ktkachov 2016-04-08 09:40:16 UTC
Author: ktkachov
Date: Fri Apr  8 09:39:44 2016
New Revision: 234825

URL: https://gcc.gnu.org/viewcvs?rev=234825&root=gcc&view=rev
Log:
[ARM] PR target/70566 Check that condition register is dead in tst-imm -> lsls-imm Thumb2 peepholes

	PR target/70566
	* config/arm/thumb2.md (tst + branch-> lsls + branch
	peephole below *orsi_not_shiftsi_si): Require that condition
	register is dead after the peephole.
	(second peephole after the above): Likewise.

	* gcc.c-torture/execute/pr70566.c: New test.


Added:
    trunk/gcc/testsuite/gcc.c-torture/execute/pr70566.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/arm/thumb2.md
    trunk/gcc/testsuite/ChangeLog
Comment 11 ktkachov 2016-04-08 09:42:03 UTC
Fixed on trunk.
The patch backports cleanly to 4.9 and 5 branches and bootstrap and test looks good.
I'll ask for a backport after a bit of time on trunk
Comment 12 ktkachov 2016-04-12 10:59:00 UTC
Author: ktkachov
Date: Tue Apr 12 10:58:28 2016
New Revision: 234898

URL: https://gcc.gnu.org/viewcvs?rev=234898&root=gcc&view=rev
Log:
[ARM] PR target/70566 Check that condition register is dead in tst-imm -> lsls-imm Thumb2 peepholes

	Backport from mainline
	2016-04-08  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>

	PR target/70566
	* config/arm/thumb2.md (tst + branch-> lsls + branch
	peephole below *orsi_not_shiftsi_si): Require that condition
	register is dead after the peephole.
	(second peephole after the above): Likewise.

	* gcc.c-torture/execute/pr70566.c: New test.

Added:
    branches/gcc-5-branch/gcc/testsuite/gcc.c-torture/execute/pr70566.c
Modified:
    branches/gcc-5-branch/gcc/ChangeLog
    branches/gcc-5-branch/gcc/config/arm/thumb2.md
    branches/gcc-5-branch/gcc/testsuite/ChangeLog
Comment 13 ktkachov 2016-04-13 08:25:15 UTC
Author: ktkachov
Date: Wed Apr 13 08:24:43 2016
New Revision: 234931

URL: https://gcc.gnu.org/viewcvs?rev=234931&root=gcc&view=rev
Log:
[ARM] PR target/70566 Check that condition register is dead in tst-imm -> lsls-imm Thumb2 peepholes

	Backport from mainline
	2016-04-08  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>

	PR target/70566
	* config/arm/thumb2.md (tst + branch-> lsls + branch
	peephole below *orsi_not_shiftsi_si): Require that condition
	register is dead after the peephole.
	(second peephole after the above): Likewise.

	* gcc.c-torture/execute/pr70566.c: New test.


Added:
    branches/gcc-4_9-branch/gcc/testsuite/gcc.c-torture/execute/pr70566.c
Modified:
    branches/gcc-4_9-branch/gcc/ChangeLog
    branches/gcc-4_9-branch/gcc/config/arm/thumb2.md
    branches/gcc-4_9-branch/gcc/testsuite/ChangeLog
Comment 14 ktkachov 2016-04-13 08:31:59 UTC
Fixed on all active branches.
Thanks for the bug report.