This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug rtl-optimization/58295] New: The combination pass doesn't eliminates some extra zero extensions
- From: "uranus at tinlans dot org" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: Mon, 02 Sep 2013 08:39:18 +0000
- Subject: [Bug rtl-optimization/58295] New: The combination pass doesn't eliminates some extra zero extensions
- Auto-submitted: auto-generated
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58295
Bug ID: 58295
Summary: The combination pass doesn't eliminates some extra
zero extensions
Product: gcc
Version: 4.9.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: rtl-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: uranus at tinlans dot org
$ cat test.c
extern char zeb_test_array[10];
unsigned char ee_isdigit2(unsigned int i)
{
unsigned char c = zeb_test_array[i];
unsigned char retval;
retval = ((c>='0') & (c<='9')) ? 1 : 0;
return retval;
}
$ arm-eabi-gcc -v
Using built-in specs.
COLLECT_GCC=arm-eabi-gcc
COLLECT_LTO_WRAPPER=/home1/lhtseng/arm/4.9/libexec/gcc/arm-eabi/4.9.0/lto-wrapper
Target: arm-eabi
Configured with: ../../../../work/4.9/src/gcc-4.9.0/configure --target=arm-eabi
--prefix=/home1/lhtseng/arm/4.9 --disable-nls --disable-shared
--enable-languages=c --enable-__cxa_atexit --enable-c99 --enable-long-long
--enable-threads=single --with-newlib --disable-multilib --disable-libssp
--disable-libgomp --disable-decimal-float --disable-libffi --disable-libmudflap
--disable-lto --with-gmp=/home1/lhtseng/work/general
--with-mpfr=/home1/lhtseng/work/general --with-mpc=/home1/lhtseng/work/general
--with-isl=/home1/lhtseng/work/general --with-cloog=/home1/lhtseng/work/general
Thread model: single
gcc version 4.9.0 20130802 (experimental) (GCC)
$ arm-eabi-gcc -O3 -S test.c
$ cat test.s
...
ee_isdigit2:
@ Function supports interworking.
@ args = 0, pretend = 0, frame = 0
@ frame_needed = 0, uses_anonymous_args = 0
@ link register save eliminated.
ldr r3, .L2
ldrb r0, [r3, r0] @ zero_extendqisi2
sub r0, r0, #48
and r0, r0, #255
cmp r0, #9
movhi r0, #0
movls r0, #1
bx lr
...
The instruction 'and r0, r0, #255' is a redundant instruction which cannot be
eliminated by the RTL instruction combination pass. This pass was able to
handle this case before this commit:
http://gcc.gnu.org/viewcvs/gcc/trunk/gcc/simplify-rtx.c?r1=191909&r2=191928&pathrev=192303
And the code was re-organized to line 643 ~ 656 after this commit:
http://gcc.gnu.org/viewcvs/gcc/trunk/gcc/simplify-rtx.c?r1=192006&r2=192186&pathrev=192303
For example, GCC 4.6.3 can handle it perfectly.
In GCC 4.9.0, reverting the two commits or simply commeting the lines mentioned
above can make the combination pass handle this case again:
$ arm-eabi-gcc-modified -O3 -da -S test.c
$ cat test.c.166r.expand
...
(insn 9 8 10 2 (set (reg:SI 120)
(plus:SI (subreg:SI (reg:QI 118) 0)
(const_int -48 [0xffffffffffffffd0]))) test.c:6 -1
(nil))
(insn 10 9 11 2 (set (reg:SI 121)
(and:SI (reg:SI 120)
(const_int 255 [0xff]))) test.c:6 -1
(nil))
(insn 11 10 12 2 (set (reg:CC 100 cc)
(compare:CC (reg:SI 121)
(const_int 9 [0x9]))) test.c:6 -1
(nil))
(insn 12 11 13 2 (set (reg:SI 122)
(leu:SI (reg:CC 100 cc)
(const_int 0 [0]))) test.c:6 -1
(nil))
...
$ cat test.c.197r.combine
...
Trying 9, 10 -> 11:
Failed to match this instruction:
(set (reg:CC 100 cc)
(compare:CC (plus:SI (reg:SI 119)
(const_int -48 [0xffffffffffffffd0]))
(const_int 9 [0x9])))
Successfully matched this instruction:
(set (reg:SI 121)
(plus:SI (reg:SI 119)
(const_int -48 [0xffffffffffffffd0])))
Successfully matched this instruction:
(set (reg:CC 100 cc)
(compare:CC (reg:SI 121)
(const_int 9 [0x9])))
deferring deletion of insn with uid = 9.
modifying insn i2 10: r121:SI=r119:SI-0x30
REG_DEAD r119:SI
deferring rescan insn with uid = 10.
modifying insn i3 11: cc:CC=cmp(r121:SI,0x9)
REG_DEAD r121:SI
deferring rescan insn with uid = 11.
...
The insn 10 is generated by (define_expand "zero_extendqisi2" ...) of ARM's
machine description. Before the commits I mentioned above, the combination pass
successfully combines it with the insn 9. However, after those commits, the
combination pass never tries to do the combination '9, 10 -> 11.'
After reading the commit messages of the file 'simplify-rtx.c', we can
understand the commits, r191928, was trying to optimize x86 code generation,
but it led to the suboptimal code generation of the ARM's target.