This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug tree-optimization/58296] New: ivopts is unable to handle some loops altered by the loop header copying pass
- From: "uranus at tinlans dot org" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: Mon, 02 Sep 2013 09:22:15 +0000
- Subject: [Bug tree-optimization/58296] New: ivopts is unable to handle some loops altered by the loop header copying pass
- Auto-submitted: auto-generated
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58296
Bug ID: 58296
Summary: ivopts is unable to handle some loops altered by the
loop header copying pass
Product: gcc
Version: 4.9.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: uranus at tinlans dot org
$ cat test.c
void bne_loop(unsigned int val,unsigned int N)
{
int i;
for (i=0;i<N;++i)
printf("%d\n",val+i);
}
Please note that the comparison expression in the for loop, 'i < N', is a
comparison between a signed int variable and an unsigned int variable. If we
change the type of i from 'int' to 'unsigned int', the issue won't be occured.
$ arm-eabi-gcc -v
Using built-in specs.
COLLECT_GCC=arm-eabi-gcc
COLLECT_LTO_WRAPPER=/home1/lhtseng/arm/4.9/libexec/gcc/arm-eabi/4.9.0/lto-wrapper
Target: arm-eabi
Configured with: ../../../../work/4.9/src/gcc-4.9.0/configure --target=arm-eabi
--prefix=/home1/lhtseng/arm/4.9 --disable-nls --disable-shared
--enable-languages=c --enable-__cxa_atexit --enable-c99 --enable-long-long
--enable-threads=single --with-newlib --disable-multilib --disable-libssp
--disable-libgomp --disable-decimal-float --disable-libffi --disable-libmudflap
--disable-lto --with-gmp=/home1/lhtseng/work/general
--with-mpfr=/home1/lhtseng/work/general --with-mpc=/home1/lhtseng/work/general
--with-isl=/home1/lhtseng/work/general --with-cloog=/home1/lhtseng/work/general
Thread model: single
gcc version 4.9.0 20130802 (experimental) (GCC)
$ arm-eabi-gcc -O3 -fdump-tree-all -O3 -da -S test.c
$ cat -n test.s
...
27 .L3:
28 add r1, r1, r5
29 add r4, r4, #1
30 ldr r0, .L9
31 bl printf
32 cmp r4, r6
33 mov r1, r4
34 bne .L3
...
The instruction 'mov r1, r4' is redundant. Reading the dump of the RTL
generation pass can understand how it's expanded:
$ cat test.c.166r.expand
...
;; i.0_4 = (unsigned int) i_9;
(insn 20 19 0 (set (reg:SI 110 [ i.0 ])
(reg/v:SI 112 [ i ])) ../test.c:6 -1
(nil))
...
$ cat test.c.165t.optimized
...
<bb 4>:
# i_13 = PHI <i_9(5), 0(3)>
# i.0_16 = PHI <i.0_4(5), 0(3)>
_7 = i.0_16 + val_6(D);
printf ("%d\n", _7);
i_9 = i_13 + 1;
i.0_4 = (unsigned int) i_9;
if (i_9 != _15)
goto <bb 5>;
else
goto <bb 6>;
...
It's surprised that the line 'i.0_4 = (unsigned int) i_9;' cannot be handled by
any tree-level optimization passes and RTL level optimization passes. After
doing some investigations, we finally find that using '-Os' or '-fno-tree-ch'
instead of '-O3' can generate the optimized code, and the conversion was
eliminated by ivopts properly:
$ arm-eabi-gcc -O3 -fdump-tree-all -O3 -fno-tree-ch -da -S test.c
$ cat test.c.119t.ivopts
<bb 3>:
_7 = ivtmp.9_11;
printf ("%d\n", _7);
ivtmp.9_10 = ivtmp.9_11 + 1;
<bb 4>:
# ivtmp.9_11 = PHI <val_6(D)(2), ivtmp.9_10(3)>
if (ivtmp.9_11 != _12)
goto <bb 3>;
else
goto <bb 5>;
$ cat test.s
...
.L3:
mov r1, r4
bl printf
add r4, r4, #1
.L2:
cmp r4, r5
ldr r0, .L6
bne .L3
ldmfd sp!, {r3, r4, r5, lr}
bx lr
...
Therefore, it's believed that there are something wrong with ivopts, which is
unable to handle the loop altered by the tree-ch pass when there is a
comparison (int v.s. unsigned int) in the condition field of a FOR statement.