This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug optimization/11841] [3.3 Regression] The code compiled with -funroll-loops crashes
- From: "rguenth at tat dot physik dot uni-tuebingen dot de" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: 27 Apr 2004 14:25:28 -0000
- Subject: [Bug optimization/11841] [3.3 Regression] The code compiled with -funroll-loops crashes
- References: <20030807012210.11841.panov@canopus.iacp.dvo.ru>
- Reply-to: gcc-bugzilla at gcc dot gnu dot org
------- Additional Comments From rguenth at tat dot physik dot uni-tuebingen dot de 2004-04-27 14:25 -------
The first (and probably the second, too) testcase can be modified to
int i;
int main()
{
int d[6];
int j;
for (i=0; i<4; ++i)
for (j=0; j<3; ++j)
d[i+j] = 0;
return 0;
}
i.e. remove the weird return. It was for preventing the variable from being
removed in favor of the biv. Note that the loop counters need to be exactly
4 and 3 in order to trigger the failure - exchanging the loops also prevents the
failure, as does moving the declaration of i into the body of main.
It is also interesting to note that gcc 3.4 passes the test using
-fold-unroll-loops. So the loop unroller itself may be not the one to blame.
>From the asm dump (sorry, can't parse RTL) we see:
main:
pushl %ebp
xorl %eax, %eax
movl %esp, %ebp
subl $40, %esp
xorl %edx, %edx
andl $-16, %esp
movl %eax, i
.L11:
movl $0, -32(%ebp,%edx,4)
xorl %eax, %eax
movl %eax, -40(%ebp,%edx,4)
leal 1(%edx), %eax
xorl %edx, %edx
movl %edx, -40(%ebp,%eax,4)
leal 2(%ecx), %eax
^^^^^^^^^^^^^^^^^^^^^^^^
xorl %edx, %edx
movl %edx, -40(%ebp,%eax,4)
xorl %edx, %edx
cmpl $3, %eax
movl %edx, -28(%ebp,%ecx,4)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
movl %eax, %edx
movl %eax, i
jle .L11
the marked line screws the loop counter of the outer loop (in fact - ecx is
never initialized...) and contain a bogous use of it - removing this insn will
fix the loop.
If one looks at the 3.4 version (with -fold-unroll-loops)
main:
pushl %ebp
xorl %ecx, %ecx
movl %esp, %ebp
subl $40, %esp
andl $-16, %esp
subl $16, %esp
movl %ecx, i
.p2align 4,,15
.L9:
movl i, %eax
xorl %edx, %edx
xorl %ecx, %ecx
movl %edx, -40(%ebp,%eax,4)
leal 1(%eax), %edx
movl %ecx, -40(%ebp,%edx,4)
xorl %ecx, %ecx
cmpl $3, %edx
movl %ecx, -32(%ebp,%eax,4)
movl %edx, i
jle .L9
You see a correct version, though strangely enough, still two different
biv's are used for array access (two times %eax and one time %edx). Register
usage could be better, too, as for the -funroll-loops version, which is
main:
pushl %ebp
xorl %ecx, %ecx
movl %esp, %ebp
subl $40, %esp
andl $-16, %esp
subl $16, %esp
movl %ecx, i
.p2align 4,,15
.L9:
movl i, %eax
xorl %edx, %edx
xorl %ecx, %ecx
movl %edx, -40(%ebp,%eax,4)
xorl %edx, %edx
movl %ecx, -36(%ebp,%eax,4)
movl %edx, -32(%ebp,%eax,4)
incl %eax
cmpl $3, %eax
movl %eax, i
jle .L9
gcc 3.5 and tree-ssa are clever and remove all of main ;)
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=11841