This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug tree-optimization/34160] New: Useful loop invariant motion missing
- From: "amonakov at gmail dot com" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: 20 Nov 2007 12:05:40 -0000
- Subject: [Bug tree-optimization/34160] New: Useful loop invariant motion missing
- Reply-to: gcc-bugzilla at gcc dot gnu dot org
int main()
{
static int i, n;
static double a[200], b[200];
... (more variables and control flow)
for (i = 0; i < n; i++)
a[i] = b[i];
...
}
Tree-level optimisations do not pull out loads of i and n and store to i out of
loop. As a result, GCC generates five memory accesses on ia64 for each
iteration (4.3.0 20071112):
.L9:
.mii
nop 0
sxt4 r14 = r16
nop 0
.mmi
ld4 r15 = [r32]
ld4 r58 = [r33]
nop 0
;;
.mii
shladd r14 = r14, 3, r0
adds r16 = 1, r15
;;
add r15 = r35, r14
.mmi
add r14 = r34, r14
st4 [r32] = r16
cmp4.lt p6, p7 = r16, r58
;;
.mmi
nop 0
ldfd f6 = [r15]
nop 0
;;
.mib
stfd [r14] = f6
nop 0
(p6) br.cond.dptk .L9
On x86_64 situation is better (4.3.0 20070930), but not good:
.L13:
movslq %eax,%rdx
movq b.3894(,%rdx,8), %rax
movq %rax, x.3895(,%rdx,8)
leal 1(%rcx), %eax
cmpl %eax, %edi
movl %eax, %ecx
movl %eax, i.3912(%rip)
jg .L13
but the optimization happened on RTL level, as final_cleanup dump reads:
<bb 13>:
# MPT.140_429 = VDEF <MPT.140_645>
x[i.265] = b[i.265];
# VUSE <MPT.140_429>
i.23 = i;
i.265 = i.23 + 1;
# MPT.140_430 = VDEF <MPT.140_429>
i = i.265;
# VUSE <MPT.140_430>
n.274 = n;
if (n.274 > i.265)
goto <bb 13>;
else
goto <bb 14>;
--
Summary: Useful loop invariant motion missing
Product: gcc
Version: 4.3.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: tree-optimization
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: amonakov at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34160