This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug target/58112] New: Ineffective addressing mode used in loop.
- From: "neleai at seznam dot cz" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: Fri, 09 Aug 2013 12:57:49 +0000
- Subject: [Bug target/58112] New: Ineffective addressing mode used in loop.
- Auto-submitted: auto-generated
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58112
Bug ID: 58112
Summary: Ineffective addressing mode used in loop.
Product: gcc
Version: 4.9.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: neleai at seznam dot cz
Hi, in following testcase gcc -O3 generates following loop:
movq %rsi, %r9
subq %rdx, %r9
movq %r9, %rdi
movq %r9, %rsi
leaq 16(%r9), %r8
addq $32, %rdi
addq $48, %rsi
.p2align 4,,10
.p2align 3
.L14:
movdqu (%rdx,%r9), %xmm0
addq $64, %rdx
movdqa %xmm0, -64(%rdx)
movdqu -64(%rdx,%r8), %xmm0
movdqa %xmm0, -48(%rdx)
movdqu -64(%rdx,%rdi), %xmm0
movdqa %xmm0, -32(%rdx)
movdqu -64(%rdx,%rsi), %xmm0
movdqa %xmm0, -16(%rdx)
cmpq %rdx, %rcx
jne .L14
rep; ret
It saves one addq $64, %rsi instruction. However it occupies four extra
registers, and address calculations done at each iteration cost more and lead
to bigger code than instruction saved.