This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug rtl-optimization/31704] New: x86_64 poor floating point register allocation across function call
- From: "ian at airs dot com" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: 25 Apr 2007 15:09:08 -0000
- Subject: [Bug rtl-optimization/31704] New: x86_64 poor floating point register allocation across function call
- Reply-to: gcc-bugzilla at gcc dot gnu dot org
When I compile this test case with -O2 for x86_64:
extern void g (void);
float
f (float sum, float mult, int *pi)
{
int i, j;
for (i = 0; i < 10; ++i)
{
g ();
for (j = 0; j < 1000; ++j)
sum += *pi++ * mult;
}
return sum;
}
I get this result:
f:
.LFB2:
pushq %rbp
.LCFI0:
movaps %xmm0, %xmm2
xorl %ebp, %ebp
pushq %rbx
.LCFI1:
movq %rdi, %rbx
subq $40, %rsp
.LCFI2:
movss %xmm1, 28(%rsp)
.L2:
movss %xmm2, (%rsp)
call g
cvtsi2ss (%rbx), %xmm0
leaq 4(%rbx), %rax
movl $1, %edx
movss (%rsp), %xmm2
mulss 28(%rsp), %xmm0
addss %xmm0, %xmm2
.p2align 4,,7
.L3:
cvtsi2ss (%rax), %xmm1
addl $1, %edx
addq $4, %rax
cmpl $1000, %edx
mulss 28(%rsp), %xmm1
addss %xmm1, %xmm2
jne .L3
addl $1, %ebp
addq $4000, %rbx
cmpl $10, %ebp
jne .L2
addq $40, %rsp
movaps %xmm2, %xmm0
popq %rbx
popq %rbp
ret
In the original code, the inner loop is performance critical. Note that this
compiles into a mulss loading a value from memory. It would be more efficient
to have the value in a register during the inner loop. In fact the value was
in a register, but we stored it in the stack because it crossed the function
call, and we load it from the stack once for each inner loop iteration rather
than once for each outer loop iteration.
I don't see a simple approach to fixing this. Some sort of live range
splitting might work.
--
Summary: x86_64 poor floating point register allocation across
function call
Product: gcc
Version: 4.3.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: rtl-optimization
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: ian at airs dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31704