[Bug target/49421] New: [arm] suboptimal choice of working regs
philb at gnu dot org
gcc-bugzilla@gcc.gnu.org
Wed Jun 15 12:02:00 GMT 2011
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49421
Summary: [arm] suboptimal choice of working regs
Product: gcc
Version: 4.6.0
Status: UNCONFIRMED
Severity: enhancement
Priority: P3
Component: target
AssignedTo: unassigned@gcc.gnu.org
ReportedBy: philb@gnu.org
If a leaf function requires one more working register than can be accomodated
in the call-clobbered set, gcc currently tends to push r4 and use that next.
However, in the specific case of a leaf function, it would be better to push lr
and use that as the working register, since then the return can be done with a
single pop. Consider the made-up example:
int f(int *a, int *b, int *c, int *d)
{
int i;
for (i = 0; i < 4; i++)
if (a[i] || b[i] || c[i] || d[i])
return 1;
return 0;
}
which compiles (-march=armv6 -mtune=arm1136jf-s -O2) to:
f:
@ args = 0, pretend = 0, frame = 0
@ frame_needed = 0, uses_anonymous_args = 0
@ link register save eliminated.
mov ip, #0
str r4, [sp, #-4]!
.L3:
ldr r4, [r0, ip]
cmp r4, #0
bne .L7
ldr r4, [r1, ip]
cmp r4, #0
bne .L7
ldr r4, [r2, ip]
cmp r4, #0
bne .L7
ldr r4, [r3, ip]
add ip, ip, #4
cmp r4, #0
bne .L7
cmp ip, #16
bne .L3
mov r0, r4
.L2:
ldmfd sp!, {r4}
bx lr
.L7:
mov r0, #1
b .L2
If lr had been pushed instead of r4 then the return could have simply been "pop
{lr}".
Also, since this is arm11, it is no more expensive to push two words than one.
If the compiler had stacked both r4 and lr, it would have freed up an extra
register for the loop which would probably have allowed the loads to be
scheduled better.
More information about the Gcc-bugs
mailing list