This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug target/45937] New: unnecessary push/pop to reserve stack memory
- From: "carrot at google dot com" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: Fri, 8 Oct 2010 03:48:18 +0000
- Subject: [Bug target/45937] New: unnecessary push/pop to reserve stack memory
- Auto-submitted: auto-generated
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45937
Summary: unnecessary push/pop to reserve stack memory
Product: gcc
Version: 4.6.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: target
AssignedTo: unassigned@gcc.gnu.org
ReportedBy: carrot@google.com
CC: carrot@google.com
Host: i686-linux
Target: arm-eabi
Build: i686-linux
Created attachment 21995
--> http://gcc.gnu.org/bugzilla/attachment.cgi?id=21995
test case
Compile the attached source code with options -march=armv7-a -mthumb -Os, gcc
generates:
tt:
push {r4, r5, r6, r7, lr}
sub sp, sp, #20
mov r5, r2
ldr r4, [sp, #40]
cbz r4, .L1
movs r2, #20
muls r2, r1, r2
adds r6, r3, r2
ldr r3, [r3, r2]
cbz r3, .L1
ldr lr, [r6, #8]
ldr r7, [r6, #12]
ldr r3, [r6, #16]
ldr r2, [r6, #4]
ldr r6, .L5
str lr, [sp, #0]
cmp r3, #0
it eq
moveq r3, r6
str r7, [sp, #4]
str r3, [sp, #8]
mov r3, r5
blx r4
.L1:
add sp, sp, #20
pop {r4, r5, r6, r7, pc}
Notice that this function uses only 12 bytes of stack memory to pass
parameters, but it allocates 20 bytes and the other 8 bytes is never used. So
the function prologue and epilogue can be rewritten as following and reduce 2
instructions.
tt:
push {r1, r2, r3, r4, r5, r6, r7, lr}
...
pop {r1, r2, r3, r4, r5, r6, r7, pc}
The root cause of this problem is the memory is separately allocated and
aligned for out going arguments and the callee saved registers. In function
expand_call() 12 bytes is needed and 16 bytes is allocated to align to 8 bytes.
In function arm_get_frame_offsets() 20 bytes is needed and 24 bytes is
allocated to save registers. So this function needs 40 bytes of stack, exceeds
the capability of push/pop, extra sub/add instructions are needed to adjust sp.
Actually the function uses only 32 bytes of stack and no data element is 8
bytes aligned, simple push/pop should be enough.