Bug 11824 - [ARM] Parameter passing via stack could be improved
Summary: [ARM] Parameter passing via stack could be improved
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: 3.4.0
: P3 enhancement
Target Milestone: 4.6.0
Assignee: Not yet assigned to anyone
URL:
Keywords: missed-optimization
Depends on:
Blocks: 16996
  Show dependency treegraph
 
Reported: 2003-08-06 08:52 UTC by Gábor Lóki
Modified: 2017-06-20 13:28 UTC (History)
2 users (show)

See Also:
Host:
Target: arm-unknown-elf
Build:
Known to work:
Known to fail:
Last reconfirmed: 2005-12-09 04:24:48


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Gábor Lóki 2003-08-06 08:52:50 UTC
For each function that uses the stack for parameter passing GCC generally
first stores lr and decrements sp, and before returning it increments
sp and loads lr into pc.
If the number of the parameters is not too much, then GCC could perform the
parameter passing in a tricky way (as far as code size is concerned),
i.e. save some arbitrary registers with lr on the stack
(so the sp is modified implicitly).

--- c example ---
// arm-elf-gcc -S -g0 -Os -o param-stack.s param-stack.c
void func (int* a, int* b);
void foo ()
{
   int a=6, b=7;
   func(&a, &b);
}

--- asm code ---
foo:
 mov ip, sp
 stmfd sp!, {fp, ip, lr, pc} <- OLD
 mov r3, #6
 sub fp, ip, #4
 sub sp, sp, #8 <- OLD
 sub r0, fp, #16
 str r3, [fp, #-16]
 sub r1, fp, #20
 add r3, r3, #1
 str r3, [fp, #-20]
 bl func
 ldmea fp, {fp, sp, pc}

--- possible solution ---
foo:
 mov ip, sp
 stmfd sp!, {r1, r2, fp, ip, lr, pc} <-NEW
 mov r3, #6
 sub fp, ip, #4
 sub r0, fp, #16
 str r3, [fp, #-16]
 sub r1, fp, #20
 add r3, r3, #1
 str r3, [fp, #-20]
 bl func
 ldmea fp, {fp, sp, pc}
Comment 1 Dara Hazeghi 2003-08-25 15:56:08 UTC
Confirmed with mainline (20030825).
Comment 2 Steven Bosscher 2005-09-02 11:21:32 UTC
Not reconfirmed for almost a year..  Is this still an issue? 
Comment 3 Richard Earnshaw 2005-09-02 12:39:17 UTC
Undoubtedly.  But I don't see much prospect of this being changed any time soon.
 It would require too much co-operation between the mid and back-ends.
Comment 4 Bill Pringlemeir 2013-04-20 13:59:19 UTC
As far as I understand, the instruction stream is smaller, but there are two extra memory writes to adjust the stack.  This optimization is only important for '-Os'.  Generally, it will slow the code as data writes and code fetches are generally the same cost and this trades 2 for 1.
Comment 5 Richard Earnshaw 2017-06-20 13:28:29 UTC
I'm not sure exactly when this was fixed, but certainly it was some time ago.  At least, gcc-4.6 appears to implement this optimization at -Os.

foo:
        @ Function supports interworking.
        @ args = 0, pretend = 0, frame = 8
        @ frame_needed = 0, uses_anonymous_args = 0
        stmfd   sp!, {r0, r1, r2, lr}
        mov     r3, #7
        mov     r2, #6
        mov     r0, sp
        add     r1, sp, #4
        stmia   sp, {r2, r3}
        bl      func
        ldmfd   sp!, {r1, r2, r3, lr}
        bx      lr