Simple ARM code generation

Steve Freeland
Wed Aug 23 15:44:00 GMT 2006

> From: Richard Earnshaw <>
> To: Steve Freeland <>
> Sent: Monday, August 21, 2006 2:22:34 PM
> Subject: Re: Simple ARM code generation 

> On Sun, 20 Aug 2006 15:43:03 PDT, Steve Freeland wrote:
> > 
> > > From: Richard Earnshaw <>
> > > On Mon, 14 Aug 2006 15:15:46 PDT, Steve Freeland wrote:
> > > 
> > > > 00000000 <AEEMod_Load>:
> > > >    0:   e1a0c00d        mov     ip, sp
> > > >    4:   e92dd800        stmdb   sp!, {fp, ip, lr, pc}
> > > >    8:   e24cb004        sub     fp, ip, #4      ; 0x4
> > > >    c:   e24dd00c        sub     sp, sp, #12     ; 0xc
> > > >   10:   e50b0010        str     r0, [fp, #-16]
> > > >   14:   e50b1014        str     r1, [fp, #-20]
> > > >   18:   e50b2018        str     r2, [fp, #-24]
> > > >   1c:   e3a03002        mov     r3, #2  ; 0x2
> > > >   20:   e1a00003        mov     r0, r3
> > > >   24:   e24bd00c        sub     sp, fp, #12     ; 0xc
> > > >   28:   e89da800        ldmia   sp, {fp, sp, pc}
> > > > 

> ldm and stm can normally be matched in pairs, so for example stmdb 
> (D_ecrement B_efore) can be matched with ldmia (I_ncrement A_fter.  The 
> registers in both instructions are always in the same order, with the 
> lowest numbered register at the lowest address in memory and incrementing 
> upwards from there.

> To make things a bit easier when talking about stacks you can also talk 
> about the stack layout in the instructions, you can then write your ldm 
> and stm instructions using the stack mnemonics, the most common of which 
> is a 'full-descending' stack (the stack grows by moving to a lower address 
> and the bottom, addresed, word contains data -- it's full).  So

>     stmdb sp!, {r4, r5, r6}
>     ldmdb sp!, {r4, r5, r6}

> will push r4-r6 onto the stack and then pop them off again.

> Now, going back to your original example, you will see that the compiler 
> pushes 4 words onto the stack at the start of the function, but at the end 
> it only pops 3 words off.  How does this work and not leave the stack 
> corrupted?

> The answer is that GCC is also saving the stack pointer on the stack, so 
> when the pop happens at the end the original value of the stack pointer 
> (which we copied int IP before we started messing with the stack at all) 
> is restored directly.

Aaah, now I understand.  I was trying to work it out with the order of the registers on the
stack in reverse.  So at the end, the original fp gets loaded back into fp, the original sp
gets loaded back into sp (after a trip through ip) and the original lr gets loaded into pc,
which triggers the procedure return.

> > Also, this:
> > nter claims that r15/pc can't always be manipulated like any other register. 
> >  Is that correct?  If so, is using ldmia into r15/pc always ok?  
> > 

> R15 is the program counter, and it's true that you can't treat it entirely 
> like a normal register, but you can load and store its value; you can copy 
> it to other registers; and you can use it in some simple addressing 
> operations (either to generate an address value in a register, or directly 
> in a pc-relative load instruction).  For example, it's perfectly 
> acceptable to write

>     ldr    r0, [pc, #32]

> > > Let me take a guess.  You are using something like an ARM920 (or an ARM7TDM
> > I) device, and you are calling
> > > this function from Thumb code.  If so, then you need to compile your functi
> > on with -mthumb-interwork, then it
> > > will generate a return sequence that switches correctly back to Thumb.
> > 
> > You're correct that it's an ARM7TDMI device.  The code which calls my code is
> >  in firmware, so I'm not entirely sure whether it's Thumb or not...  If you'r
> > e right, then presumably the calling procedure puts the appropriate flag in t
> > he LSB of r14/lr and expects the procedure being called to use bx and not ldm
> > ia to return.  Is it the case, then, that the switch to or from Thumb mode ca
> > n't be done by modifying pc with ldmia?  If so, that would certainly explain 
> > the crash.  I'll try that as soon as I get the chance.

> The ARM7TDMI is an implementation of version 4T of the ARM architecture, 
> often termed ARMv4T.  This was the first revision of the architecture to 
> support Thumb and support for compiling your application as a mixture of 
> both ARM and Thumb code (termed interworking) was fairly limited: the only 
> way you could switch states was by using the BX instruction.  The next 
> revision of the architecture added support for state switching to ldr and 
> ldm instructions as well, which makes interworking much more efficient.  
> If you compile your original example with the same code compiled with 
> -mthumb-interwork you'll probably see a return sequence something like

>    24:   e24bd00c        sub     sp, fp, #12     ; 0xc
>    28:   e89da800        ldmia   sp, {fp, sp, lr}
>    2c:   e12fff1e        bx      lr

> Anyway, I've probably bamboozled you with more than enough information by 
> now, so I'd better stop.  Hope the above helps,

Actually, that was perfect, I feel much more solid on this now.  Thanks a million!

The -mthumb-interworking flag seems to be the magic bullet; my code now works correctly
and doesn't crash the device.  The crux is definitely the use of ldmia vs. bx instructions to
perform procedure returns, but I think there's something to it besides setting the
Thumb/ARM mode flag.  If that were the only issue, only the final return out of my code and back to firmware would have to use bx, right?  But there are indications that I need *all* procedure returns to use bx.  I haven't checked this very carefully yet, but I suspect there may be more to the issue...  I may have to look into it more systematically at some point.

Anyways, you've definitely fixed my immediate problem.  Once again, many thanks!

- Steve

More information about the Gcc-help mailing list