This is the mail archive of the mailing list for the GCC project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: ARM: Imply frame pointer for arm-linux profiling

On Wed, May 11, 2005 at 03:29:53PM +0100, Richard Earnshaw wrote:
> On Wed, 2005-05-11 at 14:10, Daniel Jacobowitz wrote:
> > What would you recommend?  The problem with the ip-based version is
> > that it means you can't go through a PLT on the way to mcount - for
> > this reason I suspect the netbsd-elf implementation is a little quirky. 
> > I suppose we could require the runtime library implementation to
> > provide an entry point for mcount which will not pass through the PLT,
> > which does appropriate register saving.  Alternatively, force GCC to
> > save/restore r3 if it is live on function entry.
> Hmm, none of the existing solutions are really ABI compatible, since all
> of them use IP in some way (generally to cache the caller's return
> address).  With interworking (on V4T) that's unsafe even in a statically
> linked environment.

Ah right.  I always forget about that.

> It seems to me that we should just accept that IP & LR will get
> clobbered and work from there.  The obvious solution is then
> 	.data
> LP:
> 	.word 0
> 	.text
> foo:
> 	push	{lr}
> 	bl	__gnu_mcount
> 	.4byte	LP - .
> 	// Normal code for foo (including normal prologue)
> We can use the same sequence in both ARM and Thumb code.

FYI, this is even simpler on Linux.  The Linux implementation of mcount
does not require profile counters; we currently generate them, but
that's just an oversight.  This is based on a very old BSD
implementation; I don't know the pedigree of the counter-based

So this would be just the push and branch, plus appropriate dwarf2
frame gunk.

> There are a few things to note here:
> 1) We use .4byte because we can't assume that the entry will be aligned
> in thumb state (it will always be on a half-word boundary, but we can't
> guarantee a full word alignment).
> 2) The address of the count word is always stored PIC, even in non-pic
> code.  That means we can profile both normal and PIC code in the same
> manner. 

This is only adequately PIC on systems where the text and data segments
are loaded at fixed offsets from each other.  I've got at least one
(VxWorks) where this isn't the case.  I think some other platforms have
similar constraints.

I suppose those platforms can use a similar approach to Linux's though;
no counters at all.

I'd probably give the with and without counters versions different
names, for robustness.

> 5) __gnu_mcount has special abi privileges in that it does not take an
> 8-byte aligned stack.
> So on ARMv4T, the __gnu_mcount code will look something like:
> 	push	{r0-r3, lr}
> 	tst	lr, #1
> 	bic	lr, lr, #1	// Clear thumb bit (if set)
> 	ldreq	r0, [lr]	// Caller was ARM state (aligned)
> 	ldrhne	r0, [lr]
> 	ldrhne	ip, [lr, #2]
> 	add	r0, r0, #lr
> 	addne	r0, r0, ip, asl #16
> 	ldr	r1, [sp, #24]	// Load caller's address
> 	bl	__gnu_mcount_1	//args: r0 = &count, r1  = caller
> 	pop	{r0-r3, ip, lr}	// Pops caller's address
> 	add	ip, ip, #4
> 	bx	ip
> __gnu_mcount_1 can be written in C with substantially full ABI
> privileges (though it may only touch core registers -- ie no floating
> point).
> The above sequence should work on all cores (even those that are pre-v4)
> because the ldrh instructions will never execute in that case (and the
> only cores that didn't have these instructions wouldn't fault them if
> they didn't execute).

That is a devilish trick.  Of course to work on earlier cores it would
need the appropriate relocation for bx ip -> mov pc, ip; ISTR that the
GNU tools don't implement that yet.

It won't be right away, but I will look into implementing this for
arm-eabi and arm-none-linux-gnueabi.

Daniel Jacobowitz
CodeSourcery, LLC

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]