This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH/RFA] SH TLS support


kaz Kojima wrote:
> I have no good statistics for them, though the percentage of
> GD->IE transitions seems fairly low. But it seems that the call
> for __tls_get_address is very heavy relative to one branch.
> It contains several memory accesses and branches even for the best
> case. For the worst case, it will call some functions like malloc
> and execute some loops, though it would be relatively rare.
> So I think that the loss caused by a branch in TLS code would be
> permissible in the global and local dynamic cases.
> Except for SHmedia, SH has no SMP support. So the thread, at least
> in glibc for SH-3/4, would be the problem of the programing model
> rather than the efficiency. To get the maximal efficiency, we'll
> require at least more relocation numbers, for example, and free
> ELF relocation numbers are now valuable for SH, as you know.

I don't see why we need new relocation numbers.  Can't we just extend
the scope of the existing BFD_RELOC_SH_COUNT and BFD_RELOC_SH_USES
relocations?  The BFD_RELOC_SH_USES reloc will tell you where the
load that belongs to a call is, and the BFD_RELOC_SH_COUNT will
tell you how many times the load is being used.
The BFD_RELOC_SH_USES reloc that point to the load of x at TLSGD
would be on the instruction "add r12, r4"  , which means we have to
look at the opcode of this insn that carries the BFD_RELOC_SH_USES
relocation before we know what it means.  That should be no big deal,
since we need to read the section contents anyway.
Once you have the load, you can find the constant, just like the current
jsr->bsr relaxation does.  If something doesn not work thre right now we
should fix it first before adding more relaxation code, so that should be
no grounds of using a different relaxation strategy for TLS.

In order to be able to recognize the TLS sequence without further
relocations, we should keep the bsr and the add r12,r4 together.
So we get something like:


L4:     mov.l L2,r1
...
L5:     mov.l L1,r4
...
	.uses L4
        bsr @r1
	.uses L5
        add r12,r4
L0:
....
(in constant pool)
L1:     .long   x at TLSGD       
L2:     .long   __tls_get_addr at PLT+(L2-L0)

(L4 and L5 and their insns might be reversed)

This can then be relaxed to:

L4:     mov.l   .L1, r1
...
	mov	r1,r0
        mov.l   @(r0,r12), r4
        stc     gbr, r0
        add     r4, r0
....
(in constant pool)
.L1:    .long   x at GOTTPOFF

or:

L4:     mov.l   .L1, r1
...
L5:	add	r12,r1
	mov.l   @r1,r4
...
        stc     gbr, r0
        add     r4, r0
....
(in constant pool)
.L1:    .long   x at GOTTPOFF


or:

	mov.l   .L1, r0
        stc     gbr, r4
        mov.l   @(r0,r12), r0
        add     r4, r0
....
(in constant pool)
.L1:    .long   x at GOTTPOFF
	
-- 
--------------------------
SuperH (UK) Ltd.
2410 Aztec West / Almondsbury / BRISTOL / BS32 4QX
T:+44 1454 465658


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]