This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH/RFA] SH TLS support
- From: Joern Rennecke <joern dot rennecke at superh dot com>
- To: kaz Kojima <kkojima at rr dot iij4u dot or dot jp>
- Cc: gcc-patches at gcc dot gnu dot org, aoliva at redhat dot com
- Date: Thu, 20 Feb 2003 15:32:23 +0000
- Subject: Re: [PATCH/RFA] SH TLS support
- Organization: SuperH UK Ltd.
- References: <3E53AC73.B4BD3F22@superh.com> <200302200146.h1K1kiH19417@r-rr.iij4u.or.jp>
kaz Kojima wrote:
> I have no good statistics for them, though the percentage of
> GD->IE transitions seems fairly low. But it seems that the call
> for __tls_get_address is very heavy relative to one branch.
> It contains several memory accesses and branches even for the best
> case. For the worst case, it will call some functions like malloc
> and execute some loops, though it would be relatively rare.
> So I think that the loss caused by a branch in TLS code would be
> permissible in the global and local dynamic cases.
> Except for SHmedia, SH has no SMP support. So the thread, at least
> in glibc for SH-3/4, would be the problem of the programing model
> rather than the efficiency. To get the maximal efficiency, we'll
> require at least more relocation numbers, for example, and free
> ELF relocation numbers are now valuable for SH, as you know.
I don't see why we need new relocation numbers. Can't we just extend
the scope of the existing BFD_RELOC_SH_COUNT and BFD_RELOC_SH_USES
relocations? The BFD_RELOC_SH_USES reloc will tell you where the
load that belongs to a call is, and the BFD_RELOC_SH_COUNT will
tell you how many times the load is being used.
The BFD_RELOC_SH_USES reloc that point to the load of x at TLSGD
would be on the instruction "add r12, r4" , which means we have to
look at the opcode of this insn that carries the BFD_RELOC_SH_USES
relocation before we know what it means. That should be no big deal,
since we need to read the section contents anyway.
Once you have the load, you can find the constant, just like the current
jsr->bsr relaxation does. If something doesn not work thre right now we
should fix it first before adding more relaxation code, so that should be
no grounds of using a different relaxation strategy for TLS.
In order to be able to recognize the TLS sequence without further
relocations, we should keep the bsr and the add r12,r4 together.
So we get something like:
L4: mov.l L2,r1
...
L5: mov.l L1,r4
...
.uses L4
bsr @r1
.uses L5
add r12,r4
L0:
....
(in constant pool)
L1: .long x at TLSGD
L2: .long __tls_get_addr at PLT+(L2-L0)
(L4 and L5 and their insns might be reversed)
This can then be relaxed to:
L4: mov.l .L1, r1
...
mov r1,r0
mov.l @(r0,r12), r4
stc gbr, r0
add r4, r0
....
(in constant pool)
.L1: .long x at GOTTPOFF
or:
L4: mov.l .L1, r1
...
L5: add r12,r1
mov.l @r1,r4
...
stc gbr, r0
add r4, r0
....
(in constant pool)
.L1: .long x at GOTTPOFF
or:
mov.l .L1, r0
stc gbr, r4
mov.l @(r0,r12), r0
add r4, r0
....
(in constant pool)
.L1: .long x at GOTTPOFF
--
--------------------------
SuperH (UK) Ltd.
2410 Aztec West / Almondsbury / BRISTOL / BS32 4QX
T:+44 1454 465658