This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: PATCH: reorg branch displacement fix
- From: Joern Rennecke <joern dot rennecke at superh dot com>
- To: tm <tm at mail dot kloo dot net>
- Cc: law at redhat dot com, kkojima at gcc dot gnu dot org, tm at kloo dot net, gcc-patches at gcc dot gnu dot org, gcc at gnu dot org
- Date: Wed, 06 Nov 2002 20:18:09 +0000
- Subject: Re: PATCH: reorg branch displacement fix
- Organization: SuperH UK Ltd.
- References: <Pine.LNX.4.21.0211051122480.25259-100000@mail.kloo.net>
tm wrote:
>
> On Fri, 1 Nov 2002, Joern Rennecke wrote:
>
> > > > * reorg.c (relax_delay_slots): Don't thread conditional jump
> > > > through unconditional jump if the conditional jump can't reach
> > > > the branch target on this processor target.
> > > This is wrong. reorg does not and should not be checking branch displacements.
> > > That is a problem for shorten-branches and the backend.
> >
> > The SH does some early branch shortening in machine_dependent_reorg because
> > a long conditional branch has to be split into a short conditional branch
> > around an unconditional branch, or a short conditional branch to an unconditional
> > branch (that might have been inserted elsewhere after an inverted condbranch).
> > Unconditional branches have mandatory delay slots, so we get additional
> > delay slots exposed by doing this splitting before reorg, and the conditional
> > branches have optional delay slots, which obviously can't be used in the same
> > way when the branch is reversed or redirected.
> > Thus, machine_dependent_reorg already makes sure all the conditional branches
> > that need splitting are split. If reorg redirects conditional branches
> > willy-nilly, it destroys the very data it is supposed to operate on.
>
> Maybe I'm misunderstanding something, but the two goals of:
>
> 1) Unconstrained branch rethreading in reorg
That asumes that branch rethreading in reorg is always a win. However, for the
SH, its usually a pessimization.
> 2) Efficiently utilized branch delay slots
>
> do not seem to be mutually exclusive.
>
> The SH has only a single conditional branch, but if we model this in reorg
> as multiple conditional branches with varying displacement capabilities
> with varying number of annulled/non-annulled delay slots, we can satisfy
> both conditions?
You make it actually more complex this way. It is better to describe the
branches that do actually exist. That would also be the prerequisite to
doing a machine-independent combined local constant pool / early branch
shortening & splitting pass.
I think a branch range can be described with three numbers, or if a constant
pool entry is needed, seven numbers a strings:
(define_branch_range <name> <length> <min_offset> <max_offset>
<condition>
<scratch_register> <pool_entry_size> <min_entry_offset> <max_entry_offset> <entry_bias>)
min_offset would represent the minimum offset from the start of the branch
instruction, while max_offset would represent the maximum offset from the
byte after the end of the branch instruction.
Small additive constants that are applied to the branch destination can be
expressed by adjusting the min_offset / max_offset bounds accordingly, but
if a pool entry is needed, they need to be mentioned in entry_bias so that
the right pool entry can be created.
<scratch_register> is a constraint string for a scratch register to be scavanged.
If the constraint can't be matched for the available free registers, it might fail,
causing this brnch range to be ignored.
length can be a string with comma-separated numbers, so that you can have
multi-alternative ranges, e.g. for the SH, you'd have:
(define_branch_range "cbranch" 2 -252 254
"TARGET_SH1")
;; bra; nop
(define_branch_range "bra_range" 4 -4092 4094
"TARGET_SH1")
;; mov.w 1f,rx; braf rx; nop; 0:;...; 1: .word target-0b
;; mov.l r13,-@r15; mov.w 1f,rx; braf r13; mov.l @r15+,r13; 0:;...; 1: .word target-0b
(define_branch_range "branch_16bit" "6,8" -32762 32767
"TARGET_SH2"
"r,X" 2 0 508 6)
;; mov.l 1f,rx; braf rx; nop; 0:;...; 1: .long target-0b
;; mov.l r13,-@r15; mov.l 1f,rx; braf r13; mov.l @r15+,r13; 0:;...; 1: .long target-0b
(define_branch_range "branch_32bit" "6,8" -4294967294 4294967294
"TARGET_SH2"
"r,X" 4 0 1018 6)
;; mov.l 1f,rx; jmp rx; nop; 0:;...; 1: .word target-0b
;; mov.l r13,-@r15; mov.l 1f,rx; jmp r13; mov.l @r15+,r13; 0:;...; 1: .long target
(define_branch_range "jump_range" "6,8" -4294967294 4294967294
"TARGET_SH1 && ! flag_pic"
"r,X" 4 0 1018 "absolute")
The individual conditional branches and jump instructions can then use an
attribute to select one or more applicable branch ranges, e.g.:
(define_attr "branch_range" "bra_range,branch_16bit,branch_32bit,jump_range")
If a pool entry is created, this can be communicated by changing the branch
destination to:
(plus (mem (label_ref <pool_entry>)) (plus (pc) (const_int <entry_bias>)))
where the (plus (pc) (const_int <entry_bias>)) rtl may be shared with
other branches. Or, if a scratch register has been scavenged, a move
to that register is inserted, and the branch destination becomes
(plus (reg) (plus (pc) (const_int <entry_bias>))) - or just (reg) in the
absolute case.
Likewise, switch table dispatch should be described so that the machine-independent
code can get at the ranges without having to try them out:
(define_addr_diff_vec_range <name>
<dispatch_length> <entry_length> <entry_whence> <min_entry_offset> <max_entry_offset>
<condition>
<min_table_offset> <max_table_offset>)
<min_table_offset> and <max_table_offset> describe where the table may be relative
to the dispatch insn. Some targets require 0 / 0 here, while others are more
flexible.
<entry_whence> would describe what the offset in an individual entry is relative to,
and be one of "dispatch", "table_start", "table_end", "table_entry" ;
and a "addr_diff_vec_range" attribute is added to the dispatch insn(s).
We also need a mechanism to tell which constant should be put into local constant
pools, and how to address them. I suppose we can use a contraint modifier, e.g.
'^', to say that following letters are valid for register preferencing and reload
(unless stated otherwise by # / *), but should cause the values that are matched
against them (and not an earlier letter) to be put into a local constant pool when
these are created.
The constants that are identified this way are than matched against pool_entry
definitions, e.g.:
(define_pool_entry "sh_hi_const_pool_entry" (match_operand:HI "hi_pool_const" "Q") 0 510)
(define_pool_entry "sh_sf_const_pool_entry" (match_operand:SF "sf_pool_const" ">") 0 1020)
The latter entry wouldn't match (mem:label_ref), but it would match an
alternative that has an "X" constraint in a position where a scratch
register (r0) is allocated.
The constant pool code would then load the address into this scratch
register. Moreover, for a ">" constraint, it can elide scratch register
loads for consecutive constant references with the same scratch register
as long as the scratch register is not clobbered in-between.
If some instructions need different forms of pool entries than others,
they can select a list of appropriate entries with an attribute.
--
--------------------------
SuperH (UK) Ltd.
2410 Aztec West / Almondsbury / BRISTOL / BS32 4QX
T:+44 1454 465658