This is the mail archive of the mailing list for the GCC project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: target/9594: [3.3/3.4 regression] [sh4-elf] Assembler complains pcrel too far.

"Naveen Sharma, Noida" wrote:
[This is drifting to general design discussion, so I changed the mailing list to gcc]
> On a related note, I notice that gcc is known to generate
> out of bound branches for SH.
> From the discussion thread following the latter, it seems
> there needs to be some design changes to solve these sort
> of problems. In this message
> You suggest to "marry" the local constant pool generation
> with an early branch shortening pass.
> I would like to analyse this more, and implement it, if this can
> avoid problems of this nature. What do you think ?

This is not a simple project, but if you get it right, the benefits
can be considerable.

As we scan the insn to determine current length, we can actually assume
that there is no constant pool till we need to place one.  I.e.
backwards branches can be calculated quite well.
For forwards branches, I suppose we have to assume that the currently
accumulated constants, plus all constants that belong to insns
between the branch and its target, will come in one or more pools that
are dumped somewhere before the target, until proven otherwise.
When we finally reach the branch target, we could check if any forward
branches that were made to it can now be done with a shorter instruction;
if there has been no alignement insn in-between, it is relatively
straightforward to update the addresses retrospectively; we can also
accumulate changes till we actually dump the next constant pool to avoid
re-writing too many addresses over and over again; the local fixups can be
held in an avl tree or a skip list or somesuch in the meantime.  When we
dump a constant pool, we should probably forget about forward branches
that we were not sure about if we could make them shorter - they are
rather unlikely to be mad shorter when they actually have to cross the
constant pool, would be messy to fix up, and we also got the alignment
of the constant pool in the way.  In fact, it might also make sense to
forget about adjustable forward branches once we pass an alignment.
One thing to consider here is that with a typical if-then-else construct,
the else part is preceded by an alignment.  So when we see an alignment
that is followed by a label for a forward branch that we still think about
adjusting, we should check if we can do that first before actually doing
the ordinary processing of the alignment.

One shortcoming of the current code should also be addressed in such
a reorgnization: we currently can't take make_skip into account for
branch shortening.  SH4 currently uses 32 byte alignment without a
max_skip, but it should probably use a max_skip of four or eight.
The alignment calculations in he branch shortening pass assume that
after an alignment insn the specified alignment is present.  So we can
either model the alignments as alignments (possibly a lower alignment,
or as a fixed number of inserted bytes, but we can't have it both ways.
We need a different approach to properly implement max_skip.

The one I have in mind can be visualized like a rope, where individual
strands might end, and new ones are spliced in.  For a maximum
intelligently handled alignment of 32 bytes (you can always model larger
alignments as a fixed block of bytes equal in size to the maximum skip),
you'd have 32 strands at the start, each representing a possible start
address modulo 32.  As instructions of fixed size or branches with
varing size dependent of offset range are passed, the strands sort of
rotate, as the address is increased by the insn length. Where we see an
alignment instruction, some strands are spliced to others: for 32 with
a max_skip of four, you'd splice strands with local addresses 28, 29, 30
and 31 to strand 0 (modulo 32).
When we pass a label or varying length insn (other than an alignment),
the missing strands are replaced with new strands.
Now, in order to guarantee linear time operations for determining maximum
branch offsets, we want an upper bound on the number of splice points
on every strands that spans from one insn to another.  To archive that,
whenever we splice strands at an alignment insn, the surviving strand
is the oldest strand or one of the oldest strands.  This way, the number
of splice points on any strand as we follow it along is always lower
than the maximum handled alignment in bytes, i.e. <= 31 in our example.
The maximum offset between two points, e.g. a label and a branch,
can be calculated by tallying the offset on every strand from the
first point to the second, and picking the maximum.
The downside is that the target of a splice can have a non-zero length
for an alignment, i.e. we need to keep track of addresses for every
strand separately.  But is just a linear factor ;-)

We can further reduce the time and memory required by considering the
atomic instruction size for RISC processors, e.g. for the SH, that's
two bytes, so we only ever need strands 0, 2, 4, 8, 16, 18, 20, 22, 24,
26, 28 and 30, and the maximum splice count on any strand is 15.

So, to put this in data structures:
we use a varray to map insn uids to luids
we have another varray that maps luids to entries for as many strands
as the maximum intelligently alignmnt is in instruction size units.
For each strand, we have an address, and a next splice point, which is
represented by a pointer to a data structure that is allocated for the
alignment insn.	 The latter includes a luid, the number of the strand,
and the next splice point (to speed up pointer chasing).
As we scan the insn to update the addresses, we keep track of the age
of each strand, by means of storing the luid where it was created. 

SuperH (UK) Ltd.
2410 Aztec West / Almondsbury / BRISTOL / BS32 4QX
T:+44 1454 465658

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]