Hitachi SH lib1funcs.asm patch
Richard Henderson
rth@cygnus.com
Sat Jul 31 22:30:00 GMT 1999
On Wed, Jul 07, 1999 at 05:41:39PM -0700, Toshiyasu Morita wrote:
> This is a small improvement for Hitachi SH1-SH3E. It changes movstr to
> copy longwords by pairs because the Hitachi SH architecture has a load
> latency of two clocks.
I'll leave it to Joern to approve or not this patch.
However, I do have one suggestion, that may or may not pan
out due to available instructions. You are currently using
a jump table. However, the address can be calulated directly.
I note that there are 4 two-byte instructions between each of
the 2-word copy entry points, with the odd addresses offset.
This suggests the branch offset is
(64 - count*8) + (count & 1 ? 60 : 0)
If we pad the ___movstrSI0 case with two extra nops, we can
replace that last conditional with (count & 4 << 4). So,
(16 - count + (count & 1) * 16) * 8
> ! mova LOCAL(movstr_table),r0
> ! add #16,r6
> ! shll r6
> ! mov.w @(r0,r6),r6
> ! #ifdef __sh1__
> add r6,r0
> jmp @r0
> + #else
> + braf r6
> + #endif
mov r6,r0
and #1,r0
shll2 r0
shll2 r0
add #16,r0
sub r6,r0
shll2 r0
add r0,r0
braf r0
which _is_ more instructions, but it doesn't have the table
or a memory load, but it is all sequential. So I don't know
if this pays off.
r~
More information about the Gcc-patches
mailing list