[sh PATCH] PR/27717, sh backend lies to reload about index registers
Mon Aug 21 16:03:00 GMT 2006
Paolo Bonzini wrote:
> Whether sh really needs that is beyond my understanding. The more I
> read the patch, the more I hope it doesn't. For example, another way
> to achieve the same would be to emit the memory access as an UNSPEC.
> I would hope that we perform enough tree optimizations, that it is not
> possible to optimize further in the RTL path something like
> *(div_table + (divisor >> 58)).
Tree optimizations are mostly irrelevant here. At the tree level,
division is expected to potentially trap. The SHMEDIA backend expands
this is multiple individual machine
instructions, which are subjected to the rtl optimizations of cse, loop
invariant code motion (licm) and scheduling. In particular, if the
divisor is invariant, the entire reciprocal
computation can be hoisted/commoned bi licm/cse.
When one of the the division strategies inv:minlat, inv:call and inv:fp
is selected, a combiner pattern rearranges divisions that have not been
taken apart by cse/licm for maximum throughput (inv:minlat) or
rematerializes the division operation as a call (inv:call) or floating
point operations (inv:fp).
In the (not very likley) case that there are multiple different divisors
which are still very similar (in particular, have the same five most
significant bits) in a way visible to gcc, it is also possible that some
of the address arithmetic and possibly also table lookups can be shared
for these different divisors.
That being said, I see no reason why your patch would prevent these
Could you please do a quick sanity check?
This compiled at -O2 (add -fverbose-asm to get labels for the branches):
f(int i, int *a, int *b, int c)
a[i] = b[i] / c;
should only have two multiplies, a load, a store and some eight
shifts/additions/subtractions inside the loop.
This should use exactly nine muls instructions, and exactly one ldx.ub
and one ldx.w:
f (__complex__ int c, int d)
When you compile this testcase with -mdiv=inv:fp -O2, no table loads
should be left:
f (int a, int b)
More information about the Gcc-patches