This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [AArch64] Emit division using the Newton series


On 04/01/16 17:45, Wilco Dijkstra wrote:
Evandro Menezes wrote:

However, I don't think that there's the need to handle any special case
for division.  The only case when the approximation differs from
division is when the numerator is infinity and the denominator, zero,
when the approximation returns infinity and the division, NAN.  So I
don't think that it's a special case that deserves being handled.  IOW,
the result of the approximate reciprocal is always needed.
No, the result of the approximate reciprocal is not needed.

Basically a NR approximation produces a correction factor that is very close
to 1.0, and then multiplies that with the previous estimate to get a more
accurate estimate. The final calculation for x * recip(y) is:

result = (reciprocal_correction * reciprocal_estimate) * x

while what I am suggesting is a trivial reassociation:

result = reciprocal_correction * (reciprocal_estimate * x)

The computation of the final reciprocal_correction is on the critical latency
path, while reciprocal_estimate is computed earlier, so we can compute
(reciprocal_estimate * x) without increasing the overall latency. Ie. we saved
a multiply.

In principle this could be done as a separate optimization pass that tries to
reassociate to reduce latency. However I'm not too convinced this would be
easy to implement in GCC's scheduler, so it's best to do it explicitly.

I think that I see what you mean.  I'll hack something tomorrow.

Thanks for your patience,

--
Evandro Menezes


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]