This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [AArch64] Emit division using the Newton series


Evandro Menezes wrote:

> However, I don't think that there's the need to handle any special case
> for division.  The only case when the approximation differs from
> division is when the numerator is infinity and the denominator, zero,
> when the approximation returns infinity and the division, NAN.  So I
> don't think that it's a special case that deserves being handled.  IOW,
> the result of the approximate reciprocal is always needed.
 
No, the result of the approximate reciprocal is not needed. 

Basically a NR approximation produces a correction factor that is very close
to 1.0, and then multiplies that with the previous estimate to get a more
accurate estimate. The final calculation for x * recip(y) is:

result = (reciprocal_correction * reciprocal_estimate) * x

while what I am suggesting is a trivial reassociation:

result = reciprocal_correction * (reciprocal_estimate * x)

The computation of the final reciprocal_correction is on the critical latency
path, while reciprocal_estimate is computed earlier, so we can compute
(reciprocal_estimate * x) without increasing the overall latency. Ie. we saved
a multiply.

In principle this could be done as a separate optimization pass that tries to 
reassociate to reduce latency. However I'm not too convinced this would be
easy to implement in GCC's scheduler, so it's best to do it explicitly.

Wilco


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]