[AArch64] Emit division using the Newton series

Evandro Menezes e.menezes@samsung.com
Fri Apr 1 21:56:00 GMT 2016


On 04/01/16 16:22, Wilco Dijkstra wrote:
> Evandro Menezes wrote:
>>> The division variant should use the same latency reduction trick I mentioned for sqrt.
>> I don't think that it applies here, since it doesn't have to deal with
>> special cases.
> No it applies as it's exactly the same calculation: x * rsqrt(y) and x * recip(y). In both
> cases you don't need the final result of rsqrt(y) or recip(y), avoiding a multiply.
> Given these sequences are high latency this saving is actually quite important.

Wilco,

In the case of sqrt(), the special case when the argument is 0.0 
multiplication is necessary in order to guarantee correctness. Handling 
this special case hurts performance, when your suggestion helps.

However, I don't think that there's the need to handle any special case 
for division.  The only case when the approximation differs from 
division is when the numerator is infinity and the denominator, zero, 
when the approximation returns infinity and the division, NAN.  So I 
don't think that it's a special case that deserves being handled.  IOW, 
the result of the approximate reciprocal is always needed.

Or am I missing something?

Thank you,

-- 
Evandro Menezes



More information about the Gcc-patches mailing list