This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [AArch64] Emit division using the Newton series
- From: Wilco Dijkstra <Wilco dot Dijkstra at arm dot com>
- To: Evandro Menezes <e dot menezes at samsung dot com>, GCC Patches <gcc-patches at gcc dot gnu dot org>
- Cc: James Greenhalgh <James dot Greenhalgh at arm dot com>, Andrew Pinski <pinskia at gmail dot com>, nd <nd at arm dot com>
- Date: Fri, 1 Apr 2016 21:22:35 +0000
- Subject: Re: [AArch64] Emit division using the Newton series
- Authentication-results: sourceware.org; auth=none
- Nodisclaimer: True
- References: <56EB0EDF dot 3060401 at samsung dot com> <56F2C329 dot 10405 at samsung dot com> <56FDA311 dot 7090309 at samsung dot com> <AM3PR08MB0088DDE6EA428B37CE090953839A0 at AM3PR08MB0088 dot eurprd08 dot prod dot outlook dot com>,<56FED036 dot 2070405 at samsung dot com>
- Spamdiagnosticmetadata: NSPM
- Spamdiagnosticoutput: 1:23
Evandro Menezes wrote:
> > The division variant should use the same latency reduction trick I mentioned for sqrt.
>
> I don't think that it applies here, since it doesn't have to deal with
> special cases.
No it applies as it's exactly the same calculation: x * rsqrt(y) and x * recip(y). In both
cases you don't need the final result of rsqrt(y) or recip(y), avoiding a multiply.
Given these sequences are high latency this saving is actually quite important.
Wilco