This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [AArch64] Emit square root using the Newton series

From: Wilco Dijkstra <Wilco dot Dijkstra at arm dot com>
To: Evandro Menezes <e dot menezes at samsung dot com>
Cc: "gcc-patches at gcc dot gnu dot org" <gcc-patches at gcc dot gnu dot org>, nd <nd at arm dot com>
Date: Thu, 10 Mar 2016 19:10:24 +0000
Subject: Re: [AArch64] Emit square root using the Newton series
Authentication-results: sourceware.org; auth=none
Nodisclaimer: True
References: <AM3PR08MB00886499882773F3C8B9F71D83B30 at AM3PR08MB0088 dot eurprd08 dot prod dot outlook dot com> <011d01d17a26$31b3ade0$951b09a0$ at samsung dot com> <AM3PR08MB0088D558E387C1B736785AA883B40 at AM3PR08MB0088 dot eurprd08 dot prod dot outlook dot com>,<56E1A7AD dot 90408 at samsung dot com>
Spamdiagnosticmetadata: NSPM
Spamdiagnosticoutput: 1:23

On 03/10/16 10:52, Wilco Dijkstra wrote:
> Hi Evandro,
>
>> I have however encountered precision issues with DF, namely some benchmarks in the SPECfp CPU2000 suite would fail to validate.
> Accuracy is not an issue, the computation is extremely accurate. The issue is that your patch doesn't support sqrt(0.0) - it returns NaN rather than zero, and that causes the miscompares you're seeing. So support for the zero case should be added.
>
> This would be a better expansion, supporting zero, and with lower latency than the current sequence:

Now I think of it, frsqrts returns 1.5 for the zero case, so we only need to fix up the estimated
sqrt value before the final multiply. Since a FCSEL/VAND can be hidden completely behind the
latency of frsqrts, both scalar and vector case could do this:

    frsqrte  s1, s0
    fmul     s2, s1, s1
    frsqrts  s2, s0, s2
    fcmp     s0, 0.0
    fmul     s1, s1, s2
    fmul     s2, s1, s1
    fmul     s1, s0, s1
    frsqrts  s2, s0, s2
    fcsel    s1, s0, s1, eq
    fmul     s0, s1, s2

Wilco

Follow-Ups:
- Re: [AArch64] Emit square root using the Newton series
  - From: Evandro Menezes

References:
- Re: [AArch64] Emit square root using the Newton series
  - From: Wilco Dijkstra
- Re: [AArch64] Emit square root using the Newton series
  - From: Evandro Menezes

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]