This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [AArch64] Emit square root using the Newton series


On 03/10/16 13:10, Wilco Dijkstra wrote:
     frsqrte  s1, s0
     fmul     s2, s1, s1
     frsqrts  s2, s0, s2
     fcmp     s0, 0.0
     fmul     s1, s1, s2
     fmul     s2, s1, s1
     fmul     s1, s0, s1
     frsqrts  s2, s0, s2
     fcsel    s1, s0, s1, eq
     fmul     s0, s1, s2

That's what I had in mind too, but around the approximation for x^-1/2 and using masks for vector cases thusly:

	fcmne	v3.4s, v0.4s, #0.0
        frsqrte v1.4s, v0.4s
        fmul    v2.4s, v1.4s, v1.4s
        frsqrts v2.4s, v0.4s, v2.4s
        fmul    v1.4s, v1.4s, v2.4s
        fmul    v2.4s, v1.4s, v1.4s
        frsqrts v2.4s, v0.4s, v2.4s
        fmul    v1.4s, v1.4s, v2.4s
	and	v1.4s, v3.4s
        fmul    v0.4s, v1.4s, v0.4s


Thanks,

--
Evandro Menezes


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]