This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH 2/3][AArch64] Emit square root using the Newton series
- From: James Greenhalgh <james dot greenhalgh at arm dot com>
- To: Evandro Menezes <e dot menezes at samsung dot com>
- Cc: GCC Patches <gcc-patches at gcc dot gnu dot org>, Wilco Dijkstra <Wilco dot Dijkstra at arm dot com>, Andrew Pinski <pinskia at gmail dot com>, "philipp dot tomsich at theobroma-systems dot com" <philipp dot tomsich at theobroma-systems dot com>, Benedikt Huber <benedikt dot huber at theobroma-systems dot com>, <nd at arm dot com>
- Date: Mon, 13 Jun 2016 11:11:32 +0100
- Subject: Re: [PATCH 2/3][AArch64] Emit square root using the Newton series
- Authentication-results: sourceware.org; auth=none
- Nodisclaimer: True
- References: <57212C01 dot 3040508 at samsung dot com> <20160525155240 dot GC9511 at arm dot com> <5748D0D6 dot 80902 at samsung dot com> <20160601090002 dot GB5853 at arm dot com> <5751FB98 dot 60904 at samsung dot com>
- Spamdiagnosticmetadata: NSPM
- Spamdiagnosticoutput: 1:99
On Fri, Jun 03, 2016 at 04:50:16PM -0500, Evandro Menezes wrote:
> >>+ return false;
> >>- emit_insn ((*get_rsqrte_type (mode)) (x0, xsrc));
> >>+ rtx xmsk = gen_reg_rtx (mmsk);
> >>+ if (!recp)
> >>+ /* When calculating the approximate square root, compare the argument with
> >>+ 0.0 and create a mask. */
> >>+ emit_insn (gen_rtx_SET (xmsk, gen_rtx_NEG (mmsk, gen_rtx_EQ (mmsk, src,
> >>+ CONST0_RTX (mode)))));
> >I guess you've done it this way rather than calling gen_aarch64_cmeq<mode>
> >directly to avoid having a switch on mode? I wonder whether it is worth just
> >writing that helper function to make it explicit what instruction we want
> >to match?
>
> I prefer to avoid calling the gen_...() functions for forward
> portability. If a future version of the ISA can do it better than
> the explicit gen_...() function, then this just works. Or at least
> this is the hope. Again, this is just me.
I prefer calling the gen functions, in the hope that those patterns would
be "upgraded" to cover the new ISA versions. But, I can see your argument
so I'm happy to drop this comment.
> @@ -7369,10 +7372,10 @@ aarch64_builtin_reciprocal (tree fndecl)
>
> typedef rtx (*rsqrte_type) (rtx, rtx);
>
> -/* Select reciprocal square root initial estimate
> - insn depending on machine mode. */
> +/* Select reciprocal square root initial estimate insn depending on machine
> + mode. */
>
> -rsqrte_type
> +static rsqrte_type
> get_rsqrte_type (machine_mode mode)
> {
> switch (mode)
> @@ -7382,16 +7385,15 @@ get_rsqrte_type (machine_mode mode)
> case V2DFmode: return gen_aarch64_rsqrte_v2df2;
> case V2SFmode: return gen_aarch64_rsqrte_v2sf2;
> case V4SFmode: return gen_aarch64_rsqrte_v4sf2;
> - default: gcc_unreachable ();
> + default: gcc_unreachable ();
> }
> }
>
> typedef rtx (*rsqrts_type) (rtx, rtx, rtx);
>
> -/* Select reciprocal square root Newton-Raphson step
> - insn depending on machine mode. */
> +/* Select reciprocal square root series step insn depending on machine mode. */
>
> -rsqrts_type
> +static rsqrts_type
> get_rsqrts_type (machine_mode mode)
> {
> switch (mode)
> @@ -7401,50 +7403,88 @@ get_rsqrts_type (machine_mode mode)
> case V2DFmode: return gen_aarch64_rsqrts_v2df3;
> case V2SFmode: return gen_aarch64_rsqrts_v2sf3;
> case V4SFmode: return gen_aarch64_rsqrts_v4sf3;
> - default: gcc_unreachable ();
> + default: gcc_unreachable ();
> }
> }
You'll find these two hunks hit a merge conflict on trunk after Jiong's
recent changes to these pattern names. Just be careful when applying the
patch.
The patch is OK for trunk.
Thanks,
James
> From 5c5c07f38cb06507fe997a890dfc5bae1d3179f6 Mon Sep 17 00:00:00 2001
> From: Evandro Menezes <e.menezes@samsung.com>
> Date: Mon, 4 Apr 2016 11:23:29 -0500
> Subject: [PATCH 2/3] [AArch64] Emit square root using the Newton series
>
> 2016-04-04 Evandro Menezes <e.menezes@samsung.com>
> Wilco Dijkstra <wilco.dijkstra@arm.com>
>
> gcc/
> * config/aarch64/aarch64-protos.h
> (aarch64_emit_approx_rsqrt): Replace with new function
> "aarch64_emit_approx_sqrt".
> (cpu_approx_modes): New member "sqrt".
> * config/aarch64/aarch64.c
> (generic_approx_modes): New member "sqrt".
> (exynosm1_approx_modes): Likewise.
> (xgene1_approx_modes): Likewise.
> (aarch64_emit_approx_rsqrt): Replace with new function
> "aarch64_emit_approx_sqrt".
> (aarch64_override_options_after_change_1): Handle new option.
> * config/aarch64/aarch64-simd.md
> (rsqrt<mode>2): Use new function instead.
> (sqrt<mode>2): New expansion and insn definitions.
> * config/aarch64/aarch64.md: Likewise.
> * config/aarch64/aarch64.opt
> (mlow-precision-sqrt): Add new option description.
> * doc/invoke.texi (mlow-precision-sqrt): Likewise.