This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH][ARM] NEON DImode neg
- From: Richard Earnshaw <rearnsha at arm dot com>
- To: Andrew Stubbs <ams at codesourcery dot com>
- Cc: "gcc-patches at gcc dot gnu dot org" <gcc-patches at gcc dot gnu dot org>, "patches at linaro dot org" <patches at linaro dot org>
- Date: Thu, 12 Apr 2012 16:48:02 +0100
- Subject: Re: [PATCH][ARM] NEON DImode neg
- References: <4F4D12C5.9070805@codesourcery.com> <4F704189.4010302@codesourcery.com>
On 26/03/12 11:14, Andrew Stubbs wrote:
> On 28/02/12 17:45, Andrew Stubbs wrote:
>> Hi all,
>>
>> This patch adds a DImode negate pattern for NEON.
>>
>> Unfortunately, the NEON vneg instruction only supports vectors, not
>> singletons, so there's no direct way to do it in DImode, and the
>> compiler ends up moving the value back to core registers, negating it,
>> and returning to NEON afterwards:
>>
>> fmrrd r2, r3, d16 @ int
>> negs r2, r2
>> sbc r3, r3, r3, lsl #1
>> fmdrr d16, r2, r3 @ int
>>
>> The new patch does it entirely in NEON:
>>
>> vmov.i32 d17, #0 @ di
>> vsub.i64 d16, d17, d16
>>
>> (Note that this is the result when combined with my recent patch for
>> NEON DImode immediates. Without that you get a constant pool load.)
>
> This updates fixes a bootstrap failure caused by an early clobber error.
> I've also got a native regression test running now.
>
> OK?
>
> Andrew
>
>
> neon-neg64.patch
>
>
> 2012-03-26 Andrew Stubbs <ams@codesourcery.com>
>
> gcc/
> * config/arm/arm.md (negdi2): Use gen_negdi2_neon.
> * config/arm/neon.md (negdi2_neon): New insn.
> Also add splitters for core and NEON registers.
>
> ---
> gcc/config/arm/arm.md | 8 +++++++-
> gcc/config/arm/neon.md | 37 +++++++++++++++++++++++++++++++++++++
> 2 files changed, 44 insertions(+), 1 deletions(-)
>
> diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
> index 751997f..f1dbbf7 100644
> --- a/gcc/config/arm/arm.md
> +++ b/gcc/config/arm/arm.md
> @@ -4048,7 +4048,13 @@
> (neg:DI (match_operand:DI 1 "s_register_operand" "")))
> (clobber (reg:CC CC_REGNUM))])]
> "TARGET_EITHER"
> - ""
> + {
> + if (TARGET_NEON)
> + {
> + emit_insn (gen_negdi2_neon (operands[0], operands[1]));
> + DONE;
> + }
> + }
> )
>
> ;; The constraints here are to prevent a *partial* overlap (where %Q0 == %R1).
> diff --git a/gcc/config/arm/neon.md b/gcc/config/arm/neon.md
> index 3c88568..bf229a7 100644
> --- a/gcc/config/arm/neon.md
> +++ b/gcc/config/arm/neon.md
> @@ -922,6 +922,43 @@
> (const_string "neon_int_3")))]
> )
>
> +(define_insn "negdi2_neon"
> + [(set (match_operand:DI 0 "s_register_operand" "= w,?r,?&r,?w")
> + (neg:DI (match_operand:DI 1 "s_register_operand" " w, 0, r, w")))
> + (clobber (match_scratch:DI 2 "=&w, X, X,&w"))
> + (clobber (reg:CC CC_REGNUM))]
> + "TARGET_NEON"
> + "#"
> + [(set_attr "length" "8")
> + (set_attr "arch" "nota8,*,*,onlya8")]
> +)
> +
If negation in Neon needs a scratch register, it seems to me to be
somewhat odd that we're disparaging the ARM version.
Also, wouldn't it be sensible to support a variant that was
early-clobber on operand 0, but loaded immediate zero into that value first:
vmov Dd, #0
vsub Dd, Dd, Dm
That way you'll never need more than two registers, whereas today you
want three.
R.