This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [GCC][PATCH][Aarch64] Replace umov with cheaper fmov in popcount expansion

From: Sam Tebbs <Sam dot Tebbs at arm dot com>
To: Richard Earnshaw <Richard dot Earnshaw at arm dot com>, "gcc-patches at gcc dot gnu dot org" <gcc-patches at gcc dot gnu dot org>
Cc: Marcus Shawcroft <Marcus dot Shawcroft at arm dot com>, James Greenhalgh <James dot Greenhalgh at arm dot com>, nd <nd at arm dot com>
Date: Mon, 29 Oct 2018 10:31:35 +0000
Subject: Re: [GCC][PATCH][Aarch64] Replace umov with cheaper fmov in popcount expansion
References: <744c7940-641d-3133-3103-9858741a86d5@arm.com> <740072f6-f988-3c0b-c434-ef1a3186d7bd@arm.com>

On 10/23/2018 02:50 PM, Richard Earnshaw (lists) wrote:

> On 22/10/2018 10:02, Sam Tebbs wrote:
>> diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
>> index d7473418a8eb62b2757017cd1675493f86e41ef4..77e6f75cc15f06733df7b47906ee00580bea8d29 100644
>> --- a/gcc/config/aarch64/aarch64.md
>> +++ b/gcc/config/aarch64/aarch64.md
>> @@ -4489,7 +4489,7 @@
>>     emit_move_insn (v, gen_lowpart (V8QImode, in));
>>     emit_insn (gen_popcountv8qi2 (v1, v));
>>     emit_insn (gen_reduc_plus_scal_v8qi (r, v1));
>> -  emit_insn (gen_zero_extendqi<mode>2 (out, r));
>> +  emit_move_insn (out, gen_lowpart_SUBREG (GET_MODE (out), r));
> I don't think this is right.  You're effectively creating a paradoxical
> subreg here and relying on an unstated side effect of an earlier
> instruction for correct behaviour.
>
> What you really need is a pattern that generates the zero-extend in
> combination with the reduction operation.  So something like
>
> (set (reg:DI)
>       (zero_extend:DI (unspec:VecMode [(reg:VecMode)] UNSPEC_ADDV)))

Hi Richard,

Thanks for the feedback. What assembly would you expect such a pattern 
to produce?

I'm a bit unclear on what you mean by the "the reduction operation", but 
I'm assuming you're referring to the fmov in this case.

>
> now you can copy all, or part, or that register directly across to the
> integer side and the RTL remains mathematically accurate.
>
> R.

Follow-Ups:
- Re: [GCC][PATCH][Aarch64] Replace umov with cheaper fmov in popcount expansion
  - From: Richard Henderson

References:
- [GCC][PATCH][Aarch64] Replace umov with cheaper fmov in popcount expansion
  - From: Sam Tebbs
- Re: [GCC][PATCH][Aarch64] Replace umov with cheaper fmov in popcount expansion
  - From: Richard Earnshaw (lists)

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]