[GCC][PATCH][Aarch64] Stop redundant zero-extension after UMOV when in DI mode

Thu Jul 26 16:52:00 GMT 2018

On 07/25/2018 07:08 PM, Sudakshina Das wrote:
> Hi Sam
>
> On 25/07/18 14:08, Sam Tebbs wrote:
>> On 07/23/2018 05:01 PM, Sudakshina Das wrote:
>>> Hi Sam
>>>
>>>
>>> On Monday 23 July 2018 11:39 AM, Sam Tebbs wrote:
>>>> Hi all,
>>>>
>>>> This patch extends the aarch64_get_lane_zero_extendsi instruction 
>>>> definition to
>>>> also cover DI mode. This prevents a redundant AND instruction from 
>>>> being
>>>> generated due to the pattern failing to be matched.
>>>>
>>>> Example:
>>>>
>>>> typedef char v16qi __attribute__ ((vector_size (16)));
>>>>
>>>> unsigned long long
>>>> foo (v16qi a)
>>>> {
>>>> Â  return a[0];
>>>> }
>>>>
>>>> Previously generated:
>>>>
>>>> foo:
>>>> Â Â Â Â Â Â Â  umovÂ Â Â  w0, v0.b[0]
>>>> Â Â Â Â Â Â Â  andÂ Â Â Â  x0, x0, 255
>>>> Â Â Â Â Â Â Â  ret
>>>>
>>>> And now generates:
>>>>
>>>> foo:
>>>> Â Â Â Â Â Â Â  umovÂ Â Â  w0, v0.b[0]
>>>> Â Â Â Â Â Â Â  ret
>>>>
>>>> Bootstrapped on aarch64-none-linux-gnu and tested on 
>>>> aarch64-none-elf with no
>>>> regressions.
>>>>
>>>> gcc/
>>>> 2018-07-23Â  Sam Tebbs <sam.tebbs@arm.com>
>>>>
>>>> Â Â Â Â Â Â Â  * config/aarch64/aarch64-simd.md
>>>> Â Â Â  (*aarch64_get_lane_zero_extendsi<mode>):
>>>> Â Â Â Â Â Â Â  Rename to...
>>>> (*aarch64_get_lane_zero_extend<mode><VDQQH:mode>): ... This.
>>>> Â Â Â Â Â Â Â  Use GPI iterator instead of SI mode.
>>>>
>>>> gcc/testsuite
>>>> 2018-07-23Â  Sam Tebbs <sam.tebbs@arm.com>
>>>>
>>>> Â Â Â Â Â Â Â  * gcc.target/aarch64/extract_zero_extend.c: New file
>>>>
>>> You will need an approval from a maintainer, but I would only add 
>>> one request to this:
>>>
>>> diff --git a/gcc/config/aarch64/aarch64-simd.md 
>>> b/gcc/config/aarch64/aarch64-simd.md
>>> index 89e38e6..15fb661 100644
>>> --- a/gcc/config/aarch64/aarch64-simd.md
>>> +++ b/gcc/config/aarch64/aarch64-simd.md
>>> @@ -3032,15 +3032,16 @@
>>> Â Â  [(set_attr "type" "neon_to_gp<q>")]
>>> Â )
>>>
>>> -(define_insn "*aarch64_get_lane_zero_extendsi<mode>"
>>> -Â  [(set (match_operand:SI 0 "register_operand" "=r")
>>> -Â Â Â  (zero_extend:SI
>>> +(define_insn "*aarch64_get_lane_zero_extend<mode><VDQQH:mode>"
>>> +Â  [(set (match_operand:GPI 0 "register_operand" "=r")
>>> +Â Â Â  (zero_extend:GPI
>>>
>>> Since you are adding 4 new patterns with this change, could you add
>>> more cases in your test as well to make sure you have coverage for 
>>> each of them.
>>>
>>> Thanks
>>> Sudi
>>
>> Hi Sudi,
>>
>> Thanks for the feedback. Here is an updated patch that adds more 
>> testcases to cover the patterns generated by the different mode 
>> combinations. The changelog and description from my original email 
>> still apply.
>>
>
> Thanks for making the changes and adding more test cases. I do however
> see that you are only covering 2 out of 4 new
> *aarch64_get_lane_zero_extenddi<> patterns. The
> *aarch64_get_lane_zero_extendsi<> were already existing. I don't mind
> those tests. I would just ask you to add the other two new patterns
> as well. Also since the different versions of the instruction generate
> same instructions (like foo_16qi and foo_8qi both give out the same
> instruction), I would suggest using a -fdump-rtl-final (or any relevant
> rtl dump) with the dg-options and using a scan-rtl-dump to scan the
> pattern name. Something like:
> /* { dg-do compile } */
> /* { dg-options "-O3 -fdump-rtl-final" } */
> ...
> ...
> /* { dg-final { scan-rtl-dump "aarch64_get_lane_zero_extenddiv16qi" 
> "final" } } */
>
> Thanks
> Sudi

Hi Sudi,

Thanks again. Here's an update that adds 4 more tests, so all 8 patterns
generated are now tested for!

Below is the updated changelog

gcc/
2018-07-26Â  Sam TebbsÂ  <sam.tebbs@arm.com>

 Â Â Â Â Â Â Â  * config/aarch64/aarch64-simd.md
 Â Â Â Â Â Â Â  (*aarch64_get_lane_zero_extendsi<mode>):
 Â Â Â Â Â Â Â  Rename to...
(*aarch64_get_lane_zero_extend<GPI:mode><VDQQH:mode>): ... This.
 Â Â Â Â Â Â Â  Use GPI iterator instead of SI mode.

gcc/testsuite
2018-07-26Â  Sam TebbsÂ  <sam.tebbs@arm.com>

 Â Â Â Â Â Â Â  * gcc.target/aarch64/extract_zero_extend.c: New file

>
>>>
>>> Â Â Â Â  Â  (vec_select:<VEL>
>>> Â Â Â Â  Â Â Â  (match_operand:VDQQH 1 "register_operand" "w")
>>> Â Â Â Â  Â Â Â  (parallel [(match_operand:SI 2 "immediate_operand" "i")]))))]
>>> Â Â  "TARGET_SIMD"
>>> Â Â  {
>>> -Â Â Â  operands[2] = aarch64_endian_lane_rtx (<MODE>mode, INTVAL 
>>> (operands[2]));
>>> +Â Â Â  operands[2] = aarch64_endian_lane_rtx (<VDQQH:MODE>mode,
>>> +Â Â Â  Â Â Â  Â Â Â  Â Â Â  Â Â Â  Â Â  INTVAL (operands[2]));
>>> Â Â Â Â  return "umov\\t%w0, %1.<Vetype>[%2]";
>>> Â Â  }
>>> Â Â  [(set_attr "type" "neon_to_gp<q>")]
>>
>

-------------- next part --------------
A non-text attachment was scrubbed...
Name: latest.patch
Type: text/x-patch
Size: 3430 bytes
Desc: not available
URL: <http://gcc.gnu.org/pipermail/gcc-patches/attachments/20180726/252156f4/attachment.bin>