This is the mail archive of the mailing list for the GCC project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [GCC][PATCH][Aarch64] Stop redundant zero-extension after UMOV when in DI mode

Hi Sam

On 25/07/18 14:08, Sam Tebbs wrote:
On 07/23/2018 05:01 PM, Sudakshina Das wrote:
Hi Sam

On Monday 23 July 2018 11:39 AM, Sam Tebbs wrote:
Hi all,

This patch extends the aarch64_get_lane_zero_extendsi instruction definition to
also cover DI mode. This prevents a redundant AND instruction from being
generated due to the pattern failing to be matched.


typedef char v16qi __attribute__ ((vector_size (16)));

unsigned long long
foo (v16qi a)
  return a[0];

Previously generated:

        umov    w0, v0.b[0]
        and     x0, x0, 255

And now generates:

        umov    w0, v0.b[0]

Bootstrapped on aarch64-none-linux-gnu and tested on aarch64-none-elf with no

2018-07-23  Sam Tebbs <>

        * config/aarch64/
        Rename to...
(*aarch64_get_lane_zero_extend<mode><VDQQH:mode>): ... This.
        Use GPI iterator instead of SI mode.

2018-07-23  Sam Tebbs <>

        * New file

You will need an approval from a maintainer, but I would only add one request to this:

diff --git a/gcc/config/aarch64/ b/gcc/config/aarch64/
index 89e38e6..15fb661 100644
--- a/gcc/config/aarch64/
+++ b/gcc/config/aarch64/
@@ -3032,15 +3032,16 @@
   [(set_attr "type" "neon_to_gp<q>")]

-(define_insn "*aarch64_get_lane_zero_extendsi<mode>"
-  [(set (match_operand:SI 0 "register_operand" "=r")
-    (zero_extend:SI
+(define_insn "*aarch64_get_lane_zero_extend<mode><VDQQH:mode>"
+  [(set (match_operand:GPI 0 "register_operand" "=r")
+    (zero_extend:GPI

Since you are adding 4 new patterns with this change, could you add
more cases in your test as well to make sure you have coverage for each of them.


Hi Sudi,

Thanks for the feedback. Here is an updated patch that adds more testcases to cover the patterns generated by the different mode combinations. The changelog and description from my original email still apply.

Thanks for making the changes and adding more test cases. I do however
see that you are only covering 2 out of 4 new
*aarch64_get_lane_zero_extenddi<> patterns. The
*aarch64_get_lane_zero_extendsi<> were already existing. I don't mind
those tests. I would just ask you to add the other two new patterns
as well. Also since the different versions of the instruction generate
same instructions (like foo_16qi and foo_8qi both give out the same
instruction), I would suggest using a -fdump-rtl-final (or any relevant
rtl dump) with the dg-options and using a scan-rtl-dump to scan the
pattern name. Something like:
/* { dg-do compile } */
/* { dg-options "-O3 -fdump-rtl-final" } */
/* { dg-final { scan-rtl-dump "aarch64_get_lane_zero_extenddiv16qi" "final" } } */


         (match_operand:VDQQH 1 "register_operand" "w")
         (parallel [(match_operand:SI 2 "immediate_operand" "i")]))))]
-    operands[2] = aarch64_endian_lane_rtx (<MODE>mode, INTVAL (operands[2]));
+    operands[2] = aarch64_endian_lane_rtx (<VDQQH:MODE>mode,
+                       INTVAL (operands[2]));
     return "umov\\t%w0, %1.<Vetype>[%2]";
   [(set_attr "type" "neon_to_gp<q>")]

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]