This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH][aarch64] Fix target/pr77729 - missed optimization related to zero extension


On Thu, 2017-09-14 at 11:53 -0600, Jeff Law wrote:
> 
> 
> And I think that's starting to zero in on the problem --
> WORD_REGISTER_OPERATIONS is zero on aarch64 as you don't get extension
> to word_mode for W form registers.
> 
> I wonder if what needs to happen is somehow look to extend that code
> somehow so that combine and friends know that the value is zero extended
> to 32 bits, even if it's not extended to word_mode.
> 
> Jeff

This might be a good long term direction to move but in the mean time
it sure does seem a lot easier to just generate a subreg.  Here is a
patch that does that, it passes bootstrap and has no regressions and
fixes the bug in question (and most likely improves other code as
well).

The "LOAD_EXTEND_OP (<MODE>mode) == ZERO_EXTEND" part of the if
statement is not really necessary since we know this is true on aarch64
but I thought it helped make it clear what we were doing and the
compiler should optimize it away anyway.

OK to checkin this fix while we consider longer term options?

Steve Ellcey
sellcey@cavium.com


2017-09-14  Steve Ellcey  <sellcey@cavium.com>

        PR target/77729
        * config/aarch64/aarch64.md (mov<mode>): Generate subreg for
        short loads to reflect that upper bits are zeroed out on load.


2017-09-14  Steve Ellcey  <sellcey@cavium.com>

	* gcc.target/aarch64/pr77729.c: New test.

diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index f8cdb06..bca4cf5 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -864,6 +864,15 @@
 	(match_operand:SHORT 1 "general_operand" ""))]
   ""
   "
+    if (LOAD_EXTEND_OP (<MODE>mode) == ZERO_EXTEND && MEM_P (operands[1])
+	&& can_create_pseudo_p () && optimize > 0)
+      {
+	/* Generate subreg of SImode so we know that the upper bits
+	of the reg are zero and do not need to masked out later.  */
+	rtx reg = gen_reg_rtx (SImode);
+	emit_insn (gen_zero_extend<mode>si2 (reg, operands[1]));
+	operands[1] = gen_lowpart (<MODE>mode, reg);
+      }
     if (GET_CODE (operands[0]) == MEM && operands[1] != const0_rtx)
       operands[1] = force_reg (<MODE>mode, operands[1]);
   "
diff --git a/gcc/testsuite/gcc.target/aarch64/pr77729.c b/gcc/testsuite/gcc.target/aarch64/pr77729.c
index e69de29..2fcda9a 100644
--- a/gcc/testsuite/gcc.target/aarch64/pr77729.c
+++ b/gcc/testsuite/gcc.target/aarch64/pr77729.c
@@ -0,0 +1,32 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+int TrieCase3_v1(const char *string)
+{
+    if((string[0] | 32) == 't') {
+        if((string[1] | 32) == 'a') {
+            if((string[2] | 32) == 'g') {
+                return 42;
+            }
+        }
+    }
+    return -1;
+}
+
+int TrieCase3_v2(const char *string)
+{
+    switch(string[0] | 32) {
+    case 't':
+        switch(string[1] | 32) {
+        case 'a':
+            switch(string[2] | 32) {
+            case 'g':
+                return 42;
+            }
+        }
+    }
+    return -1;
+}
+
+/* { dg-final { scan-assembler-not "and" } } */
+/* { dg-final { scan-assembler-not "uxtb" } } */

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]