This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [Patch, RTL] Eliminate redundant vec_select moves.


Richard Sandiford wrote:
Tejas Belagod <tbelagod@arm.com> writes:
The problem is that one reg rtx can span several hard registers.
E.g. (reg:V4SI 32) might represent one 64-bit register (no. 32),
but it might instead represent two 32-bit registers (nos. 32 and 33).
Obviously the latter's not very likely for vectors this small,
but more likely for larger ones (including on NEON IIRC).

So if we had 2 32-bit registers being treated as a V4HI, it would be:

   <--32--><--33-->
   msb          lsb
   0000111122223333
   VVVVVVVV
   00001111
   msb  lsb
   <--32-->

for big endian and:

   <--33--><--32-->
   msb          lsb
   3333222211110000
           VVVVVVVV
           11110000
           msb  lsb
           <--32-->

for little endian.
Ah, ok, that makes things clearer. Thanks for that.

I can't find any helper function that figures out if we're writing partial or full result regs. Would something like

     REGNO (src) == REGNO (dst) &&
     HARD_REGNO_NREGS (src) == HARD_REGNO_NREGS (dst) == 1

be a sane check for partial result regs?

Yeah, that should work.  I think a more general alternative would be:

  simplify_subreg_regno (REGNO (src), GET_MODE (src),
                         offset, GET_MODE (dst)) == (int) REGNO (dst)

where:

  offset = GET_MODE_UNIT_SIZE (GET_MODE (src)) * INTVAL (XVECEXP (sel, 0))

That offset is the byte offset of the first selected element from the
start of a vector in memory, which is also the way that SUBREG_BYTEs
are counted.  For little-endian it gives the offset of the lsb of the
slice, while for big-endian it gives the offset of the msb (which is
also how SUBREG_BYTEs work).

The simplify_subreg_regno should cope with both single-register vectors
and multi-register vectors.

Sorry for the delayed response to this.

Thanks for the tip. Here's an improved patch that implements the simplify_sureg_regno () method of eliminating redundant moves. Regarding the test case, I failed to get the ppc back-end to generate RTL pattern that this patch checks for. I can easily write a test case for aarch64(big and little endian) on these lines

typedef float float32x4_t __attribute__ ((__vector_size__ (16)));

float foo_be (float32x4_t x)
{
  return x[3];
}

float foo_le (float32x4_t x)
{
  return x[0];
}

where I know that the vector indexing will generate a vec_select on the same src and dst regs that could be optimized away and hence test it. But I'm struggling to get a test case that the ppc altivec back-end will generate such a vec_select for. I see that altivec does not define vec_extract, so a simple indexing like this seems to happen via memory. Also, I don't know enough about the ppc PCS or architecture to write a test that will check for this optimization opportunity on same src and dst hard-registers. Any hints?

This patch has been bootstrapped on x64_64 and regressed on aarch64-none-elf and aarch64_be-none-elf.

Thanks for your patience,
Tejas.
diff --git a/gcc/rtlanal.c b/gcc/rtlanal.c
index 0cd0c7e..ca25ce5 100644
--- a/gcc/rtlanal.c
+++ b/gcc/rtlanal.c
@@ -1180,6 +1180,22 @@ set_noop_p (const_rtx set)
       dst = SUBREG_REG (dst);
     }
 
+  /* It is a NOOP if destination overlaps with selected src vector
+     elements.  */
+  if (GET_CODE (src) == VEC_SELECT
+      && REG_P (XEXP (src, 0)) && REG_P (dst)
+      && HARD_REGISTER_P (XEXP (src, 0))
+      && HARD_REGISTER_P (dst))
+    {
+      rtx par = XEXP (src, 1);
+      rtx src0 = XEXP (src, 0);
+      HOST_WIDE_INT offset =
+	GET_MODE_UNIT_SIZE (GET_MODE (src0)) * INTVAL (XVECEXP (par, 0, 0));
+
+      return simplify_subreg_regno (REGNO (src0), GET_MODE (src0),
+				    offset, GET_MODE (dst)) == (int)REGNO (dst);
+    }
+
   return (REG_P (src) && REG_P (dst)
 	  && REGNO (src) == REGNO (dst));
 }

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]