This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [PATCH AARCH64] fix and enable non-const shuffle for bigendian using TBL instruction

From: Alan Lawrence <alan dot lawrence at arm dot com>
To: "gcc-patches at gcc dot gnu dot org" <gcc-patches at gcc dot gnu dot org>
Date: Wed, 25 Jun 2014 10:21:31 +0100
Subject: Re: [PATCH AARCH64] fix and enable non-const shuffle for bigendian using TBL instruction
Authentication-results: sourceware.org; auth=none
References: <5357DC70 dot 5080907 at arm dot com>

This one seems to have slipped under the radar. I've just rebased and run theregression tests on aarch64_be-none-elf, with no issues; ping?


(patch applied straightforwardly, but rebased version below)

--Alan

Alan Lawrence wrote:

At present vec_perm with non-const indices is not handled on bigendian, so gccgenerates generic, slow, code. This patch fixes up TBL to reverse the indiceswithin each input vector (following Richard Henderson's suggestion of using anXOR with (nelts - 1) rather than a complicated mask/add/subtract,http://gcc.gnu.org/ml/gcc-patches/2014-03/msg01285.html), and enables the codefor bigendian.
Regressed on aarch64_be-none-elf with no changes. (This is as expected: in allaffected cases, gcc was already producing correct non-arch-specific code usingscalar op. However, I have manually verified for various tests inc-c++-common/torture/vshuf-v* that (a) TBL instructions are now produced, (b) aversion of the compiler that produces TBLs without the index correction, failstests).
Note tests c-c++-common/torture/vshuf-{v16hi,v4df,v4di,v8si} (i.e. the 32-bytevectors) were broken prior to this patch and are not affected.
gcc/ChangeLog:
2014-04-23  Alan Lawrence  <alan.lawrence@arm.com>

	* config/aarch64/aarch64-simd.md (vec_perm): Enable for bigendian.
	* config/aarch64/aarch64.c (aarch64_expand_vec_perm): Remove assert
	against bigendian and adjust indices.


diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-sim
index 42bfd3e..08eb6b3 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -4224,7 +4224,7 @@
    (match_operand:VB 1 "register_operand")
    (match_operand:VB 2 "register_operand")
    (match_operand:VB 3 "register_operand")]
-  "TARGET_SIMD && !BYTES_BIG_ENDIAN"
+  "TARGET_SIMD"
 {
   aarch64_expand_vec_perm (operands[0], operands[1],
                           operands[2], operands[3]);
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index b2d005b..0ea277a 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -8730,18 +8730,24 @@ aarch64_expand_vec_perm (rtx target, rtx op0, rtx op1, r
   enum machine_mode vmode = GET_MODE (target);
   unsigned int i, nelt = GET_MODE_NUNITS (vmode);
   bool one_vector_p = rtx_equal_p (op0, op1);
-  rtx rmask[MAX_VECT_LEN], mask;
-
-  gcc_checking_assert (!BYTES_BIG_ENDIAN);
+  rtx mask;

   /* The TBL instruction does not use a modulo index, so we must take care
      of that ourselves.  */
-  mask = GEN_INT (one_vector_p ? nelt - 1 : 2 * nelt - 1);
-  for (i = 0; i < nelt; ++i)
-    rmask[i] = mask;
-  mask = gen_rtx_CONST_VECTOR (vmode, gen_rtvec_v (nelt, rmask));
+  mask = aarch64_simd_gen_const_vector_dup (vmode,
+      one_vector_p ? nelt - 1 : 2 * nelt - 1);
   sel = expand_simple_binop (vmode, AND, sel, mask, NULL, 0, OPTAB_LIB_WIDEN);

+  /* For big-endian, we also need to reverse the index within the vector
+     (but not which vector).  */
+  if (BYTES_BIG_ENDIAN)
+    {
+      /* If one_vector_p, mask is a vector of (nelt - 1)'s already.  */
+      if (!one_vector_p)
+        mask = aarch64_simd_gen_const_vector_dup (vmode, nelt - 1);
+      sel = expand_simple_binop (vmode, XOR, sel, mask,
+                                NULL, 0, OPTAB_LIB_WIDEN);
+    }
   aarch64_expand_vec_perm_1 (target, op0, op1, sel);
 }

Follow-Ups:
- Re: [PATCH AARCH64] fix and enable non-const shuffle for bigendian using TBL instruction
  - From: Marcus Shawcroft

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]