This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[PATCH] Fix REE with vector modes (PR rtl-optimization/70542)


Hi!

As mentioned in PR64286 already, unlike integral modes, vector modes
don't have the property that lowpart subreg of sign/zero extended
value contains the original non-extended bits, so if we touch some
definition, we really have to change all uses.  Except that we really
don't have infrastructure/code to make sure we change either none or
all within a single transaction, and there is lots of reasons why it could
fail, more with copy_needed cases as in the testcase below, where we have:
  (set (reg:V8HI xmm6) (reg:V8HI xmm10))
  (set (reg:V8SI xmm3) (sign_extend:V8SI (reg:V8HI xmm6)))
...
// in another bb later on
  (set (reg:V8SI ...) (sign_extend:V8SI (reg:V8HI xmm6)))
(why CSE has not managed to optimize this is strange, something to look
for GCC7).  Without this patch we optimize the first two insns into:
  (set (reg:V8SI xmm3) (sign_extend:V8HI xmm10))
  (set (reg:V8SI xmm6) (reg:V8SI xmm3))
but then fail to tweak the last insn, because the definition has been
already modified and it is another copy_needed case.

So, IMHO without big changes we really can't guarantee all or nothing,
and thus this patch changes the PR64286 fix to be more conservative,
verify the def is used just in a single non-debug insn, which we can then
safely modify.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2016-04-05  Jakub Jelinek  <jakub@redhat.com>

	PR rtl-optimization/70542
	* ree.c (add_removable_extension): For VECTOR_MODE_P punt
	if there are any uses other than insn or debug insns.

	* gcc.dg/torture/pr70542.c: New test.
	* gcc.target/i386/avx2-pr70542.c: New test.

--- gcc/ree.c.jj	2016-01-04 14:55:54.000000000 +0100
+++ gcc/ree.c	2016-04-05 14:38:08.219486904 +0200
@@ -1025,11 +1025,11 @@ add_removable_extension (const_rtx expr,
 	    return;
 	  }
 	/* For vector mode extensions, ensure that all uses of the
-	   XEXP (src, 0) register are the same extension (both code
-	   and to which mode), as unlike integral extensions lowpart
-	   subreg of the sign/zero extended register are not equal
-	   to the original register, so we have to change all uses or
-	   none.  */
+	   XEXP (src, 0) register are in insn or debug insns, as unlike
+	   integral extensions lowpart subreg of the sign/zero extended
+	   register are not equal to the original register, so we have
+	   to change all uses or none and the current code isn't able
+	   to change them all at once in one transaction.  */
 	else if (VECTOR_MODE_P (GET_MODE (XEXP (src, 0))))
 	  {
 	    if (idx == 0)
@@ -1046,15 +1046,7 @@ add_removable_extension (const_rtx expr,
 			break;
 		      }
 		    rtx_insn *use_insn = DF_REF_INSN (ref_link->ref);
-		    const_rtx use_set;
-		    if (use_insn == insn || DEBUG_INSN_P (use_insn))
-		      continue;
-		    if (!(use_set = single_set (use_insn))
-			|| !REG_P (SET_DEST (use_set))
-			|| GET_MODE (SET_DEST (use_set)) != GET_MODE (dest)
-			|| GET_CODE (SET_SRC (use_set)) != code
-			|| !rtx_equal_p (XEXP (SET_SRC (use_set), 0),
-					 XEXP (src, 0)))
+		    if (use_insn != insn && !DEBUG_INSN_P (use_insn))
 		      {
 			idx = -1U;
 			break;
--- gcc/testsuite/gcc.dg/torture/pr70542.c.jj	2016-04-05 13:05:53.925334083 +0200
+++ gcc/testsuite/gcc.dg/torture/pr70542.c	2016-04-05 13:05:35.000000000 +0200
@@ -0,0 +1,31 @@
+/* PR rtl-optimization/70542 */
+/* { dg-do run } */
+
+int a[113], d[113];
+short b[113], c[113], e[113];
+
+int
+main ()
+{
+  int i;
+  long j;
+  for (i = 0; i < 113; ++i)
+    {
+      a[i] = -636544305;
+      b[i] = -31804;
+    }
+  for (j = 1; j <= 112; ++j)
+    {
+      c[j] = b[j] >> ((a[j] & 1587842570) - 1510214139);
+      if (a[j])
+	d[j] = j;
+      e[j] = 7 << ((2312631697 - b[j]) - 2312663500);
+    }
+  asm volatile ("" : : : "memory");
+  if (c[0] || d[0] || e[0])
+    __builtin_abort ();
+  for (i = 1; i <= 112; ++i)
+    if (c[i] != -1 || d[i] != i || e[i] != 14)
+      __builtin_abort ();
+  return 0;
+}
--- gcc/testsuite/gcc.target/i386/avx2-pr70542.c.jj	2016-04-05 13:06:29.154855716 +0200
+++ gcc/testsuite/gcc.target/i386/avx2-pr70542.c	2016-04-05 13:07:01.779412722 +0200
@@ -0,0 +1,16 @@
+/* PR tree-optimization/70542 */
+/* { dg-do run } */
+/* { dg-options "-O3 -mavx2" } */
+/* { dg-require-effective-target avx2 } */
+
+#include "avx2-check.h"
+
+#define main() do_main ()
+
+#include "../../gcc.dg/torture/pr70542.c"
+
+static void
+avx2_test (void)
+{
+  do_main ();
+}

	Jakub


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]