[RFC PATCH, i386]: Autovectorize 8-byte vectors

Richard Biener rguenther@suse.de
Thu Jun 27 07:41:00 GMT 2019


On Thu, 27 Jun 2019, Uros Bizjak wrote:

> On Thu, Jun 27, 2019 at 8:05 AM Jakub Jelinek <jakub@redhat.com> wrote:
> >
> > On Wed, Jun 26, 2019 at 12:19:28PM +0200, Uros Bizjak wrote:
> > > Yes, the patch works OK. I'll regression test it and push it later today.
> >
> > I think it caused
> > +FAIL: gcc.dg/tree-ssa/pr84512.c scan-tree-dump optimized "return 285;"
> > which admittedly already is xfailed on various targets.
> > We now newly vectorize those loops and there is no FRE or similar pass
> > after vectorization to clean it up, in particular optimize the
> > a[8] and a[9] loads given the MEM <vector(2) int> [(int *)&a + 32B]
> > store:
> >   MEM <vector(2) int> [(int *)&a + 32B] = { 64, 81 };
> >   _13 = a[8];
> >   res_6 = _13 + 140;
> >   _18 = a[9];
> >   res_15 = res_6 + _18;
> >   a ={v} {CLOBBER};
> >   return res_15;
> 
> Yes, I have seen pr84512.c, but the failure is benign. It is caused by
> the fact that we now vectorize the loops of the test.
> 
> > Shall we xfail it, or is there a plan to enable FRE after vectorization,
> > or similar pass that would be able to do similar memory optimizations?
> > Note, the RTL passes are able to optimize it in the end in this testcase.
> 
> The testcase failure could be solved by -fno-tree-vectorize, but I
> think that the value should be propagated through vectors, and tree
> optimizers should optimize the vectorized function in the same way as
> scalar function.

FRE needs a simple fix (oops) to handle this case though.

Bootstrap / regtest running on x86_64-unknown-linux-gnu.

And yes, I think ultimatively we want a late FRE...

Richard.

2019-06-27  Richard Biener  <rguenther@suse.de>

	* tree-ssa-sccvn.c (vn_reference_lookup_3): Encode valueized RHS.

	* gcc.dg/tree-ssa/ssa-fre-67.c: New testcase.

Index: gcc/tree-ssa-sccvn.c
===================================================================
--- gcc/tree-ssa-sccvn.c	(revision 272732)
+++ gcc/tree-ssa-sccvn.c	(working copy)
@@ -2242,7 +2242,7 @@ vn_reference_lookup_3 (ao_ref *ref, tree
 	  tree rhs = gimple_assign_rhs1 (def_stmt);
 	  if (TREE_CODE (rhs) == SSA_NAME)
 	    rhs = SSA_VAL (rhs);
-	  len = native_encode_expr (gimple_assign_rhs1 (def_stmt),
+	  len = native_encode_expr (rhs,
 				    buffer, sizeof (buffer),
 				    (offseti - offset2) / BITS_PER_UNIT);
 	  if (len > 0 && len * BITS_PER_UNIT >= maxsizei)
Index: gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-67.c
===================================================================
--- gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-67.c	(revision 272732)
+++ gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-67.c	(working copy)
@@ -1,16 +1,32 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -fno-tree-ccp -fdump-tree-fre1-stats" } */
+/* { dg-options "-fgimple -O1 -fdump-tree-fre1" } */
 
-int foo()
+int a[10];
+typedef int v2si __attribute__((vector_size(__SIZEOF_INT__*2)));
+int __GIMPLE (ssa,guessed_local(97603132),startwith("fre1"))
+     foo ()
 {
-  int i = 0;
-  do
-    {
-      i++;
-    }
-  while (i != 1);
-  return i;
+  int i;
+  int _59;
+  int _44;
+  int _13;
+  int _18;
+  v2si _80;
+  v2si _81;
+  int res;
+
+  __BB(2,guessed_local(97603132)):
+  _59 = 64;
+  i_61 = 9;
+  _44 = i_61 * i_61;
+  _80 = _Literal (v2si) {_59, _44};
+  _81 = _80;
+  __MEM <v2si> ((int *)&a + _Literal (int *) 32) = _81;
+  i_48 = 9;
+  _13 = a[8];
+  _18 = a[i_48];
+  res_15 = _13 + _18;
+  return res_15;
 }
 
-/* { dg-final { scan-tree-dump "RPO iteration over 3 blocks visited 3 blocks" "fre1" } } */
-/* { dg-final { scan-tree-dump "return 1;" "fre1" } } */
+/* { dg-final { scan-tree-dump "return 145;" "fre1" } } */



More information about the Gcc-patches mailing list