This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Hoist loop invariant statements containing data refs with zero-step during loop-versioning in vectorization.


On Tue, 15 Oct 2013, Cong Hou wrote:

> Thank you for your reminder, Jeff! I just noticed Richard's comment. I
> have modified the patch according to that.
> 
> The new patch is attached.

(posting patches inline is easier for review, now you have to deal
with no quoting markers ;))

Comments inline.

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 8a38316..2637309 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,8 @@
+2013-10-15  Cong Hou  <congh@google.com>
+
+	* tree-vect-loop-manip.c (vect_loop_versioning): Hoist loop invariant
+	statement that contains data refs with zero-step.
+
 2013-10-14  David Malcolm  <dmalcolm@redhat.com>
 
 	* dumpfile.h (gcc::dump_manager): New class, to hold state
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 075d071..9d0f4a5 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,7 @@
+2013-10-15  Cong Hou  <congh@google.com>
+
+	* gcc.dg/vect/pr58508.c: New test.
+
 2013-10-14  Tobias Burnus  <burnus@net-b.de>
 
 	PR fortran/58658
diff --git a/gcc/testsuite/gcc.dg/vect/pr58508.c b/gcc/testsuite/gcc.dg/vect/pr58508.c
new file mode 100644
index 0000000..cb22b50
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/pr58508.c
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -ftree-vectorize -fdump-tree-vect-details" } */
+
+
+/* The GCC vectorizer generates loop versioning for the following loop
+   since there may exist aliasing between A and B.  The predicate checks
+   if A may alias with B across all iterations.  Then for the loop in
+   the true body, we can assert that *B is a loop invariant so that
+   we can hoist the load of *B before the loop body.  */
+
+void foo (int* a, int* b)
+{
+  int i;
+  for (i = 0; i < 100000; ++i)
+    a[i] = *b + 1;
+}
+
+
+/* { dg-final { scan-tree-dump-times "hoist" 2 "vect" } } */
+/* { dg-final { cleanup-tree-dump "vect" } } */
diff --git a/gcc/tree-vect-loop-manip.c b/gcc/tree-vect-loop-manip.c
index 574446a..f4fdec2 100644
--- a/gcc/tree-vect-loop-manip.c
+++ b/gcc/tree-vect-loop-manip.c
@@ -2477,6 +2477,92 @@ vect_loop_versioning (loop_vec_info loop_vinfo,
       adjust_phi_and_debug_stmts (orig_phi, e, PHI_RESULT (new_phi));
     }
 

Note that applying this kind of transform at this point invalidates
some of the earlier analysis the vectorizer performed (namely the
def-kind which now effectively gets vect_external_def from 
vect_internal_def).  In this case it doesn't seem to cause any
issues (we re-compute the def-kind everytime we need it (how wasteful)).

+  /* Extract load and store statements on pointers with zero-stride
+     accesses.  */
+  if (LOOP_REQUIRES_VERSIONING_FOR_ALIAS (loop_vinfo))
+    {
+      /* In the loop body, we iterate each statement to check if it is a load
+	 or store.  Then we check the DR_STEP of the data reference.  If
+	 DR_STEP is zero, then we will hoist the load statement to the loop
+	 preheader, and move the store statement to the loop exit.  */

We don't move the store yet.  Micha has a patch pending that enables
vectorization of zero-step stores.

+      for (gimple_stmt_iterator si = gsi_start_bb (loop->header);
+	   !gsi_end_p (si);)

While technically ok now (vectorized loops contain a single basic block)
please use LOOP_VINFO_BBS () to get at the vector of basic-blcoks
and iterate over them like other code does.

+	{
+	  gimple stmt = gsi_stmt (si);
+	  stmt_vec_info stmt_info = vinfo_for_stmt (stmt);
+	  struct data_reference *dr = STMT_VINFO_DATA_REF (stmt_info);
+
+	  if (dr && integer_zerop (DR_STEP (dr)))
+	    {
+	      if (DR_IS_READ (dr))
+		{
+		  if (dump_enabled_p ())
+		    {
+		      dump_printf_loc
+			  (MSG_NOTE, vect_location,
+			   "hoist the statement to outside of the loop ");

"hoisting out of the vectorized loop: "

+		      dump_gimple_stmt (MSG_NOTE, TDF_SLIM, stmt, 0);
+		      dump_printf (MSG_NOTE, "\n");
+		    }
+
+		  gsi_remove (&si, false);
+		  gsi_insert_on_edge_immediate (loop_preheader_edge (loop), stmt);

Note that this will result in a bogus VUSE on the stmt at this point which
will be only fixed because of implementation details of loop versioning.
Either get the correct VUSE from the loop header virtual PHI node 
preheader edge (if there is none then the current VUSE is the correct one 
to use) or clear it.

+		}
+	      /* TODO: We also consider vectorizing loops containing zero-step
+		 data refs as writes.  For example:
+
+		 int a[N], *s;
+		 for (i = 0; i < N; i++)
+		   *s += a[i];
+
+		 In this case the write to *s can be also moved after the
+		 loop.  */

Note that you then invalidate even more vectorizer analysis - you
basically introduce a scalar reduction (you have to add a PHI node).
Which means that the transform has to happen elsewhere.

As Jakub now tries with if-conversion this would also be a candidate
for applying the loop versioning before even starting the rest of the
vectorizer analysis code.

That said, I'd remove the TODO at this point because it's clearly not
possible to implement just here ;)

+	      continue;
+	    }
+	  else if (!dr)
+	  {
+	    bool hoist = true;
+	    for (size_t i = 0; i < gimple_num_ops (stmt); i++)

You are checking all kinds of statements, including assignments
of which you are also checking the LHS ... restricting to
assignments you can start walking at i = 1.

+	      {
+		tree op = gimple_op (stmt, i);
+		if (TREE_CODE (op) == INTEGER_CST
+		    || TREE_CODE (op) == REAL_CST)

There are other constants as well - just check

		if (is_gimple_min_invariant (op))

+		  continue;
+		if (TREE_CODE (op) == SSA_NAME)
+		  {
+		    gimple def = SSA_NAME_DEF_STMT (op);
+		    if (def == stmt

with starting at op 1 you can drop this.

+			|| gimple_nop_p (def)
+			|| !flow_bb_inside_loop_p (loop, gimple_bb (def)))
+		      continue;
+		  }

Note that you fail to hoist if-converted code this way because
op1 of

   name_1 = a_2 < b_4 ? x_5 : y_6;

is 'a_2 < b_4', a tree expression and not an SSA name (ugh).  Maybe
we don't care (it's just a missed optimization), but if you are
restricting yourself to hoisting assignments without a data-ref
then you can walk over SSA uses on the stmt (instead of over gimple 
ops) with

		FOR_EACH_SSA_TREE_OPERAND (op, stmt, iter, SSA_USE)

which would automagically take care of that case.

Can you add a testcase which involves invariant if-conversion?
Can you add a testcase with just an invariant store to make sure
we don't wreck it?  Can you add a testcase with an invariant store
and load (the reduction case), too?

+		hoist = false;
+		break;
+	      }

For example you'll hoist all labels this way (no ops), as well as the
loop exit GIMPLE_COND in case it's operands were loop invariant,
breaking the CFG ... so please add && is_gimple_assign () to the
if (!dr) check ;)

+	    if (hoist)
+	      {
+		gsi_remove (&si, false);
+		gsi_insert_on_edge_immediate (loop_preheader_edge (loop), stmt);
+
+		if (dump_enabled_p ())
+		  {
+		    dump_printf_loc
+			(MSG_NOTE, vect_location,
+			 "hoist the statement to outside of the loop ");
+		    dump_gimple_stmt (MSG_NOTE, TDF_SLIM, stmt, 0);
+		    dump_printf (MSG_NOTE, "\n");
+		  }
+		continue;
+	      }
+	  }
+	  gsi_next (&si);
+	}
+    }
+
   /* End loop-exit-fixes after versioning.  */
 
   if (cond_expr_stmt_list)

> 
> thanks,
> Cong
> 
> 
> On Tue, Oct 15, 2013 at 12:33 PM, Jeff Law <law@redhat.com> wrote:
> > On 10/14/13 17:31, Cong Hou wrote:
> >>
> >> Any comment on this patch?
> >
> > Richi replied in the BZ you opened.
> >
> > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58508
> >
> > Essentially he said emit the load on the edge rather than in the block
> > itself.
> > jeff
> >


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]