[PATCH PR69848/partial]Propagate comparison into VEC_COND_EXPR if target supports

Bin.Cheng amker.cheng@gmail.com
Wed May 18 15:06:00 GMT 2016


On Tue, May 17, 2016 at 12:08 PM, Richard Biener
<richard.guenther@gmail.com> wrote:
> On Mon, May 16, 2016 at 10:09 AM, Bin.Cheng <amker.cheng@gmail.com> wrote:
>> On Fri, May 13, 2016 at 5:53 PM, Richard Biener
>> <richard.guenther@gmail.com> wrote:
>>> On May 13, 2016 6:02:27 PM GMT+02:00, Bin Cheng <Bin.Cheng@arm.com> wrote:
>>>>Hi,
>>>>As PR69848 reported, GCC vectorizer now generates comparison outside of
>>>>VEC_COND_EXPR for COND_REDUCTION case, as below:
>>>>
>>>>  _20 = vect__1.6_8 != { 0, 0, 0, 0 };
>>>>  vect_c_2.8_16 = VEC_COND_EXPR <_20, { 0, 0, 0, 0 }, vect_c_2.7_13>;
>>>>  _21 = VEC_COND_EXPR <_20, ivtmp_17, _19>;
>>>>
>>>>This results in inefficient expanding.  With IR like:
>>>>
>>>>vect_c_2.8_16 = VEC_COND_EXPR <vect__1.6_8 != { 0, 0, 0, 0 }, { 0, 0,
>>>>0, 0 }, vect_c_2.7_13>;
>>>>  _21 = VEC_COND_EXPR <vect__1.6_8 != { 0, 0, 0, 0 }, ivtmp_17, _19>;
>>>>
>>>>We can do:
>>>>1) Expanding time optimization, for example, reverting comparison
>>>>operator by switching VEC_COND_EXPR operands.  This is useful when
>>>>backend only supports some comparison operators.
>>>>2) For backend not supporting vcond_mask patterns, saving one LT_EXPR
>>>>instruction which introduced by expand_vec_cond_expr.
>>>>
>>>>This patch fixes this by propagating comparison into VEC_COND_EXPR even
>>>>if it's used multiple times.  For now, GCC does single_use_only
>>>>propagation.  Ideally, we may duplicate the comparison before each use
>>>>statement just before expanding, so that TER can successfully backtrack
>>>>it from each VEC_COND_EXPR.  Unfortunately I didn't find a good pass to
>>>>do this.  Tree-vect-generic.c looks like a good candidate, but it's so
>>>>early that following CSE could undo the transform.  Another possible
>>>>fix is to generate comparison inside VEC_COND_EXPR directly in function
>>>>vectorizable_reduction.
>>>
>>> I prefer this for now.
>> Hi Richard, you mean this patch, or the possible fix before your comment?
>
> The possible fix before my comment - make the vectorizer generate VEC_COND_EXPRs
> with embedded comparison.
Hi,
Here is updated patch doing that.  It's definitely clearer than the
original version.
Bootstrap and test on x86_64.  Also checked the expanding time
optimization still happens.  Is it OK?

Thanks,
bin
>
> Thanks,
> Richard.
>
-------------- next part --------------
diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c
index d673c67..67053af 100644
--- a/gcc/tree-vect-loop.c
+++ b/gcc/tree-vect-loop.c
@@ -6159,21 +6159,14 @@ vectorizable_reduction (gimple *stmt, gimple_stmt_iterator *gsi,
 	     Finally, we update the phi (NEW_PHI_TREE) to take the value of
 	     the new cond_expr (INDEX_COND_EXPR).  */
 
-	  /* Turn the condition from vec_stmt into an ssa name.  */
-	  gimple_stmt_iterator vec_stmt_gsi = gsi_for_stmt (*vec_stmt);
-	  tree ccompare = gimple_assign_rhs1 (*vec_stmt);
-	  tree ccompare_name = make_ssa_name (TREE_TYPE (ccompare));
-	  gimple *ccompare_stmt = gimple_build_assign (ccompare_name,
-						       ccompare);
-	  gsi_insert_before (&vec_stmt_gsi, ccompare_stmt, GSI_SAME_STMT);
-	  gimple_assign_set_rhs1 (*vec_stmt, ccompare_name);
-	  update_stmt (*vec_stmt);
+	  /* Duplicate the condition from vec_stmt.  */
+	  tree ccompare = unshare_expr (gimple_assign_rhs1 (*vec_stmt));
 
 	  /* Create a conditional, where the condition is taken from vec_stmt
-	     (CCOMPARE_NAME), then is the induction index (INDEX_BEFORE_INCR)
-	     and else is the phi (NEW_PHI_TREE).  */
+	     (CCOMPARE), then is the induction index (INDEX_BEFORE_INCR) and
+	     else is the phi (NEW_PHI_TREE).  */
 	  tree index_cond_expr = build3 (VEC_COND_EXPR, cr_index_vector_type,
-					 ccompare_name, indx_before_incr,
+					 ccompare, indx_before_incr,
 					 new_phi_tree);
 	  cond_name = make_ssa_name (cr_index_vector_type);
 	  gimple *index_condition = gimple_build_assign (cond_name,


More information about the Gcc-patches mailing list