This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Simple optimization for MASK_STORE.


On Wed, May 20, 2015 at 4:00 PM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
> Hi All,
>
> Here is updated patch to optimize mask stores. The main goal of it is
> to avoid execution of mask store if its mask is zero vector since
> loads that follow it can be blocked.
> The following changes were done:
>   1.  A test on sink legality was added - it simply prohibits to cross
> statements with non-null vdef or vuse.
>   2. New phi node is created in join block for moved MASK_STORE statements.
>   3. Test was changed to check that 2 MASK_STORE statements are not
> moved to the same block.
>   4. New field was added to loop_vec_info structure to mark loops
> having MASK_STORE's.
>
> Any comments will be appreciated.
> Yuri.

I still don't understand why you need the new target hook.  If we have a masked
load/store then the mask is computed by an assignment with a VEC_COND_EXPR
(in your example) and thus a test for a zero mask is expressible as just

 if (vect__ifc__41.17_167 == { 0, 0, 0, 0... })

or am I missing something?  As we dont' want this transform on all targets
(or always) we can control it by either a --param or a new flag which default
targets can adjust.  [Is the hazard still present with Skylake or other
AVX512 implementations for example?]

+/* Helper for optimize_mask_stores: returns true if STMT sinking to end
+   of BB is valid and false otherwise.  */
+
+static bool
+is_valid_sink (gimple stmt)
+{

so STMT is either a masked store or a masked load?  You shouldn't
walk all stmts here but instead rely on the factored use-def def-use
chains of virtual operands.

+void
+optimize_mask_stores (struct loop *loop)
+{
+  basic_block bb = loop->header;
+  gimple_stmt_iterator gsi;
+  gimple stmt;
+  auto_vec<gimple> worklist;
+
+  if (loop->dont_vectorize
+      || loop->num_nodes > 2)
+    return;

looks like no longer needed given the place this is called from now
(or does it guard against outer loop vectorization as well?)
Also put it into tree-vect-loop.c and make it static.

+      /* Loop was not vectorized if mask does not have vector type.  */
+      if (!VECTOR_TYPE_P (TREE_TYPE (mask)))
+       return;

this should be always false

+      store_bb->frequency = bb->frequency / 2;
+      efalse->probability = REG_BR_PROB_BASE / 2;

I think the == 0 case should end up as unlikely.

+      if (dom_info_available_p (CDI_POST_DOMINATORS))
+       set_immediate_dominator (CDI_POST_DOMINATORS, bb, store_bb);

post dominators are not available in the vectorizer.

+         /* Put definition statement of stored value in STORE_BB
+            if possible.  */
+         arg3 = gimple_call_arg (last, 3);
+         if (TREE_CODE (arg3) == SSA_NAME && has_single_use (arg3))
+           {
+             def_stmt = SSA_NAME_DEF_STMT (arg3);
+             /* Move def_stmt to STORE_BB if it is in the same bb and
+                it is legal.  */
+             if (gimple_bb (def_stmt) == bb
+                 && is_valid_sink (def_stmt))

ok, so you move arbitrary statements.  You can always move non-memory
statements and if you keep track of the last store you moved you
can check if gimple_vuse () is the same as on that last store and be
done with that, or if you sink another store (under the same condition)
then just update the last store.

Otherwise this looks better now.

Thus - get rid of the target hook and of is_valid_sink.

Thanks,
Richard.

> 2015-05-20  Yuri Rumyantsev  <ysrumyan@gmail.com>
>
> * config/i386/i386.c: Include files stringpool.h and tree-ssanames.h.
> (ix86_vectorize_is_zero_vector): New function.
> (TARGET_VECTORIZE_IS_ZERO_VECTOR): New target macro
> * doc/tm.texi.in: Add @hook TARGET_VECTORIZE_IS_ZERO_VECTOR.
> * doc/tm.texi: Updated.
> * target.def (is_zero_vector): New DEFHOOK.
> * tree-vect-stmts.c : Include tree-into-ssa.h.
> (vectorizable_mask_load_store): Initialize has_mask_store field.
> (is_valid_sink): New function.
> (optimize_mask_stores): New function.
> * tree-vectorizer.c (vectorize_loops): Invoke optimaze_mask_stores for
> loops having masked stores.
> * tree-vectorizer.h (loop_vec_info): Add new has_mask_store field and
> correspondent macros.
> (optimize_mask_stores): Update prototype.
>
> gcc/testsuite/ChangeLog:
> * gcc.target/i386/avx2-vect-mask-store-move1.c: New test.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]