This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [RFC][Scalar masks 1/x] Introduce GEN_MASK_EXPR.


On 08/17/2015 10:22 AM, Ilya Enkovich wrote:
Hi,

This patch starts a series introducing scalar masks support in the vectorizer.  It was discussed on the recent Cauldron and changes overiew is available here: https://gcc.gnu.org/wiki/cauldron2015?action=AttachFile&do=view&target=Vectorization+for+Intel+AVX-512.pdf.  Here is shortly a list of changes introduced by this series:

  - Add new tree expr to produce scalar masks in a vectorized code
  - Fix-up if-conversion to use bool predicates instead of integer masks
  - Disable some bool patterns to avoid bool to int conversion where masks can be used
  - Support bool operands in vectorization factor computation
  - Support scalar masks in MASK_LOAD, MASK_STORE and VEC_COND_EXPR by adding new optabs
  - Support vectorization for statements which are now not transformed by bool patterns
  - Add target support (hooks, optabs, expands)

This patch introduces GEN_MASK_EXPR code.  Intitially I wanted to use a comparison as an operand for it directly mapping it into AVX-512 comparison instruction.  But a feedback was to simplify new code's semantics and use it for converting vectors into scalar masks.  Therefore if we want to compare two vectors into a scalar masks we use two statements:

   vect.18_87 = vect__5.13_81 > vect__6.16_86;
   mask__ifc__23.17_88 = GEN_MASK <vect.18_87>;

Trying it in practice I found it producing worse code. The problem is that on target first comparison is expanded into two instructions: cmp with mask result + masked move to get a vector. GEN_MASK is then expanded into another comparison with zero vector.  Thus I get two comparisons + move instead of a single comparison and have to optimize this out on a target side (current optimizers can't handle it).  That's actually what I wanted to avoid.  For now I changed GEN_MASK_EXPR to get a vector value as an operand but didn't change expand pattern which has four opernads: two vectors to compare + cmp operator + result.  On expand I try to detect GEN_MASK uses a result of comparison and thus avoid double comparison generation.

Patch series is not actually fully finished yet.  I still have several type conversion tests not being vectorized and it wasn't widely tested.  That's what I'm working on now.

Will be glad to any comments.

Thanks,
Ilya
--
2015-08-17  Ilya Enkovich  <enkovich.gnu@gmail.com>

	* expr.c (expand_expr_real_2): Support GEN_MASK_EXPR.
	* gimple-pretty-print.c (dump_unary_rhs): Likewise.
	* gimple.c (get_gimple_rhs_num_ops): Likewise.
	* optabs.c: Include gimple.h.
	(vector_compare_rtx): Add OPNO arg.
	(get_gen_mask_icode): New.
	(expand_gen_mask_expr_p): New.
	(expand_gen_mask_expr): New.
	(expand_vec_cond_expr): Adjust vector_compare_rtx call.
	* optabs.def (gen_mask_optab): New.
	(gen_masku_optab): New.
	* optabs.h (expand_gen_mask_expr_p): New.
	(expand_gen_mask_expr): New.
	* tree-cfg.c (verify_gimple_assign_unary): Support GEN_MASK_EXPR.
	* tree-inline.c (estimate_operator_cost): Likewise.
	* tree-pretty-print.c (dump_generic_node): Likewise.
	* tree-ssa-operands.c (get_expr_operands): Likewise.
	* tree.def (GEN_MASK_EXPR): New.
A general question, would any of this likely help Yuri's work to optimize MASK_STORES?



diff --git a/gcc/optabs.c b/gcc/optabs.c
index a6ca706..bf466ca 100644
--- a/gcc/optabs.c
+++ b/gcc/optabs.c
@@ -51,6 +51,7 @@ along with GCC; see the file COPYING3.  If not see
  #include "recog.h"
  #include "reload.h"
  #include "target.h"
+#include "gimple.h"
Hmm, part of me doesn't want to see optabs.c depending on gimple.h.

How painful would it be to have this stuff live in expr.c?

+
+/* Generate insns for a GEN_MASK_EXPR, given its TYPE and operand.  */
+
+rtx
+expand_gen_mask_expr (tree type, tree op0, rtx target)
+{
+  struct expand_operand ops[4];
+  enum insn_code icode;
+  rtx comparison;
+  machine_mode mode = TYPE_MODE (type);
+  machine_mode cmp_op_mode;
+  bool unsignedp;
+  tree op0a, op0b;
+  enum tree_code tcode;
+  gimple def_stmt;
+
+  /* Avoid double comparison.  */
+  if (TREE_CODE (op0) == SSA_NAME
+      && (def_stmt = SSA_NAME_DEF_STMT (op0))
+      && is_gimple_assign (def_stmt)
+      && TREE_CODE_CLASS (gimple_assign_rhs_code (def_stmt)) == tcc_comparison)
+    {
+      op0a = gimple_assign_rhs1 (def_stmt);
+      op0b = gimple_assign_rhs2 (def_stmt);
+      tcode = gimple_assign_rhs_code (def_stmt);
+    }
+  else
+    {
+      op0a = op0;
+      op0b = build_zero_cst (TREE_TYPE (op0));
+      tcode = NE_EXPR;
+    }
+
+  unsignedp = TYPE_UNSIGNED (TREE_TYPE (op0a));
+  cmp_op_mode = TYPE_MODE (TREE_TYPE (op0a));
+
+  gcc_assert (GET_MODE_BITSIZE (mode) >= GET_MODE_NUNITS (cmp_op_mode));
+
+  icode = get_gen_mask_icode (cmp_op_mode, unsignedp);
+  if (icode == CODE_FOR_nothing)
+    return 0;
So if the target doesn't have suitable insns, what happens? I suspect the answer is nothing useful. In which case the question becomes what prevents the optimizers from generating a GEN_MASK_EXPR? Maybe that'll become clear as I read the rest of the patches.




diff --git a/gcc/tree-inline.c b/gcc/tree-inline.c
index e1ceea4..052c055 100644
--- a/gcc/tree-inline.c
+++ b/gcc/tree-inline.c
@@ -3850,6 +3850,7 @@ estimate_operator_cost (enum tree_code code, eni_weights *weights,
      case COMPLEX_EXPR:
      case PAREN_EXPR:
      case VIEW_CONVERT_EXPR:
+    case GEN_MASK_EXPR:
        return 0;
That seems wrong to me :-)

Jeff


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]