This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[RFC] Design and Implementation for Path Splitting for Loop with Conditional IF-THEN-ELSE


I have Designed and  implemented with the following design for the path splitting of the loops with conditional IF-THEN-ELSE.
The implementation has gone through the bootstrap for Microblaze target along DEJA GNU regressions tests and
running the MIBench/EEMBC benchmarks. There is no regression seen in Deja GNU tests and Deja C++ tests results are
better with the path splitting changes(lesser failures). The Mibench/EEMBC benchmarks are run and no performance degradation is
seen and the performance improvement in telcom_gsm( Mibench benchmarks) of 9.15% and rgbcmy01_lite(EEMBC
benchmarks) the performance improvement of 9.4% is observed.

Proposal on the below were made earlier on path splitting and the reference is attached.

https://gcc.gnu.org/ml/gcc/2015-03/msg00057.html

The above gains are achieved because of the constant propagation on conditional which become much easier with the
Path splitting changes and benefitted with other optimization on tree representation along with better register allocation.
Because of the better granularity.

I will make the SPEC run and Bootstrap on i386 target Before that I would like to receive your feedbacks and comments 
 on design and implementation done. I will send the actual patch with SPEC run and bootstrap on i386 target after
 I receive the feedback on the design  and the implementation mentioned in this mail.

Design changes.

1. The dominators of the block with conditional IF statements say BB1 are found and the join node of the IF-THEN-ELSE
inside the loops is found on the blocks dominated by the BB1 and are not successor of BB1 are the join node.

2. The Join node is same as the source of the loops latch basic blocks.

3. If the above conditional in (1) and (2) are found the Join node  same as the source of the Loop latch node is
moved into predecessors and the Join node ( Source of the Loop latch node) is made empty statement block with only
the phi nodes.

4. In the process of moving the Join node into its predecessors the result of the phi node in the Join node propagated
with the corresponding phi arguments  based on which predecessor it came from in the Join blocks and move into its predecessors.

5. The Dominator INFO is updated after performing the steps of (1) (2) (3) and (4).

6. The Join which is same as the source of the Loop latch node is made empty with only the phi node in the Join node.

7. The implementation is done in Jump threading phase of the machine independent optimization on tree based
representation. The path splitting is called after the Jump threading optimization is performed.
The following files are modifed.

    a) tree-vrp.c
    b) tree-ssa-threadedge.c
    c) tree-cfg.c
    d) tree-ssa-threadedge.h
    e) tree-cfg.h
    f) cfghooks.c

The diff is attached along with this mail and pasted below.

Please share your thoughts and feedbacks on the above optimization and the design and the coding and implementation
done for the given above design.

diff --git a/gcc/cfghooks.c b/gcc/cfghooks.c
index 9faa339..559ca96 100644
--- a/gcc/cfghooks.c
+++ b/gcc/cfghooks.c
@@ -581,7 +581,7 @@ delete_basic_block (basic_block bb)
 
       /* If we remove the header or the latch of a loop, mark the loop for
 	 removal.  */
-      if (loop->latch == bb
+      if (loop && loop->latch == bb
 	  || loop->header == bb)
 	mark_loop_for_removal (loop);
 
diff --git a/gcc/tree-cfg.c b/gcc/tree-cfg.c
index aed5254..b25e409 100644
--- a/gcc/tree-cfg.c
+++ b/gcc/tree-cfg.c
@@ -1838,6 +1838,64 @@ replace_uses_by (tree name, tree val)
     }
 }
 
+void
+gimple_threaded_merge_blocks (basic_block a, basic_block b)
+{
+  gimple_stmt_iterator last, gsi, psi;
+
+  if (dump_file)
+    fprintf (dump_file, "Merging blocks %d and %d\n", a->index, b->index);
+
+   /* Remove labels from B and set gimple_bb to A for other statements.  */
+  for (gsi = gsi_start_bb (b); !gsi_end_p (gsi);)
+     {
+       gimple stmt = gsi_stmt (gsi);
+       if (glabel *label_stmt = dyn_cast <glabel *> (stmt))
+         {
+           tree label = gimple_label_label (label_stmt);
+           int lp_nr;
+
+           gsi_remove (&gsi, false);
+
+           if (FORCED_LABEL (label))
+             {
+               gimple_stmt_iterator dest_gsi = gsi_start_bb (a);
+               gsi_insert_before (&dest_gsi, stmt, GSI_NEW_STMT);
+             }
+             /* Other user labels keep around in a form of a debug stmt.  */
+             else if (!DECL_ARTIFICIAL (label) && MAY_HAVE_DEBUG_STMTS)
+                 {
+                   gimple dbg = gimple_build_debug_bind (label,
+                                                         integer_zero_node,
+                                                         stmt);
+
+                   gimple_debug_bind_reset_value (dbg);
+
+                   gsi_insert_before (&gsi, dbg, GSI_SAME_STMT);
+
+                 }
+           lp_nr = EH_LANDING_PAD_NR (label);
+           if (lp_nr)
+             {
+               eh_landing_pad lp = get_eh_landing_pad_from_number (lp_nr);
+               lp->post_landing_pad = NULL;
+             }
+           }
+       else
+         {
+            gimple_set_bb (stmt, a);
+            gsi_next (&gsi);
+         }
+    }
+
+    last = gsi_last_bb (a);
+    gsi_insert_seq_after (&last, bb_seq (b), GSI_NEW_STMT);
+    set_bb_seq (b, NULL);
+
+    if (cfgcleanup_altered_bbs)
+      bitmap_set_bit (cfgcleanup_altered_bbs, a->index);
+}
+
 /* Merge block B into block A.  */
 
 static void
@@ -2061,7 +2119,7 @@ remove_bb (basic_block bb)
 
       /* If a loop gets removed, clean up the information associated
 	 with it.  */
-      if (loop->latch == bb
+      if (loop && loop->latch == bb
 	  || loop->header == bb)
 	free_numbers_of_iterations_estimates_loop (loop);
     }
@@ -5779,7 +5837,7 @@ gimple_can_duplicate_bb_p (const_basic_block bb ATTRIBUTE_UNUSED)
 /* Create a duplicate of the basic block BB.  NOTE: This does not
    preserve SSA form.  */
 
-static basic_block
+basic_block
 gimple_duplicate_bb (basic_block bb)
 {
   basic_block new_bb;
@@ -5858,7 +5916,7 @@ gimple_duplicate_bb (basic_block bb)
 
 /* Adds phi node arguments for edge E_COPY after basic block duplication.  */
 
-static void
+void
 add_phi_args_after_copy_edge (edge e_copy)
 {
   basic_block bb, bb_copy = e_copy->src, dest;
diff --git a/gcc/tree-cfg.h b/gcc/tree-cfg.h
index cd28a80..e828547 100644
--- a/gcc/tree-cfg.h
+++ b/gcc/tree-cfg.h
@@ -103,5 +103,8 @@ extern void extract_true_false_edges_from_block (basic_block, edge *, edge *);
 extern unsigned int execute_fixup_cfg (void);
 extern unsigned int split_critical_edges (void);
 extern basic_block insert_cond_bb (basic_block, gimple, gimple);
+extern void gimple_threaded_merge_blocks (basic_block a, basic_block b);
+extern void add_phi_args_after_copy_edge (edge e_copy);
+extern basic_block gimple_duplicate_bb (basic_block bb);
 
 #endif /* _TREE_CFG_H  */
diff --git a/gcc/tree-ssa-threadedge.c b/gcc/tree-ssa-threadedge.c
index 4303a18..2c7d36d 100644
--- a/gcc/tree-ssa-threadedge.c
+++ b/gcc/tree-ssa-threadedge.c
@@ -1359,6 +1359,322 @@ thread_through_normal_block (edge e,
   return 0;
 }
 
+static void
+replace_threaded_uses (basic_block a,basic_block b)
+{
+  use_operand_p use;
+  ssa_op_iter iter;
+  gimple_stmt_iterator gsi, gsi_copy, psi, psi_copy;
+
+  for (psi = gsi_start_bb (a),
+       psi_copy = gsi_start_bb (b);
+       !gsi_end_p (psi);
+       gsi_next (&psi), gsi_next (&psi_copy))
+
+    {
+      gimple psi_stmt = gsi_stmt (psi);
+      gimple psi_copy_stmt = gsi_stmt (psi_copy);
+      tree lhs = gimple_get_lhs (psi_stmt);
+      tree lhs_copy = gimple_get_lhs (psi_copy_stmt);
+
+      for (gsi = gsi_start_bb (a),
+           gsi_copy = gsi_start_bb(b);
+           !gsi_end_p (gsi);
+           gsi_next(&gsi),
+           gsi_next(&gsi_copy))
+         {
+           gimple stmt = gsi_stmt (gsi);
+           gimple stmt_copy = gsi_stmt(gsi_copy);
+
+           FOR_EACH_SSA_USE_OPERAND (use, stmt, iter, SSA_OP_USE)
+           {
+             tree tuse = USE_FROM_PTR (use);
+
+             if (lhs  == tuse)
+               {
+                 FOR_EACH_SSA_USE_OPERAND (use, stmt_copy, iter, SSA_OP_USE)
+                 {
+                   tree tuse = USE_FROM_PTR (use);
+
+                   if (tuse == lhs)
+                     {
+                       propagate_value (use, lhs_copy);
+                       update_stmt (stmt_copy);
+                     }
+                 }
+               }
+            }
+         }
+    }
+}
+
+static void
+replace_threaded_uses_block (basic_block b , tree def,tree val)
+{
+  use_operand_p use;
+  ssa_op_iter iter;
+  gimple_stmt_iterator gsi;
+
+  for (gsi = gsi_start_bb (b); !gsi_end_p (gsi);)
+     {
+       gimple stmt = gsi_stmt (gsi);
+
+       FOR_EACH_SSA_USE_OPERAND (use, stmt, iter, SSA_OP_USE)
+       {
+         tree tuse = USE_FROM_PTR (use);
+
+         if (val == tuse || val == NULL_TREE)
+           {
+             propagate_value (use, def);
+             update_stmt (stmt);
+           }
+       }
+
+      gsi_next (&gsi);
+    }
+}
+
+static void
+replace_phis (basic_block bb, basic_block aa)
+{
+  gphi_iterator psi_copy, psi;
+  edge e;
+  edge_iterator ei;
+  tree def;
+  gimple_seq phis = phi_nodes (bb);
+
+  for (psi = gsi_start_phis (aa),
+       psi_copy = gsi_start_phis (bb);
+       !gsi_end_p (psi);
+        gsi_next (&psi), gsi_next (&psi_copy))
+     {
+       gphi *phi_copy = psi_copy.phi ();
+       gphi *phi = psi.phi ();
+       tree var = gimple_phi_result (phi);
+
+       FOR_EACH_EDGE (e, ei, aa->preds)
+       {
+         def = PHI_ARG_DEF_FROM_EDGE (phi, e);
+         basic_block bb = e->dest;
+
+         if (e->flags & EDGE_ABNORMAL)
+           {
+             SSA_NAME_OCCURS_IN_ABNORMAL_PHI (def) = 1;
+             SSA_NAME_OCCURS_IN_ABNORMAL_PHI (PHI_RESULT (phi_copy)) = 1;
+           }
+
+         SET_PHI_ARG_DEF (phi_copy, e->dest_idx, def);
+         gimple_phi_arg_set_location (phi_copy, e->dest_idx, UNKNOWN_LOCATION);
+      }
+    }
+}
+
+static void
+replace_uses_phi(basic_block b, int pred_index)
+{
+  gimple_stmt_iterator last, gsi, psi;
+  for (psi = gsi_start_phis (b); !gsi_end_p (psi); )
+     {
+       gimple phi = gsi_stmt (psi);
+       tree def = gimple_phi_result (phi), use = gimple_phi_arg_def (phi, pred_index-1);
+       gimple copy;
+
+       bool may_replace_uses = (virtual_operand_p (def)
+                              || may_propagate_copy (def, use));
+       if (virtual_operand_p (def))
+         {
+           imm_use_iterator iter;
+           use_operand_p use_p;
+           gimple stmt;
+
+           FOR_EACH_IMM_USE_STMT (stmt, iter, def)
+             FOR_EACH_IMM_USE_ON_STMT (use_p, iter)
+               SET_USE (use_p, use);
+
+           if (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (def))
+             SSA_NAME_OCCURS_IN_ABNORMAL_PHI (use) = 1;
+
+         }
+      else
+        replace_uses_by (def, use);
+
+      if (pred_index == 2)
+        replace_threaded_uses_block (b, use, gimple_phi_arg_def (phi, 0));
+
+      gsi_next (&psi);
+    }
+}
+
+static bool
+is_def_stmt_assert_expr (basic_block src1, basic_block bb,
+                        int pred_index)
+{
+  gimple_stmt_iterator last, gsi, psi;
+  tree def, use;
+
+  for (psi = gsi_start_phis (bb); !gsi_end_p (psi); )
+     {
+       gimple phi = gsi_stmt (psi);
+       def = gimple_phi_result (phi),
+       use = gimple_phi_arg_def (phi, pred_index-1);
+       gsi_next (&psi);
+     }
+
+  for (psi = gsi_start_bb (src1); !gsi_end_p (psi); )
+     {
+       gimple stmt  = gsi_stmt (psi);
+       tree lhs = gimple_get_lhs (stmt);
+
+       if ( lhs == use)
+         {
+           if (TREE_CODE (gimple_assign_rhs1 (stmt)) == ASSERT_EXPR ||
+               gimple_assign_rhs_code (stmt) == MIN_EXPR)
+             return true;
+           else
+             return false;
+         }
+
+       gsi_next (&psi);
+     }
+
+    return false;
+}
+
+void
+process_path_splitting (basic_block bb)
+{
+  vec<basic_block> bbs;
+  basic_block bb1;
+  unsigned int i;
+  edge_iterator ei;
+  edge e1;
+  bool found = false ,found1;
+
+  bbs = get_all_dominated_blocks (CDI_DOMINATORS,
+                                  bb );
+  FOR_EACH_VEC_ELT (bbs, i, bb1)
+  {
+    found1 = false;
+    FOR_EACH_EDGE (e1, ei, bb->succs)
+    {
+      if (bb1 == e1->dest)
+        {
+          found = true;
+          found1 = true;
+        }
+    }
+    if (!found1 && found)
+      {
+        unsigned int num_preds = 0;
+        found = false;
+
+        FOR_EACH_EDGE (e1, ei, bb1->succs)
+        {
+          if (e1->flags & (EDGE_DFS_BACK))
+            found = true;
+        }
+
+        if (found && EDGE_COUNT(bb1->preds) == 2)
+          {
+            basic_block temp_bb = bb1;
+            basic_block src1,src2;
+            unsigned int k = 0, num_stmt = 0, num_phis = 0;
+            bool is_src1 = false, is_src2 = false;
+            gimple_stmt_iterator  psi,gsi;
+
+            FOR_EACH_EDGE (e1, ei, bb1->preds)
+            {
+              if ((e1->flags & (EDGE_DFS_BACK)))
+                continue;
+              if (k == 1)
+                {
+                  if (single_succ_p(e1->src) &&
+                      single_succ_edge (e1->src)->flags & EDGE_FALLTHRU)
+                    {
+                      src2 = e1->src;
+                      is_src2 = true;
+                    }
+                }
+                else
+                  {
+                    if (single_succ_p(e1->src) &&
+                        single_succ_edge (e1->src)->flags & EDGE_FALLTHRU)
+                      {
+                        is_src1 = true;
+                        src1 = e1->src;
+                      }
+                  }
+                k++;
+           }
+          for (gsi = gsi_start_bb (bb1); !gsi_end_p (gsi); gsi_next (&gsi))
+            num_stmt++;
+
+          if ((num_stmt > 1) && is_src1 && is_src2)
+            {
+              bool found_non_virtual_result = false;
+
+              for (psi = gsi_start_phis (bb1); !gsi_end_p (psi); )
+                 {
+                   gimple stmt = gsi_stmt (psi);
+
+                   if (!virtual_operand_p (gimple_phi_result (stmt)))
+                     num_phis++;
+                   else
+                     found_non_virtual_result = true;
+
+                   gsi_next(&psi);
+                }
+
+              if ((num_phis >1) || found_non_virtual_result)
+                return;
+
+              if (!(is_def_stmt_assert_expr (src1, bb1, 1) &&
+                  is_def_stmt_assert_expr (src2, bb1, 2)))
+                return;
+
+              temp_bb = gimple_duplicate_bb (bb1);
+
+              replace_threaded_uses (bb1, temp_bb);
+
+              make_edge (src1, temp_bb, EDGE_FALLTHRU);
+
+              make_edge (src2,temp_bb,EDGE_FALLTHRU);
+
+              FOR_EACH_EDGE (e1, ei, temp_bb->preds)
+                add_phi_args_after_copy_edge (e1);
+
+              replace_phis(temp_bb, bb1);
+
+              replace_uses_phi(bb1, 1);
+
+              gimple_threaded_merge_blocks (src1, bb1);
+
+              replace_uses_phi(temp_bb, 2);
+
+              gimple_threaded_merge_blocks (src2, temp_bb);
+
+              while (EDGE_COUNT (temp_bb->preds) != 0)
+                remove_edge (EDGE_PRED (temp_bb, 0));
+
+              while (EDGE_COUNT (temp_bb->succs) != 0)
+                remove_edge (EDGE_SUCC (temp_bb, 0));
+
+              if (dom_info_available_p (CDI_DOMINATORS))
+                delete_from_dominance_info (CDI_DOMINATORS, temp_bb);
+
+              if (dom_info_available_p (CDI_POST_DOMINATORS))
+                delete_from_dominance_info (CDI_POST_DOMINATORS, temp_bb);
+
+              remove_phi_nodes (temp_bb);
+              expunge_block (temp_bb);
+
+              return;
+            }
+          }
+      }
+  }
+}
+
 /* We are exiting E->src, see if E->dest ends with a conditional
    jump which has a known value when reached via E.
 
diff --git a/gcc/tree-ssa-threadedge.h b/gcc/tree-ssa-threadedge.h
index 63eca79..a2f6b09 100644
--- a/gcc/tree-ssa-threadedge.h
+++ b/gcc/tree-ssa-threadedge.h
@@ -32,5 +32,6 @@ extern bool potentially_threadable_block (basic_block);
 extern void propagate_threaded_block_debug_into (basic_block, basic_block);
 extern void thread_across_edge (gcond *, edge, bool,
 				vec<tree> *, tree (*) (gimple, gimple));
+extern void process_path_splitting(basic_block bb);
 
 #endif /* GCC_TREE_SSA_THREADEDGE_H */
diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
index fbecdf7..c0db184 100644
--- a/gcc/tree-vrp.c
+++ b/gcc/tree-vrp.c
@@ -10141,6 +10141,10 @@ identify_jump_threads (void)
 	      thread_across_edge (dummy, e, true, &equiv_stack,
 				  simplify_stmt_for_jump_threading);
 	    }
+          if (gimple_code(last) == GIMPLE_COND)
+            {
+              process_path_splitting(bb);
+            }
 	}
     }

Thanks & Regards
Ajit

Attachment: patch_split_diff.txt
Description: patch_split_diff.txt


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]