This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH, PR43864] Gimple level duplicate block cleanup.


On Sun, Jul 17, 2011 at 8:33 PM, Tom de Vries <vries@codesourcery.com> wrote:

> Bootstrapped and reg-tested on x86_64. ?Ok for trunk (after ARM testing)?

+static int
+same_succ_equal (const void *ve1, const void *ve2)
+{
...
+  if (bitmap_bit_p (e1->bbs, ENTRY_BLOCK)
+      || bitmap_bit_p (e1->bbs, EXIT_BLOCK)
+      || bitmap_bit_p (e2->bbs, ENTRY_BLOCK)
+      || bitmap_bit_p (e2->bbs, EXIT_BLOCK))
+    return 0;

that's odd - what are these checks for?

+  if (dump_file)
+    {
+      fprintf (dump_file, "initial worklist:\n");

with dump_flags & TDF_DETAILS

I'm now at merge_calls and wondering about alias info again.  We are
probably safe for the per-pointer information because we are not
operating flow-sensitive for memory and for merging require value-equivalence
for SSA names.  For calls the same should be true - we are not
flow- or context-sensitive, and even if we were context-sentitive we
require equivalent arguments (for memory arguments we should be safe
because of the non-flow-sensitivity).

So, did you actually run into problems?  If not then I suggest to remove
merge_calls completely (and the related changes that it requires).

+/* Create or update a vop phi in BB2.  Use VUSE1 arguments for all the
+   REDIRECTED_EDGES, or if VUSE1 is NULL_TREE, use BB_VOP_AT_EXIT.  If a new
+   phis is created, use the phi instead of VUSE2 in BB2.  */
+
+static void
+update_vuses (tree vuse1, tree vuse2, basic_block bb2,
+              VEC (edge,heap) *redirected_edges)

...

+  if (vuse2 == NULL_TREE)
+    return;

hm, that's the case when there is no VUSE that is dominated by BB2
(or is in BB2).  Ok, might happen.

+             locus1 = gimple_location (SSA_NAME_DEF_STMT (arg));
+             add_phi_arg (phi, arg, e, locus1);

I don't think virtual operand PHIs should have locations, just use
UNKNOWN_LOCATION here.

+  locus2 = gimple_location (def_stmt2);

Likewise.

+  /* Create a phi, first with default argument vuse2 for all preds.  */
+  lhs = make_ssa_name (SSA_NAME_VAR (vuse2), NULL);
+  VN_INFO_GET (lhs);
+  phi = create_phi_node (lhs, bb2);
+  SSA_NAME_DEF_STMT (lhs) = phi;
+  FOR_EACH_EDGE (e, ei, bb2->preds)
+    add_phi_arg (phi, vuse2, e, locus2);
+
+  /* Now overwrite the arguments associated with the redirected edges with
+     vuse1.  */
+  for (i = 0; i < EDGE_COUNT (redirected_edges); ++i)
+    {
+      e = VEC_index (edge, redirected_edges, i);
+      gcc_assert (PHI_ARG_DEF_FROM_EDGE (phi, e));

No need for this assert.

+      if (vuse1)
+       arg = vuse1;
+      else
+       arg = BB_VOP_AT_EXIT (e->src);
+      SET_PHI_ARG_DEF (phi, e->dest_idx, arg);
+      locus1 = gimple_location (SSA_NAME_DEF_STMT (arg));

See above.

+      gimple_phi_arg_set_location (phi, e->dest_idx, locus1);
+    }


Can you maybe merge this with the update-existing-phi-case?  They
look all too similar.

+  /* Replace uses of vuse2 in bb2 with phi.  */
+  FOR_EACH_IMM_USE_STMT (stmt, iter, vuse2)
+    {
+      if (gimple_code (stmt) == GIMPLE_PHI)

Does FOR_EACH_IMM_USE_ON_STMT really not work for PHIs?
Other code doesn't seem to care.

+       {
+         edge e;
+         if (stmt == phi)
+           continue;
+         e = find_edge (bb2, gimple_bb (stmt));
+         if (e == NULL)
+           continue;
+         use_p = PHI_ARG_DEF_PTR_FROM_EDGE (stmt, e);
+         SET_USE (use_p, lhs);
+         update_stmt (stmt);
+       }
+      else if (gimple_bb (stmt) == bb2)

That check looks odd.  A use can very well appear in a forwarder block.

+       {
+         FOR_EACH_IMM_USE_ON_STMT (use_p, iter)
+           SET_USE (use_p, lhs);
+         update_stmt (stmt);
+       }
+    }

+/* Scans the vdefs and vuses of the insn of BB, and returns the vop at entry in
+   VOP_AT_ENTRY, and the vop at exit in VOP_AT_EXIT.  */
+
+static void
+insn_vops (basic_block bb, tree *vop_at_entry, tree *vop_at_exit)

it's easier to start from the bb end and walk until you see the
first vdef or vuse.  Then you have *vop_at_exit.  From there
just walk the SSA_NAME_DEF_STMTs of the vuse until you
hit one whose definition is not in BB - and you have *vop_at_entry.
That way you can avoid scanning most of the stmts.

The function also has an odd name ;)  It should be something like
vops_at_bb_entry_and_exit.

+static void
+vop_at_entry (basic_block bb1, basic_block bb2, tree *vop_at_entry1,
+             tree *vop_at_entry2)

so you don't need the vop at exit at all?  The function is a bit unclear
to me given it does so much stuff other than just computing the BBs
entry VOPs ...

+static void
+replace_block_by (basic_block bb1, basic_block bb2, bool update_vops)
+{

can you add some comments before the different phases of update?
I _think_ I understand what it does, but ...

+/* Runs tail merge optimization.  */
+
+unsigned int
+tail_merge_optimize (unsigned int todo)
+{
+  int nr_bbs_removed_total = 0;
+  int nr_bbs_removed;
+  bool loop_entered = false;
+  int iteration_nr = 0;
+  bool update_vops = ((todo & TODO_update_ssa_only_virtuals) == 0
+                     || !symbol_marked_for_renaming (gimple_vop (cfun)));

you need to simplify this to

  bool update_vops = !symbol_marked_for_renaming (gimple_vop (cfun));

+      if (nr_bbs_removed == 0)
+       break;
+
+      free_dominance_info (CDI_DOMINATORS);

we might want to limit the number of iterations we perform - especially
as you are re-computing dominators on each iteration.  What's the
maximum number of iterations you see during a GCC bootstrap?

+  if (dump_file)
+    fprintf (dump_file, "htab collision / search: %f\n",
+            htab_collisions (same_succ_htab));

in general without dump_flags & TDF_DETAILS passes should print
at most things when they did a transformation (some even don't do that).
Please double-check all your dump-prints.

+      todo |= (TODO_verify_ssa | TODO_verify_stmts | TODO_verify_flow
+              | TODO_dump_func);

should all be already set.

@ -4945,6 +4944,9 @@ execute_pre (bool do_fre)
   scev_finalize ();
   fini_pre (do_fre);

+  todo |= tail_merge_optimize (todo);
+  free_scc_vn ();

Please only run tail_merge_optimize once.  As we are running through
this code three times at -O2.  I suggest to try it in the !do_fre case
only which we only run once (as PRE).  If that doesn't work out
nicely we need to find other means (like introduce a
pass_fre_and_tail_merge which passes down another flag and replace
one FRE pass with that new combined pass).

Index: gcc/tree-flow.h
===================================================================
--- gcc/tree-flow.h     (revision 175801)
+++ gcc/tree-flow.h     (working copy)
@@ -806,6 +806,9 @@ bool multiplier_allowed_in_address_p (HO
 unsigned multiply_by_cost (HOST_WIDE_INT, enum machine_mode, bool);
 bool may_be_nonaddressable_p (tree expr);

+/* In tree-ssa-tail-merge.c.  */
+extern unsigned int tail_merge_optimize (unsigned int);

Eh, tree-flow.h kitchen-sink ;)  Please put it into tree-pass.h instead.

That said - I'm reasonably happy with the pass now, but it's rather large
(this review took 40min again ...) so I appreciate a second look from
somebody else.

Btw, can you expand a bit on the amount of testcases?

Thanks,
Richard.


> Thanks,
> - Tom
>
> 2011-07-17 ?Tom de Vries ?<tom@codesourcery.com>
>
> ? ? ? ?PR middle-end/43864
> ? ? ? ?* tree-ssa-tail-merge.c: New file.
> ? ? ? ?(struct same_succ): Define.
> ? ? ? ?(same_succ_t, const_same_succ_t): New typedef.
> ? ? ? ?(struct bb_cluster): Define.
> ? ? ? ?(bb_cluster_t, const_bb_cluster_t): New typedef.
> ? ? ? ?(struct aux_bb_info): Define.
> ? ? ? ?(BB_SIZE, BB_SAME_SUCC, BB_CLUSTER, BB_VOP_AT_EXIT): Define.
> ? ? ? ?(gvn_uses_equal): New function.
> ? ? ? ?(same_succ_print, same_succ_print_traverse, same_succ_hash)
> ? ? ? ?(inverse_flags, same_succ_equal, same_succ_alloc, same_succ_delete)
> ? ? ? ?(same_succ_reset): New function.
> ? ? ? ?(same_succ_htab, same_succ_edge_flags)
> ? ? ? ?(deleted_bbs, deleted_bb_preds): New var.
> ? ? ? ?(debug_same_succ): New function.
> ? ? ? ?(worklist): New var.
> ? ? ? ?(print_worklist, add_to_worklist, find_same_succ_bb, find_same_succ)
> ? ? ? ?(init_worklist, delete_worklist, delete_basic_block_same_succ)
> ? ? ? ?(same_succ_flush_bbs, update_worklist): New function.
> ? ? ? ?(print_cluster, debug_cluster, same_predecessors)
> ? ? ? ?(add_bb_to_cluster, new_cluster, delete_cluster): New function.
> ? ? ? ?(all_clusters): New var.
> ? ? ? ?(alloc_cluster_vectors, reset_cluster_vectors, delete_cluster_vectors)
> ? ? ? ?(merge_clusters, set_cluster): New function.
> ? ? ? ?(gimple_equal_p, find_duplicate, same_phi_alternatives_1)
> ? ? ? ?(same_phi_alternatives, bb_has_non_vop_phi, find_clusters_1)
> ? ? ? ?(find_clusters): New function.
> ? ? ? ?(merge_calls, update_vuses, vop_phi, insn_vops, vop_at_entry)
> ? ? ? ?(replace_block_by): New function.
> ? ? ? ?(update_bbs): New var.
> ? ? ? ?(apply_clusters): New function.
> ? ? ? ?(update_debug_stmt, update_debug_stmts): New function.
> ? ? ? ?(tail_merge_optimize): New function.
> ? ? ? ?tree-flow.h (tail_merge_optimize): Declare.
> ? ? ? ?* tree-ssa-pre.c (execute_pre): Use tail_merge_optimize.
> ? ? ? ?* Makefile.in (OBJS-common): Add tree-ssa-tail-merge.o.
> ? ? ? ?(tree-ssa-tail-merge.o): New rule.
> ? ? ? ?* opts.c (default_options_table): Set OPT_ftree_tail_merge by default at
> ? ? ? ?OPT_LEVELS_2_PLUS.
> ? ? ? ?* tree-ssa-sccvn.c (vn_valueize): Move to ...
> ? ? ? ?* tree-ssa-sccvn.h (vn_valueize): Here.
> ? ? ? ?* tree-ssa-alias.h (pt_solution_ior_into_shared): Declare.
> ? ? ? ?* tree-ssa-structalias.c (find_what_var_points_to): Factor out and
> ? ? ? ?use ...
> ? ? ? ?(pt_solution_share): New function.
> ? ? ? ?(pt_solution_unshare, pt_solution_ior_into_shared): New function.
> ? ? ? ?(delete_points_to_sets): Nullify shared_bitmap_table after deletion.
> ? ? ? ?* timevar.def (TV_TREE_TAIL_MERGE): New timevar.
> ? ? ? ?* common.opt (ftree-tail-merge): New switch.
> ? ? ? ?* params.def (PARAM_MAX_TAIL_MERGE_COMPARISONS): New parameter.
> ? ? ? ?* doc/invoke.texi (Optimization Options, -O2): Add -ftree-tail-merge.
> ? ? ? ?(-ftree-tail-merge, max-tail-merge-comparisons): New item.
>


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]