[PATCH]: Fix PR tree-optimization/18673

Daniel Berlin dberlin@dberlin.org
Tue Nov 30 15:00:00 GMT 2004


This patch fixes PR tree-optimization/18673, as well as the general 
problem of GVN-PRE inserting a lot of dead code sometimes.

It turns out there were a few problems in the draft published algorithm 
i had been using originally that were later clarified in the final 
algorithm, and these were causing us to insert dead code (though not 
incorrect code).  This same bug was also the cause of the ridiculously 
long PRE times on 18673.

On the testcase in that bug, PRE would make >30000 insertions instead of 
158, which in turn caused the bitmaps to get very dense during insert, 
which caused the rest of insert to slow down.

All of this was caused by PRE thinking it didn't already make values 
available that it had, which in turn was caused by a few places where we 
were using "replace existing in set with new value", and should have been 
using "replace existing in set with new value or insert new value if it 
doesn't exist in set".

I've added comments to all the insertion conditions that are used to 
propagate inserted values so that it doesn't become screwed up again in 
the future by someone.

In order to verify correctness, i ran through every single elimination PRE 
used to make during a bootstrap of GCC, and made sure it still made the 
exact same eliminations, without inserting unneccessary dead code in the 
process.

This patch also produces ridiculous speedups in some cases.

On PR 18673, it used to take 140 seconds to do PRE when you generated 150 
loops (it inserted 30000 pieces of dead code).  300 loops took about 1500 
seconds.

150 loops now takes 1.43 seconds, and 300 loops now takes 3.03 seconds, 
which is roughly a 10x and 500x speedup.

It inserts no dead code in the process (IE every insertion causes an 
elimination)

I have also tested this on cc1-i-files, which shows a very slight 
speedup (~4 seconds out of 7 minutes.

There are no testcases on which PRE takes a significant amount of time 
other than 18673 which i could find to also test this against.

I also have not included the 18673 test as it still causes O(n^2) behavior 
in ivopts that would cause the 300 loop test to timeout (plus it seems if 
i turn on ivopts, it takes over a gig of memory).

Bootstrapped and regtested on i686-pc-linux-gnu, and powerpc-darwin.

Committed to mainline.

2004-11-29  Daniel Berlin  <dberlin@dberlin.org>

 	Fix PR tree-optimization/18673

 	* tree-ssa-pre.c: Remove useless splay-tree.h include.
 	(bitmap_value_replace_in_set): Fix to add if it does not exist.
 	(find_or_generate_expression): Remove now-wrong condition.
 	(create_expression_by_pieces): Fix condition and comment reason
 	for it.
 	(insert_aux): Fix condition and comment reasons for it.
 	Factor insertion code from here.
 	(insert_into_preds_of_block): To here.  Fix conditions in factored
 	function and comment reasons for them.

Index: tree-ssa-pre.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/tree-ssa-pre.c,v
retrieving revision 2.57
diff -u -p -r2.57 tree-ssa-pre.c
--- tree-ssa-pre.c	25 Nov 2004 22:31:09 -0000	2.57
+++ tree-ssa-pre.c	30 Nov 2004 02:58:42 -0000
@@ -41,7 +41,6 @@ Boston, MA 02111-1307, USA.  */
  #include "alloc-pool.h"
  #include "tree-pass.h"
  #include "flags.h"
-#include "splay-tree.h"
  #include "bitmap.h"
  #include "langhooks.h"

@@ -638,7 +637,9 @@ bitmap_set_contains_value (bitmap_set_t
  {
    if (is_gimple_min_invariant (val))
      return true;
-  return bitmap_bit_p (set->values, VALUE_HANDLE_ID (val));
+  if (bitmap_bit_p (set->values, VALUE_HANDLE_ID (val)))
+    return true;
+  return false;
  }

  /* Replace an instance of value LOOKFOR with expression EXPR in SET.  */
@@ -713,13 +716,17 @@ set_equal (value_set_t a, value_set_t b)
    return true;
  }

-/* Replace an instance of EXPR's VALUE with EXPR in SET.  */
+/* Replace an instance of EXPR's VALUE with EXPR in SET if it exists,
+   and add it otherwise. */

  static void
  bitmap_value_replace_in_set (bitmap_set_t set, tree expr)
  {
    tree val = get_value_handle (expr);
-  bitmap_set_replace_value (set, val, expr);
+  if (bitmap_set_contains_value (set, val))
+    bitmap_set_replace_value (set, val, expr);
+  else
+    bitmap_insert_into_set (set, expr);
  }

  /* Insert EXPR into SET if EXPR's value is not already present in
@@ -1280,13 +1287,6 @@ find_or_generate_expression (basic_block
  {
    tree genop;
    genop = bitmap_find_leader (AVAIL_OUT (block), expr);
-  /* Depending on the order we process DOM branches in, the value
-     may not have propagated to all the dom children yet during
-     this iteration.  In this case, the value will always be in
-     the NEW_SETS for us already, having been propagated from our
-     dominator.  */
-  if (genop == NULL)
-    genop = bitmap_find_leader (NEW_SETS (block), expr);
    /* If it's still NULL, see if it is a complex expression, and if
       so, generate it recursively, otherwise, abort, because it's
       not really .  */
@@ -1374,8 +1374,13 @@ create_expression_by_pieces (basic_block
      }
    v = get_value_handle (expr);
    vn_add (name, v, NULL);
-  bitmap_insert_into_set (NEW_SETS (block), name);
-  bitmap_value_insert_into_set (AVAIL_OUT (block), name);
+
+  /* The value may already exist in either NEW_SETS, or AVAIL_OUT, because
+     we are creating the expression by pieces, and this particular piece of
+     the expression may have been represented.  There is no harm in replacing
+     here.  */
+  bitmap_value_replace_in_set (NEW_SETS (block), name); 
+  bitmap_value_replace_in_set (AVAIL_OUT (block), name);
    if (dump_file && (dump_flags & TDF_DETAILS))
      {
        fprintf (dump_file, "Inserted ");
@@ -1384,6 +1389,90 @@ create_expression_by_pieces (basic_block
      }
    return name;
  }
+
+/* Insert the to-be-made-available values of NODE for each predecessor, stored
+   in AVAIL, into the predecessors of BLOCK, and merge the result with a phi
+   node, given the same value handle as NODE.  The prefix of the phi node is
+   given with TMPNAME*/
+
+static bool
+insert_into_preds_of_block (basic_block block, value_set_node_t node,
+			    tree *avail, const char *tmpname)
+{
+  tree val = get_value_handle (node->expr);
+  edge pred;
+  basic_block bprime;
+  tree eprime;
+  edge_iterator ei;
+  tree type = TREE_TYPE (avail[EDGE_PRED (block, 0)->src->index]);
+  tree temp;
+ 
+  if (dump_file && (dump_flags & TDF_DETAILS))
+    {
+      fprintf (dump_file, "Found partial redundancy for expression ");
+      print_generic_expr (dump_file, node->expr, 0);
+      fprintf (dump_file, "\n");
+    }
+
+  /* Make the necessary insertions.  */
+  FOR_EACH_EDGE (pred, ei, block->preds)
+    {
+      tree stmts = alloc_stmt_list ();
+      tree builtexpr;
+      bprime = pred->src;
+      eprime = avail[bprime->index];
+      if (BINARY_CLASS_P (eprime)
+	  || UNARY_CLASS_P (eprime))
+	{
+	  builtexpr = create_expression_by_pieces (bprime,
+						   eprime,
+						   stmts);
+	  bsi_insert_on_edge (pred, stmts);
+	  avail[bprime->index] = builtexpr;
+	} 
+    }
+  /* Now build a phi for the new variable.  */
+  temp = create_tmp_var (type, tmpname);
+  add_referenced_tmp_var (temp);
+  temp = create_phi_node (temp, block);
+ 
+  FOR_EACH_EDGE (pred, ei, block->preds)
+    add_phi_arg (temp, avail[pred->src->index], pred);
+ 
+  vn_add (PHI_RESULT (temp), val, NULL);
+ 
+  /* The value should *not* exist in PHI_GEN, or else we wouldn't be doing
+     this insertion, since we test for the existence of this value in PHI_GEN
+     before proceeding with the partial redundancy checks in insert_aux.
+ 
+     The value may exist in AVAIL_OUT, in particular, it could be represented
+     by the expression we are trying to eliminate, in which case we want the
+     replacement to occur.  If it's not existing in AVAIL_OUT, we want it
+     inserted there.
+ 
+     Similarly, to the PHI_GEN case, the value should not exist in NEW_SETS of
+     this block, because if it did, it would have existed in our dominator's
+     AVAIL_OUT, and would have been skipped due to the full redundancy check.
+  */
+
+  bitmap_insert_into_set (PHI_GEN (block),
+			  PHI_RESULT (temp));
+  bitmap_value_replace_in_set (AVAIL_OUT (block), 
+			       PHI_RESULT (temp));
+  bitmap_insert_into_set (NEW_SETS (block),
+			  PHI_RESULT (temp));
+ 
+  if (dump_file && (dump_flags & TDF_DETAILS))
+    {
+      fprintf (dump_file, "Created phi ");
+      print_generic_expr (dump_file, temp, 0);
+      fprintf (dump_file, " in block %d\n", block->index);
+    }
+  pre_stats.phis++;
+  return true;
+}
+
+

  /* Perform insertion of partially redundant values.
     For BLOCK, do the following:
@@ -1399,6 +1488,7 @@ create_expression_by_pieces (basic_block
     3. Recursively call ourselves on the dominator children of BLOCK.

  */
+
  static bool
  insert_aux (basic_block block)
  {
@@ -1413,12 +1503,18 @@ insert_aux (basic_block block)
  	{
  	  unsigned i;
  	  bitmap_iterator bi;
-
  	  bitmap_set_t newset = NEW_SETS (dom);
-	  EXECUTE_IF_SET_IN_BITMAP (newset->expressions, 0, i, bi)
+	  if (newset)
  	    {
-	      bitmap_insert_into_set (NEW_SETS (block), ssa_name (i));
-	      bitmap_value_replace_in_set (AVAIL_OUT (block), ssa_name (i));
+	      /* Note that we need to value_replace both NEW_SETS, and
+		 AVAIL_OUT. For both the case of NEW_SETS, the value may be
+		 represented by some non-simple expression here that we want
+		 to replace it with.  */
+	      EXECUTE_IF_SET_IN_BITMAP (newset->expressions, 0, i, bi)
+		{
+		  bitmap_value_replace_in_set (NEW_SETS (block), ssa_name (i));
+		  bitmap_value_replace_in_set (AVAIL_OUT (block), ssa_name (i));
+		}
  	    }
  	  if (EDGE_COUNT (block->preds) > 1)
  	    {
@@ -1438,7 +1534,7 @@ insert_aux (basic_block block)
  		      tree first_s = NULL;
  		      edge pred;
  		      basic_block bprime;
-		      tree eprime;
+		      tree eprime = NULL_TREE;
  		      edge_iterator ei;

  		      val = get_value_handle (node->expr);
@@ -1500,11 +1596,9 @@ insert_aux (basic_block block)
  			      by_some = true;
  			      if (first_s == NULL)
  				first_s = edoubleprime;
-			      else if (first_s != edoubleprime)
+			      else if (!operand_equal_p (first_s, edoubleprime,
+							 0))
  				all_same = false;
-			      gcc_assert (first_s == edoubleprime 
-					  || !operand_equal_p
-					      (first_s, edoubleprime, 0));
  			    }
  			}
  		      /* If we can insert it, it's not the same value
@@ -1513,63 +1607,9 @@ insert_aux (basic_block block)
  			 partially redundant.  */
  		      if (!cant_insert && !all_same && by_some)
  			{
-			  tree type = TREE_TYPE (avail[EDGE_PRED (block, 0)->src->index]);
-			  tree temp;
-			  if (dump_file && (dump_flags & TDF_DETAILS))
-			    {
-			      fprintf (dump_file, "Found partial redundancy for expression ");
-			      print_generic_expr (dump_file, node->expr, 0);
-			      fprintf (dump_file, "\n");
-			    }
-
-			  /* Make the necessary insertions.  */
-			  FOR_EACH_EDGE (pred, ei, block->preds)
-			    {
-			      tree stmts = alloc_stmt_list ();
-			      tree builtexpr;
-			      bprime = pred->src;
-			      eprime = avail[bprime->index];
-			      if (BINARY_CLASS_P (eprime)
-				  || UNARY_CLASS_P (eprime))
-				{
-				  builtexpr = create_expression_by_pieces (bprime,
-									   eprime,
-									   stmts);
-				  bsi_insert_on_edge (pred, stmts);
-				  avail[bprime->index] = builtexpr;
-				} 
-			    }
-			  /* Now build a phi for the new variable.  */
-			  temp = create_tmp_var (type, "prephitmp");
-			  add_referenced_tmp_var (temp);
-			  temp = create_phi_node (temp, block);
-			  vn_add (PHI_RESULT (temp), val, NULL);
-
-#if 0
-			  if (!set_contains_value (AVAIL_OUT (block), val))
-			    insert_into_set (AVAIL_OUT (block), 
-					     PHI_RESULT (temp));
-			  else
-#endif
-			    bitmap_value_replace_in_set (AVAIL_OUT (block), 
-							 PHI_RESULT (temp));
-			  FOR_EACH_EDGE (pred, ei, block->preds)
-			    {
-			      add_phi_arg (temp, avail[pred->src->index],
-					   pred);
-			    }
-			  if (dump_file && (dump_flags & TDF_DETAILS))
-			    {
-			      fprintf (dump_file, "Created phi ");
-			      print_generic_expr (dump_file, temp, 0);
-			      fprintf (dump_file, " in block %d\n", block->index);
-			    }
-			  pre_stats.phis++;
-			  new_stuff = true;
-			  bitmap_insert_into_set (NEW_SETS (block),
-						  PHI_RESULT (temp));
-			  bitmap_insert_into_set (PHI_GEN (block),
-						  PHI_RESULT (temp));
+ 			  if (insert_into_preds_of_block (block, node, avail, 
+ 							  "prephitmp"))
+ 			    new_stuff = true;
  			}

  		      free (avail);



More information about the Gcc-patches mailing list