This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: c++/1687: Extreme compile time regression from 2.95.2


Following up to myself...

I hacked up a walk_expr_tree.  It improves things by a bit: a 16-input
mux() goes from 1min to 30sec on my machine (give or take).
Unfortunately, the time taken is still O(3^N) in the number of
inputs.  So that won't do.

Using the 'visit each node only once' mechanism of walk_tree
thoroughly squelches the performance problem.  (One can't use
walk_tree_without_duplicates blindly - slightly more cleverness is
required.  It's still simple.)  We get sub-second compile time all the
way up to -O2 and 32 input mux().

Before (17 inputs):
  8.29     91.80     7.62 215233608     0.00     0.00  expand_call_inline

After (17 inputs):
  0.00      0.14     0.00      147     0.00     0.00  expand_call_inline

After (32 inputs):
  0.00      0.15     0.00      269     0.00     0.00  expand_call_inline

HOWEVER: I am not certain that the change is correct.  Suppose that a
function A is a candidate for inlining, and it's called twice from the
same function B.  If the two call_expr nodes are shared, we won't
inline both calls.  There may be other problems as well.

Patch follows.  Commentary from C++ team appreciated.  Will bootstrap
and report.

zw

	* optimize.c: Include hashtab.h.
	(struct inline_data): Add tree_pruner field.
	(expand_call_inline, expand_calls_inline): Call walk_tree with
	id->tree_pruner for fourth argument, putting it into
	no-duplicates mode.
	(optimize_function): Create and destroy a hash table for
	duplicate pruning, store it in id.tree_pruner.

===================================================================
Index: cp/optimize.c
--- cp/optimize.c	2001/01/28 14:04:19	1.50
+++ cp/optimize.c	2001/02/06 20:40:33
@@ -28,6 +28,7 @@ Software Foundation, 59 Temple Place - S
 #include "input.h"
 #include "integrate.h"
 #include "varray.h"
+#include "hashtab.h"
 
 /* To Do:
 
@@ -62,6 +63,10 @@ typedef struct inline_data
   int in_target_cleanup_p;
   /* A stack of the TARGET_EXPRs that we are currently processing.  */
   varray_type target_exprs;
+
+  /* Hash table used to prevent walk_tree from visiting the same node
+     umpteen million times.  */
+  htab_t tree_pruner;
 } inline_data;
 
 /* Prototypes.  */
@@ -655,7 +660,7 @@ expand_call_inline (tp, walk_subtrees, d
 	  if (i == 2)
 	    ++id->in_target_cleanup_p;
 	  walk_tree (&TREE_OPERAND (*tp, i), expand_call_inline, data,
-		     NULL);
+		     id->tree_pruner);
 	  if (i == 2)
 	    --id->in_target_cleanup_p;
 	}
@@ -810,8 +815,12 @@ expand_calls_inline (tp, id)
      inline_data *id;
 {
   /* Search through *TP, replacing all calls to inline functions by
-     appropriate equivalents.  */
-  walk_tree (tp, expand_call_inline, id, NULL);
+     appropriate equivalents.  Use walk_tree in no-duplicates mode
+     to avoid exponential time complexity.  (We can't just use
+     walk_tree_without_duplicates, because of the special TARGET_EXPR
+     handling in expand_calls.  The hash table is set up in
+     optimize_function.  */
+  walk_tree (tp, expand_call_inline, id, id->tree_pruner);
 }
 
 /* Optimize the body of FN.  */
@@ -863,11 +872,14 @@ optimize_function (fn)
 
       /* Replace all calls to inline functions with the bodies of those
 	 functions.  */
+      id.tree_pruner = htab_create (37, htab_hash_pointer,
+				    htab_eq_pointer, NULL);
       expand_calls_inline (&DECL_SAVED_TREE (fn), &id);
 
       /* Clean up.  */
       VARRAY_FREE (id.fns);
       VARRAY_FREE (id.target_exprs);
+      htab_delete (id.tree_pruner);
     }
 
   /* Undo the call to ggc_push_context above.  */

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]