Early inlining with early optimization and passmanagerization of the inliner

Jan Hubicka jh@suse.cz
Mon Jan 15 01:23:00 GMT 2007


Hi,
this is patch makes early optimization to happen during early inlining.  The
reason is to make results of early optimization available to early inlining (so
the abstraction is recognized and not limiting further inlining) and to reduce
memory peak appearing just after early inlinier (for Tramp3d we right now
need about 480MB of memory after early inlining, while we need 390MB of memory
after early optimization to hold the program and I have plans to significantly
reduce the second number).

For tramp3d the results are quite good:

Memory wise the savings are 856MB->813MB. GGC heuristic pick really unfortunate
places to collect on tramp3d, with machines having smaller memory the savings
are actually bigger so we now fit in 520MB rlimit, with mainline 600MB rlimit
is not enough.

Early inlining now perform significantly more transformations resulting in
faster binary (1.29 seconds/iteration -> 0.82 seconds/iteration, 37% speedup).
The performance is now same as with "flatten" attribute used and the flatten
attribute no longer has effect on performance with the patch applied, at least
in my testing suggesting that early inlining with optimization is now
powerful enought to unpeel all the abstraction in tramp3d.

Resulting binary is 2.17MB without the patch, 2.26MB with the patch, 2.35MB
without patch but with "flatten" attribute, 2.53MB with "flatten" attribute
and the patch.

Compilation time wise, without the patch we need 97.4 seconds, with the patch
116 seconds.  This is somewhat worse. I would say it is fair for 37% speedup
but there is easy way to get performance back.  Two factors contribute:

 1) We are re-optimizing every function body after it is inlined somewhere
    by early inliner
 2) We inline significantly more (about 30% more code is passed to backend)

The second factor is dominant: It is possible to significantly reduce
inline-unit-growth parameter now and with inline unit growth of 10% (instead of
50%) we produce about same amount of instructions after early optimization.
With this, we need 73s to compile, resulting binary is 1.8MB and performance is
still 0.82 seconds/iteration.

In summary my best tramp3d specific score is 32% compile time speedup, 37%
runtime speedup, 20% code size savings (out of static binary), 5% code size
savings.  This patch alone should make 20% compile time regression and 4% code
size, speedups and memory savings are the same.  This is all compared to today
mainline that has a lot better compile times caused by early inlining, but also
some runtime regressions I believe to be independent on inlining as can be seen
by fact that "flatten" attribute line regressed too, but inlining decisions are
pretty irelevant there.

I plan to retune the heuristic (since now the intermediate code does
contain a lot less of random noise so the factors about abstraction
penalty and allowed growth can be made more rational), but I don't want
to bundle it into this patch since I want to see the effect
independnetly.  I also have few patches to cut down costs of pre-inline
optimization queue (factor 1) that affects the scores, so I want to try
them first.

Concenring the other benchmarks, I did measured GCC modules compilation times
with IMA and they improve by about 12% (but my machine is swapping during this
excercise) and inlining seems improved too.  I didn't had luck to run IMA SPECs
(I am working on that) but it seems to suggest that the patch work well on
whole program C language optimization (the GCC itself is not that different
from tramp3d anymore - we do have half a dozen of accestors too).

For single unit SPEC compilation, the binaries ends up almost unchanged, so I
don't expect any difference.

DLV benchmark compilation time is change seems within noise, the
runtimes are slightly improved overall, but there is also an regression
in last benchmark:

[0]: dl-mainline
[1]: dl-patch
[2]: dl-patch with unit growth set to 20

                     |     [0]      |     [1]     |     [2]     |
---------------------+--------------+-------------+-------------+
      STRATCOMP1-ALL |  2.50 (0.38) | 1.96 (0.03) | 1.91 (0.04) |
   STRATCOMP-770.2-Q |  0.36 (0.00) | 0.36 (0.00) | 0.38 (0.01) |
               2QBF1 | 10.02 (0.24) | 9.11 (0.02) | 9.12 (0.02) |
          PRIMEIMPL2 |  5.64 (0.01) | 5.32 (0.01) | 5.69 (0.00) |
       3COL-SIMPLEX1 |  3.71 (0.08) | 3.76 (0.07) | 3.78 (0.09) |
        3COL-RANDOM1 |  5.51 (0.03) | 5.37 (0.04) | 5.57 (0.03) |
          HP-RANDOM1 |  5.20 (0.07) | 5.23 (0.12) | 5.41 (0.00) |
       HAMCYCLE-FREE |  0.68 (0.00) | 0.68 (0.00) | 0.70 (0.00) |
             DECOMP2 |  6.97 (0.04) | 6.99 (0.01) | 7.36 (0.02) |
        BW-P5-nopush |  3.42 (0.01) | 3.40 (0.01) | 3.48 (0.01) |
       BW-P5-pushbin |  2.72 (0.02) | 2.70 (0.01) | 2.82 (0.01) |
     BW-P5-nopushbin |  0.92 (0.00) | 0.92 (0.00) | 0.96 (0.00) |
        HANOI-Towers |  1.80 (0.00) | 1.84 (0.01) | 1.96 (0.03) |
              RAMSEY |  4.02 (0.02) | 4.07 (0.01) | 4.19 (0.01) |
             CRISTAL |  5.13 (0.03) | 5.17 (0.04) | 5.31 (0.11) |
           21-QUEENS |  4.74 (0.02) | 4.66 (0.00) | 4.94 (0.00) |
   MSTDir[V=13,A=40] |  7.89 (0.00) | 7.59 (0.03) | 8.04 (0.03) |
   MSTDir[V=15,A=40] |  7.86 (0.01) | 7.55 (0.09) | 8.01 (0.01) |
 MSTUndir[V=13,A=40] |  4.50 (0.01) | 4.38 (0.01) | 4.60 (0.02) |
         TIMETABLING |  4.79 (0.04) | 5.42 (0.16) | 4.84 (0.19) |
---------------------+--------------+-------------+-------------+

The results show that TIMETABLING regression goes away when inline limits are
cut, but another small regression appear in MSTDir.  I've however never yet
seen a patch that would bring consistent improvement on DLV.  Reducing unit
growth to 20% reduce compilation time from 2m32 seconds to 2m17seconds.

I've bootstrapped/regtested the slighly older version of patch on i686-linux,
I am re-running the same testing now.  I plan to hold the patch till Richard's
C++ testers catch the referenced vars/eh table changes I sent today (so till
yesterday afternoon at least) and I would welcome any comments or benchmark
results.

Some details on implementation are in ipa-inline comment in the patch
(especially the interference with profiling is bit unforutnate).

It is actually different implementation than the reoptimization idea I
had on IPA branch and presented on summit.  It is a lot cheaper
compilation time, but it is not 100% replacement. I will re-evaulate
this trick and see if it should be merged at all, as this early inlining
trick seems a lot cheaper.

	* cgraph.h (cgraph_decide_inlining_incrementally): Kill.
	* tree-pass.h: Reorder to make IPA passes appear toegher.
	(pass_early_inline, pass_inline_parameters, pass_apply_inline): Declare.
	* cgraphunit.c (cgraph_finalize_function): Do not compute inling
	parameters, do not call early inliner.
	* ipa-inline.c: Update comments.  Include tree-flow.h
	(cgraph_decide_inlining): Do not compute inlining parameters.
	(cgraph_decide_inlining_incrementally): Return TODOs; assume to
	be called with function context set up.
	(pass_ipa_inline): Remove unreachable functions before pass.
	(cgraph_early_inlining): Simplify assuming to be called from the
	PM as local pass.
	(pass_early_inline): New pass.
	(cgraph_gate_ipa_early_inlining): New gate.
	(pass_ipa_early_inline): Turn into simple wrapper.
	(compute_inline_parameters): New function.
	(gate_inline_passes): New gate.
	(pass_inline_parameters): New pass.
	(apply_inline): Move here from tree-optimize.c
	(pass_apply_inline): New pass.
	* ipa.c (cgraph_remove_unreachable_nodes): Verify cgraph after
	transforming.
	* tree-inline.c (optimize_inline_calls): Return TODOs rather than
	doing them by hand.
	(tree_function_versioning): Do not allocate dummy struct function.
	* tree-inline.h (optimize_inline_calls): Update prototype.
	* tree-optimize.c (execute_fixup_cfg): Export.
	(pass_fixup_cfg): Remove
	(tree_rest_of_compilation): Do not apply inlines.
	* tree-flow.h (execute_fixup_cfg): Declare.
	* Makefile.in (gt-passes.c): New.
	* passes.c: Include gt-passes.h
	(init_optimization_passes): New passes.
	(nnodes, order): New static vars.
	(do_per_function_toporder): New function.
	(execute_one_pass): Dump current pass here.
	(execute_ipa_pass_list): Don't dump current pass here.

Index: cgraph.h
===================================================================
*** cgraph.h	(revision 120777)
--- cgraph.h	(working copy)
*************** varpool_next_static_initializer (struct 
*** 395,401 ****
          (node) = varpool_next_static_initializer (node))
  
  /* In ipa-inline.c  */
- bool cgraph_decide_inlining_incrementally (struct cgraph_node *, bool);
  void cgraph_clone_inlined_nodes (struct cgraph_edge *, bool, bool);
  void cgraph_mark_inline_edge (struct cgraph_edge *, bool);
  bool cgraph_default_inline_p (struct cgraph_node *, const char **);
--- 395,400 ----
Index: tree-pass.h
===================================================================
*** tree-pass.h	(revision 120777)
--- tree-pass.h	(working copy)
*************** extern struct tree_opt_pass pass_reset_c
*** 310,322 ****
  /* IPA Passes */
  extern struct tree_opt_pass pass_ipa_cp;
  extern struct tree_opt_pass pass_ipa_inline;
! extern struct tree_opt_pass pass_early_ipa_inline;
  extern struct tree_opt_pass pass_ipa_reference;
  extern struct tree_opt_pass pass_ipa_pure_const;
  extern struct tree_opt_pass pass_ipa_type_escape;
  extern struct tree_opt_pass pass_ipa_pta;
  extern struct tree_opt_pass pass_early_local_passes;
- extern struct tree_opt_pass pass_all_early_optimizations;
  extern struct tree_opt_pass pass_ipa_increase_alignment;
  extern struct tree_opt_pass pass_ipa_function_and_variable_visibility;
  
--- 310,321 ----
  /* IPA Passes */
  extern struct tree_opt_pass pass_ipa_cp;
  extern struct tree_opt_pass pass_ipa_inline;
! extern struct tree_opt_pass pass_ipa_early_inline;
  extern struct tree_opt_pass pass_ipa_reference;
  extern struct tree_opt_pass pass_ipa_pure_const;
  extern struct tree_opt_pass pass_ipa_type_escape;
  extern struct tree_opt_pass pass_ipa_pta;
  extern struct tree_opt_pass pass_early_local_passes;
  extern struct tree_opt_pass pass_ipa_increase_alignment;
  extern struct tree_opt_pass pass_ipa_function_and_variable_visibility;
  
*************** extern struct tree_opt_pass pass_set_not
*** 399,404 ****
--- 398,407 ----
  extern struct tree_opt_pass pass_final;
  extern struct tree_opt_pass pass_rtl_seqabstr;
  extern struct tree_opt_pass pass_release_ssa_names;
+ extern struct tree_opt_pass pass_early_inline;
+ extern struct tree_opt_pass pass_inline_parameters;
+ extern struct tree_opt_pass pass_apply_inline;
+ extern struct tree_opt_pass pass_all_early_optimizations;
  
  /* The root of the compilation pass tree, once constructed.  */
  extern struct tree_opt_pass *all_passes, *all_ipa_passes, *all_lowering_passes;
Index: cgraphunit.c
===================================================================
*** cgraphunit.c	(revision 120777)
--- cgraphunit.c	(working copy)
*************** cgraph_finalize_function (tree decl, boo
*** 455,464 ****
    /* If not unit at a time, then we need to create the call graph
       now, so that called functions can be queued and emitted now.  */
    if (!flag_unit_at_a_time)
!     {
!       cgraph_analyze_function (node);
!       cgraph_decide_inlining_incrementally (node, false);
!     }
  
    if (decide_is_function_needed (node, decl))
      cgraph_mark_needed_node (node);
--- 455,461 ----
    /* If not unit at a time, then we need to create the call graph
       now, so that called functions can be queued and emitted now.  */
    if (!flag_unit_at_a_time)
!     cgraph_analyze_function (node);
  
    if (decide_is_function_needed (node, decl))
      cgraph_mark_needed_node (node);
Index: ipa-inline.c
===================================================================
*** ipa-inline.c	(revision 120777)
--- ipa-inline.c	(working copy)
*************** Software Foundation, 51 Franklin Street,
*** 61,67 ****
  
        cgraph_decide_inlining implements heuristics taking whole callgraph
        into account, while cgraph_decide_inlining_incrementally considers
!       only one function at a time and is used in non-unit-at-a-time mode.  */
  
  #include "config.h"
  #include "system.h"
--- 61,124 ----
  
        cgraph_decide_inlining implements heuristics taking whole callgraph
        into account, while cgraph_decide_inlining_incrementally considers
!       only one function at a time and is used in non-unit-at-a-time mode. 
! 
!    The inliner itself is split into several passes:
! 
!    pass_inline_parameters
! 
!      This pass computes local properties of functions that are used by inliner:
!      estimated function body size, whether function is inlinable at all and
!      stack frame consumption.
! 
!      Before executing any of inliner passes, this local pass has to be applied
!      to each function in the callgraph (ie run as subpass of some earlier
!      IPA pass).  The results are made out of date by any optimization applied
!      on the function body.
! 
!    pass_early_inlining
! 
!      Simple local inlining pass inlining callees into current function.  This
!      pass makes no global whole compilation unit analysis and this when allowed
!      to do inlining expanding code size it might result in unbounded growth of
!      whole unit.
! 
!      This is the main inlining pass in non-unit-at-a-time.
! 
!      With unit-at-a-time the pass is run during conversion into SSA form.
!      Only functions already converted into SSA form are inlined, so the
!      conversion must happen in topological order on the callgraph (that is
!      maintained by pass manager).  The functions after inlining are early
!      optimized so the early inliner sees unoptimized function itself, but
!      all considered callees are already optimized allowing it to unfold
!      abstraction penalty on C++ effectivly and cheaply.
! 
!    pass_ipa_early_inlining
! 
!      With profiling, the early inlining is also neccesary to reduce
!      instrumentation costs on program with high abstraction penalty (doing
!      many redundant calls).  This can't happen in parallel with early
!      optimization and profile instrumentation, because we would end up
!      re-instrumenting already instrumented function bodies we brought in via
!      inlining.
! 
!      To avoid this, this pass is executed as IPA pass before profiling.  It is
!      simple wrapper to pass_early_inlining and ensures first inlining.
! 
!    pass_ipa_inline
! 
!      This is the main pass implementing simple greedy algorithm to do inlining
!      of small functions that results in overall growth of compilation unit and
!      inlining of functions called once.  The pass compute just so called inline
!      plan (representation of inlining to be done in callgraph) and unlike early
!      inlining it is not performing the inlining itself.
! 
!    pass_apply_inline
! 
!      This pass performs actual inlining according to pass_ipa_inline on given
!      function.  Possible the function body before inlining is saved when it is
!      needed for further inlining later.
!  */
  
  #include "config.h"
  #include "system.h"
*************** Software Foundation, 51 Franklin Street,
*** 81,86 ****
--- 138,144 ----
  #include "hashtab.h"
  #include "coverage.h"
  #include "ggc.h"
+ #include "tree-flow.h"
  
  /* Statistics we collect about inlining algorithm.  */
  static int ncalls_inlined;
*************** cgraph_decide_inlining (void)
*** 931,943 ****
        {
  	struct cgraph_edge *e;
  
- 	/* At the moment, no IPA passes change function bodies before inlining.
- 	   Save some time by not recomputing function body sizes if early inlining
- 	   already did so.  */
- 	if (!flag_early_inlining)
- 	  node->local.self_insns = node->global.insns
- 	     = estimate_num_insns (node->decl);
- 
  	initial_insns += node->local.self_insns;
  	gcc_assert (node->local.self_insns == node->global.insns);
  	for (e = node->callees; e; e = e->next_callee)
--- 989,994 ----
*************** cgraph_decide_inlining (void)
*** 1088,1104 ****
  /* Decide on the inlining.  We do so in the topological order to avoid
     expenses on updating data structures.  */
  
! bool
  cgraph_decide_inlining_incrementally (struct cgraph_node *node, bool early)
  {
    struct cgraph_edge *e;
    bool inlined = false;
    const char *failed_reason;
  
    /* First of all look for always inline functions.  */
    for (e = node->callees; e; e = e->next_callee)
      if (e->callee->local.disregard_inline_limits
  	&& e->inline_failed
          && !cgraph_recursive_inlining_p (node, e->callee, &e->inline_failed)
  	/* ??? It is possible that renaming variable removed the function body
  	   in duplicate_decls. See gcc.c-torture/compile/20011119-2.c  */
--- 1139,1162 ----
  /* Decide on the inlining.  We do so in the topological order to avoid
     expenses on updating data structures.  */
  
! static unsigned int
  cgraph_decide_inlining_incrementally (struct cgraph_node *node, bool early)
  {
    struct cgraph_edge *e;
    bool inlined = false;
    const char *failed_reason;
+   unsigned int todo = 0;
+ 
+ #ifdef ENABLE_CHECKING
+   verify_cgraph_node (node);
+ #endif
  
    /* First of all look for always inline functions.  */
    for (e = node->callees; e; e = e->next_callee)
      if (e->callee->local.disregard_inline_limits
  	&& e->inline_failed
+ 	&& (gimple_in_ssa_p (DECL_STRUCT_FUNCTION (node->decl))
+ 	    == gimple_in_ssa_p (DECL_STRUCT_FUNCTION (e->callee->decl)))
          && !cgraph_recursive_inlining_p (node, e->callee, &e->inline_failed)
  	/* ??? It is possible that renaming variable removed the function body
  	   in duplicate_decls. See gcc.c-torture/compile/20011119-2.c  */
*************** cgraph_decide_inlining_incrementally (st
*** 1111,1116 ****
--- 1169,1181 ----
  	    fprintf (dump_file, " into %s\n", cgraph_node_name (node));
  	  }
  	cgraph_mark_inline (e);
+ 	/* In order to fully inline alway_inline functions at -O0, we need to
+ 	   recurse here, since the inlined functions might not be processed by
+ 	   incremental inlining at all yet.  */
+ 	
+ 	if (!flag_unit_at_a_time)
+           cgraph_decide_inlining_incrementally (e->callee, early);
+ 	
  	inlined = true;
        }
  
*************** cgraph_decide_inlining_incrementally (st
*** 1121,1126 ****
--- 1186,1193 ----
  	  && e->inline_failed
  	  && !e->callee->local.disregard_inline_limits
  	  && !cgraph_recursive_inlining_p (node, e->callee, &e->inline_failed)
+ 	  && (gimple_in_ssa_p (DECL_STRUCT_FUNCTION (node->decl))
+ 	      == gimple_in_ssa_p (DECL_STRUCT_FUNCTION (e->callee->decl)))
  	  && (!early
  	      || (cgraph_estimate_size_after_inlining (1, e->caller, e->callee)
  	          <= e->caller->global.insns))
*************** cgraph_decide_inlining_incrementally (st
*** 1142,1160 ****
  	  else if (!early)
  	    e->inline_failed = failed_reason;
  	}
!   if (early && inlined)
      {
        timevar_push (TV_INTEGRATION);
!       push_cfun (DECL_STRUCT_FUNCTION (node->decl));
!       tree_register_cfg_hooks ();
!       current_function_decl = node->decl;
!       optimize_inline_calls (current_function_decl);
!       node->local.self_insns = node->global.insns;
!       current_function_decl = NULL;
!       pop_cfun ();
        timevar_pop (TV_INTEGRATION);
      }
!   return inlined;
  }
  
  /* When inlining shall be performed.  */
--- 1209,1221 ----
  	  else if (!early)
  	    e->inline_failed = failed_reason;
  	}
!   if (early && inlined && !node->global.inlined_to)
      {
        timevar_push (TV_INTEGRATION);
!       todo = optimize_inline_calls (current_function_decl);
        timevar_pop (TV_INTEGRATION);
      }
!   return todo;
  }
  
  /* When inlining shall be performed.  */
*************** struct tree_opt_pass pass_ipa_inline = 
*** 1176,1182 ****
    0,	                                /* properties_required */
    PROP_cfg,				/* properties_provided */
    0,					/* properties_destroyed */
!   0,					/* todo_flags_start */
    TODO_dump_cgraph | TODO_dump_func
    | TODO_remove_functions,		/* todo_flags_finish */
    0					/* letter */
--- 1237,1243 ----
    0,	                                /* properties_required */
    PROP_cfg,				/* properties_provided */
    0,					/* properties_destroyed */
!   TODO_remove_functions,		/* todo_flags_finish */
    TODO_dump_cgraph | TODO_dump_func
    | TODO_remove_functions,		/* todo_flags_finish */
    0					/* letter */
*************** static GTY ((length ("nnodes"))) struct 
*** 1194,1237 ****
  static unsigned int
  cgraph_early_inlining (void)
  {
!   struct cgraph_node *node;
!   int i;
  
    if (sorrycount || errorcount)
      return 0;
! #ifdef ENABLE_CHECKING
!   for (node = cgraph_nodes; node; node = node->next)
!     gcc_assert (!node->aux);
! #endif
! 
!   order = ggc_alloc (sizeof (*order) * cgraph_n_nodes);
!   nnodes = cgraph_postorder (order);
!   for (i = nnodes - 1; i >= 0; i--)
!     {
!       node = order[i];
!       if (node->analyzed && (node->needed || node->reachable))
!         node->local.self_insns = node->global.insns
! 	  = estimate_num_insns (node->decl);
!     }
!   for (i = nnodes - 1; i >= 0; i--)
!     {
!       node = order[i];
!       if (node->analyzed && node->local.inlinable
! 	  && (node->needed || node->reachable)
! 	  && node->callers)
! 	{
! 	  if (cgraph_decide_inlining_incrementally (node, true))
! 	    ggc_collect ();
! 	}
!     }
! #ifdef ENABLE_CHECKING
!   for (node = cgraph_nodes; node; node = node->next)
!     gcc_assert (!node->global.inlined_to);
! #endif
!   ggc_free (order);
!   order = NULL;
!   nnodes = 0;
!   return 0;
  }
  
  /* When inlining shall be performed.  */
--- 1255,1265 ----
  static unsigned int
  cgraph_early_inlining (void)
  {
!   struct cgraph_node *node = cgraph_node (current_function_decl);
  
    if (sorrycount || errorcount)
      return 0;
!   return cgraph_decide_inlining_incrementally (node, flag_unit_at_a_time);
  }
  
  /* When inlining shall be performed.  */
*************** cgraph_gate_early_inlining (void)
*** 1241,1247 ****
    return flag_inline_trees && flag_early_inlining;
  }
  
! struct tree_opt_pass pass_early_ipa_inline = 
  {
    "einline",	 			/* name */
    cgraph_gate_early_inlining,		/* gate */
--- 1269,1275 ----
    return flag_inline_trees && flag_early_inlining;
  }
  
! struct tree_opt_pass pass_early_inline = 
  {
    "einline",	 			/* name */
    cgraph_gate_early_inlining,		/* gate */
*************** struct tree_opt_pass pass_early_ipa_inli
*** 1254,1261 ****
    PROP_cfg,				/* properties_provided */
    0,					/* properties_destroyed */
    0,					/* todo_flags_start */
!   TODO_dump_cgraph | TODO_dump_func
!   | TODO_remove_functions,		/* todo_flags_finish */
    0					/* letter */
  };
  
--- 1282,1418 ----
    PROP_cfg,				/* properties_provided */
    0,					/* properties_destroyed */
    0,					/* todo_flags_start */
!   TODO_dump_func,    			/* todo_flags_finish */
!   0					/* letter */
! };
! 
! /* When inlining shall be performed.  */
! static bool
! cgraph_gate_ipa_early_inlining (void)
! {
!   return (flag_inline_trees && flag_early_inlining
! 	  && (flag_branch_probabilities || flag_test_coverage
! 	      || profile_arc_flag));
! }
! 
! /* IPA pass wrapper for early inlining pass.  We need to run early inlining
!    before tree profiling so we have stand alone IPA pass for doing so.  */
! struct tree_opt_pass pass_ipa_early_inline = 
! {
!   "einline_ipa",			/* name */
!   cgraph_gate_ipa_early_inlining,	/* gate */
!   NULL,					/* execute */
!   NULL,					/* sub */
!   NULL,					/* next */
!   0,					/* static_pass_number */
!   TV_INLINE_HEURISTICS,			/* tv_id */
!   0,	                                /* properties_required */
!   PROP_cfg,				/* properties_provided */
!   0,					/* properties_destroyed */
!   0,					/* todo_flags_start */
!   TODO_dump_cgraph, 		        /* todo_flags_finish */
!   0					/* letter */
! };
! 
! /* Compute parameters of functions used by inliner.  */
! static unsigned int
! compute_inline_parameters (void)
! {
!   struct cgraph_node *node = cgraph_node (current_function_decl);
! 
!   gcc_assert (!node->global.inlined_to);
!   node->local.estimated_self_stack_size = estimated_stack_frame_size ();
!   node->global.estimated_stack_size = node->local.estimated_self_stack_size;
!   node->global.stack_frame_offset = 0;
!   node->local.inlinable = tree_inlinable_function_p (current_function_decl);
!   node->local.self_insns = estimate_num_insns (current_function_decl);
!   if (node->local.inlinable)
!     node->local.disregard_inline_limits
!       = lang_hooks.tree_inlining.disregard_inline_limits (current_function_decl);
!   if (flag_really_no_inline && !node->local.disregard_inline_limits)
!     node->local.inlinable = 0;
!   /* Inlining characteristics are maintained by the cgraph_mark_inline.  */
!   node->global.insns = node->local.self_insns;
!   return 0;
! }
! 
! /* When inlining shall be performed.  */
! static bool
! gate_inline_passes (void)
! {
!   return flag_inline_trees;
! }
! 
! struct tree_opt_pass pass_inline_parameters = 
! {
!   NULL,	 				/* name */
!   gate_inline_passes,			/* gate */
!   compute_inline_parameters,		/* execute */
!   NULL,					/* sub */
!   NULL,					/* next */
!   0,					/* static_pass_number */
!   TV_INLINE_HEURISTICS,			/* tv_id */
!   0,	                                /* properties_required */
!   PROP_cfg,				/* properties_provided */
!   0,					/* properties_destroyed */
!   0,					/* todo_flags_start */
!   0,					/* todo_flags_finish */
!   0					/* letter */
! };
! 
! /* Apply inline plan to the function.  */
! static unsigned int
! apply_inline (void)
! {
!   unsigned int todo = 0;
!   struct cgraph_edge *e;
!   struct cgraph_node *node = cgraph_node (current_function_decl);
! 
!   /* Even when not optimizing, ensure that always_inline functions get inlined.
!    */
!   if (!optimize)
!    cgraph_decide_inlining_incrementally (node, false);
! 
!   /* We might need the body of this function so that we can expand
!      it inline somewhere else.  */
!   if (cgraph_preserve_function_body_p (current_function_decl))
!     save_inline_function_body (node);
! 
!   for (e = node->callees; e; e = e->next_callee)
!     if (!e->inline_failed || warn_inline)
!       break;
!   if (e)
!     {
!       timevar_push (TV_INTEGRATION);
!       todo = optimize_inline_calls (current_function_decl);
!       timevar_pop (TV_INTEGRATION);
!     }
!   /* In non-unit-at-a-time we must mark all referenced functions as needed.  */
!   if (!flag_unit_at_a_time)
!     {
!       struct cgraph_edge *e;
!       for (e = node->callees; e; e = e->next_callee)
! 	if (e->callee->analyzed)
!           cgraph_mark_needed_node (e->callee);
!     }
!   return todo | execute_fixup_cfg ();
! }
! 
! struct tree_opt_pass pass_apply_inline = 
! {
!   "apply_inline",			/* name */
!   NULL,					/* gate */
!   apply_inline,				/* execute */
!   NULL,					/* sub */
!   NULL,					/* next */
!   0,					/* static_pass_number */
!   TV_INLINE_HEURISTICS,			/* tv_id */
!   0,	                                /* properties_required */
!   PROP_cfg,				/* properties_provided */
!   0,					/* properties_destroyed */
!   0,					/* todo_flags_start */
!   TODO_dump_func | TODO_verify_flow
!   | TODO_verify_stmts,			/* todo_flags_finish */
    0					/* letter */
  };
  
Index: ipa.c
===================================================================
*** ipa.c	(revision 120777)
--- ipa.c	(working copy)
*************** cgraph_remove_unreachable_nodes (bool be
*** 206,211 ****
--- 206,214 ----
      node->aux = NULL;
    if (file)
      fprintf (file, "\nReclaimed %i insns", insns);
+ #ifdef ENABLE_CHECKING
+   verify_cgraph ();
+ #endif
    return changed;
  }
  
Index: tree-inline.c
===================================================================
*** tree-inline.c	(revision 120777)
--- tree-inline.c	(working copy)
*************** fold_marked_statements (int first, struc
*** 2613,2619 ****
  
  /* Expand calls to inline functions in the body of FN.  */
  
! void
  optimize_inline_calls (tree fn)
  {
    copy_body_data id;
--- 2613,2619 ----
  
  /* Expand calls to inline functions in the body of FN.  */
  
! unsigned int
  optimize_inline_calls (tree fn)
  {
    copy_body_data id;
*************** optimize_inline_calls (tree fn)
*** 2624,2630 ****
       occurred -- and we might crash if we try to inline invalid
       code.  */
    if (errorcount || sorrycount)
!     return;
  
    /* Clear out ID.  */
    memset (&id, 0, sizeof (id));
--- 2624,2630 ----
       occurred -- and we might crash if we try to inline invalid
       code.  */
    if (errorcount || sorrycount)
!     return 0;
  
    /* Clear out ID.  */
    memset (&id, 0, sizeof (id));
*************** optimize_inline_calls (tree fn)
*** 2679,2703 ****
    if (ENTRY_BLOCK_PTR->count)
      counts_to_freqs ();
  
    fold_marked_statements (last, id.statements_to_fold);
    pointer_set_destroy (id.statements_to_fold);
!   if (gimple_in_ssa_p (cfun))
!     {
!       /* We make no attempts to keep dominance info up-to-date.  */
!       free_dominance_info (CDI_DOMINATORS);
!       free_dominance_info (CDI_POST_DOMINATORS);
!       delete_unreachable_blocks ();
!       update_ssa (TODO_update_ssa);
!       fold_cond_expr_cond ();
!       if (need_ssa_update_p ())
!         update_ssa (TODO_update_ssa);
!     }
!   else
!     fold_cond_expr_cond ();
    /* It would be nice to check SSA/CFG/statement consistency here, but it is
       not possible yet - the IPA passes might make various functions to not
       throw and they don't care to proactively update local EH info.  This is
       done later in fixup_cfg pass that also execute the verification.  */
  }
  
  /* FN is a function that has a complete body, and CLONE is a function whose
--- 2679,2700 ----
    if (ENTRY_BLOCK_PTR->count)
      counts_to_freqs ();
  
+   /* We are not going to maintain the cgraph edges up to date.
+      Kill it so it won't confuse us.  */
+   cgraph_node_remove_callees (id.dst_node);
+ 
    fold_marked_statements (last, id.statements_to_fold);
    pointer_set_destroy (id.statements_to_fold);
!   fold_cond_expr_cond ();
!   /* We make no attempts to keep dominance info up-to-date.  */
!   free_dominance_info (CDI_DOMINATORS);
!   free_dominance_info (CDI_POST_DOMINATORS);
    /* It would be nice to check SSA/CFG/statement consistency here, but it is
       not possible yet - the IPA passes might make various functions to not
       throw and they don't care to proactively update local EH info.  This is
       done later in fixup_cfg pass that also execute the verification.  */
+   return (TODO_update_ssa | TODO_cleanup_cfg
+ 	  | (gimple_in_ssa_p (cfun) ? TODO_remove_unused_locals : 0));
  }
  
  /* FN is a function that has a complete body, and CLONE is a function whose
*************** tree_function_versioning (tree old_decl,
*** 3194,3199 ****
--- 3191,3197 ----
    struct ipa_replace_map *replace_info;
    basic_block old_entry_block;
    tree t_step;
+   tree old_current_function_decl = current_function_decl;
  
    gcc_assert (TREE_CODE (old_decl) == FUNCTION_DECL
  	      && TREE_CODE (new_decl) == FUNCTION_DECL);
*************** tree_function_versioning (tree old_decl,
*** 3202,3211 ****
    old_version_node = cgraph_node (old_decl);
    new_version_node = cgraph_node (new_decl);
  
-   allocate_struct_function (new_decl);
-   /* Cfun points to the new allocated function struct at this point.  */
-   cfun->function_end_locus = DECL_SOURCE_LOCATION (new_decl);
- 
    DECL_ARTIFICIAL (new_decl) = 1;
    DECL_ABSTRACT_ORIGIN (new_decl) = DECL_ORIGIN (old_decl);
  
--- 3200,3205 ----
*************** tree_function_versioning (tree old_decl,
*** 3322,3328 ****
    free_dominance_info (CDI_DOMINATORS);
    free_dominance_info (CDI_POST_DOMINATORS);
    pop_cfun ();
!   current_function_decl = NULL;
    return;
  }
  
--- 3316,3324 ----
    free_dominance_info (CDI_DOMINATORS);
    free_dominance_info (CDI_POST_DOMINATORS);
    pop_cfun ();
!   current_function_decl = old_current_function_decl;
!   gcc_assert (!current_function_decl
! 	      || DECL_STRUCT_FUNCTION (current_function_decl) == cfun);
    return;
  }
  
Index: tree-inline.h
===================================================================
*** tree-inline.h	(revision 120777)
--- tree-inline.h	(working copy)
*************** typedef struct copy_body_data
*** 98,104 ****
  extern tree copy_body_r (tree *, int *, void *);
  extern void insert_decl_map (copy_body_data *, tree, tree);
  
! void optimize_inline_calls (tree);
  bool tree_inlinable_function_p (tree);
  tree copy_tree_r (tree *, int *, void *);
  void clone_body (tree, tree, void *);
--- 98,104 ----
  extern tree copy_body_r (tree *, int *, void *);
  extern void insert_decl_map (copy_body_data *, tree, tree);
  
! unsigned int optimize_inline_calls (tree);
  bool tree_inlinable_function_p (tree);
  tree copy_tree_r (tree *, int *, void *);
  void clone_body (tree, tree, void *);
Index: tree-optimize.c
===================================================================
*** tree-optimize.c	(revision 120777)
--- tree-optimize.c	(working copy)
*************** has_abnormal_outgoing_edge_p (basic_bloc
*** 285,293 ****
  /* Pass: fixup_cfg.  IPA passes, compilation of earlier functions or inlining
     might have changed some properties, such as marked functions nothrow or
     added calls that can potentially go to non-local labels.  Remove redundant
!    edges and basic blocks, and create new ones if necessary.  */
  
! static unsigned int
  execute_fixup_cfg (void)
  {
    basic_block bb;
--- 285,296 ----
  /* Pass: fixup_cfg.  IPA passes, compilation of earlier functions or inlining
     might have changed some properties, such as marked functions nothrow or
     added calls that can potentially go to non-local labels.  Remove redundant
!    edges and basic blocks, and create new ones if necessary.
  
!    This pass can't be executed as stand alone pass from pass manager, because
!    in between inlining and this fixup the verify_flow_info would fail.  */
! 
! unsigned int
  execute_fixup_cfg (void)
  {
    basic_block bb;
*************** execute_fixup_cfg (void)
*** 310,316 ****
  	      {
  		if (gimple_in_ssa_p (cfun))
  		  {
! 		    todo |= TODO_update_ssa;
  	            update_stmt (stmt);
  		  }
  	        TREE_SIDE_EFFECTS (call) = 0;
--- 313,319 ----
  	      {
  		if (gimple_in_ssa_p (cfun))
  		  {
! 		    todo |= TODO_update_ssa | TODO_cleanup_cfg;
  	            update_stmt (stmt);
  		  }
  	        TREE_SIDE_EFFECTS (call) = 0;
*************** execute_fixup_cfg (void)
*** 320,326 ****
  	    if (!tree_could_throw_p (stmt) && lookup_stmt_eh_region (stmt))
  	      remove_stmt_from_eh_region (stmt);
  	  }
! 	tree_purge_dead_eh_edges (bb);
        }
  
    if (current_function_has_nonlocal_label)
--- 323,330 ----
  	    if (!tree_could_throw_p (stmt) && lookup_stmt_eh_region (stmt))
  	      remove_stmt_from_eh_region (stmt);
  	  }
! 	if (tree_purge_dead_eh_edges (bb))
!           todo |= TODO_cleanup_cfg;
        }
  
    if (current_function_has_nonlocal_label)
*************** execute_fixup_cfg (void)
*** 358,364 ****
  
  		      for (phi = phi_nodes (bb); phi; phi = PHI_CHAIN (phi))
  			{
! 		          todo |= TODO_update_ssa;
  			  gcc_assert (SSA_NAME_OCCURS_IN_ABNORMAL_PHI
  				      (PHI_RESULT (phi)));
  			  mark_sym_for_renaming
--- 362,368 ----
  
  		      for (phi = phi_nodes (bb); phi; phi = PHI_CHAIN (phi))
  			{
! 		          todo |= TODO_update_ssa | TODO_cleanup_cfg;
  			  gcc_assert (SSA_NAME_OCCURS_IN_ABNORMAL_PHI
  				      (PHI_RESULT (phi)));
  			  mark_sym_for_renaming
*************** execute_fixup_cfg (void)
*** 377,400 ****
    return todo;
  }
  
- struct tree_opt_pass pass_fixup_cfg =
- {
-   "fixupcfg",				/* name */
-   NULL,					/* gate */
-   execute_fixup_cfg,			/* execute */
-   NULL,					/* sub */
-   NULL,					/* next */
-   0,					/* static_pass_number */
-   0,					/* tv_id */
-   PROP_cfg,				/* properties_required */
-   0,					/* properties_provided */
-   0,					/* properties_destroyed */
-   0,					/* todo_flags_start */
-   TODO_cleanup_cfg | TODO_ggc_collect
-   | TODO_dump_func | TODO_verify_flow
-   | TODO_verify_stmts,/* todo_flags_finish */
-   0					/* letter */ };
- 
  /* Do the actions required to initialize internal data structures used
     in tree-ssa optimization passes.  */
  
--- 381,386 ----
*************** tree_rest_of_compilation (tree fndecl)
*** 487,499 ****
    /* Initialize the default bitmap obstack.  */
    bitmap_obstack_initialize (NULL);
  
-   /* We might need the body of this function so that we can expand
-      it inline somewhere else.  */
-   if (cgraph_preserve_function_body_p (fndecl))
-     save_inline_function_body (node);
- 
    /* Initialize the RTL code for the function.  */
    current_function_decl = fndecl;
    saved_loc = input_location;
    input_location = DECL_SOURCE_LOCATION (fndecl);
    init_function_start (fndecl);
--- 473,481 ----
    /* Initialize the default bitmap obstack.  */
    bitmap_obstack_initialize (NULL);
  
    /* Initialize the RTL code for the function.  */
    current_function_decl = fndecl;
+   cfun = DECL_STRUCT_FUNCTION (fndecl);
    saved_loc = input_location;
    input_location = DECL_SOURCE_LOCATION (fndecl);
    init_function_start (fndecl);
*************** tree_rest_of_compilation (tree fndecl)
*** 506,538 ****
    
    tree_register_cfg_hooks ();
  
-   if (flag_inline_trees)
-     {
-       struct cgraph_edge *e;
-       for (e = node->callees; e; e = e->next_callee)
- 	if (!e->inline_failed || warn_inline)
- 	  break;
-       if (e)
- 	{
- 	  timevar_push (TV_INTEGRATION);
- 	  optimize_inline_calls (fndecl);
- 	  timevar_pop (TV_INTEGRATION);
- 	}
-     }
-   /* In non-unit-at-a-time we must mark all referenced functions as needed.
-      */
-   if (!flag_unit_at_a_time)
-     {
-       struct cgraph_edge *e;
-       for (e = node->callees; e; e = e->next_callee)
- 	if (e->callee->analyzed)
-           cgraph_mark_needed_node (e->callee);
-     }
- 
-   /* We are not going to maintain the cgraph edges up to date.
-      Kill it so it won't confuse us.  */
-   cgraph_node_remove_callees (node);
- 
    bitmap_obstack_initialize (&reg_obstack); /* FIXME, only at RTL generation*/
    /* Perform all tree transforms and optimizations.  */
    execute_pass_list (all_passes);
--- 488,493 ----
Index: tree-flow.h
===================================================================
*** tree-flow.h	(revision 120777)
--- tree-flow.h	(working copy)
*************** void sort_fieldstack (VEC(fieldoff_s,hea
*** 1058,1063 ****
--- 1058,1064 ----
  
  void init_alias_heapvars (void);
  void delete_alias_heapvars (void);
+ unsigned int execute_fixup_cfg (void);
  
  #include "tree-flow-inline.h"
  
Index: Makefile.in
===================================================================
*** Makefile.in	(revision 120777)
--- Makefile.in	(working copy)
*************** passes.o : passes.c $(CONFIG_H) $(SYSTEM
*** 2103,2109 ****
     $(PARAMS_H) $(TM_P_H) reload.h dwarf2asm.h $(TARGET_H) \
     langhooks.h insn-flags.h $(CFGLAYOUT_H) $(REAL_H) $(CFGLOOP_H) \
     hosthooks.h $(CGRAPH_H) $(COVERAGE_H) tree-pass.h $(TREE_DUMP_H) \
!    $(GGC_H) $(INTEGRATE_H) $(CPPLIB_H) opts.h $(TREE_FLOW_H) $(TREE_INLINE_H)
  
  main.o : main.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) toplev.h
  
--- 2103,2110 ----
     $(PARAMS_H) $(TM_P_H) reload.h dwarf2asm.h $(TARGET_H) \
     langhooks.h insn-flags.h $(CFGLAYOUT_H) $(REAL_H) $(CFGLOOP_H) \
     hosthooks.h $(CGRAPH_H) $(COVERAGE_H) tree-pass.h $(TREE_DUMP_H) \
!    $(GGC_H) $(INTEGRATE_H) $(CPPLIB_H) opts.h $(TREE_FLOW_H) $(TREE_INLINE_H) \
!    gt-passes.h
  
  main.o : main.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) toplev.h
  
*************** GTFILES = $(srcdir)/input.h $(srcdir)/co
*** 2819,2825 ****
    $(srcdir)/ipa-reference.c $(srcdir)/tree-ssa-structalias.h \
    $(srcdir)/tree-ssa-structalias.c \
    $(srcdir)/c-pragma.h $(srcdir)/omp-low.c $(srcdir)/varpool.c \
!   $(srcdir)/targhooks.c $(out_file) \
    @all_gtfiles@
  
  GTFILES_FILES_LANGS = @all_gtfiles_files_langs@
--- 2820,2826 ----
    $(srcdir)/ipa-reference.c $(srcdir)/tree-ssa-structalias.h \
    $(srcdir)/tree-ssa-structalias.c \
    $(srcdir)/c-pragma.h $(srcdir)/omp-low.c $(srcdir)/varpool.c \
!   $(srcdir)/targhooks.c $(out_file) $(srcdir)/passes.c\
    @all_gtfiles@
  
  GTFILES_FILES_LANGS = @all_gtfiles_files_langs@
*************** gt-tree-profile.h gt-tree-ssa-address.h 
*** 2850,2856 ****
  gt-tree-iterator.h gt-gimplify.h \
  gt-tree-phinodes.h gt-tree-nested.h \
  gt-tree-ssa-propagate.h gt-varpool.h \
! gt-tree-ssa-structalias.h gt-ipa-inline.h \
  gt-stringpool.h gt-targhooks.h gt-omp-low.h : s-gtype ; @true
  
  define echo_quoted_to_gtyp
--- 2851,2857 ----
  gt-tree-iterator.h gt-gimplify.h \
  gt-tree-phinodes.h gt-tree-nested.h \
  gt-tree-ssa-propagate.h gt-varpool.h \
! gt-tree-ssa-structalias.h gt-ipa-inline.h gt-passes.h \
  gt-stringpool.h gt-targhooks.h gt-omp-low.h : s-gtype ; @true
  
  define echo_quoted_to_gtyp
Index: passes.c
===================================================================
*** passes.c	(revision 120777)
--- passes.c	(working copy)
*************** init_optimization_passes (void)
*** 437,446 ****
    struct tree_opt_pass **p;
  
  #define NEXT_PASS(PASS)  (p = next_pass_1 (p, &PASS))
    /* Interprocedural optimization passes.  */
    p = &all_ipa_passes;
    NEXT_PASS (pass_ipa_function_and_variable_visibility);
!   NEXT_PASS (pass_early_ipa_inline);
    NEXT_PASS (pass_early_local_passes);
    NEXT_PASS (pass_ipa_increase_alignment);
    NEXT_PASS (pass_ipa_cp);
--- 437,447 ----
    struct tree_opt_pass **p;
  
  #define NEXT_PASS(PASS)  (p = next_pass_1 (p, &PASS))
+ 
    /* Interprocedural optimization passes.  */
    p = &all_ipa_passes;
    NEXT_PASS (pass_ipa_function_and_variable_visibility);
!   NEXT_PASS (pass_ipa_early_inline);
    NEXT_PASS (pass_early_local_passes);
    NEXT_PASS (pass_ipa_increase_alignment);
    NEXT_PASS (pass_ipa_cp);
*************** init_optimization_passes (void)
*** 451,456 ****
--- 452,463 ----
    NEXT_PASS (pass_ipa_pta);
    *p = NULL;
  
+   p = &pass_ipa_early_inline.sub;
+   NEXT_PASS (pass_early_inline);
+   NEXT_PASS (pass_inline_parameters);
+   NEXT_PASS (pass_rebuild_cgraph_edges);
+   *p = NULL;
+ 
    /* All passes needed to lower the function into shape optimizers can
       operate on.  */
    p = &all_lowering_passes;
*************** init_optimization_passes (void)
*** 464,469 ****
--- 471,477 ----
    NEXT_PASS (pass_lower_vector);
    NEXT_PASS (pass_warn_function_return);
    NEXT_PASS (pass_build_cgraph_edges);
+   NEXT_PASS (pass_inline_parameters);
    *p = NULL;
  
    p = &pass_early_local_passes.sub;
*************** init_optimization_passes (void)
*** 473,478 ****
--- 481,487 ----
    NEXT_PASS (pass_expand_omp);
    NEXT_PASS (pass_all_early_optimizations);
    NEXT_PASS (pass_rebuild_cgraph_edges);
+   NEXT_PASS (pass_inline_parameters);
    *p = NULL;
  
    p = &pass_all_early_optimizations.sub;
*************** init_optimization_passes (void)
*** 480,485 ****
--- 489,496 ----
    NEXT_PASS (pass_reset_cc_flags);
    NEXT_PASS (pass_build_ssa);
    NEXT_PASS (pass_early_warn_uninitialized);
+   NEXT_PASS (pass_rebuild_cgraph_edges);
+   NEXT_PASS (pass_early_inline);
    NEXT_PASS (pass_cleanup_cfg);
    NEXT_PASS (pass_rename_ssa_copies);
    NEXT_PASS (pass_ccp);
*************** init_optimization_passes (void)
*** 494,500 ****
    *p = NULL;
  
    p = &all_passes;
!   NEXT_PASS (pass_fixup_cfg);
    NEXT_PASS (pass_all_optimizations);
    NEXT_PASS (pass_warn_function_noreturn);
    NEXT_PASS (pass_free_datastructures);
--- 505,511 ----
    *p = NULL;
  
    p = &all_passes;
!   NEXT_PASS (pass_apply_inline);
    NEXT_PASS (pass_all_optimizations);
    NEXT_PASS (pass_warn_function_noreturn);
    NEXT_PASS (pass_free_datastructures);
*************** do_per_function (void (*callback) (void 
*** 749,754 ****
--- 760,811 ----
      }
  }
  
+ /* Because inlining might remove no-longer reachable nodes, we need to
+    keep the array visible to garbage collector to avoid reading collected
+    out nodes.  */
+ static int nnodes;
+ static GTY ((length ("nnodes"))) struct cgraph_node **order;
+ 
+ /* If we are in IPA mode (i.e., current_function_decl is NULL), call
+    function CALLBACK for every function in the call graph.  Otherwise,
+    call CALLBACK on the current function.  */ 
+ 
+ static void
+ do_per_function_toporder (void (*callback) (void *data), void *data)
+ {
+   int i;
+ 
+   if (current_function_decl)
+     callback (data);
+   else
+     {
+       gcc_assert (!order);
+       order = ggc_alloc (sizeof (*order) * cgraph_n_nodes);
+       nnodes = cgraph_postorder (order);
+       for (i = nnodes - 1; i >= 0; i--)
+ 	{
+ 	  struct cgraph_node *node = order[i];
+ 
+ 	  /* Allow possibly removed nodes to be garbage collected.  */
+ 	  order[i] = NULL;
+ 	  if (node->analyzed && (node->needed || node->reachable))
+ 	    {
+ 	      push_cfun (DECL_STRUCT_FUNCTION (node->decl));
+ 	      current_function_decl = node->decl;
+ 	      callback (data);
+ 	      free_dominance_info (CDI_DOMINATORS);
+ 	      free_dominance_info (CDI_POST_DOMINATORS);
+ 	      current_function_decl = NULL;
+ 	      pop_cfun ();
+ 	      ggc_collect ();
+ 	    }
+ 	}
+     }
+   ggc_free (order);
+   order = NULL;
+   nnodes = 0;
+ }
+ 
  /* Perform all TODO actions that ought to be done on each function.  */
  
  static void
*************** execute_one_pass (struct tree_opt_pass *
*** 903,908 ****
--- 960,968 ----
    if (pass->gate && !pass->gate ())
      return false;
  
+   if (!quiet_flag && !cfun)
+     fprintf (stderr, " <%s>", pass->name ? pass->name : "");
+ 
    if (pass->todo_flags_start & TODO_set_props)
      cfun->curr_properties = pass->properties_required;
  
*************** execute_ipa_pass_list (struct tree_opt_p
*** 1012,1027 ****
      {
        gcc_assert (!current_function_decl);
        gcc_assert (!cfun);
-       if (!quiet_flag)
- 	{
-           fprintf (stderr, " <%s>", pass->name ? pass->name : "");
- 	  fflush (stderr);
- 	}
        if (execute_one_pass (pass) && pass->sub)
! 	do_per_function ((void (*)(void *))execute_pass_list, pass->sub);
        if (!current_function_decl)
  	cgraph_process_new_functions ();
        pass = pass->next;
      }
    while (pass);
  }
--- 1072,1084 ----
      {
        gcc_assert (!current_function_decl);
        gcc_assert (!cfun);
        if (execute_one_pass (pass) && pass->sub)
! 	do_per_function_toporder ((void (*)(void *))execute_pass_list,
! 				  pass->sub);
        if (!current_function_decl)
  	cgraph_process_new_functions ();
        pass = pass->next;
      }
    while (pass);
  }
+ #include "gt-passes.h"



More information about the Gcc-patches mailing list