This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Early inlining


Hi,
this patch adds early inlining.  As discussed earlier it helps
noticeably to cut down expenses of tree profiling on testcases with
extreme function call overhead (such as tramp3d).  It also save memory
on Gerald's testcase and makes GCC module compilation test apparently
tinny bit faster at -O3 (but close to noise).  In the future I would
like to have more optimizations in between the early inlining and real
inliing that will make this infrastructure bit more usefull, but at
the moment it seems to solve some side cases and have no measurable
overhead otherwise (even tought we rebuild cgraph edges after the early
passes) so I am enabling it by default now.

The patch also fixes two bugs I managed to create in the IPA->local PM
switching patch.  Apparently inliner is 100% reentrant as the patch
simply caused inliner to be executed multiple times without really
breaking something.

There are also moderately ugly issues with tree profiling that needs to
be executed at -O0 too when we don't run the IPA passes at all, so we
need to have two fake passes (or insert pass on different places
depending on -funit-at-a-time)

Bootstrapped/regtested i686-pc-gnu-linux, OK?

2005-06-24  Jan Hubicka  <jh@suse.cz>
	* cgraph.c (cgraph_remove_node): Do not release function bodies until
	full cgraph is built.
	* cgraph.h (cgraph_decide_inlining_incrementally): Add early argument.
	* cgraphunit.c (cgraph_finalize_function): Update call of
	cgraph_decide_inlining_incrementally.
	(initialize_inline_failed): Break out of ...
	(cgraph_analyze_function): ... here.
	(rebuild_cgraph_edges): New function.
	(pass_rebuild_cgraph_edges): New pass.
	* common.opt (fearly-inlining): New flag.
	* ipa-inline.c: Include ggc.h
	(cgraph_clone_inlined_nodes): Avoid re-using of original copy
	when cgraph is not fully built.
	(cgraph_decide_inlining_incrementally): Add early mode.
	(cgraph_early_inlining): New function.
	(cgraph_gate_early_inlining): Likewise.
	(pass_early_ipa_inline): New pass.
	* ipa.c (cgraph_postorder): NULLify aux pointer.
	* tree-inline.c (expand_call_inline): Avoid warning early.
	* tree-optimize.c (pass_early_local_passes): New.
	(execute_cleanup_cfg_pre_ipa): New.
	(pass_cleanup_cfg): New.
	(register_dump_files): Fix handling subpasses of IPA pass.
	(init_tree_optimization_passes): Add early passes.
	(execute_ipa_pass_list): Fix handling of subpasses of IPA pass.
	* passes.h (pass_early_tree_profile, pass_rebuild_cgraph_edges,
	pass_early_ipa_inline): New passes.
	* tree-profile.c (do_early_tree_profiling, pass_early_tree_profile): New.

	* invoke.texi: Document early-inlining.
Index: cgraph.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/cgraph.c,v
retrieving revision 1.77
diff -c -3 -p -r1.77 cgraph.c
*** cgraph.c	3 Jun 2005 13:41:35 -0000	1.77
--- cgraph.c	23 Jun 2005 15:02:25 -0000
*************** cgraph_remove_node (struct cgraph_node *
*** 473,479 ****
      {
        struct cgraph_node *n = *slot;
        if (!n->next_clone && !n->global.inlined_to
! 	  && (TREE_ASM_WRITTEN (n->decl) || DECL_EXTERNAL (n->decl)))
  	kill_body = true;
      }
  
--- 473,480 ----
      {
        struct cgraph_node *n = *slot;
        if (!n->next_clone && !n->global.inlined_to
! 	  && (cgraph_global_info_ready
! 	      && (TREE_ASM_WRITTEN (n->decl) || DECL_EXTERNAL (n->decl))))
  	kill_body = true;
      }
  
Index: cgraph.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/cgraph.h,v
retrieving revision 1.57
diff -c -3 -p -r1.57 cgraph.h
*** cgraph.h	2 Jun 2005 19:41:31 -0000	1.57
--- cgraph.h	23 Jun 2005 15:02:25 -0000
*************** bool cgraph_remove_unreachable_nodes (bo
*** 284,290 ****
  int cgraph_postorder (struct cgraph_node **);
  
  /* In ipa-inline.c  */
! void cgraph_decide_inlining_incrementally (struct cgraph_node *);
  void cgraph_clone_inlined_nodes (struct cgraph_edge *, bool);
  void cgraph_mark_inline_edge (struct cgraph_edge *);
  bool cgraph_default_inline_p (struct cgraph_node *);
--- 284,290 ----
  int cgraph_postorder (struct cgraph_node **);
  
  /* In ipa-inline.c  */
! bool cgraph_decide_inlining_incrementally (struct cgraph_node *, bool);
  void cgraph_clone_inlined_nodes (struct cgraph_edge *, bool);
  void cgraph_mark_inline_edge (struct cgraph_edge *);
  bool cgraph_default_inline_p (struct cgraph_node *);
Index: cgraphunit.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/cgraphunit.c,v
retrieving revision 1.118
diff -c -3 -p -r1.118 cgraphunit.c
*** cgraphunit.c	12 Jun 2005 14:02:58 -0000	1.118
--- cgraphunit.c	23 Jun 2005 15:02:25 -0000
*************** cgraph_finalize_function (tree decl, boo
*** 419,425 ****
    if (!flag_unit_at_a_time)
      {
        cgraph_analyze_function (node);
!       cgraph_decide_inlining_incrementally (node);
      }
  
    if (decide_is_function_needed (node, decl))
--- 419,425 ----
    if (!flag_unit_at_a_time)
      {
        cgraph_analyze_function (node);
!       cgraph_decide_inlining_incrementally (node, false);
      }
  
    if (decide_is_function_needed (node, decl))
*************** cgraph_create_edges (struct cgraph_node 
*** 561,566 ****
--- 561,633 ----
    visited_nodes = NULL;
  }
  
+ /* Give initial reasons why inlining would fail.  Those gets
+    either NULLified or usually overwritten by more precise reason
+    later.  */
+ static void
+ initialize_inline_failed (struct cgraph_node *node)
+ {
+   struct cgraph_edge *e;
+ 
+   for (e = node->callers; e; e = e->next_caller)
+     {
+       gcc_assert (!e->callee->global.inlined_to);
+       gcc_assert (e->inline_failed);
+       if (node->local.redefined_extern_inline)
+ 	e->inline_failed = N_("redefined extern inline functions are not "
+ 			   "considered for inlining");
+       else if (!node->local.inlinable)
+ 	e->inline_failed = N_("function not inlinable");
+       else
+ 	e->inline_failed = N_("function not considered for inlining");
+     }
+ }
+ 
+ /* Rebuild call edges from current function after a passes not aware
+    of cgraph updating.  */
+ static void
+ rebuild_cgraph_edges (void)
+ {
+   basic_block bb;
+   struct cgraph_node *node = cgraph_node (current_function_decl);
+   block_stmt_iterator bsi;
+ 
+   cgraph_node_remove_callees (node);
+ 
+   node->count = ENTRY_BLOCK_PTR->count;
+ 
+   FOR_EACH_BB (bb)
+     for (bsi = bsi_start (bb); !bsi_end_p (bsi); bsi_next (&bsi))
+       {
+ 	tree stmt = bsi_stmt (bsi);
+ 	tree call = get_call_expr_in (stmt);
+ 	tree decl;
+ 
+ 	if (call && (decl = get_callee_fndecl (call)))
+ 	  cgraph_create_edge (node, cgraph_node (decl), stmt,
+ 			      bb->count,
+ 			      bb->loop_depth);
+       }
+   initialize_inline_failed (node);
+   gcc_assert (!node->global.inlined_to);
+ }
+ 
+ struct tree_opt_pass pass_rebuild_cgraph_edges =
+ {
+   NULL,					/* name */
+   NULL,					/* gate */
+   rebuild_cgraph_edges,			/* execute */
+   NULL,					/* sub */
+   NULL,					/* next */
+   0,					/* static_pass_number */
+   0,					/* tv_id */
+   PROP_cfg,				/* properties_required */
+   0,					/* properties_provided */
+   0,					/* properties_destroyed */
+   0,					/* todo_flags_start */
+   0,					/* todo_flags_finish */
+   0					/* letter */
+ };
  
  /* Verify cgraph nodes of given cgraph node.  */
  void
*************** static void
*** 756,762 ****
  cgraph_analyze_function (struct cgraph_node *node)
  {
    tree decl = node->decl;
-   struct cgraph_edge *e;
  
    current_function_decl = decl;
    push_cfun (DECL_STRUCT_FUNCTION (decl));
--- 823,828 ----
*************** cgraph_analyze_function (struct cgraph_n
*** 770,785 ****
    if (node->local.inlinable)
      node->local.disregard_inline_limits
        = lang_hooks.tree_inlining.disregard_inline_limits (decl);
!   for (e = node->callers; e; e = e->next_caller)
!     {
!       if (node->local.redefined_extern_inline)
! 	e->inline_failed = N_("redefined extern inline functions are not "
! 			   "considered for inlining");
!       else if (!node->local.inlinable)
! 	e->inline_failed = N_("function not inlinable");
!       else
! 	e->inline_failed = N_("function not considered for inlining");
!     }
    if (flag_really_no_inline && !node->local.disregard_inline_limits)
      node->local.inlinable = 0;
    /* Inlining characteristics are maintained by the cgraph_mark_inline.  */
--- 836,842 ----
    if (node->local.inlinable)
      node->local.disregard_inline_limits
        = lang_hooks.tree_inlining.disregard_inline_limits (decl);
!   initialize_inline_failed (node);
    if (flag_really_no_inline && !node->local.disregard_inline_limits)
      node->local.inlinable = 0;
    /* Inlining characteristics are maintained by the cgraph_mark_inline.  */
Index: common.opt
===================================================================
RCS file: /cvs/gcc/gcc/gcc/common.opt,v
retrieving revision 1.73
diff -c -3 -p -r1.73 common.opt
*** common.opt	4 Jun 2005 17:07:55 -0000	1.73
--- common.opt	23 Jun 2005 15:02:25 -0000
*************** finline-functions
*** 472,477 ****
--- 472,481 ----
  Common Report Var(flag_inline_functions)
  Integrate simple functions into their callers
  
+ fearly-inlining
+ Common Report Var(flag_early_inlining) Init(1)
+ Perform early inlining
+ 
  finline-limit-
  Common RejectNegative Joined UInteger
  
Index: ipa-inline.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/ipa-inline.c,v
retrieving revision 2.8
diff -c -3 -p -r2.8 ipa-inline.c
*** ipa-inline.c	2 Jun 2005 19:41:31 -0000	2.8
--- ipa-inline.c	23 Jun 2005 15:02:25 -0000
*************** Software Foundation, 59 Temple Place - S
*** 79,84 ****
--- 79,85 ----
  #include "intl.h"
  #include "tree-pass.h"
  #include "coverage.h"
+ #include "ggc.h"
  
  /* Statistics we collect about inlining algorithm.  */
  static int ncalls_inlined;
*************** cgraph_clone_inlined_nodes (struct cgrap
*** 120,126 ****
    if (!e->callee->callers->next_caller
        && (!e->callee->needed || DECL_EXTERNAL (e->callee->decl))
        && duplicate
!       && flag_unit_at_a_time)
      {
        gcc_assert (!e->callee->global.inlined_to);
        if (!DECL_EXTERNAL (e->callee->decl))
--- 121,127 ----
    if (!e->callee->callers->next_caller
        && (!e->callee->needed || DECL_EXTERNAL (e->callee->decl))
        && duplicate
!       && (flag_unit_at_a_time && cgraph_global_info_ready))
      {
        gcc_assert (!e->callee->global.inlined_to);
        if (!DECL_EXTERNAL (e->callee->decl))
*************** cgraph_decide_inlining (void)
*** 870,879 ****
  /* Decide on the inlining.  We do so in the topological order to avoid
     expenses on updating data structures.  */
  
! void
! cgraph_decide_inlining_incrementally (struct cgraph_node *node)
  {
    struct cgraph_edge *e;
  
    /* First of all look for always inline functions.  */
    for (e = node->callees; e; e = e->next_callee)
--- 871,881 ----
  /* Decide on the inlining.  We do so in the topological order to avoid
     expenses on updating data structures.  */
  
! bool
! cgraph_decide_inlining_incrementally (struct cgraph_node *node, bool early)
  {
    struct cgraph_edge *e;
+   bool inlined = false;
  
    /* First of all look for always inline functions.  */
    for (e = node->callees; e; e = e->next_callee)
*************** cgraph_decide_inlining_incrementally (st
*** 883,889 ****
  	/* ??? It is possible that renaming variable removed the function body
  	   in duplicate_decls. See gcc.c-torture/compile/20011119-2.c  */
  	&& DECL_SAVED_TREE (e->callee->decl))
!       cgraph_mark_inline (e);
  
    /* Now do the automatic inlining.  */
    if (!flag_really_no_inline)
--- 885,897 ----
  	/* ??? It is possible that renaming variable removed the function body
  	   in duplicate_decls. See gcc.c-torture/compile/20011119-2.c  */
  	&& DECL_SAVED_TREE (e->callee->decl))
!       {
!         if (dump_file && early)
!           fprintf (dump_file, "  Early inlining %s into %s\n",
! 		   cgraph_node_name (e->callee), cgraph_node_name (node));
! 	cgraph_mark_inline (e);
! 	inlined = true;
!       }
  
    /* Now do the automatic inlining.  */
    if (!flag_really_no_inline)
*************** cgraph_decide_inlining_incrementally (st
*** 892,906 ****
  	  && e->inline_failed
  	  && !e->callee->local.disregard_inline_limits
  	  && !cgraph_recursive_inlining_p (node, e->callee, &e->inline_failed)
  	  && cgraph_check_inline_limits (node, e->callee, &e->inline_failed)
  	  && DECL_SAVED_TREE (e->callee->decl))
  	{
  	  if (cgraph_default_inline_p (e->callee))
! 	    cgraph_mark_inline (e);
! 	  else
  	    e->inline_failed
  	      = N_("--param max-inline-insns-single limit reached");
  	}
  }
  
  /* When inlining shall be performed.  */
--- 900,935 ----
  	  && e->inline_failed
  	  && !e->callee->local.disregard_inline_limits
  	  && !cgraph_recursive_inlining_p (node, e->callee, &e->inline_failed)
+ 	  && (!early
+ 	      || (cgraph_estimate_size_after_inlining (1, e->caller, node)
+ 	          <= e->caller->global.insns))
  	  && cgraph_check_inline_limits (node, e->callee, &e->inline_failed)
  	  && DECL_SAVED_TREE (e->callee->decl))
  	{
  	  if (cgraph_default_inline_p (e->callee))
! 	    {
! 	      if (dump_file && early)
!                 fprintf (dump_file, "  Early inlining %s into %s\n",
! 			 cgraph_node_name (e->callee), cgraph_node_name (node));
! 	      cgraph_mark_inline (e);
! 	      inlined = true;
! 	    }
! 	  else if (!early)
  	    e->inline_failed
  	      = N_("--param max-inline-insns-single limit reached");
  	}
+   if (early && inlined)
+     {
+       push_cfun (DECL_STRUCT_FUNCTION (node->decl));
+       tree_register_cfg_hooks ();
+       current_function_decl = node->decl;
+       optimize_inline_calls (current_function_decl);
+       node->local.self_insns = node->global.insns;
+       current_function_decl = NULL;
+       pop_cfun ();
+       ggc_collect ();
+     }
+   return inlined;
  }
  
  /* When inlining shall be performed.  */
*************** struct tree_opt_pass pass_ipa_inline = 
*** 920,926 ****
    0,					/* static_pass_number */
    TV_INTEGRATION,			/* tv_id */
    0,	                                /* properties_required */
!   PROP_trees,				/* properties_provided */
    0,					/* properties_destroyed */
    0,					/* todo_flags_start */
    TODO_dump_cgraph | TODO_dump_func,	/* todo_flags_finish */
--- 949,1015 ----
    0,					/* static_pass_number */
    TV_INTEGRATION,			/* tv_id */
    0,	                                /* properties_required */
!   PROP_cfg,				/* properties_provided */
!   0,					/* properties_destroyed */
!   0,					/* todo_flags_start */
!   TODO_dump_cgraph | TODO_dump_func,	/* todo_flags_finish */
!   0					/* letter */
! };
! 
! /* Do inlining of small functions.  Doing so early helps profiling and other
!    passes to be somewhat more effective and avoids some code duplication in
!    later real inlining pass for testcases with very many function calls.  */
! static void
! cgraph_early_inlining (void)
! {
!   struct cgraph_node *node;
!   int nnodes;
!   struct cgraph_node **order =
!     xcalloc (cgraph_n_nodes, sizeof (struct cgraph_node *));
!   int i;
! 
!   if (sorrycount || errorcount)
!     return;
! #ifdef ENABLE_CHECKING
!   for (node = cgraph_nodes; node; node = node->next)
!     gcc_assert (!node->aux);
! #endif
! 
!   nnodes = cgraph_postorder (order);
!   for (i = nnodes - 1; i >= 0; i--)
!     {
!       node = order[i];
!       if (node->analyzed && node->local.inlinable
! 	  && (node->needed || node->reachable)
! 	  && node->callers)
! 	cgraph_decide_inlining_incrementally (node, true);
!     }
!   cgraph_remove_unreachable_nodes (true, dump_file);
! #ifdef ENABLE_CHECKING
!   for (node = cgraph_nodes; node; node = node->next)
!     gcc_assert (!node->global.inlined_to);
! #endif
!   free (order);
! }
! 
! /* When inlining shall be performed.  */
! static bool
! cgraph_gate_early_inlining (void)
! {
!   return flag_inline_trees && flag_early_inlining;
! }
! 
! struct tree_opt_pass pass_early_ipa_inline = 
! {
!   "einline",	 			/* name */
!   cgraph_gate_early_inlining,		/* gate */
!   cgraph_early_inlining,		/* execute */
!   NULL,					/* sub */
!   NULL,					/* next */
!   0,					/* static_pass_number */
!   TV_INTEGRATION,			/* tv_id */
!   0,	                                /* properties_required */
!   PROP_cfg,				/* properties_provided */
    0,					/* properties_destroyed */
    0,					/* todo_flags_start */
    TODO_dump_cgraph | TODO_dump_func,	/* todo_flags_finish */
Index: ipa.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/ipa.c,v
retrieving revision 2.1
diff -c -3 -p -r2.1 ipa.c
*** ipa.c	22 Apr 2005 08:16:54 -0000	2.1
--- ipa.c	23 Jun 2005 15:02:25 -0000
*************** cgraph_postorder (struct cgraph_node **o
*** 83,88 ****
--- 83,90 ----
  	  }
        }
    free (stack);
+   for (node = cgraph_nodes; node; node = node->next)
+     node->aux = NULL;
    return order_pos;
  }
  
Index: tree-inline.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/tree-inline.c,v
retrieving revision 1.196
diff -c -3 -p -r1.196 tree-inline.c
*** tree-inline.c	23 Jun 2005 13:20:02 -0000	1.196
--- tree-inline.c	23 Jun 2005 15:02:25 -0000
*************** expand_call_inline (basic_block bb, tree
*** 1969,1975 ****
        else if (warn_inline && DECL_DECLARED_INLINE_P (fn)
  	       && !DECL_IN_SYSTEM_HEADER (fn)
  	       && strlen (reason)
! 	       && !lookup_attribute ("noinline", DECL_ATTRIBUTES (fn)))
  	{
  	  warning (0, "%Jinlining failed in call to %qF: %s", fn, fn, reason);
  	  warning (0, "called from here");
--- 1969,1977 ----
        else if (warn_inline && DECL_DECLARED_INLINE_P (fn)
  	       && !DECL_IN_SYSTEM_HEADER (fn)
  	       && strlen (reason)
! 	       && !lookup_attribute ("noinline", DECL_ATTRIBUTES (fn))
! 	       /* Avoid warnings during early inline pass. */
! 	       && (!flag_unit_at_a_time || cgraph_global_info_ready))
  	{
  	  warning (0, "%Jinlining failed in call to %qF: %s", fn, fn, reason);
  	  warning (0, "called from here");
Index: tree-optimize.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/tree-optimize.c,v
retrieving revision 2.108
diff -c -3 -p -r2.108 tree-optimize.c
*** tree-optimize.c	17 Jun 2005 11:53:54 -0000	2.108
--- tree-optimize.c	23 Jun 2005 15:02:25 -0000
*************** int dump_flags;
*** 55,61 ****
  bool in_gimple_form;
  
  /* The root of the compilation pass tree, once constructed.  */
! static struct tree_opt_pass *all_passes, *all_ipa_passes, * all_lowering_passes;
  
  /* Gate: execute, or not, all of the non-trivial optimizations.  */
  
--- 55,61 ----
  bool in_gimple_form;
  
  /* The root of the compilation pass tree, once constructed.  */
! static struct tree_opt_pass *all_passes, *all_ipa_passes, *all_lowering_passes;
  
  /* Gate: execute, or not, all of the non-trivial optimizations.  */
  
*************** static struct tree_opt_pass pass_all_opt
*** 84,89 ****
--- 84,135 ----
    0					/* letter */
  };
  
+ static struct tree_opt_pass pass_early_local_passes =
+ {
+   NULL,					/* name */
+   gate_all_optimizations,		/* gate */
+   NULL,					/* execute */
+   NULL,					/* sub */
+   NULL,					/* next */
+   0,					/* static_pass_number */
+   0,					/* tv_id */
+   0,					/* properties_required */
+   0,					/* properties_provided */
+   0,					/* properties_destroyed */
+   0,					/* todo_flags_start */
+   0,					/* todo_flags_finish */
+   0					/* letter */
+ };
+ 
+ /* Pass: cleanup the CFG just before expanding trees to RTL.
+    This is just a round of label cleanups and case node grouping
+    because after the tree optimizers have run such cleanups may
+    be necessary.  */
+ 
+ static void 
+ execute_cleanup_cfg_pre_ipa (void)
+ {
+   cleanup_tree_cfg ();
+ }
+ 
+ static struct tree_opt_pass pass_cleanup_cfg =
+ {
+   "cleanup_cfg",			/* name */
+   NULL,					/* gate */
+   execute_cleanup_cfg_pre_ipa,		/* execute */
+   NULL,					/* sub */
+   NULL,					/* next */
+   0,					/* static_pass_number */
+   0,					/* tv_id */
+   PROP_cfg,				/* properties_required */
+   0,					/* properties_provided */
+   0,					/* properties_destroyed */
+   0,					/* todo_flags_start */
+   TODO_dump_func,					/* todo_flags_finish */
+   0					/* letter */
+ };
+ 
+ 
  /* Pass: cleanup the CFG just before expanding trees to RTL.
     This is just a round of label cleanups and case node grouping
     because after the tree optimizers have run such cleanups may
*************** register_dump_files (struct tree_opt_pas
*** 294,300 ****
          n++;
  
        if (pass->sub)
!         new_properties = register_dump_files (pass->sub, ipa, new_properties);
  
        /* If we have a gate, combine the properties that we could have with
           and without the pass being examined.  */
--- 340,346 ----
          n++;
  
        if (pass->sub)
!         new_properties = register_dump_files (pass->sub, false, new_properties);
  
        /* If we have a gate, combine the properties that we could have with
           and without the pass being examined.  */
*************** init_tree_optimization_passes (void)
*** 362,367 ****
--- 408,415 ----
  #define NEXT_PASS(PASS)  (p = next_pass_1 (p, &PASS))
    /* Intraprocedural optimization passes.  */
    p = &all_ipa_passes;
+   NEXT_PASS (pass_early_ipa_inline);
+   NEXT_PASS (pass_early_local_passes);
    NEXT_PASS (pass_ipa_inline);
    *p = NULL;
  
*************** init_tree_optimization_passes (void)
*** 377,383 ****
--- 425,437 ----
    NEXT_PASS (pass_lower_complex_O0);
    NEXT_PASS (pass_lower_vector);
    NEXT_PASS (pass_warn_function_return);
+   NEXT_PASS (pass_early_tree_profile);
+   *p = NULL;
+ 
+   p = &pass_early_local_passes.sub;
    NEXT_PASS (pass_tree_profile);
+   NEXT_PASS (pass_cleanup_cfg);
+   NEXT_PASS (pass_rebuild_cgraph_edges);
    *p = NULL;
  
    p = &all_passes;
*************** execute_ipa_pass_list (struct tree_opt_p
*** 685,691 ****
  	      {
  		push_cfun (DECL_STRUCT_FUNCTION (node->decl));
  		current_function_decl = node->decl;
! 		execute_pass_list (pass);
  		free_dominance_info (CDI_DOMINATORS);
  		free_dominance_info (CDI_POST_DOMINATORS);
  		current_function_decl = NULL;
--- 739,745 ----
  	      {
  		push_cfun (DECL_STRUCT_FUNCTION (node->decl));
  		current_function_decl = node->decl;
! 		execute_pass_list (pass->sub);
  		free_dominance_info (CDI_DOMINATORS);
  		free_dominance_info (CDI_POST_DOMINATORS);
  		current_function_decl = NULL;
Index: tree-pass.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/tree-pass.h,v
retrieving revision 2.42
diff -c -3 -p -r2.42 tree-pass.h
*** tree-pass.h	9 Jun 2005 13:05:37 -0000	2.42
--- tree-pass.h	23 Jun 2005 15:02:25 -0000
*************** extern struct tree_opt_pass pass_lower_c
*** 164,169 ****
--- 164,170 ----
  extern struct tree_opt_pass pass_lower_eh;
  extern struct tree_opt_pass pass_build_cfg;
  extern struct tree_opt_pass pass_tree_profile;
+ extern struct tree_opt_pass pass_early_tree_profile;
  extern struct tree_opt_pass pass_referenced_vars;
  extern struct tree_opt_pass pass_sra;
  extern struct tree_opt_pass pass_tail_recursion;
*************** extern struct tree_opt_pass pass_build_p
*** 225,232 ****
--- 226,235 ----
  extern struct tree_opt_pass pass_del_pta;
  extern struct tree_opt_pass pass_uncprop;
  extern struct tree_opt_pass pass_reassoc;
+ extern struct tree_opt_pass pass_rebuild_cgraph_edges;
  
  /* IPA Passes */
  extern struct tree_opt_pass pass_ipa_inline;
+ extern struct tree_opt_pass pass_early_ipa_inline;
  
  #endif /* GCC_TREE_PASS_H */
Index: tree-profile.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/tree-profile.c,v
retrieving revision 2.12
diff -c -3 -p -r2.12 tree-profile.c
*** tree-profile.c	24 May 2005 20:19:12 -0000	2.12
--- tree-profile.c	23 Jun 2005 15:02:25 -0000
*************** struct tree_opt_pass pass_tree_profile =
*** 273,278 ****
--- 273,305 ----
    0					/* letter */
  };
  
+ /* Return 1 if tree-based profiling is in effect, else 0.
+    If it is, set up hooks for tree-based profiling.
+    Gate for pass_tree_profile.  */
+ 
+ static bool
+ do_early_tree_profiling (void)
+ {
+   return (do_tree_profiling () && (!flag_unit_at_a_time || !optimize));
+ }
+ 
+ struct tree_opt_pass pass_early_tree_profile = 
+ {
+   "early_tree_profile",			/* name */
+   do_early_tree_profiling,		/* gate */
+   tree_profiling,			/* execute */
+   NULL,					/* sub */
+   NULL,					/* next */
+   0,					/* static_pass_number */
+   TV_BRANCH_PROB,			/* tv_id */
+   PROP_gimple_leh | PROP_cfg,		/* properties_required */
+   PROP_gimple_leh | PROP_cfg,		/* properties_provided */
+   0,					/* properties_destroyed */
+   0,					/* todo_flags_start */
+   TODO_verify_stmts,			/* todo_flags_finish */
+   0					/* letter */
+ };
+ 
  struct profile_hooks tree_profile_hooks =
  {
    tree_init_edge_profiler,      /* init_edge_profiler */
Index: doc/invoke.texi
===================================================================
RCS file: /cvs/gcc/gcc/gcc/doc/invoke.texi,v
retrieving revision 1.637
diff -c -3 -p -r1.637 invoke.texi
*** doc/invoke.texi	15 Jun 2005 12:53:41 -0000	1.637
--- doc/invoke.texi	24 Jun 2005 02:25:01 -0000
*************** Objective-C and Objective-C++ Dialects}.
*** 300,306 ****
  -fbranch-target-load-optimize2 -fbtr-bb-exclusive @gol
  -fcaller-saves  -fcprop-registers  -fcse-follow-jumps @gol
  -fcse-skip-blocks  -fcx-limited-range  -fdata-sections @gol
! -fdelayed-branch  -fdelete-null-pointer-checks @gol
  -fexpensive-optimizations  -ffast-math  -ffloat-store @gol
  -fforce-addr  -fforce-mem  -ffunction-sections @gol
  -fgcse  -fgcse-lm  -fgcse-sm  -fgcse-las  -fgcse-after-reload @gol
--- 300,306 ----
  -fbranch-target-load-optimize2 -fbtr-bb-exclusive @gol
  -fcaller-saves  -fcprop-registers  -fcse-follow-jumps @gol
  -fcse-skip-blocks  -fcx-limited-range  -fdata-sections @gol
! -fdelayed-branch  -fdelete-null-pointer-checks -fearly-inlining @gol
  -fexpensive-optimizations  -ffast-math  -ffloat-store @gol
  -fforce-addr  -fforce-mem  -ffunction-sections @gol
  -fgcse  -fgcse-lm  -fgcse-sm  -fgcse-las  -fgcse-after-reload @gol
*************** assembler code in its own right.
*** 4450,4455 ****
--- 4450,4465 ----
  
  Enabled at level @option{-O3}.
  
+ @item -fearly-inlining
+ @opindex fearly-inlining
+ Inline functions marged by @code{always_inline} and functions whose body seems
+ smaller than the function call overhead early before doing
+ @code{-fprofile-generate} instrumentation and real inlining pass.  Doing so
+ makes profiling significantly cheaper and usually inlining faster on programs
+ having large chains of nested wrapper functions.
+ 
+ Enabled by default.
+ 
  @item -finline-limit=@var{n}
  @opindex finline-limit
  By default, GCC limits the size of functions that can be inlined.  This flag


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]