cgraph based inlining heuristics

Jan Hubicka jh@suse.cz
Tue Jul 1 23:52:00 GMT 2003


Hi,
the attached patch implements callgraph based inlining heuristics.  For
SPEC2000 it makes little difference (about 5% compilation time/code size
savings for eon, but I am still running the test), but for template
heavy inlining code it can save quite a lot of compilation time.

For Gerald's application it saves about 28% of compilation time and 17%
code size reduction while getting overall better performance.

I tested multiple simple algorithms and the current is actually an
greedy heuristics as used for knapsack solvers - always function causing
smallest overall growth after inlining completely is considered and if
possible inlined into all callers.

There are new parameters limiting growth of large functions and overall
growth of compilation unit.  I hope these to ensure that we won't run
into degenerate cases like with current heuristics.

I would also wish to thank Gerald for a lot of testing, feedback and
ideas!

The performance of benchmark is as follows.  I will send SPEC2000
results once they are ready.

The patch is additive to the previous patch implementing new code size
estimation, so I am sending comparison to both.

Bootstrapped/regtested on mainline together with previous patch for
inline sizes and C++ unit-at-a-time (this one is not required)
OK?
Honza

                     |   mainline     |  new code size |  new heuristics|
---------------------+----------------+----------------+----------------+
      STRATCOMP1-ALL |   3.11  (0.00) |   2.89  (0.01) |   2.83  (0.00) |
   STRATCOMP-770.2-Q |   0.48  (0.00) |   0.47  (0.00) |   0.47  (0.00) |
               2QBF1 |  12.66  (0.02) |  11.93  (0.02) |  12.40  (0.02) |
          PRIMEIMPL2 |   6.05  (0.01) |   6.01  (0.01) |   6.11  (0.00) |
            ANCESTOR |  95.57  (0.23) |  91.77  (0.06) |  90.68  (0.16) |
       3COL-SIMPLEX1 |   4.94  (0.01) |   4.72  (0.08) |   4.53  (0.09) |
         3COL-LADDER | 116.42  (0.05) | 111.07  (0.13) | 108.96  (0.16) |
       3COL-N-LADDER |   -    (-)     |   -    (-)     |   -    (-)     |
        3COL-RANDOM1 |   8.83  (0.14) |   8.80  (0.02) |   8.90  (0.12) |
          HP-RANDOM1 |   9.57  (0.05) |   9.40  (0.07) |   9.06  (0.02) |
       HAMCYCLE-FREE |   0.96  (0.00) |   0.93  (0.00) |   0.81  (0.00) |
             DECOMP2 |   9.04  (0.07) |   8.27  (0.02) |   8.05  (0.05) |
        BW-P4-Esra-a |  64.37  (0.06) |  65.12  (0.08) |  64.39  (0.07) |
        BW-P5-nopush |   6.03  (0.01) |   6.06  (0.00) |   6.01  (0.01) |
       BW-P5-pushbin |   5.17  (0.01) |   5.22  (0.00) |   5.16  (0.00) |
     BW-P5-nopushbin |   1.55  (0.00) |   1.55  (0.00) |   1.52  (0.00) |
              3SAT-1 |  20.13  (0.52) |  19.14  (0.10) |  20.02  (0.49) |
   3SAT-1-CONSTRAINT |  10.48  (0.01) |  10.73  (0.02) |  10.76  (0.03) |
        HANOI-Towers |   2.96  (0.01) |   2.81  (0.02) |   2.82  (0.02) |
              RAMSEY |   5.38  (0.02) |   5.15  (0.01) |   4.91  (0.01) |
             CRISTAL |   6.31  (0.07) |   6.02  (0.07) |   5.91  (0.10) |
             HANOI-K |  23.05  (0.08) |  24.10  (0.31) |  23.77  (0.02) |
           21-QUEENS |   5.50  (0.02) |   5.46  (0.02) |   5.57  (0.05) |
   MSTDir[V=13,A=40] |   9.52  (0.01) |   9.72  (0.01) |   9.22  (0.00) |
   MSTDir[V=15,A=40] |   9.57  (0.00) |   9.68  (0.00) |   9.22  (0.00) |
 MSTUndir[V=13,A=40] |   5.17  (0.01) |   5.26  (0.00) |   4.93  (0.00) |
 MSTUndir[V=15,A=40] |  84.09  (0.03) |  85.17  (0.02) |  80.56  (0.01) |
         TIMETABLING |   7.03  (0.01) |   6.85  (0.01) |   6.59  (0.00) |
---------------------+----------------+----------------+----------------+


Tue Jul  1 23:30:24 CEST 2003  Jan Hubicka  <jh@suse.cz>
			       Gerald Pfeifer  <pfeifer@dbai.tuwien.ac.at>

	* cgraph.c (cgraph_max_uid): New global variable.
	(cgraph_node): Set uid field.
	(create_edge): Keep inline flags consistent.
	(dump_cgraph): Dump more info.
	* cgraph.h (struct cgraph_local_info): Remove inline_many and
	can_inline_once; add inlinable, disgread_inline_limits, and self_insn
	(struct cgraph_global_info): Add insns, calls, cloned_times, will_be_output.
	(struct cgraph_node): Add uid.
	(struct cgraph_edge): Add inline_call.
	(cgraph_max_uid, cgraph_inline_p): Declare.
	* cgraph.c: Include params.h and fibheap.h
	(cgraph_mark_functions_to_inline_once): Kill.
	(INSNS_PER_CALL): New constant.
	(ncalls_inlined, nfunctions_inlined, initial_insns, overall_insns): New
	static variables.
	(cgraph_finalize_function): Do not analyze inlining.
	(cgraph_finalize_compilation_unit): Set inlining attributes.
	(cgraph_mark_functions_to_output): More consistency checks.
	(cgraph_optimize_function): Set current_function_decl to NULL.
	(cgraph_expand_function): Use new inline flags.
	(cgraph_postorder): Expand from cgraph_expand_functions.
	(INLINED_TIMES, SET_INLINED_TIMES): New macros.
	(cgraph_inlined_into, cgraph_inlined_callees,
	cgraph_estimate_size_after_inlining, cgraph_estimate_growth,
	cgraph_mark_inline, cgraph_check_inline_limits,
	cgraph_default_inline_p, cgraph_decide_inling_of_small_functions, 
	cgraph_decide_inlining, cgraph_inline_p): New functions.
	* params.def (PARAM_LARGE_FUNCTION_INSNS, PARAM_LARGE_FUNCTION_GROWTH,
	PARAM_INLINE_UNIT_GROWTH): New parameters.
	* tree-inline.c (struct inline_data): New field current_decl.
	(expand_call_inline): Avoid forward declarations; use inlinable_function_p.
	(optimize_inline_calls): Set id.current_decl.
Index: cgraph.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/cgraph.c,v
retrieving revision 1.16
diff -c -3 -p -r1.16 cgraph.c
*** cgraph.c	30 Jun 2003 21:56:46 -0000	1.16
--- cgraph.c	1 Jul 2003 20:56:02 -0000
*************** struct cgraph_node *cgraph_nodes_queue;
*** 48,53 ****
--- 48,56 ----
  /* Number of nodes in existence.  */
  int cgraph_n_nodes;
  
+ /* Maximal uid used in cgraph nodes.  */
+ int cgraph_max_uid;
+ 
  /* Set when whole unit has been analyzed so we can access global info.  */
  bool cgraph_global_info_ready = false;
  
*************** cgraph_node (decl)
*** 114,119 ****
--- 117,123 ----
    node = ggc_alloc_cleared (sizeof (*node));
    node->decl = decl;
    node->next = cgraph_nodes;
+   node->uid = cgraph_max_uid++;
    if (cgraph_nodes)
      cgraph_nodes->previous = node;
    node->previous = NULL;
*************** create_edge (caller, callee)
*** 157,162 ****
--- 161,179 ----
       struct cgraph_node *caller, *callee;
  {
    struct cgraph_edge *edge = ggc_alloc (sizeof (struct cgraph_edge));
+   struct cgraph_edge *edge2;
+ 
+   edge->inline_call = false;
+   /* At the moment we don't associate calls with specific CALL_EXPRs
+      as we probably ought to, so we must preserve inline_call flags to
+      be the same in all copies of the same edge.  */
+   if (cgraph_global_info_ready)
+     for (edge2 = caller->callees; edge2; edge2 = edge2->next_caller)
+       if (edge2->callee == callee)
+ 	{
+ 	  edge->inline_call = edge2->inline_call;
+ 	  break;
+ 	}
  
    edge->caller = caller;
    edge->callee = callee;
*************** dump_cgraph (f)
*** 329,335 ****
    for (node = cgraph_nodes; node; node = node->next)
      {
        struct cgraph_edge *edge;
!       fprintf (f, "%s", IDENTIFIER_POINTER (DECL_NAME (node->decl)));
        if (node->origin)
  	fprintf (f, " nested in: %s",
  		 IDENTIFIER_POINTER (DECL_NAME (node->origin->decl)));
--- 346,352 ----
    for (node = cgraph_nodes; node; node = node->next)
      {
        struct cgraph_edge *edge;
!       fprintf (f, "%s %i insns", IDENTIFIER_POINTER (DECL_NAME (node->decl)), node->local.self_insns);
        if (node->origin)
  	fprintf (f, " nested in: %s",
  		 IDENTIFIER_POINTER (DECL_NAME (node->origin->decl)));
*************** dump_cgraph (f)
*** 340,354 ****
        if (DECL_SAVED_TREE (node->decl))
  	fprintf (f, " tree");
  
        fprintf (f, "\n  called by :");
        for (edge = node->callers; edge; edge = edge->next_caller)
! 	fprintf (f, "%s ",
! 		 IDENTIFIER_POINTER (DECL_NAME (edge->caller->decl)));
  
        fprintf (f, "\n  calls: ");
        for (edge = node->callees; edge; edge = edge->next_callee)
! 	fprintf (f, "%s ",
! 		 IDENTIFIER_POINTER (DECL_NAME (edge->callee->decl)));
        fprintf (f, "\n");
      }
  }
--- 357,390 ----
        if (DECL_SAVED_TREE (node->decl))
  	fprintf (f, " tree");
  
+       if (node->local.disgread_inline_limits)
+ 	fprintf (f, " always_inline");
+       else if (node->local.inlinable)
+ 	fprintf (f, " inlinable");
+       if (node->global.insns && node->global.insns != node->local.self_insns)
+ 	fprintf (f, " %i insns after inlining", node->global.insns);
+       if (node->global.cloned_times > 1)
+ 	fprintf (f, " cloned %ix", node->global.cloned_times);
+       if (node->global.calls)
+ 	fprintf (f, " %i calls", node->global.calls);
+ 
        fprintf (f, "\n  called by :");
        for (edge = node->callers; edge; edge = edge->next_caller)
! 	{
! 	  fprintf (f, "%s ",
! 		   IDENTIFIER_POINTER (DECL_NAME (edge->caller->decl)));
! 	  if (edge->inline_call)
! 	    fprintf(f, "(inlined) ");
! 	}
  
        fprintf (f, "\n  calls: ");
        for (edge = node->callees; edge; edge = edge->next_callee)
! 	{
! 	  fprintf (f, "%s ",
! 		   IDENTIFIER_POINTER (DECL_NAME (edge->callee->decl)));
! 	  if (edge->inline_call)
! 	    fprintf(f, "(inlined) ");
! 	}
        fprintf (f, "\n");
      }
  }
Index: cgraph.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/cgraph.h,v
retrieving revision 1.8
diff -c -3 -p -r1.8 cgraph.h
*** cgraph.h	1 Jul 2003 12:17:51 -0000	1.8
--- cgraph.h	1 Jul 2003 20:56:02 -0000
*************** struct cgraph_local_info GTY(())
*** 30,40 ****
    /* Set when function function is visible in current compilation unit only
       and it's address is never taken.  */
    bool local;
!   /* Set when function is small enough to be inlinable many times.  */
!   bool inline_many;
!   /* Set when function can be inlined once (false only for functions calling
!      alloca, using varargs and so on).  */
!   bool can_inline_once;
  };
  
  /* Information about the function that needs to be computed globally
--- 30,42 ----
    /* Set when function function is visible in current compilation unit only
       and it's address is never taken.  */
    bool local;
! 
!   /* False when there is something making inlining impossible (such as va_arg) */
!   bool inlinable;
!   /* True when function should be inlined independently on it's size.  */
!   bool disgread_inline_limits;
!   /* Size of the function before inlining.  */
!   int self_insns;
  };
  
  /* Information about the function that needs to be computed globally
*************** struct cgraph_global_info GTY(())
*** 44,49 ****
--- 46,65 ----
  {
    /* Set when the function will be inlined exactly once.  */
    bool inline_once;
+ 
+   /* Estimated size of the function after inlining.  */
+   int insns;
+ 
+   /* Number of direct calls not inlined into the function body.  */
+   int calls;
+ 
+   /* Number of times given function will be cloned during output.  */
+   int cloned_times;
+ 
+   /* Set to true for all reachable functions before inlining is decided.
+      Once we inline all calls to the function and the function is local,
+      it is set to false.  */
+   bool will_be_output;
  };
  
  /* Information about the function that is propagated by the RTL backend.
*************** struct cgraph_node GTY(())
*** 73,78 ****
--- 89,96 ----
    struct cgraph_node *nested;
    /* Pointer to the next function with same origin, if any.  */
    struct cgraph_node *next_nested;
+   /* Unique id of the node.  */
+   int uid;
    PTR GTY ((skip (""))) aux;
  
    /* Set when function must be output - it is externally visible
*************** struct cgraph_edge GTY(())
*** 98,103 ****
--- 116,122 ----
    struct cgraph_node *callee;
    struct cgraph_edge *next_caller;
    struct cgraph_edge *next_callee;
+   bool inline_call;
  };
  
  /* The cgraph_varpool data strutcture.
*************** struct cgraph_varpool_node GTY(())
*** 119,124 ****
--- 138,144 ----
  
  extern GTY(()) struct cgraph_node *cgraph_nodes;
  extern GTY(()) int cgraph_n_nodes;
+ extern GTY(()) int cgraph_max_uid;
  extern bool cgraph_global_info_ready;
  extern GTY(()) struct cgraph_node *cgraph_nodes_queue;
  
*************** void cgraph_finalize_compilation_unit	PA
*** 150,154 ****
--- 170,175 ----
  void cgraph_create_edges		PARAMS ((tree, tree));
  void cgraph_optimize			PARAMS ((void));
  void cgraph_mark_needed_node		PARAMS ((struct cgraph_node *, int));
+ bool cgraph_inline_p			PARAMS ((tree, tree));
  
  #endif  /* GCC_CGRAPH_H  */
Index: cgraphunit.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/cgraphunit.c,v
retrieving revision 1.8
diff -c -3 -p -r1.8 cgraphunit.c
*** cgraphunit.c	1 Jul 2003 12:17:51 -0000	1.8
--- cgraphunit.c	1 Jul 2003 20:56:03 -0000
*************** Software Foundation, 59 Temple Place - S
*** 34,48 ****
  #include "target.h"
  #include "cgraph.h"
  #include "diagnostic.h"
  
  static void cgraph_expand_functions PARAMS ((void));
  static void cgraph_mark_functions_to_output PARAMS ((void));
  static void cgraph_expand_function PARAMS ((struct cgraph_node *));
  static tree record_call_1 PARAMS ((tree *, int *, void *));
  static void cgraph_mark_local_functions PARAMS ((void));
- static void cgraph_mark_functions_to_inline_once PARAMS ((void));
  static void cgraph_optimize_function PARAMS ((struct cgraph_node *));
  
  /* Analyze function once it is parsed.  Set up the local information
     available - create cgraph edges for function calls via BODY.  */
  
--- 34,57 ----
  #include "target.h"
  #include "cgraph.h"
  #include "diagnostic.h"
+ #include "params.h"
+ #include "fibheap.h"
+ 
+ #define INSNS_PER_CALL 10
  
  static void cgraph_expand_functions PARAMS ((void));
  static void cgraph_mark_functions_to_output PARAMS ((void));
  static void cgraph_expand_function PARAMS ((struct cgraph_node *));
  static tree record_call_1 PARAMS ((tree *, int *, void *));
  static void cgraph_mark_local_functions PARAMS ((void));
  static void cgraph_optimize_function PARAMS ((struct cgraph_node *));
  
+ /* Statistics we collect about inlining algorithm.  */
+ static int ncalls_inlined;
+ static int nfunctions_inlined;
+ static int initial_insns;
+ static int overall_insns;
+ 
  /* Analyze function once it is parsed.  Set up the local information
     available - create cgraph edges for function calls via BODY.  */
  
*************** cgraph_finalize_function (decl, body)
*** 74,88 ****
        cgraph_mark_needed_node (node, 1);
      }
  
-   if (!node->needed && !DECL_COMDAT (node->decl))
-     node->local.can_inline_once = tree_inlinable_function_p (decl, 1);
-   else
-     node->local.can_inline_once = 0;
-   if (flag_inline_trees)
-     node->local.inline_many = tree_inlinable_function_p (decl, 0);
-   else
-     node->local.inline_many = 0;
- 
    (*debug_hooks->deferred_inline_function) (decl);
  }
  
--- 85,90 ----
*************** cgraph_finalize_compilation_unit ()
*** 186,191 ****
--- 188,201 ----
        /* First kill forward declaration so reverse inling works properly.  */
        cgraph_create_edges (decl, DECL_SAVED_TREE (decl));
  
+       node->local.inlinable = tree_inlinable_function_p (decl, 1);
+       if (!DECL_ESTIMATED_INSNS (decl))
+ 	DECL_ESTIMATED_INSNS (decl) = estimate_num_insns (decl);
+       node->local.self_insns = DECL_ESTIMATED_INSNS (decl);
+       if (node->local.inlinable)
+ 	node->local.disgread_inline_limits
+ 	  = (*lang_hooks.tree_inlining.disregard_inline_limits) (decl);
+ 
        for (edge = node->callees; edge; edge = edge->next_callee)
  	{
  	  if (!edge->callee->reachable)
*************** cgraph_mark_functions_to_output ()
*** 230,245 ****
    for (node = cgraph_nodes; node; node = node->next)
      {
        tree decl = node->decl;
  
        /* We need to output all local functions that are used and not
  	 always inlined, as well as those that are reachable from
  	 outside the current compilation unit.  */
        if (DECL_SAVED_TREE (decl)
  	  && (node->needed
! 	      || (!node->local.inline_many && !node->global.inline_once
! 		  && node->reachable)
! 	      || (DECL_ASSEMBLER_NAME_SET_P (decl)
! 	          && TREE_SYMBOL_REFERENCED (DECL_ASSEMBLER_NAME (decl))))
  	  && !TREE_ASM_WRITTEN (decl) && !node->origin
  	  && !DECL_EXTERNAL (decl))
  	node->output = 1;
--- 240,259 ----
    for (node = cgraph_nodes; node; node = node->next)
      {
        tree decl = node->decl;
+       struct cgraph_edge *e;
+       if (node->output)
+ 	abort ();
+ 
+       for (e = node->callers; e; e = e->next_caller)
+ 	if (!e->inline_call)
+ 	  break;
  
        /* We need to output all local functions that are used and not
  	 always inlined, as well as those that are reachable from
  	 outside the current compilation unit.  */
        if (DECL_SAVED_TREE (decl)
  	  && (node->needed
! 	      || (e && node->reachable))
  	  && !TREE_ASM_WRITTEN (decl) && !node->origin
  	  && !DECL_EXTERNAL (decl))
  	node->output = 1;
*************** cgraph_optimize_function (node)
*** 254,259 ****
--- 268,275 ----
  {
    tree decl = node->decl;
  
+   /* optimize_inline_calls avoids inlining of current_function_decl.  */
+   current_function_decl = 0;
    if (flag_inline_trees)
      optimize_inline_calls (decl);
    if (node->nested)
*************** cgraph_expand_function (node)
*** 270,275 ****
--- 286,292 ----
       struct cgraph_node *node;
  {
    tree decl = node->decl;
+   struct cgraph_edge *e;
  
    announce_function (decl);
  
*************** cgraph_expand_function (node)
*** 279,319 ****
       via lang_expand_decl_stmt.  */
    (*lang_hooks.callgraph.expand_function) (decl);
  
!   /* When we decided to inline the function once, we never ever should
!      need to output it separately.  */
!   if (node->global.inline_once)
!     abort ();
!   if (!node->local.inline_many
!       || !node->callers)
!     DECL_SAVED_TREE (decl) = NULL;
    current_function_decl = NULL;
  }
  
! 
! /* Expand all functions that must be output. 
!   
!    Attempt to topologically sort the nodes so function is output when
!    all called functions are already assembled to allow data to be
!    propagated across the callgraph.  Use a stack to get smaller distance
!    between a function and it's callees (later we may choose to use a more
!    sophisticated algorithm for function reordering; we will likely want
!    to use subsections to make the output functions appear in top-down
!    order.  */
! 
! static void
! cgraph_expand_functions ()
  {
    struct cgraph_node *node, *node2;
-   struct cgraph_node **stack =
-     xcalloc (sizeof (struct cgraph_node *), cgraph_n_nodes);
-   struct cgraph_node **order =
-     xcalloc (sizeof (struct cgraph_node *), cgraph_n_nodes);
    int stack_size = 0;
    int order_pos = 0;
    struct cgraph_edge *edge, last;
-   int i;
  
!   cgraph_mark_functions_to_output ();
  
    /* We have to deal with cycles nicely, so use a depth first traversal
       output algorithm.  Ignore the fact that some functions won't need
--- 296,321 ----
       via lang_expand_decl_stmt.  */
    (*lang_hooks.callgraph.expand_function) (decl);
  
!   for (e = node->callers; e; e = e->next_caller)
!     if (e->inline_call)
!       break;
!   if (!e)
!     DECL_SAVED_TREE (decl) = NULL;
    current_function_decl = NULL;
  }
  
! /* Fill array order with all nodes with output flag set in the reverse
!    topological order.  */
! static int
! cgraph_postorder (struct cgraph_node **order)
  {
    struct cgraph_node *node, *node2;
    int stack_size = 0;
    int order_pos = 0;
    struct cgraph_edge *edge, last;
  
!   struct cgraph_node **stack =
!     xcalloc (sizeof (struct cgraph_node *), cgraph_n_nodes);
  
    /* We have to deal with cycles nicely, so use a depth first traversal
       output algorithm.  Ignore the fact that some functions won't need
*************** cgraph_expand_functions ()
*** 359,486 ****
  	      }
  	  }
        }
!   for (i = order_pos - 1; i >= 0; i--)
      {
!       node = order[i];
!       if (node->output)
  	{
! 	  if (!node->reachable)
! 	    abort ();
! 	  node->output = 0;
! 	  cgraph_expand_function (node);
  	}
      }
    free (stack);
!   free (order);
  }
  
! /* Mark all local functions.
!    We can not use node->needed directly as it is modified during
!    execution of cgraph_optimize.  */
  
  static void
! cgraph_mark_local_functions ()
  {
    struct cgraph_node *node;
  
!   if (!quiet_flag)
!     fprintf (stderr, "\n\nMarking local functions:");
  
-   /* Figure out functions we want to assemble.  */
    for (node = cgraph_nodes; node; node = node->next)
      {
!       node->local.local = (!node->needed
! 		           && DECL_SAVED_TREE (node->decl)
! 			   && !DECL_COMDAT (node->decl)
! 		           && !TREE_PUBLIC (node->decl));
!       if (node->local.local)
  	announce_function (node->decl);
      }
  }
  
! /* Decide what function should be inlined because they are invoked once
!    (so inlining won't result in duplication of the code).  */
! 
  static void
! cgraph_mark_functions_to_inline_once ()
  {
!   struct cgraph_node *node, *node1;
  
!   if (!quiet_flag)
!     fprintf (stderr, "\n\nMarking functions to inline once:");
  
-   /* Now look for function called only once and mark them to inline.
-      From this point number of calls to given function won't grow.  */
    for (node = cgraph_nodes; node; node = node->next)
      {
        if (node->callers && !node->callers->next_caller && !node->needed
! 	  && node->local.can_inline_once)
  	{
  	  bool ok = true;
  
  	  /* Verify that we won't duplicate the caller.  */
  	  for (node1 = node->callers->caller;
! 	       node1->local.inline_many
! 	       && node1->callers
! 	       && ok;
! 	       node1 = node1->callers->caller)
  	    if (node1->callers->next_caller || node1->needed)
  	      ok = false;
  	  if (ok)
  	    {
! 	      node->global.inline_once = true;
! 	      announce_function (node->decl);
  	    }
  	}
      }
  }
  
  
  /* Perform simple optimizations based on callgraph.  */
  
  void
  cgraph_optimize ()
  {
-   struct cgraph_node *node;
-   bool changed = true;
- 
    cgraph_mark_local_functions ();
  
!   cgraph_mark_functions_to_inline_once ();
  
    cgraph_global_info_ready = true;
    if (!quiet_flag)
      fprintf (stderr, "\n\nAssembling functions:");
  
!   /* Output everything.  
!      ??? Our inline heuristic may decide to not inline functions previously
!      marked as inlinable thus adding new function bodies that must be output.
!      Later we should move all inlining decisions to callgraph code to make
!      this impossible.  */
    cgraph_expand_functions ();
-   if (!quiet_flag)
-     fprintf (stderr, "\n\nAssembling functions that failed to inline:");
-   while (changed && !errorcount && !sorrycount)
-     {
-       changed = false;
-       for (node = cgraph_nodes; node; node = node->next)
- 	{
- 	  tree decl = node->decl;
- 	  if (!node->origin
- 	      && !TREE_ASM_WRITTEN (decl)
- 	      && DECL_SAVED_TREE (decl)
- 	      && !DECL_EXTERNAL (decl))
- 	    {
- 	      struct cgraph_edge *edge;
- 
- 	      for (edge = node->callers; edge; edge = edge->next_caller)
- 		if (TREE_ASM_WRITTEN (edge->caller->decl))
- 		  {
- 		    changed = true;
- 		    cgraph_expand_function (node);
- 		    break;
- 		  }
- 	    }
- 	}
-     }
  }
--- 359,1043 ----
  	      }
  	  }
        }
!   free (stack);
!   return order_pos;
! }
! 
! #define INLINED_TIMES(node) ((size_t)(node)->aux)
! #define SET_INLINED_TIMES(node,times) ((node)->aux = (void *)(times))
! 
! /* Return list of nodes we decided to inline NODE into, set their output
!    flag and compute INLINED_TIMES. 
! 
!    We do simple backtracing to get INLINED_TIMES right.  This should not be
!    expensive as we limit the amount of inlining.  Alternatively we may first
!    discover set of nodes, topologically sort these and propagate
!    INLINED_TIMES  */
! 
! static int
! cgraph_inlined_into (struct cgraph_node *node, struct cgraph_node **array)
! {
!   int nfound = 0;
!   struct cgraph_edge **stack;
!   struct cgraph_edge *e, *e1;
!   int sp;
!   int i;
! 
!   /* Fast path: since we traverse in mostly topological order, we will likely
!      find no edges.  */
!   for (e = node->callers; e; e = e->next_caller)
!     if (e->inline_call)
!       break;
! 
!   if (!e)
!     return 0;
! 
!   /* Allocate stack for back-tracking up callgraph.  */
!   stack = xmalloc ((cgraph_n_nodes + 1) * sizeof (struct cgraph_edge));
!   sp = 0;
! 
!   /* Push the first edge on to the stack.  */
!   stack[sp++] = e;
! 
!   while (sp)
      {
!       struct cgraph_node *caller;
! 
!       /* Look at the edge on the top of the stack.  */
!       e = stack[sp - 1];
!       caller = e->caller;
! 
!       /* Check if the caller destination has been visited yet.  */
!       if (!caller->output)
  	{
! 	  array[nfound++] = e->caller;
! 	  /* Mark that we have visited the destination.  */
! 	  caller->output = true;
! 	  SET_INLINED_TIMES (caller, 0);
! 	}
!       SET_INLINED_TIMES (caller, INLINED_TIMES (caller) + 1);
! 
!       for (e1 = caller->callers; e1; e1 = e1->next_caller)
! 	if (e1->inline_call)
! 	  break;
!       if (e1)
! 	stack[sp++] = e1;
!       else
! 	{
! 	  while (true)
! 	    {
! 	      for (e1 = e->next_caller; e1; e1 = e1->next_caller)
! 		if (e1->inline_call)
! 		  break;
! 
! 	      if (e1)
! 		{
! 		  stack[sp - 1] = e1;
! 		  break;
! 		}
! 	      else
! 		{
! 		  sp--;
! 		  if (!sp)
! 		    break;
! 		  e = stack[sp - 1];
! 		}
! 	    }
  	}
      }
+ 
    free (stack);
! 
! 
!   if (!quiet_flag)
!     {
!       fprintf (stderr, "\nFound inline predecesors of ");
!       announce_function (node->decl);
!       fprintf (stderr, ":");
!       for (i = 0; i < nfound; i++)
! 	{
! 	  announce_function (array[i]->decl);
! 	  if (INLINED_TIMES (array[i]) != 1)
! 	    fprintf (stderr, " (%i times)", INLINED_TIMES (array[i]));
! 	}
!       fprintf (stderr, "\n");
!     }
! 
!   return nfound;
  }
  
! /* Return list of nodes we decided to inline into NODE, set their output
!    flag and compute INLINED_TIMES. 
! 
!    This function is identical to cgraph_inlined_into with callers and callees
!    nodes swapped.  */
  
+ static int
+ cgraph_inlined_callees (struct cgraph_node *node, struct cgraph_node **array)
+ {
+   int nfound = 0;
+   struct cgraph_edge **stack;
+   struct cgraph_edge *e, *e1;
+   int sp;
+   int i;
+ 
+   /* Fast path: since we traverse in mostly topological order, we will likely
+      find no edges.  */
+   for (e = node->callees; e; e = e->next_callee)
+     if (e->inline_call)
+       break;
+ 
+   if (!e)
+     return 0;
+ 
+   /* Allocate stack for back-tracking up callgraph.  */
+   stack = xmalloc ((cgraph_n_nodes + 1) * sizeof (struct cgraph_edge));
+   sp = 0;
+ 
+   /* Push the first edge on to the stack.  */
+   stack[sp++] = e;
+ 
+   while (sp)
+     {
+       struct cgraph_node *callee;
+ 
+       /* Look at the edge on the top of the stack.  */
+       e = stack[sp - 1];
+       callee = e->callee;
+ 
+       /* Check if the callee destination has been visited yet.  */
+       if (!callee->output)
+ 	{
+ 	  array[nfound++] = e->callee;
+ 	  /* Mark that we have visited the destination.  */
+ 	  callee->output = true;
+ 	  SET_INLINED_TIMES (callee, 0);
+ 	}
+       SET_INLINED_TIMES (callee, INLINED_TIMES (callee) + 1);
+ 
+       for (e1 = callee->callees; e1; e1 = e1->next_callee)
+ 	if (e1->inline_call)
+ 	  break;
+       if (e1)
+ 	stack[sp++] = e1;
+       else
+ 	{
+ 	  while (true)
+ 	    {
+ 	      for (e1 = e->next_callee; e1; e1 = e1->next_callee)
+ 		if (e1->inline_call)
+ 		  break;
+ 
+ 	      if (e1)
+ 		{
+ 		  stack[sp - 1] = e1;
+ 		  break;
+ 		}
+ 	      else
+ 		{
+ 		  sp--;
+ 		  if (!sp)
+ 		    break;
+ 		  e = stack[sp - 1];
+ 		}
+ 	    }
+ 	}
+     }
+ 
+   free (stack);
+ 
+ 
+   if (!quiet_flag)
+     {
+       fprintf (stderr, "\nFound inline succcessors of ");
+       announce_function (node->decl);
+       fprintf (stderr, ":");
+       for (i = 0; i < nfound; i++)
+ 	{
+ 	  announce_function (array[i]->decl);
+ 	  if (INLINED_TIMES (array[i]) != 1)
+ 	    fprintf (stderr, " (%i times)", INLINED_TIMES (array[i]));
+ 	}
+       fprintf (stderr, "\n");
+     }
+ 
+   return nfound;
+ }
+ 
+ /* Estimate size of the function after inlining WHAT into TO.  */
+ static int
+ cgraph_estimate_size_after_inlining (int times,
+ 				     struct cgraph_node *to,
+ 				     struct cgraph_node *what)
+ {
+   return (what->global.insns - INSNS_PER_CALL) * times + to->global.insns;
+ }
+ 
+ /* Estimate the growth caused by inlining NODE into all callees.  */
+ static int
+ cgraph_estimate_growth (struct cgraph_node *node)
+ {
+   int growth = 0;
+   int calls_saved = 0;
+   int clones_added = 0;
+   struct cgraph_edge *e;
+ 
+   for (e = node->callers; e; e = e->next_caller)
+     if (!e->inline_call)
+       {
+ 	growth += ((cgraph_estimate_size_after_inlining (1, e->caller, node)
+ 		   - e->caller->global.insns) * e->caller->global.cloned_times);
+ 	calls_saved += e->caller->global.cloned_times;
+ 	clones_added += e->caller->global.cloned_times;
+       }
+   
+   /* ??? Wrong for self recursive functions or cases where we decide to not
+      inline for different reasons, but it is not big deal as in that case
+      we will keep the body around, but we will also avoid some inlining.  */
+   if (!node->needed && !node->origin && !DECL_EXTERNAL (node->decl))
+     growth -= node->global.insns, clones_added --;
+ 
+   if (!calls_saved)
+     calls_saved = 1;
+ 
+   return growth;
+ }
+ 
+ /* Update insn sizes after inlining WHAT into TO that is already inlined into
+    all nodes in INLINED array.  */
  static void
! cgraph_mark_inline (struct cgraph_node *to,
! 		    struct cgraph_node *what,
! 		    struct cgraph_node **inlined, int ninlined,
! 		    struct cgraph_node **inlined_callees, int ninlined_callees)
! {
!   int i;
!   int times = 0;
!   int clones = 0;
!   struct cgraph_edge *e;
!   bool called = false;
! 
!   for (e = what->callers; e; e = e->next_caller)
!     {
!       if (e->caller == to)
! 	{
! 	  if (e->inline_call)
! 	    abort ();
! 	  e->inline_call = true;
! 	  times++;
! 	  clones += e->caller->global.cloned_times;
! 	}
!       else if (!e->inline_call)
! 	called = true;
!     }
!   if (!times)
!     abort ();
!   ncalls_inlined += times;
! 
!   if (to->global.will_be_output)
!     overall_insns -= to->global.insns;
!   to->global.insns = cgraph_estimate_size_after_inlining (times, to, what);
!   if (to->global.will_be_output)
!     overall_insns += to->global.insns;
! 
!   to->global.calls += (what->global.calls - 1) * times;
!   if (!called && !what->needed && !what->origin && !DECL_EXTERNAL (what->decl))
!     {
!       if (!what->global.will_be_output)
! 	abort ();
!       clones--;
!       nfunctions_inlined++;
!       what->global.will_be_output = 0;
!       overall_insns -= what->global.insns;
!     }
!   what->global.cloned_times += clones;
!   if (to->global.calls < 0)
!     abort ();
!   for (i = 0; i < ninlined; i++)
!     {
!       if (inlined[i]->global.will_be_output)
! 	overall_insns -= inlined[i]->global.insns;
!       inlined[i]->global.insns =
! 	cgraph_estimate_size_after_inlining (INLINED_TIMES (inlined[i]) *
! 					     times,
! 					     inlined[i], what);
!       if (inlined[i]->global.will_be_output)
! 	overall_insns += inlined[i]->global.insns;
!       inlined[i]->global.calls +=
! 	(what->global.calls - 1) *INLINED_TIMES (inlined[i]) * times;
!       if (inlined[i]->global.calls < 0)
! 	abort ();
!     }
!   for (i = 0; i < ninlined_callees; i++)
!     {
!       inlined_callees[i]->global.cloned_times += INLINED_TIMES (inlined_callees[i]) * clones;
!     }
! }
! 
! /* Return false when inlining WHAT into TO is not good idea as it would cause
!    too large growth of function bodies.  */
! static bool
! cgraph_check_inline_limits (struct cgraph_node *to,
! 			    struct cgraph_node *what,
! 			    struct cgraph_node **inlined, int ninlined)
! {
!   int i;
!   int times = 0;
!   struct cgraph_edge *e;
!   int newsize;
! 
!   for (e = to->callees; e; e = e->next_callee)
!     if (e->callee == what)
!       times++;
! 
!   newsize = cgraph_estimate_size_after_inlining (times, to, what);
!   if (newsize > PARAM_VALUE (PARAM_LARGE_FUNCTION_INSNS)
!       && newsize > to->local.self_insns * PARAM_VALUE (PARAM_LARGE_FUNCTION_GROWTH) / 100)
!     return false;
!   for (i = 0; i < ninlined; i++)
!     {
!       newsize =
! 	cgraph_estimate_size_after_inlining (INLINED_TIMES (inlined[i]) *
! 					     times, inlined[i], what);
!       if (newsize > PARAM_VALUE (PARAM_LARGE_FUNCTION_INSNS)
! 	  && newsize >
! 	  inlined[i]->local.self_insns * PARAM_VALUE (PARAM_LARGE_FUNCTION_GROWTH)  / 100)
! 	return false;
!     }
!   return true;
! }
! 
! /* Return true when function N is small enought to be inlined.  */
! static bool
! cgraph_default_inline_p (struct cgraph_node *n)
  {
+   if (!DECL_INLINE (n->decl) || !DECL_SAVED_TREE (n->decl))
+     return false;
+   if (DID_INLINE_FUNC (n->decl))
+     return n->global.insns < MAX_INLINE_INSNS_AUTO;
+   else
+     return n->global.insns < MAX_INLINE_INSNS_SINGLE;
+ }
+ 
+ /* We use greedy algorithm for inlining of small functions:
+    All inline candidates are put into prioritized heap based on estimated
+    growth of the overall number of instructions and then update the estimates.
+    
+    INLINED and INLINED_CALEES are just pointers to arrays large enought
+    to be passed to cgraph_inlined_into and cgraph_inlined_callees.  */
+ 
+ static void
+ cgraph_decide_inling_of_small_functions (struct cgraph_node **inlined,
+ 					 struct cgraph_node **inlined_callees)
+ {
+   int i;
    struct cgraph_node *node;
+   fibheap_t heap = fibheap_new ();
+   struct fibnode **heap_node =
+     xcalloc (sizeof (struct fibnode *), cgraph_max_uid);
+   int ninlined, ninlined_callees;
  
!   /* Put all inline candidates into the heap.  */
  
    for (node = cgraph_nodes; node; node = node->next)
      {
!       struct cgraph_edge *e;
! 
!       if (!node->local.inlinable || !node->callers
! 	  || !cgraph_default_inline_p (node))
! 	continue;
! 
!       /* Rule out always_inline functions we dealt with earler.  */
!       for (e = node->callers; e; e = e->next_caller)
! 	if (e->inline_call)
! 	  break;
!       if (e)
! 	continue;
!       heap_node[node->uid] =
! 	fibheap_insert (heap, cgraph_estimate_growth (node), node);
!     }
! 
!   if (!quiet_flag)
!     fprintf (stderr, "\n\nDeciding on inlining: ");
!   while ((node = fibheap_extract_min (heap))
! 	 && overall_insns <=
! 	 initial_insns * PARAM_VALUE (PARAM_INLINE_UNIT_GROWTH) / 100)
!     {
!       struct cgraph_edge *e;
! 
!       heap_node [node->uid] = NULL;
!       if (!cgraph_default_inline_p (node))
! 	continue;
!       ninlined_callees = cgraph_inlined_callees (node, inlined_callees);
!       for (e = node->callers; e; e = e->next_caller)
! 	if (!e->inline_call && e->caller != node)
! 	  {
! 	    ninlined = cgraph_inlined_into (e->caller, inlined);
! 	    if (e->callee->output
! 		|| !cgraph_check_inline_limits (e->caller, node, inlined,
! 						ninlined))
! 	      {
! 		for (i = 0; i < ninlined; i++)
! 		  inlined[i]->output = 0, node->aux = 0;
! 		continue;
! 	      }
! 	    cgraph_mark_inline (e->caller, node, inlined, ninlined,
! 				inlined_callees, ninlined_callees);
! 	    if (heap_node [e->caller->uid])
! 	      fibheap_replace_key (heap, heap_node [e->caller->uid],
! 				   cgraph_estimate_growth (e->caller));
! 
! 	    /* Size of the functions we updated into has changed, so update
! 	       the keys.  */
! 	    for (i = 0; i < ninlined; i++)
! 	      {
! 		inlined[i]->output = 0, node->aux = 0;
! 		if (heap_node[inlined[i]->uid])
! 		  fibheap_replace_key (heap, heap_node [inlined[i]->uid],
! 				       cgraph_estimate_growth (inlined[i]));
! 	      }
! 	  }
!       if (!quiet_flag)
  	announce_function (node->decl);
+ 
+       /* Similarly all functions called by function we just inlined
+          are now called more times; update keys.  */
+      
+       for (e = node->callees; e; e = e->next_callee)
+ 	if (!e->inline_call && heap_node [e->callee->uid])
+ 	  fibheap_replace_key (heap, heap_node [e->callee->uid],
+ 			       cgraph_estimate_growth (e->callee));
+ 
+       for (i = 0; i < ninlined_callees; i++)
+ 	{
+ 	  struct cgraph_edge *e;
+ 
+ 	  for (e = inlined_callees[i]->callees; e; e = e->next_callee)
+ 	    if (!e->inline_call && heap_node [e->callee->uid])
+ 	      fibheap_replace_key (heap, heap_node [e->callee->uid],
+ 				   cgraph_estimate_growth (e->callee));
+ 
+ 	  inlined_callees[i]->output = 0, node->aux = 0;
+ 	}
      }
+   fibheap_delete (heap);
+   free (heap_node);
  }
  
! /* Decide on the inlining.  We do so in the topological order to avoid
!    expenses on updating datastructures.  */
  static void
! cgraph_decide_inlining (void)
  {
!   struct cgraph_node *node;
!   int nnodes;
!   struct cgraph_node **order =
!     xcalloc (sizeof (struct cgraph_node *), cgraph_n_nodes);
!   struct cgraph_node **inlined =
!     xcalloc (sizeof (struct cgraph_node *), cgraph_n_nodes);
!   struct cgraph_node **inlined_callees =
!     xcalloc (sizeof (struct cgraph_node *), cgraph_n_nodes);
!   int ninlined;
!   int ninlined_callees;
!   int i, y;
  
!   for (node = cgraph_nodes; node; node = node->next)
!     {
!       int ncalls = 0;
!       struct cgraph_edge *e;
! 
!       node->global.insns = node->local.self_insns;
!       for (e = node->callees; e; e = e->next_callee)
! 	ncalls++;
!       node->global.calls = ncalls;
!       if (!DECL_EXTERNAL (node->decl))
! 	{
! 	  node->global.cloned_times = 1;
! 	  initial_insns += node->local.self_insns;
! 	  node->global.will_be_output = true;
! 	}
!     }
!   overall_insns = initial_insns;
! 
!   nnodes = cgraph_postorder (order);
  
    for (node = cgraph_nodes; node; node = node->next)
+     node->aux = 0;
+ 
+   /* In the first pass mark all always_inline edges.  Do this with a priority
+      so no our decisions makes this impossible.  */
+   for (i = nnodes - 1; i >= 0; i--)
      {
+       struct cgraph_edge *e;
+ 
+       node = order[i];
+ 
+       for (e = node->callees; e; e = e->next_callee)
+ 	if (e->callee->local.disgread_inline_limits)
+ 	  break;
+       if (!e)
+ 	continue;
+       ninlined = cgraph_inlined_into (order[i], inlined);
+       for (; e; e = e->next_callee)
+ 	{
+ 	  if (e->inline_call || !e->callee->local.disgread_inline_limits)
+ 	    continue;
+ 	  if (e->callee->output || e->callee == node)
+ 	    continue;
+ 	  ninlined_callees =
+ 	    cgraph_inlined_callees (e->callee, inlined_callees);
+ 	  cgraph_mark_inline (node, e->callee, inlined, ninlined,
+ 			      inlined_callees, ninlined_callees);
+ 	  for (y = 0; y < ninlined_callees; y++)
+ 	    inlined_callees[y]->output = 0, node->aux = 0;
+ 	}
+       for (y = 0; y < ninlined; y++)
+ 	inlined[y]->output = 0, node->aux = 0;
+     }
+ 
+   cgraph_decide_inling_of_small_functions (inlined, inlined_callees);
+ 
+   if (!quiet_flag)
+     fprintf (stderr, "\n\nFunctions to inline once:");
+ 
+   /* And finally decide what functions are called once.  */
+ 
+   for (i = nnodes - 1; i >= 0; i--)
+     {
+       node = order[i];
+ 
        if (node->callers && !node->callers->next_caller && !node->needed
! 	  && node->local.inlinable && !node->callers->inline_call
! 	  && !DECL_EXTERNAL (node->decl) && !DECL_COMDAT (node->decl))
  	{
  	  bool ok = true;
+ 	  struct cgraph_node *node1;
  
  	  /* Verify that we won't duplicate the caller.  */
  	  for (node1 = node->callers->caller;
! 	       node1->callers && node1->callers->inline_call
! 	       && ok; node1 = node1->callers->caller)
  	    if (node1->callers->next_caller || node1->needed)
  	      ok = false;
  	  if (ok)
  	    {
! 	      ninlined = cgraph_inlined_into (node->callers->caller, inlined);
! 	      if (cgraph_check_inline_limits
! 		  (node->callers->caller, node, inlined, ninlined))
! 		{
! 		  ninlined_callees =
! 		    cgraph_inlined_callees (node, inlined_callees);
! 		  cgraph_mark_inline (node->callers->caller, node, inlined,
! 				      ninlined, inlined_callees,
! 				      ninlined_callees);
! 		  announce_function (node->decl);
! 		  for (y = 0; y < ninlined_callees; y++)
! 		    inlined_callees[y]->output = 0, node->aux = 0;
! 		}
! 	      for (y = 0; y < ninlined; y++)
! 		inlined[y]->output = 0, node->aux = 0;
  	    }
  	}
      }
+ 
+   if (!quiet_flag)
+     fprintf (stderr,
+ 	     "Inlined %i calls, elliminated %i functions, %i insns turned to %i insns.\n",
+ 	     ncalls_inlined, nfunctions_inlined, initial_insns,
+ 	     overall_insns);
+   free (order);
+   free (inlined);
+   free (inlined_callees);
  }
  
+ /* Return true when CALLER_DECL should be inlined into CALLEE_DECL.  */
+ bool
+ cgraph_inline_p (tree caller_decl, tree callee_decl)
+ {
+   struct cgraph_node *caller = cgraph_node (caller_decl);
+   struct cgraph_node *callee = cgraph_node (callee_decl);
+   struct cgraph_edge *e;
+ 
+   for (e = caller->callees; e; e = e->next_callee)
+     if (e->callee == callee)
+       return e->inline_call;
+   return false;
+ }
+ 
+ /* Expand all functions that must be output. 
+   
+    Attempt to topologically sort the nodes so function is output when
+    all called functions are already assembled to allow data to be
+    propagated accross the callgraph.  Use a stack to get smaller distance
+    between a function and it's callees (later we may choose to use a more
+    sophisticated algorithm for function reordering; we will likely want
+    to use subsections to make the output functions appear in top-down
+    order).  */
+ 
+ static void
+ cgraph_expand_functions ()
+ {
+   struct cgraph_node *node;
+   struct cgraph_node **order =
+     xcalloc (sizeof (struct cgraph_node *), cgraph_n_nodes);
+   int order_pos = 0;
+   int i;
+ 
+   cgraph_mark_functions_to_output ();
+ 
+   order_pos = cgraph_postorder (order);
+ 
+   for (i = order_pos - 1; i >= 0; i--)
+     {
+       node = order[i];
+       if (node->output)
+ 	{
+ 	  if (!node->reachable)
+ 	    abort ();
+ 	  node->output = 0;
+ 	  cgraph_expand_function (node);
+ 	}
+     }
+   free (order);
+ }
+ 
+ /* Mark all local functions.
+    We can not use node->needed directly as it is modified during
+    execution of cgraph_optimize.  */
+ 
+ static void
+ cgraph_mark_local_functions ()
+ {
+   struct cgraph_node *node;
+ 
+   if (!quiet_flag)
+     fprintf (stderr, "\n\nMarking local functions:");
+ 
+   /* Figure out functions we want to assemble.  */
+   for (node = cgraph_nodes; node; node = node->next)
+     {
+       node->local.local = (!node->needed
+ 		           && DECL_SAVED_TREE (node->decl)
+ 			   && !DECL_COMDAT (node->decl)
+ 		           && !TREE_PUBLIC (node->decl));
+       if (node->local.local)
+ 	announce_function (node->decl);
+     }
+ }
  
  /* Perform simple optimizations based on callgraph.  */
  
  void
  cgraph_optimize ()
  {
    cgraph_mark_local_functions ();
  
!   cgraph_decide_inlining ();
  
    cgraph_global_info_ready = true;
    if (!quiet_flag)
      fprintf (stderr, "\n\nAssembling functions:");
  
!   /* Output everything.   */
    cgraph_expand_functions ();
  }
Index: params.def
===================================================================
RCS file: /cvs/gcc/gcc/gcc/params.def,v
retrieving revision 1.25
diff -c -3 -p -r1.25 params.def
*** params.def	4 Jun 2003 07:51:41 -0000	1.25
--- params.def	1 Jul 2003 20:56:03 -0000
*************** DEFPARAM(PARAM_MAX_PENDING_LIST_LENGTH,
*** 151,156 ****
--- 151,169 ----
  	 "max-pending-list-length",
  	 "The maximum length of scheduling's pending operations list",
  	 32)
+ 
+ DEFPARAM(PARAM_LARGE_FUNCTION_INSNS,
+ 	 "large-function-insns",
+ 	 "The size of function body to be considered large",
+ 	 30000)
+ DEFPARAM(PARAM_LARGE_FUNCTION_GROWTH,
+ 	 "large-function-growth",
+ 	 "Maximal growth due to inlining of large function (in percents)",
+ 	 200)
+ DEFPARAM(PARAM_INLINE_UNIT_GROWTH,
+ 	 "inline-unit-growth",
+ 	 "how much can given compilation unit grow because of the inlining (in percents)",
+ 	 150)
  
  /* The GCSE optimization will be disabled if it would require
     significantly more memory than this value.  */
Index: tree-inline.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/tree-inline.c,v
retrieving revision 1.64
diff -c -3 -p -r1.64 tree-inline.c
*** tree-inline.c	1 Jul 2003 12:18:01 -0000	1.64
--- tree-inline.c	1 Jul 2003 20:56:03 -0000
*************** typedef struct inline_data
*** 106,111 ****
--- 106,112 ----
    htab_t tree_pruner;
    /* Decl of function we are inlining into.  */
    tree decl;
+   tree current_decl;
  } inline_data;
  
  /* Prototypes.  */
*************** expand_call_inline (tp, walk_subtrees, d
*** 1183,1188 ****
--- 1301,1310 ----
    if (!fn)
      return NULL_TREE;
  
+   /* Turn forward declarations into real ones.  */
+   if (flag_unit_at_a_time)
+     fn = cgraph_node (fn)->decl;
+ 
    /* If fn is a declaration of a function in a nested scope that was
       globally declared inline, we don't set its DECL_INITIAL.
       However, we can't blindly follow DECL_ABSTRACT_ORIGIN because the
*************** expand_call_inline (tp, walk_subtrees, d
*** 1197,1205 ****
  
    /* Don't try to inline functions that are not well-suited to
       inlining.  */
!   if ((!flag_unit_at_a_time || !DECL_SAVED_TREE (fn)
!        || !cgraph_global_info (fn)->inline_once)
!       && !inlinable_function_p (fn, id, 0))
      {
        if (warn_inline && DECL_INLINE (fn) && !DID_INLINE_FUNC (fn)
  	  && !DECL_IN_SYSTEM_HEADER (fn))
--- 1319,1327 ----
  
    /* Don't try to inline functions that are not well-suited to
       inlining.  */
!   if (!DECL_SAVED_TREE (fn)
!       || (flag_unit_at_a_time && !cgraph_inline_p (id->current_decl, fn))
!       || (!flag_unit_at_a_time && !inlinable_function_p (fn, id, 0)))
      {
        if (warn_inline && DECL_INLINE (fn) && !DID_INLINE_FUNC (fn)
  	  && !DECL_IN_SYSTEM_HEADER (fn))
    if (id->decl && flag_unit_at_a_time)
*************** expand_call_inline (tp, walk_subtrees, d
*** 1441,1453 ****
      }
  
    /* Recurse into the body of the just inlined function.  */
!   expand_calls_inline (inlined_body, id);
    VARRAY_POP (id->fns);
  
    /* If we've returned to the top level, clear out the record of how
       much inlining has been done.  */
    if (VARRAY_ACTIVE_SIZE (id->fns) == id->first_inlined_fn)
      id->inlined_stmts = 0;
  
    /* Don't walk into subtrees.  We've already handled them above.  */
    *walk_subtrees = 0;
--- 1563,1580 ----
      }
  
    /* Recurse into the body of the just inlined function.  */
!   {
!     tree old_decl = id->current_decl;
!     id->current_decl = fn;
!     expand_calls_inline (inlined_body, id);
!     id->current_decl = old_decl;
!   }
    VARRAY_POP (id->fns);
  
    /* If we've returned to the top level, clear out the record of how
       much inlining has been done.  */
    if (VARRAY_ACTIVE_SIZE (id->fns) == id->first_inlined_fn)
      id->inlined_stmts = 0;
  
    /* Don't walk into subtrees.  We've already handled them above.  */
    *walk_subtrees = 0;
*************** optimize_inline_calls (fn)
*** 1487,1495 ****
--- 1614,1623 ----
    memset (&id, 0, sizeof (id));
  
    id.decl = fn;
+   id.current_decl = fn;
    /* Don't allow recursion into FN.  */
    VARRAY_TREE_INIT (id.fns, 32, "fns");
    VARRAY_PUSH_TREE (id.fns, fn);
    /* Or any functions that aren't finished yet.  */
    prev_fn = NULL_TREE;
    if (current_function_decl)
Index: invoke.texi
===================================================================
RCS file: /cvs/gcc/gcc/gcc/doc/invoke.texi,v
retrieving revision 1.307
diff -c -3 -p -r1.307 invoke.texi
*** invoke.texi	1 Jul 2003 14:39:20 -0000	1.307
--- invoke.texi	1 Jul 2003 21:52:50 -0000
*************** After exceeding the maximum number of in
*** 4547,4552 ****
--- 4547,4553 ----
  inlining, a linear function is used to decrease the allowable size
  for single functions.  The slope of that function is the negative
  reciprocal of the number specified here.
+ This parameter is ignored when @option{-funit-at-a-time} is used.
  The default value is 32.
  
  @item min-inline-insns
*************** The repeated inlining is throttled more 
*** 4554,4560 ****
--- 4555,4580 ----
  after exceeding the limit.  To avoid too much throttling, a minimum for
  this function is specified here to allow repeated inlining for very small
  functions even when a lot of repeated inlining already has been done.
+ This parameter is ignored when @option{-funit-at-a-time} is used.
  The default value is 130.
+ 
+ @item large-function-insns
+ The limit specifying really large functions.  For functions greater than this
+ limit inlining is constrained by @option{--param large-function-growth}.
+ This parameter is usefull primarily to avoid extreme compilation time caused by non-linear
+ algorithms used by the backend.
+ This parameter is ignored when @option{-funit-at-a-time} is not used.
+ The default value is 30000.
+ 
+ @item large-function-growth
+ Specifies maximal growth of large functtion caused by inlining in percents.
+ This parameter is ignored when @option{-funit-at-a-time} is not used.
+ The default value is 200.
+ 
+ @item inline-unit-growth
+ Specifies maximal overall growth of the compilation unit caused by inlining.
+ This parameter is ignored when @option{-funit-at-a-time} is not used.
+ The default value is 150.
  
  @item max-inline-insns-rtl
  For languages that use the RTL inliner (this happens at a later stage



More information about the Gcc-patches mailing list