Inlining heuristics tweaks

Wed Mar 30 23:46:00 GMT 2005

Mark,
what would you think about pushing this patch (+one followup fix for C
testcases with missmatching implicit declarations) to 4.0.
It has been in mainline in more than a week now and it seems to solve
the C++ problems.  I know of one regression attributable to this change,
the 32bzip gzip compiled by 64bit cross (see the 32bit results
http://www.suse.de/~aj/SPEC/amd64 ).
I tried to oprofile it on my hammer but the regression is not happening
there and gzip is known to be picky about code alignment on Athlons (it
apparently hits the limits of branch predictor), so I would attribute it
to noise - it does not happen in 64bit test neither in Diego's 32bit
tests on pentium4.

Honza
> Hi,
> this is patch we ended up after rather lengtly process of trying various
> possibilities of inlining heuristics changes.  We tested the patch on SPEC and
> -O3, CSiBE and -Os, Gerald's application, tramp3d and SPEC with and without
> profile feedback in whole program mode on tree-profiling branch.  While for
> individual benchmarks we do have versions of patch that performs better, this
> patch seems to be best compromise over all tests.
> 
> The strengths of patch are:
> 1) tramp3d performance improvement (from 1m55s of 4.0 to 0m29s, 3.4 didn 0m51s)
>    similarly on -O2 and -O3
> 2) 1% SPECint improvement with whole program mode,  with 0.3% code size
>    increase
> 3) PR17864 compile time improvement from 15s to 8s
> 4) it is almost completely SPEC neutral for -O3 at 4.0
> 5) CSiBE compile time improvement from 140s to 82s and tinny code size improvemnt at -Os
> 6) 0.6% improvement in Gerald's application benchmark (geometric average over speedups)
> 
> And now the weak sides
> 1) Performance degradation on CSiBE and -Os 16.22s to 18.63s
> 2) compile time and code size regression on 8361 (20s to 22s, 446636bytes to
>    464140), similarly for whole dlv application and tramp3d (1m9.019s to
>    2m31.520s, 1835557 to 2390393), code size regression on PR17863 (508673 to
>    557281)
> 3) 0.5% code size regression on whole program SPEC with profile feedback for no
>    measureable speedup.
> 
> The reason why the regressions seems acceptable to me is that -Os is not meant
> to excell in performance (and I think I can tweak it further later on) and for
> C++ testcases we actually started the attempts to increase inlining to get
> perofmrance back, so we don't need to be surprised by compile time slodowns.
> Unfortunately for tramp3d we went bit too far - it is possible to reach
> comparable perofmrance with less inlining for sure.  Here are some results with
> different incarnations of patch: tramp3d-v3 -O3 -funroll-loops:
> 
> 	      size	 compile time  runtime
> 3.4:          1706569    1m11.710s     0m44.753s
> 4.0:          1835557    1m9.019s      1m47.997s
> 4.0 p #1:     2578785    2m59.829s     0m20.861s
> 4.0 p #2:     1983129    1m27.115s     0m46.579s
> 4.0 p #3:     1981481    1m29.660s     0m33.506s
> 4.0 p #4:     1836253    1m7.366s      0m49.142s
> 4.0 p #5:     2390393    2m31.520s     0m25.233s
> 
> #5 is the current patch, the other values are for different versions we tried.
> I believe that tramp3d is dificult to tune somewhat extreme case and the
> behaviour is not at all unresonable even compared to other patches as the
> perofmrance seems to be very good.  In general it is possible to trottle the
> inline-insns-auto/inline-insns-single limits down to little values (like 100)
> without hurting runtime and improving compile time, but this is unfortunately
> not possible for the other bechmarks.
> 
> 
> So now what the patch does.  As you know we all started with Richard's idea to
> get most of moves for free.  This is important as it makes the "forwarder"
> functions (functions taking arguments and passing them to other functions) come
> out free.  For tramp3d we do have many calls in the original program for every
> operation executed in the final binarry so it is obvious that if we get
> non-zero cost for inlining each of them, we will need to push the overall
> limits high.
> 
> The patch thus push the move cost of is_gimple_reg variables to 0.  This
> however led to increase in amount of inlining and regresions in compile/code
> size tests on almost evertyhing.  We tried to push the inline limits down
> somewhat but this per se still brought regressions on Gerald's application and
> SPEC.  Another idea was to limit cost of CALL operation, currently set to 10,
> to something more realistic code size wise.  Interestingly reducing the cost to
> 1+move size for setup of operands makes the heuristics even more dificult to
> tune (it caused large regressions on SPEC specially in whole program and
> Gerald's testcase)
> 
> On the other hand increasing the call cost seemed to have positive effect on
> these benchmarks as well as decrease code size of benchmarks with low
> abstraction penalty.  So the patch adds call cost as an argument and sets it to
> 16 that was best scoring value for whole program SPEC benchmarks.  THis is
> somewhat contraintuitive part of the patch - the rationale is that call cost
> for inlining heuristics actually represents sort of ratio of how large function
> is profitable to be inlined over function call as well as way to penalize
> inlining of non-leaf functions over leaf.  As profile driven inline results
> suggest, even code size wise it is profitable to inline functions up to cost of
> 16 and obviiously inlining is most profitable for leaf functions.
> 
> It is quite irritating that the code size and performance consideratins are
> currently mixed in estimate_num_insns and it leads to confussion.  For post
> tree-profiling code, I would like to split these two (I already have patch that
> adds estimate_time that actually counts the instructions in a way weighted by
> guessed profile and drive inlining by that, I didn't get across tunning or
> testing this idea yet, however), or if this turns out to be overengineered, we
> can just split this out in future so inlining code itself just look for number
> of calls and adds some cost to each function so other uses of
> esitmate-num_insns won't get confused.
> 
> I've bootstrapped/regtested i686-pc-gnu-linux and ppc-linux on both
> mainline and 4.0.  I would like to commit this patch to mainline
> tomorrow (I will wait for Zack to fix the make install problems so it
> might be somewhat later) if no one objects and propose it for 4.0 after
> few days of testing if no considerable regressions are found.  I would
> like to also thank Richard and Steven for all the work and ideas brought
> into this experiments.
> 
> Honza
> 
> 2005-03-20  Richard Guenther <rguenth@tat.physik.uni-tuebingen.de>
> 	    Jan Hubicka  <jh@suse.cz>
> 	    Steven Bosscher <stevenb@suse.de
> 	* cgraphunit.c (cgraph_estimate_size_after_inlining): Compute
> 	call cost based on argument sizes.
> 	(cgraph_mark_inline_edge): Avoid inline unit from shringking by
> 	inlining.
> 	* params.def: (max-inline-inssn-single): Set to 450.
> 	(max-inline-insns-auto): Set to 90.
> 	(max-inline-insns-recursive): Set to 450
> 	(max-inline-insns-recursive-auto): Set to 450.
> 	(large-function-insns): Set to 2700.
> 	(inline-call-cost): New parameter.
> 	* tree-inline.c (estimate_move_cost): New function.
> 	(estimate_num_insns_1): Compute move sizes costs by estimate_move_cost
> 	for non-gimple-regs, set cost to 0 for gimple-regs.  Compute call size
> 	based on arguments.
> 	* tree-inline.h (estimate_move_cost): Declare.
> 	* invoke.texi: (max-inline-inssn-single): Change default to 450.
> 	(max-inline-insns-auto): Change default to 90.
> 	(max-inline-insns-recursive): Change default to 450
> 	(max-inline-insns-recursive-auto): Change default to 450.
> 	(large-function-insns): Change default to 2700.
> 	(inline-call-cost): Document new parameter.
> 
> 	* testsuite/gcc.dg/winline-6.c: Modify so inlined function have nonzero cost.
> 
> Index: cgraphunit.c
> ===================================================================
> RCS file: /cvs/gcc/gcc/gcc/cgraphunit.c,v
> retrieving revision 1.95
> diff -c -3 -p -r1.95 cgraphunit.c
> *** cgraphunit.c	16 Mar 2005 17:14:55 -0000	1.95
> --- cgraphunit.c	20 Mar 2005 15:32:51 -0000
> *************** static int
> *** 1030,1036 ****
>   cgraph_estimate_size_after_inlining (int times, struct cgraph_node *to,
>   				     struct cgraph_node *what)
>   {
> !   return (what->global.insns - INSNS_PER_CALL) * times + to->global.insns;
>   }
>   
>   /* Estimate the growth caused by inlining NODE into all callees.  */
> --- 1030,1041 ----
>   cgraph_estimate_size_after_inlining (int times, struct cgraph_node *to,
>   				     struct cgraph_node *what)
>   {
> !   tree fndecl = what->decl;
> !   tree arg;
> !   int call_insns = PARAM_VALUE (PARAM_INLINE_CALL_COST);
> !   for (arg = DECL_ARGUMENTS (fndecl); arg; arg = TREE_CHAIN (arg))
> !     call_insns += estimate_move_cost (TREE_TYPE (arg));
> !   return (what->global.insns - call_insns) * times + to->global.insns;
>   }
>   
>   /* Estimate the growth caused by inlining NODE into all callees.  */
> *************** cgraph_mark_inline_edge (struct cgraph_e
> *** 1124,1130 ****
>         to->global.insns = new_insns;
>       }
>     gcc_assert (what->global.inlined_to == to);
> !   overall_insns += new_insns - old_insns;
>     ncalls_inlined++;
>   }
>   
> --- 1129,1136 ----
>         to->global.insns = new_insns;
>       }
>     gcc_assert (what->global.inlined_to == to);
> !   if (new_insns > old_insns)
> !     overall_insns += new_insns - old_insns;
>     ncalls_inlined++;
>   }
>   
> Index: params.def
> ===================================================================
> RCS file: /cvs/gcc/gcc/gcc/params.def,v
> retrieving revision 1.54
> diff -c -3 -p -r1.54 params.def
> *** params.def	1 Feb 2005 10:03:08 -0000	1.54
> --- params.def	20 Mar 2005 15:32:51 -0000
> *************** DEFPARAM (PARAM_SRA_FIELD_STRUCTURE_RATI
> *** 58,67 ****
>      of a function counted in internal gcc instructions (not in
>      real machine instructions) that is eligible for inlining
>      by the tree inliner.
> !    The default value is 500.
>      Only functions marked inline (or methods defined in the class
> !    definition for C++) are affected by this, unless you set the
> !    -finline-functions (included in -O3) compiler option.
>      There are more restrictions to inlining: If inlined functions
>      call other functions, the already inlined instructions are
>      counted and once the recursive inline limit (see 
> --- 58,66 ----
>      of a function counted in internal gcc instructions (not in
>      real machine instructions) that is eligible for inlining
>      by the tree inliner.
> !    The default value is 450.
>      Only functions marked inline (or methods defined in the class
> !    definition for C++) are affected by this.
>      There are more restrictions to inlining: If inlined functions
>      call other functions, the already inlined instructions are
>      counted and once the recursive inline limit (see 
> *************** DEFPARAM (PARAM_SRA_FIELD_STRUCTURE_RATI
> *** 70,76 ****
>   DEFPARAM (PARAM_MAX_INLINE_INSNS_SINGLE,
>   	  "max-inline-insns-single",
>   	  "The maximum number of instructions in a single function eligible for inlining",
> ! 	  500, 0, 0)
>   
>   /* The single function inlining limit for functions that are
>      inlined by virtue of -finline-functions (-O3).
> --- 69,75 ----
>   DEFPARAM (PARAM_MAX_INLINE_INSNS_SINGLE,
>   	  "max-inline-insns-single",
>   	  "The maximum number of instructions in a single function eligible for inlining",
> ! 	  450, 0, 0)
>   
>   /* The single function inlining limit for functions that are
>      inlined by virtue of -finline-functions (-O3).
> *************** DEFPARAM (PARAM_MAX_INLINE_INSNS_SINGLE,
> *** 78,98 ****
>      that is applied to functions marked inlined (or defined in the
>      class declaration in C++) given by the "max-inline-insns-single"
>      parameter.
> !    The default value is 150.  */
>   DEFPARAM (PARAM_MAX_INLINE_INSNS_AUTO,
>   	  "max-inline-insns-auto",
>   	  "The maximum number of instructions when automatically inlining",
> ! 	  120, 0, 0)
>   
>   DEFPARAM (PARAM_MAX_INLINE_INSNS_RECURSIVE,
>   	  "max-inline-insns-recursive",
>   	  "The maximum number of instructions inline function can grow to via recursive inlining",
> ! 	  500, 0, 0)
>   
>   DEFPARAM (PARAM_MAX_INLINE_INSNS_RECURSIVE_AUTO,
>   	  "max-inline-insns-recursive-auto",
>   	  "The maximum number of instructions non-inline function can grow to via recursive inlining",
> ! 	  500, 0, 0)
>   
>   DEFPARAM (PARAM_MAX_INLINE_RECURSIVE_DEPTH,
>   	  "max-inline-recursive-depth",
> --- 77,97 ----
>      that is applied to functions marked inlined (or defined in the
>      class declaration in C++) given by the "max-inline-insns-single"
>      parameter.
> !    The default value is 90.  */
>   DEFPARAM (PARAM_MAX_INLINE_INSNS_AUTO,
>   	  "max-inline-insns-auto",
>   	  "The maximum number of instructions when automatically inlining",
> ! 	  90, 0, 0)
>   
>   DEFPARAM (PARAM_MAX_INLINE_INSNS_RECURSIVE,
>   	  "max-inline-insns-recursive",
>   	  "The maximum number of instructions inline function can grow to via recursive inlining",
> ! 	  450, 0, 0)
>   
>   DEFPARAM (PARAM_MAX_INLINE_INSNS_RECURSIVE_AUTO,
>   	  "max-inline-insns-recursive-auto",
>   	  "The maximum number of instructions non-inline function can grow to via recursive inlining",
> ! 	  450, 0, 0)
>   
>   DEFPARAM (PARAM_MAX_INLINE_RECURSIVE_DEPTH,
>   	  "max-inline-recursive-depth",
> *************** DEFPARAM(PARAM_MAX_PENDING_LIST_LENGTH,
> *** 148,154 ****
>   DEFPARAM(PARAM_LARGE_FUNCTION_INSNS,
>   	 "large-function-insns",
>   	 "The size of function body to be considered large",
> ! 	 3000, 0, 0)
>   DEFPARAM(PARAM_LARGE_FUNCTION_GROWTH,
>   	 "large-function-growth",
>   	 "Maximal growth due to inlining of large function (in percent)",
> --- 147,153 ----
>   DEFPARAM(PARAM_LARGE_FUNCTION_INSNS,
>   	 "large-function-insns",
>   	 "The size of function body to be considered large",
> ! 	 2700, 0, 0)
>   DEFPARAM(PARAM_LARGE_FUNCTION_GROWTH,
>   	 "large-function-growth",
>   	 "Maximal growth due to inlining of large function (in percent)",
> *************** DEFPARAM(PARAM_INLINE_UNIT_GROWTH,
> *** 157,162 ****
> --- 156,165 ----
>   	 "inline-unit-growth",
>   	 "how much can given compilation unit grow because of the inlining (in percent)",
>   	 50, 0, 0)
> + DEFPARAM(PARAM_INLINE_CALL_COST,
> + 	 "inline-call-cost",
> + 	 "expense of call operation relative to ordinary aritmetic operations",
> + 	 16, 0, 0)
>   
>   /* The GCSE optimization will be disabled if it would require
>      significantly more memory than this value.  */
> Index: tree-inline.c
> ===================================================================
> RCS file: /cvs/gcc/gcc/gcc/tree-inline.c,v
> retrieving revision 1.172
> diff -c -3 -p -r1.172 tree-inline.c
> *** tree-inline.c	16 Mar 2005 09:01:18 -0000	1.172
> --- tree-inline.c	20 Mar 2005 15:32:51 -0000
> *************** inlinable_function_p (tree fn)
> *** 1164,1169 ****
> --- 1164,1186 ----
>     return inlinable;
>   }
>   
> + /* Estimate the cost of a memory move.  Use machine dependent
> +    word size and take possible memcpy call into account.  */
> + 
> + int
> + estimate_move_cost (tree type)
> + {
> +   HOST_WIDE_INT size;
> + 
> +   size = int_size_in_bytes (type);
> + 
> +   if (size < 0 || size > MOVE_MAX_PIECES * MOVE_RATIO)
> +     /* Cost of a memcpy call, 3 arguments and the call.  */
> +     return 4;
> +   else
> +     return ((size + MOVE_MAX_PIECES - 1) / MOVE_MAX_PIECES);
> + }
> + 
>   /* Used by estimate_num_insns.  Estimate number of instructions seen
>      by given statement.  */
>   
> *************** estimate_num_insns_1 (tree *tp, int *wal
> *** 1242,1269 ****
>         *walk_subtrees = 0;
>         return NULL;
>   
> !     /* Recognize assignments of large structures and constructors of
> !        big arrays.  */
>       case INIT_EXPR:
>       case MODIFY_EXPR:
> !       x = TREE_OPERAND (x, 0);
> !       /* FALLTHRU */
>       case TARGET_EXPR:
>       case CONSTRUCTOR:
> !       {
> ! 	HOST_WIDE_INT size;
> ! 
> ! 	size = int_size_in_bytes (TREE_TYPE (x));
> ! 
> ! 	if (size < 0 || size > MOVE_MAX_PIECES * MOVE_RATIO)
> ! 	  *count += 10;
> ! 	else
> ! 	  *count += ((size + MOVE_MAX_PIECES - 1) / MOVE_MAX_PIECES);
> !       }
>         break;
>   
> !       /* Assign cost of 1 to usual operations.
> ! 	 ??? We may consider mapping RTL costs to this.  */
>       case COND_EXPR:
>   
>       case PLUS_EXPR:
> --- 1259,1308 ----
>         *walk_subtrees = 0;
>         return NULL;
>   
> !     /* Try to estimate the cost of assignments.  We have three cases to
> !        deal with:
> ! 	1) Simple assignments to registers;
> ! 	2) Stores to things that must live in memory.  This includes
> ! 	   "normal" stores to scalars, but also assignments of large
> ! 	   structures, or constructors of big arrays;
> ! 	3) TARGET_EXPRs.
> ! 
> !        Let us look at the first two cases, assuming we have "a = b + C":
> !        <modify_expr <var_decl "a"> <plus_expr <var_decl "b"> <constant C>>
> !        If "a" is a GIMPLE register, the assignment to it is free on almost
> !        any target, because "a" usually ends up in a real register.  Hence
> !        the only cost of this expression comes from the PLUS_EXPR, and we
> !        can ignore the MODIFY_EXPR.
> !        If "a" is not a GIMPLE register, the assignment to "a" will most
> !        likely be a real store, so the cost of the MODIFY_EXPR is the cost
> !        of moving something into "a", which we compute using the function
> !        estimate_move_cost.
> ! 
> !        The third case deals with TARGET_EXPRs, for which the semantics are
> !        that a temporary is assigned, unless the TARGET_EXPR itself is being
> !        assigned to something else.  In the latter case we do not need the
> !        temporary.  E.g. in <modify_expr <var_decl "a"> <target_expr>>, the
> !        MODIFY_EXPR is free.  */
>       case INIT_EXPR:
>       case MODIFY_EXPR:
> !       /* Is the right and side a TARGET_EXPR?  */
> !       if (TREE_CODE (TREE_OPERAND (x, 1)) == TARGET_EXPR)
> ! 	break;
> !       /* ... fall through ...  */
> ! 
>       case TARGET_EXPR:
> +       x = TREE_OPERAND (x, 0);
> +       /* Is this an assignments to a register?  */
> +       if (is_gimple_reg (x))
> + 	break;
> +       /* Otherwise it's a store, so fall through to compute the move cost.  */
> +       
>       case CONSTRUCTOR:
> !       *count += estimate_move_cost (TREE_TYPE (x));
>         break;
>   
> !     /* Assign cost of 1 to usual operations.
> !        ??? We may consider mapping RTL costs to this.  */
>       case COND_EXPR:
>   
>       case PLUS_EXPR:
> *************** estimate_num_insns_1 (tree *tp, int *wal
> *** 1350,1355 ****
> --- 1389,1395 ----
>       case CALL_EXPR:
>         {
>   	tree decl = get_callee_fndecl (x);
> + 	tree arg;
>   
>   	if (decl && DECL_BUILT_IN_CLASS (decl) == BUILT_IN_NORMAL)
>   	  switch (DECL_FUNCTION_CODE (decl))
> *************** estimate_num_insns_1 (tree *tp, int *wal
> *** 1362,1368 ****
>   	    default:
>   	      break;
>   	    }
> ! 	*count += 10;
>   	break;
>         }
>       default:
> --- 1402,1413 ----
>   	    default:
>   	      break;
>   	    }
> ! 
> ! 	arg = TREE_OPERAND (x, 1);
> ! 	for (arg = TREE_OPERAND (x, 1); arg; arg = TREE_CHAIN (arg))
> ! 	  *count += estimate_move_cost (TREE_TYPE (TREE_VALUE (arg)));
> ! 
> ! 	*count += PARAM_VALUE (PARAM_INLINE_CALL_COST);
>   	break;
>         }
>       default:
> Index: tree-inline.h
> ===================================================================
> RCS file: /cvs/gcc/gcc/gcc/tree-inline.h,v
> retrieving revision 1.14
> diff -c -3 -p -r1.14 tree-inline.h
> *** tree-inline.h	8 Nov 2004 22:40:09 -0000	1.14
> --- tree-inline.h	20 Mar 2005 15:32:51 -0000
> *************** bool tree_inlinable_function_p (tree);
> *** 29,34 ****
> --- 29,35 ----
>   tree copy_tree_r (tree *, int *, void *);
>   void clone_body (tree, tree, void *);
>   tree save_body (tree, tree *, tree *);
> + int estimate_move_cost (tree type);
>   int estimate_num_insns (tree expr);
>   
>   /* 0 if we should not perform inlining.
> Index: doc/invoke.texi
> ===================================================================
> RCS file: /cvs/gcc/gcc/gcc/doc/invoke.texi,v
> retrieving revision 1.592
> diff -c -3 -p -r1.592 invoke.texi
> *** doc/invoke.texi	15 Mar 2005 00:36:15 -0000	1.592
> --- doc/invoke.texi	20 Mar 2005 15:32:53 -0000
> *************** This number sets the maximum number of i
> *** 5518,5524 ****
>   internal representation) in a single function that the tree inliner
>   will consider for inlining.  This only affects functions declared
>   inline and methods implemented in a class declaration (C++).
> ! The default value is 500.
>   
>   @item max-inline-insns-auto
>   When you use @option{-finline-functions} (included in @option{-O3}),
> --- 5518,5524 ----
>   internal representation) in a single function that the tree inliner
>   will consider for inlining.  This only affects functions declared
>   inline and methods implemented in a class declaration (C++).
> ! The default value is 450.
>   
>   @item max-inline-insns-auto
>   When you use @option{-finline-functions} (included in @option{-O3}),
> *************** a lot of functions that would otherwise 
> *** 5526,5532 ****
>   by the compiler will be investigated.  To those functions, a different
>   (more restrictive) limit compared to functions declared inline can
>   be applied.
> ! The default value is 120.
>   
>   @item large-function-insns
>   The limit specifying really large functions.  For functions larger than this
> --- 5526,5532 ----
>   by the compiler will be investigated.  To those functions, a different
>   (more restrictive) limit compared to functions declared inline can
>   be applied.
> ! The default value is 90.
>   
>   @item large-function-insns
>   The limit specifying really large functions.  For functions larger than this
> *************** limit after inlining inlining is constra
> *** 5535,5541 ****
>   to avoid extreme compilation time caused by non-linear algorithms used by the
>   backend.
>   This parameter is ignored when @option{-funit-at-a-time} is not used.
> ! The default value is 3000.
>   
>   @item large-function-growth
>   Specifies maximal growth of large function caused by inlining in percents.
> --- 5535,5541 ----
>   to avoid extreme compilation time caused by non-linear algorithms used by the
>   backend.
>   This parameter is ignored when @option{-funit-at-a-time} is not used.
> ! The default value is 2700.
>   
>   @item large-function-growth
>   Specifies maximal growth of large function caused by inlining in percents.
> *************** For functions declared inline @option{--
> *** 5558,5564 ****
>   taken into acount.  For function not declared inline, recursive inlining
>   happens only when @option{-finline-functions} (included in @option{-O3}) is
>   enabled and @option{--param max-inline-insns-recursive-auto} is used.  The
> ! default value is 500.
>   
>   @item max-inline-recursive-depth
>   @itemx max-inline-recursive-depth-auto
> --- 5558,5564 ----
>   taken into acount.  For function not declared inline, recursive inlining
>   happens only when @option{-finline-functions} (included in @option{-O3}) is
>   enabled and @option{--param max-inline-insns-recursive-auto} is used.  The
> ! default value is 450.
>   
>   @item max-inline-recursive-depth
>   @itemx max-inline-recursive-depth-auto
> *************** For functions declared inline @option{--
> *** 5568,5574 ****
>   taken into acount.  For function not declared inline, recursive inlining
>   happens only when @option{-finline-functions} (included in @option{-O3}) is
>   enabled and @option{--param max-inline-recursive-depth-auto} is used.  The
> ! default value is 500.
>   
>   @item max-unrolled-insns
>   The maximum number of instructions that a loop should have if that loop
> --- 5568,5583 ----
>   taken into acount.  For function not declared inline, recursive inlining
>   happens only when @option{-finline-functions} (included in @option{-O3}) is
>   enabled and @option{--param max-inline-recursive-depth-auto} is used.  The
> ! default value is 450.
> ! 
> ! @item inline-call-cost
> ! Specify cost of call instruction relative to simple arithmetics operations
> ! (having cost of 1).  Increasing this cost disqualify inlinining of non-leaf
> ! functions and at same time increase size of leaf function that is believed to
> ! reduce function size by being inlined.  In effect it increase amount of
> ! inlining for code having large abstraction penalty (many functions that just
> ! pass the argumetns to other functions) and decrease inlining for code with low
> ! abstraction penalty.  Default value is 16.
>   
>   @item max-unrolled-insns
>   The maximum number of instructions that a loop should have if that loop
> Index: testsuite/gcc.dg/winline-6.c
> ===================================================================
> RCS file: /cvs/gcc/gcc/gcc/testsuite/gcc.dg/winline-6.c,v
> retrieving revision 1.1
> diff -c -3 -p -r1.1 winline-6.c
> *** testsuite/gcc.dg/winline-6.c	4 Jan 2004 14:39:13 -0000	1.1
> --- testsuite/gcc.dg/winline-6.c	20 Mar 2005 15:32:54 -0000
> *************** inline int q(void)
> *** 17,21 ****
>   }
>   inline int t (void)
>   {
> ! 	return q ();		 /* { dg-warning "called from here" } */
>   }
> --- 17,21 ----
>   }
>   inline int t (void)
>   {
> ! 	return q () + 1;	 /* { dg-warning "called from here" } */
>   }