[RFA] new code size estimation for inlining

Jan Hubicka jh@suse.cz
Mon Jun 30 21:09:00 GMT 2003


Hi,
this patch implements the new estimation of body size via langhook.  My
estimation is based on counting number of operations encoded in the trees that
should be better than number of statements as it more closely responds to the
memory requirements, number of isntructions and perhaps also the performance :)

I've set the default values to get approximately as much of inlining as we do
right now.  This is perhaps too much but I would like to get all the parts of
infrastructure I am working on at place before tunning it too hard.
The sizes of binaries behaves like:

 164.gzip: Base: 59720 bytes
 164.gzip: Peak: 59720 bytes
 175.vpr: Base: 158808 bytes
 175.vpr: Peak: 161144 bytes
 176.gcc: Base: 1727744 bytes
 176.gcc: Peak: 1677600 bytes
 181.mcf: Base: 20816 bytes
 181.mcf: Peak: 20816 bytes
 186.crafty: Base: 206048 bytes
 186.crafty: Peak: 206016 bytes
 197.parser: Base: 136608 bytes
 197.parser: Peak: 141472 bytes
 252.eon: Base: 574464 bytes
 252.eon: Peak: 567552 bytes
 253.perlbmk: Base: 667992 bytes
 253.perlbmk: Peak: 684280 bytes
 254.gap: Base: 537464 bytes
 254.gap: Peak: 551736 bytes
 255.vortex: Base: 638120 bytes
 255.vortex: Peak: 630216 bytes
 256.bzip2: Base: 58216 bytes
 256.bzip2: Peak: 58216 bytes
 300.twolf: Base: 201488 bytes
 300.twolf: Peak: 202448 bytes
 =============================
 Total: Base: 4987488 bytes
 Total: Peak: 4961216 bytes

Compilation time is unchanged. (375s vs 371s), same as bootstrap time.
Scores are also similar (slightly better):

   164.gzip          1400     200         699*     1400     198         707*
   175.vpr           1400     183         766*     1400     181         772*
   176.gcc           1100     130         843*     1100     131         838*
   181.mcf           1800     351         512*     1800     352         511*
   186.crafty        1000      80.8      1238*     1000      80.6      1241*
   197.parser        1800     337         535*     1800     334         538*
   252.eon           1300     140         929*     1300     137         947*
   253.perlbmk       1800     218         826*     1800     219         821*
   254.gap           1100     158         696*     1100     158         696*
   255.vortex        1900     173        1096*     1900     173        1098*
   256.bzip2         1500     200         750*     1500     199         755*
   300.twolf         3000     387         776*     3000     385         779*
   Est. SPECint_base2000                  782
   Est. SPECint2000                                                     785

Since SPEC2000 is poor testcase for inlining I also use Gerald's application.
On his application I get:

				time		stripped size
mainline     			3m0.440s 	1606860
new estimate 			2m55.360s 	1667564
unit-at-a-time based inlining	2m23.130s	1486764

So the this patch is not major win, but not particularly well tunned  (this is
my very first try with new counts) unit-at-a-time patch with this estimate
already is.  (22% speedup, 8% code size savings).

I don't have up-to-date-unit-at-a-time-SPEC2000-scores (TM) but these were
mostly the same in last run I tested except for EON saving about 11% of code
size and 5% of compilation time.

The performance is as follows (times in seconds followed by variations
between 3 runs):

                     |   mainline     |   new size est.|  unit-at-a-time|
---------------------+----------------+----------------+----------------+
      STRATCOMP1-ALL |   3.11  (0.00) |   2.88  (0.00) |   2.87  (0.00) |
   STRATCOMP-770.2-Q |   0.48  (0.00) |   0.46  (0.00) |   0.47  (0.00) |
               2QBF1 |  12.59  (0.01) |  11.85  (0.02) |  13.03  (0.00) |
          PRIMEIMPL2 |   6.06  (0.00) |   6.02  (0.00) |   6.19  (0.00) |
            ANCESTOR |  95.17  (0.02) |  91.50  (0.14) |  89.18  (0.10) |
       3COL-SIMPLEX1 |   4.93  (0.02) |   4.70  (0.02) |   4.52  (0.04) |
         3COL-LADDER | 116.72  (0.33) | 110.92  (0.12) | 105.46  (0.03) |
       3COL-N-LADDER |   -    (-)     |   -    (-)     |   -    (-)     |
        3COL-RANDOM1 |   8.73  (0.01) |   8.65  (0.05) |   8.77  (0.07) |
          HP-RANDOM1 |   9.57  (0.03) |   9.39  (0.10) |   9.25  (0.07) |
       HAMCYCLE-FREE |   0.95  (0.00) |   0.92  (0.00) |   0.83  (0.00) |
             DECOMP2 |   8.95  (0.05) |   8.26  (0.07) |   8.45  (0.03) |
        BW-P4-Esra-a |  63.83  (0.11) |  64.57  (0.10) |  63.89  (0.05) |
        BW-P5-nopush |   5.99  (0.00) |   6.03  (0.00) |   5.96  (0.00) |
       BW-P5-pushbin |   5.14  (0.00) |   5.19  (0.01) |   5.14  (0.01) |
     BW-P5-nopushbin |   1.54  (0.00) |   1.54  (0.01) |   1.52  (0.00) |
              3SAT-1 |  19.23  (0.10) |  19.17  (0.14) |  19.61  (0.05) |
   3SAT-1-CONSTRAINT |  10.54  (0.06) |  10.73  (0.03) |  10.90  (0.06) |
        HANOI-Towers |   2.95  (0.02) |   2.84  (0.02) |   2.88  (0.00) |
              RAMSEY |   5.40  (0.01) |   5.18  (0.01) |   4.98  (0.01) |
             CRISTAL |   6.41  (0.01) |   6.14  (0.01) |   6.01  (0.06) |
             HANOI-K |  23.01  (0.12) |  23.92  (0.18) |  23.83  (0.07) |
           21-QUEENS |   5.55  (0.07) |   5.51  (0.01) |   5.49  (0.02) |
   MSTDir[V=13,A=40] |   9.53  (0.00) |   9.73  (0.00) |   9.26  (0.01) |
   MSTDir[V=15,A=40] |   9.58  (0.01) |   9.69  (0.00) |   9.22  (0.00) |
 MSTUndir[V=13,A=40] |   5.17  (0.01) |   5.27  (0.00) |   4.99  (0.00) |
 MSTUndir[V=15,A=40] |  84.05  (0.01) |  85.19  (0.02) |  81.14  (0.02) |
         TIMETABLING |   7.05  (0.00) |   6.81  (0.02) |   6.59  (0.01) |
---------------------+----------------+----------------+----------------+
Mon Jun 30 22:50:59 CEST 2003  Jan Hubicka  <jh@suse.cz>
	* c-common.c (c_estimate_num_insns_1): New static function.
	(c_estimate_num_insns): New global function.
	* c-common.h (DECL_NUM_STMTS): Rename to...
	(DECL_ESTIMATED_INSNS): ... this.
	(c_estimate_num_insns): Declare.
	* c-decl.c (duplicate_decls): Use DECL_ESTIMATED_INSNS.
	* c-lang.c (LANG_HOOKS_TREE_INLINING_ESTIMATE_NUM_INSNS): New.
	* c-semantics.c (add_stmt): Do not account statements.
	* langhooks-def.h (LANG_HOOKS_TREE_INLINING_ESTIMATE_NUM_INSNS):
	New.
	* langhooks.h (lang_hooks_for_tree_inlining): Add
	estimate_num_insns
	* params.def (max-inline-insns-auto, max-inline-insns-auto): set
	to 100.
	(max-inline-insns): set to 300.
	(min-inline-insns): set to 10.
	* tree-inline.c (struct inline_data): Rename inlined_stmts to
	inlined-insns.
	(INSNS_PER_STMT): Kill.
	(inlinable_function_p): Compute and store body size.
	(expand_call_inline): Likewise.
	(optimize_inline_calls): Likewise.

	* cp-lang.c (LANG_HOOKS_TREE_INLINING_ESTIMATE_NUM_INSNS): New.
	* decl.c (duplicate_decls): Use DECL_ESTIMATED_INSNS.
	(start_function): Use DECL_ESTIMATED_INSNS.
	* optimize.c (maybe_clone_body): Use DECL_ESTIMATED_INSNS.

	* java-tree.h (DECL_NUM_STMTS): Rename to...
	(DECL_ESTIMATED_INSNS): ... this.
	* lang.c (java_estimate_num_insns, java_estimate_num_insns_1):
	New static functions.
	(LANG_HOOKS_TREE_INLINING_ESTIMATE_NUM_INSNS): Define.
	* parser.y (add_stmt_to_compound): Do not account statements.

diff -Nrc3p gcc.orig/c-common.c gcc/c-common.c
*** gcc.orig/c-common.c	Mon Jun 30 22:49:23 2003
--- gcc/c-common.c	Mon Jun 30 23:31:28 2003
*************** check_function_arguments_recurse (void (
*** 6014,6017 ****
--- 6014,6127 ----
    (*callback) (ctx, param, param_num);
  }
  
+ /* Used by estimate_num_insns.  Estimate number of instructions seen
+    by given statement.  */
+ static tree
+ c_estimate_num_insns_1 (tree *tp, int *walk_subtrees, void *data)
+ {
+   int *count = data;
+   tree x = *tp;
+ 
+   if (TYPE_P (x) || DECL_P (x))
+     {
+       *walk_subtrees = 0;
+       return NULL;
+     }
+   /* Assume that constants and references counts nothing.  These should
+      be majorized by amount of operations amoung them we count later
+      and are common target of CSE and similar optimizations.  */
+   if (TREE_CODE_CLASS (TREE_CODE (x)) == 'c'
+       || TREE_CODE_CLASS (TREE_CODE (x)) == 'r')
+     return NULL;
+   switch (TREE_CODE (x))
+     { 
+     /* Reconginze assignments of large structures and constructors of
+        big arrays.  */
+     case MODIFY_EXPR:
+     case CONSTRUCTOR:
+       {
+ 	int size = int_size_in_bytes (TREE_TYPE (x));
+ 
+ 	if (!size || size > MOVE_MAX_PIECES)
+ 	  *count += 10;
+ 	else
+ 	  *count += 2 * (size + MOVE_MAX - 1) / MOVE_MAX;
+ 	return NULL;
+       }
+       break;
+     /* Few special cases of expensive operations.  This is usefull
+        to avoid inlining on functions having too many of these.  */
+     case TRUNC_DIV_EXPR:
+     case CEIL_DIV_EXPR:
+     case FLOOR_DIV_EXPR:
+     case ROUND_DIV_EXPR:
+     case TRUNC_MOD_EXPR:
+     case CEIL_MOD_EXPR:
+     case FLOOR_MOD_EXPR:
+     case ROUND_MOD_EXPR:
+     case RDIV_EXPR:
+     case CALL_EXPR:
+     case METHOD_CALL_EXPR:
+       *count += 10;
+       break;
+     /* Various containers that will produce no code themselves.  */
+     case INIT_EXPR:
+     case TARGET_EXPR:
+     case BIND_EXPR:
+     case BLOCK:
+     case TREE_LIST:
+     case TREE_VEC:
+     case IDENTIFIER_NODE:
+     case PLACEHOLDER_EXPR:
+     case WITH_CLEANUP_EXPR:
+     case CLEANUP_POINT_EXPR:
+     case NOP_EXPR:
+     case VIEW_CONVERT_EXPR:
+     case SAVE_EXPR:
+     case UNSAVE_EXPR:
+     case COMPLEX_EXPR:
+     case REALPART_EXPR:
+     case IMAGPART_EXPR:
+     case TRY_CATCH_EXPR:
+     case TRY_FINALLY_EXPR:
+     case LABEL_EXPR:
+     case EXIT_EXPR:
+     case LABELED_BLOCK_EXPR:
+     case EXIT_BLOCK_EXPR:
+     case EXPR_WITH_FILE_LOCATION:
+ 
+     case EXPR_STMT:
+     case COMPOUND_STMT:
+     case RETURN_STMT:
+     case LABEL_STMT:
+     case SCOPE_STMT:
+     case FILE_STMT:
+     case CASE_LABEL:
+     case STMT_EXPR:
+     case CLEANUP_STMT:
+ 
+     case SIZEOF_EXPR:
+     case ARROW_EXPR:
+     case ALIGNOF_EXPR:
+       break;
+     case DECL_STMT:
+       /* Do not account static initializers.  */
+       if (TREE_STATIC (TREE_OPERAND (x, 0)))
+ 	*walk_subtrees = 0;
+       break;
+     default:
+       (*count)++;
+     }
+   return NULL;
+ }
+ 
+ /*  Estimate number of instructions that will be created by expanding the body.  */
+ int
+ c_estimate_num_insns (tree decl)
+ {
+   int num = 0;
+   walk_tree_without_duplicates (&DECL_SAVED_TREE (decl), c_estimate_num_insns_1, &num);
+   return num;
+ }
+ 
  #include "gt-c-common.h"
diff -Nrc3p gcc.orig/c-common.h gcc/c-common.h
*** gcc.orig/c-common.h	Mon Jun 30 22:49:23 2003
--- gcc/c-common.h	Mon Jun 30 23:18:01 2003
*************** struct c_lang_decl GTY(()) {
*** 348,354 ****
       the approximate number of statements in this function.  There is
       no need for this number to be exact; it is only used in various
       heuristics regarding optimization.  */
! #define DECL_NUM_STMTS(NODE) \
    (FUNCTION_DECL_CHECK (NODE)->decl.u1.i)
  
  /* The variant of the C language being processed.  Each C language
--- 348,354 ----
       the approximate number of statements in this function.  There is
       no need for this number to be exact; it is only used in various
       heuristics regarding optimization.  */
! #define DECL_ESTIMATED_INSNS(NODE) \
    (FUNCTION_DECL_CHECK (NODE)->decl.u1.i)
  
  /* The variant of the C language being processed.  Each C language
*************** extern void c_common_write_pch (void);
*** 1295,1300 ****
--- 1295,1301 ----
  extern void builtin_define_with_value (const char *, const char *, int);
  extern void c_stddef_cpp_builtins (void);
  extern void fe_file_change (const struct line_map *);
+ extern int c_estimate_num_insns (tree decl);
  
  /* In c-ppoutput.c  */
  extern void init_pp_output (FILE *);
diff -Nrc3p gcc.orig/c-decl.c gcc/c-decl.c
*** gcc.orig/c-decl.c	Mon Jun 30 22:49:39 2003
--- gcc/c-decl.c	Mon Jun 30 23:13:56 2003
*************** duplicate_decls (tree newdecl, tree oldd
*** 1474,1480 ****
  	    DECL_INITIAL (newdecl) = DECL_INITIAL (olddecl);
  	  DECL_SAVED_INSNS (newdecl) = DECL_SAVED_INSNS (olddecl);
  	  DECL_SAVED_TREE (newdecl) = DECL_SAVED_TREE (olddecl);
! 	  DECL_NUM_STMTS (newdecl) = DECL_NUM_STMTS (olddecl);
  	  DECL_ARGUMENTS (newdecl) = DECL_ARGUMENTS (olddecl);
  
  	  /* Set DECL_INLINE on the declaration if we've got a body
--- 1474,1480 ----
  	    DECL_INITIAL (newdecl) = DECL_INITIAL (olddecl);
  	  DECL_SAVED_INSNS (newdecl) = DECL_SAVED_INSNS (olddecl);
  	  DECL_SAVED_TREE (newdecl) = DECL_SAVED_TREE (olddecl);
! 	  DECL_ESTIMATED_INSNS (newdecl) = DECL_ESTIMATED_INSNS (olddecl);
  	  DECL_ARGUMENTS (newdecl) = DECL_ARGUMENTS (olddecl);
  
  	  /* Set DECL_INLINE on the declaration if we've got a body
diff -Nrc3p gcc.orig/c-lang.c gcc/c-lang.c
*** gcc.orig/c-lang.c	Mon Jun 30 22:50:21 2003
--- gcc/c-lang.c	Mon Jun 30 23:36:57 2003
*************** static int c_init_options (void);
*** 98,103 ****
--- 98,105 ----
  #undef LANG_HOOKS_TREE_INLINING_CONVERT_PARM_FOR_INLINING
  #define LANG_HOOKS_TREE_INLINING_CONVERT_PARM_FOR_INLINING \
    c_convert_parm_for_inlining
+ #undef LANG_HOOKS_TREE_INLINING_ESTIMATE_NUM_INSNS
+ #define LANG_HOOKS_TREE_INLINING_ESTIMATE_NUM_INSNS c_estimate_num_insns
  #undef LANG_HOOKS_TREE_DUMP_DUMP_TREE_FN
  #define LANG_HOOKS_TREE_DUMP_DUMP_TREE_FN c_dump_tree
  
diff -Nrc3p gcc.orig/c-semantics.c gcc/c-semantics.c
*** gcc.orig/c-semantics.c	Mon Jun 30 22:49:23 2003
--- gcc/c-semantics.c	Mon Jun 30 23:13:56 2003
*************** add_stmt (tree t)
*** 98,107 ****
       statements are full-expressions.  We record that fact here.  */
    STMT_IS_FULL_EXPR_P (last_tree) = stmts_are_full_exprs_p ();
  
-   /* Keep track of the number of statements in this function.  */
-   if (current_function_decl)
-     ++DECL_NUM_STMTS (current_function_decl);
- 
    return t;
  }
  
--- 98,103 ----
diff -Nrc3p gcc.orig/cp/cp-lang.c gcc/cp/cp-lang.c
*** gcc.orig/cp/cp-lang.c	Mon Jun 30 22:49:38 2003
--- gcc/cp/cp-lang.c	Tue Jul  1 00:02:00 2003
*************** static bool cp_var_mod_type_p (tree);
*** 138,143 ****
--- 138,145 ----
  #define LANG_HOOKS_TREE_INLINING_START_INLINING cp_start_inlining
  #undef LANG_HOOKS_TREE_INLINING_END_INLINING
  #define LANG_HOOKS_TREE_INLINING_END_INLINING cp_end_inlining
+ #undef LANG_HOOKS_TREE_INLINING_ESTIMATE_NUM_INSNS
+ #define LANG_HOOKS_TREE_INLINING_ESTIMATE_NUM_INSNS c_estimate_num_insns
  #undef LANG_HOOKS_TREE_DUMP_DUMP_TREE_FN
  #define LANG_HOOKS_TREE_DUMP_DUMP_TREE_FN cp_dump_tree
  #undef LANG_HOOKS_TREE_DUMP_TYPE_QUALS_FN
diff -Nrc3p gcc.orig/cp/decl.c gcc/cp/decl.c
*** gcc.orig/cp/decl.c	Mon Jun 30 22:49:38 2003
--- gcc/cp/decl.c	Mon Jun 30 23:13:56 2003
*************** duplicate_decls (tree newdecl, tree oldd
*** 3531,3537 ****
  	      SET_DECL_RTL (newdecl, DECL_RTL (olddecl));
  	    }
  	  else
! 	    DECL_NUM_STMTS (newdecl) = DECL_NUM_STMTS (olddecl);
  
  	  DECL_RESULT (newdecl) = DECL_RESULT (olddecl);
  	  /* Don't clear out the arguments if we're redefining a function.  */
--- 3531,3537 ----
  	      SET_DECL_RTL (newdecl, DECL_RTL (olddecl));
  	    }
  	  else
! 	    DECL_ESTIMATED_INSNS (newdecl) = DECL_ESTIMATED_INSNS (olddecl);
  
  	  DECL_RESULT (newdecl) = DECL_RESULT (olddecl);
  	  /* Don't clear out the arguments if we're redefining a function.  */
*************** start_function (tree declspecs, tree dec
*** 13524,13530 ****
    begin_stmt_tree (&DECL_SAVED_TREE (decl1));
  
    /* Don't double-count statements in templates.  */
!   DECL_NUM_STMTS (decl1) = 0;
  
    /* Let the user know we're compiling this function.  */
    announce_function (decl1);
--- 13524,13530 ----
    begin_stmt_tree (&DECL_SAVED_TREE (decl1));
  
    /* Don't double-count statements in templates.  */
!   DECL_ESTIMATED_INSNS (decl1) = 0;
  
    /* Let the user know we're compiling this function.  */
    announce_function (decl1);
diff -Nrc3p gcc.orig/cp/optimize.c gcc/cp/optimize.c
*** gcc.orig/cp/optimize.c	Mon Jun 30 22:49:38 2003
--- gcc/cp/optimize.c	Mon Jun 30 23:13:56 2003
*************** maybe_clone_body (tree fn)
*** 254,260 ****
  
        /* There are as many statements in the clone as in the
  	 original.  */
!       DECL_NUM_STMTS (clone) = DECL_NUM_STMTS (fn);
  
        /* Clean up.  */
        splay_tree_delete (decl_map);
--- 254,260 ----
  
        /* There are as many statements in the clone as in the
  	 original.  */
!       DECL_ESTIMATED_INSNS (clone) = DECL_ESTIMATED_INSNS (fn);
  
        /* Clean up.  */
        splay_tree_delete (decl_map);
diff -Nrc3p gcc.orig/java/java-tree.h gcc/java/java-tree.h
*** gcc.orig/java/java-tree.h	Mon Jun 30 22:49:23 2003
--- gcc/java/java-tree.h	Mon Jun 30 23:13:56 2003
*************** union lang_tree_node 
*** 922,931 ****
  #define DECL_FIELD_FINAL_WFL(NODE) \
    (DECL_LANG_SPECIFIC(NODE)->u.v.wfl)
  /* In a FUNCTION_DECL for which DECL_BUILT_IN does not hold, this is
!      the approximate number of statements in this function.  There is
       no need for this number to be exact; it is only used in various
       heuristics regarding optimization.  */
! #define DECL_NUM_STMTS(NODE) \
    (FUNCTION_DECL_CHECK (NODE)->decl.u1.i)
  /* True if NODE is a local variable final. */
  #define LOCAL_FINAL_P(NODE) (DECL_LANG_SPECIFIC (NODE) && DECL_FINAL (NODE))
--- 922,931 ----
  #define DECL_FIELD_FINAL_WFL(NODE) \
    (DECL_LANG_SPECIFIC(NODE)->u.v.wfl)
  /* In a FUNCTION_DECL for which DECL_BUILT_IN does not hold, this is
!      the approximate number of instructions in this function.  There is
       no need for this number to be exact; it is only used in various
       heuristics regarding optimization.  */
! #define DECL_ESTIMATED_INSNS(NODE) \
    (FUNCTION_DECL_CHECK (NODE)->decl.u1.i)
  /* True if NODE is a local variable final. */
  #define LOCAL_FINAL_P(NODE) (DECL_LANG_SPECIFIC (NODE) && DECL_FINAL (NODE))
diff -Nrc3p gcc.orig/java/lang.c gcc/java/lang.c
*** gcc.orig/java/lang.c	Mon Jun 30 22:49:23 2003
--- gcc/java/lang.c	Tue Jul  1 00:08:23 2003
*************** static bool java_can_use_bit_fields_p (v
*** 66,71 ****
--- 66,72 ----
  static bool java_dump_tree (void *, tree);
  static void dump_compound_expr (dump_info_p, tree);
  static bool java_decl_ok_for_sibcall (tree);
+ static int java_estimate_num_insns (tree);
  
  #ifndef TARGET_OBJECT_SUFFIX
  # define TARGET_OBJECT_SUFFIX ".o"
*************** struct language_function GTY(())
*** 249,254 ****
--- 250,258 ----
  #undef LANG_HOOKS_TREE_INLINING_WALK_SUBTREES
  #define LANG_HOOKS_TREE_INLINING_WALK_SUBTREES java_tree_inlining_walk_subtrees
  
+ #undef LANG_HOOKS_TREE_INLINING_ESTIMATE_NUM_INSNS
+ #define LANG_HOOKS_TREE_INLINING_ESTIMATE_NUM_INSNS java_estimate_num_insns
+ 
  #undef LANG_HOOKS_TREE_DUMP_DUMP_TREE_FN
  #define LANG_HOOKS_TREE_DUMP_DUMP_TREE_FN java_dump_tree
  
*************** java_decl_ok_for_sibcall (tree decl)
*** 1079,1082 ****
--- 1083,1189 ----
    return decl != NULL && DECL_CONTEXT (decl) == current_class;
  }
  
+ /* Used by estimate_num_insns.  Estimate number of instructions seen
+    by given statement.  */
+ static tree
+ java_estimate_num_insns_1 (tree *tp, int *walk_subtrees, void *data)
+ {
+   int *count = data;
+   tree x = *tp;
+ 
+   if (TYPE_P (x) || DECL_P (x))
+     {
+       *walk_subtrees = 0;
+       return NULL;
+     }
+   /* Assume that constants and references counts nothing.  These should
+      be majorized by amount of operations amoung them we count later
+      and are common target of CSE and similar optimizations.  */
+   if (TREE_CODE_CLASS (TREE_CODE (x)) == 'c'
+       || TREE_CODE_CLASS (TREE_CODE (x)) == 'r')
+     return NULL;
+   switch (TREE_CODE (x))
+     { 
+     /* Reconginze assignments of large structures and constructors of
+        big arrays.  */
+     case MODIFY_EXPR:
+     case CONSTRUCTOR:
+       {
+ 	int size = int_size_in_bytes (TREE_TYPE (x));
+ 
+ 	if (!size || size > MOVE_MAX_PIECES)
+ 	  *count += 10;
+ 	else
+ 	  *count += 2 * (size + MOVE_MAX - 1) / MOVE_MAX;
+ 	return NULL;
+       }
+       break;
+     /* Few special cases of expensive operations.  This is usefull
+        to avoid inlining on functions having too many of these.  */
+     case TRUNC_DIV_EXPR:
+     case CEIL_DIV_EXPR:
+     case FLOOR_DIV_EXPR:
+     case ROUND_DIV_EXPR:
+     case TRUNC_MOD_EXPR:
+     case CEIL_MOD_EXPR:
+     case FLOOR_MOD_EXPR:
+     case ROUND_MOD_EXPR:
+     case RDIV_EXPR:
+     case CALL_EXPR:
+     case METHOD_CALL_EXPR:
+ 
+     case NEW_ARRAY_EXPR:
+     case NEW_ANONYMOUS_ARRAY_EXPR:
+     case NEW_CLASS_EXPR:
+       *count += 10;
+       break;
+     /* Various containers that will produce no code themselves.  */
+     case INIT_EXPR:
+     case TARGET_EXPR:
+     case BIND_EXPR:
+     case BLOCK:
+     case TREE_LIST:
+     case TREE_VEC:
+     case IDENTIFIER_NODE:
+     case PLACEHOLDER_EXPR:
+     case WITH_CLEANUP_EXPR:
+     case CLEANUP_POINT_EXPR:
+     case NOP_EXPR:
+     case VIEW_CONVERT_EXPR:
+     case SAVE_EXPR:
+     case UNSAVE_EXPR:
+     case COMPLEX_EXPR:
+     case REALPART_EXPR:
+     case IMAGPART_EXPR:
+     case TRY_CATCH_EXPR:
+     case TRY_FINALLY_EXPR:
+     case LABEL_EXPR:
+     case EXIT_EXPR:
+     case LABELED_BLOCK_EXPR:
+     case EXIT_BLOCK_EXPR:
+     case EXPR_WITH_FILE_LOCATION:
+     case UNARY_PLUS_EXPR:
+     case THIS_EXPR:
+     case DEFAULT_EXPR:
+     case TRY_EXPR:
+ 
+       break;
+     case CLASS_LITERAL:
+       *walk_subtrees = 0;
+       break;
+     default:
+       (*count)++;
+     }
+   return NULL;
+ }
+ 
+ /*  Estimate number of instructions that will be created by expanding the body.  */
+ static int
+ java_estimate_num_insns (tree decl)
+ {
+   int num = 0;
+   walk_tree_without_duplicates (&DECL_SAVED_TREE (decl), java_estimate_num_insns_1, &num);
+   return num;
+ }
+ 
  #include "gt-java-lang.h"
diff -Nrc3p gcc.orig/java/parse.y gcc/java/parse.y
*** gcc.orig/java/parse.y	Mon Jun 30 22:49:23 2003
--- gcc/java/parse.y	Mon Jun 30 23:13:56 2003
*************** add_stmt_to_block (tree b, tree type, tr
*** 7429,7438 ****
  static tree
  add_stmt_to_compound (tree existing, tree type, tree stmt)
  {
-   /* Keep track of this for inlining.  */
-   if (current_function_decl)
-     ++DECL_NUM_STMTS (current_function_decl);
- 
    if (existing)
      return build (COMPOUND_EXPR, type, existing, stmt);
    else
--- 7429,7434 ----
diff -Nrc3p gcc.orig/langhooks-def.h gcc/langhooks-def.h
*** gcc.orig/langhooks-def.h	Mon Jun 30 22:49:24 2003
--- gcc/langhooks-def.h	Mon Jun 30 23:18:49 2003
*************** void write_global_declarations PARAMS ((
*** 152,157 ****
--- 152,159 ----
    lhd_tree_inlining_end_inlining
  #define LANG_HOOKS_TREE_INLINING_CONVERT_PARM_FOR_INLINING \
    lhd_tree_inlining_convert_parm_for_inlining
+ #define LANG_HOOKS_TREE_INLINING_ESTIMATE_NUM_INSNS \
+   NULL
  
  #define LANG_HOOKS_TREE_INLINING_INITIALIZER { \
    LANG_HOOKS_TREE_INLINING_WALK_SUBTREES, \
*************** void write_global_declarations PARAMS ((
*** 165,171 ****
    LANG_HOOKS_TREE_INLINING_VAR_MOD_TYPE_P, \
    LANG_HOOKS_TREE_INLINING_START_INLINING, \
    LANG_HOOKS_TREE_INLINING_END_INLINING, \
!   LANG_HOOKS_TREE_INLINING_CONVERT_PARM_FOR_INLINING \
  } \
  
  #define LANG_HOOKS_CALLGRAPH_LOWER_FUNCTION NULL
--- 167,174 ----
    LANG_HOOKS_TREE_INLINING_VAR_MOD_TYPE_P, \
    LANG_HOOKS_TREE_INLINING_START_INLINING, \
    LANG_HOOKS_TREE_INLINING_END_INLINING, \
!   LANG_HOOKS_TREE_INLINING_CONVERT_PARM_FOR_INLINING, \
!   LANG_HOOKS_TREE_INLINING_ESTIMATE_NUM_INSNS \
  } \
  
  #define LANG_HOOKS_CALLGRAPH_LOWER_FUNCTION NULL
diff -Nrc3p gcc.orig/langhooks.h gcc/langhooks.h
*** gcc.orig/langhooks.h	Mon Jun 30 22:49:23 2003
--- gcc/langhooks.h	Mon Jun 30 23:14:36 2003
*************** struct lang_hooks_for_tree_inlining
*** 56,61 ****
--- 56,62 ----
    union tree_node *(*convert_parm_for_inlining) PARAMS ((union tree_node *,
  							 union tree_node *,
  							 union tree_node *));
+   int (*estimate_num_insns) PARAMS ((union tree_node *));
  };
  
  struct lang_hooks_for_callgraph
diff -Nrc3p gcc.orig/params.def gcc/params.def
*** gcc.orig/params.def	Mon Jun 30 22:49:40 2003
--- gcc/params.def	Mon Jun 30 23:13:56 2003
*************** Software Foundation, 59 Temple Place - S
*** 39,45 ****
     of a function counted in internal gcc instructions (not in
     real machine instructions) that is eligible for inlining
     by the tree inliner.
!    The default value is 300.
     Only functions marked inline (or methods defined in the class
     definition for C++) are affected by this, unless you set the
     -finline-functions (included in -O3) compiler option.
--- 39,45 ----
     of a function counted in internal gcc instructions (not in
     real machine instructions) that is eligible for inlining
     by the tree inliner.
!    The default value is 100.
     Only functions marked inline (or methods defined in the class
     definition for C++) are affected by this, unless you set the
     -finline-functions (included in -O3) compiler option.
*************** Software Foundation, 59 Temple Place - S
*** 51,57 ****
  DEFPARAM (PARAM_MAX_INLINE_INSNS_SINGLE,
  	  "max-inline-insns-single",
  	  "The maximum number of instructions in a single function eligible for inlining",
! 	  300)
  
  /* The single function inlining limit for functions that are
     inlined by virtue of -finline-functions (-O3).
--- 51,57 ----
  DEFPARAM (PARAM_MAX_INLINE_INSNS_SINGLE,
  	  "max-inline-insns-single",
  	  "The maximum number of instructions in a single function eligible for inlining",
! 	  100)
  
  /* The single function inlining limit for functions that are
     inlined by virtue of -finline-functions (-O3).
*************** DEFPARAM (PARAM_MAX_INLINE_INSNS_SINGLE,
*** 59,69 ****
     that is applied to functions marked inlined (or defined in the
     class declaration in C++) given by the "max-inline-insns-single"
     parameter.
!    The default value is 300.  */
  DEFPARAM (PARAM_MAX_INLINE_INSNS_AUTO,
  	  "max-inline-insns-auto",
  	  "The maximum number of instructions when automatically inlining",
! 	  300)
  
  /* The repeated inlining limit.  After this number of instructions 
     (in the internal gcc representation, not real machine instructions)
--- 59,69 ----
     that is applied to functions marked inlined (or defined in the
     class declaration in C++) given by the "max-inline-insns-single"
     parameter.
!    The default value is 100.  */
  DEFPARAM (PARAM_MAX_INLINE_INSNS_AUTO,
  	  "max-inline-insns-auto",
  	  "The maximum number of instructions when automatically inlining",
! 	  100)
  
  /* The repeated inlining limit.  After this number of instructions 
     (in the internal gcc representation, not real machine instructions)
*************** DEFPARAM (PARAM_MAX_INLINE_INSNS_AUTO,
*** 82,88 ****
  DEFPARAM (PARAM_MAX_INLINE_INSNS,
  	  "max-inline-insns",
  	  "The maximum number of instructions by repeated inlining before gcc starts to throttle inlining",
! 	  600)
  
  /* After the repeated inline limit has been exceeded (see
     "max-inline-insns" parameter), a linear function is used to
--- 82,88 ----
  DEFPARAM (PARAM_MAX_INLINE_INSNS,
  	  "max-inline-insns",
  	  "The maximum number of instructions by repeated inlining before gcc starts to throttle inlining",
! 	  300)
  
  /* After the repeated inline limit has been exceeded (see
     "max-inline-insns" parameter), a linear function is used to
*************** DEFPARAM (PARAM_MAX_INLINE_SLOPE,
*** 108,114 ****
  DEFPARAM (PARAM_MIN_INLINE_INSNS,
  	  "min-inline-insns",
  	  "The number of instructions in a single functions still eligible to inlining after a lot recursive inlining",
! 	  130)
  
  /* For languages that (still) use the RTL inliner, we can specify
     limits for the RTL inliner separately.
--- 108,114 ----
  DEFPARAM (PARAM_MIN_INLINE_INSNS,
  	  "min-inline-insns",
  	  "The number of instructions in a single functions still eligible to inlining after a lot recursive inlining",
! 	  10)
  
  /* For languages that (still) use the RTL inliner, we can specify
     limits for the RTL inliner separately.
diff -Nrc3p gcc.orig/tree-inline.c gcc/tree-inline.c
*** gcc.orig/tree-inline.c	Mon Jun 30 22:49:38 2003
--- gcc/tree-inline.c	Mon Jun 30 23:16:30 2003
*************** typedef struct inline_data
*** 93,101 ****
    int in_target_cleanup_p;
    /* A list of the functions current function has inlined.  */
    varray_type inlined_fns;
!   /* The approximate number of statements we have inlined in the
       current call stack.  */
!   int inlined_stmts;
    /* We use the same mechanism to build clones that we do to perform
       inlining.  However, there are a few places where we need to
       distinguish between those two situations.  This flag is true if
--- 93,101 ----
    int in_target_cleanup_p;
    /* A list of the functions current function has inlined.  */
    varray_type inlined_fns;
!   /* The approximate number of instructions we have inlined in the
       current call stack.  */
!   int inlined_insns;
    /* We use the same mechanism to build clones that we do to perform
       inlining.  However, there are a few places where we need to
       distinguish between those two situations.  This flag is true if
*************** static tree find_alloca_call PARAMS ((tr
*** 131,141 ****
  static tree find_builtin_longjmp_call_1 PARAMS ((tree *, int *, void *));
  static tree find_builtin_longjmp_call PARAMS ((tree));
  
- /* The approximate number of instructions per statement.  This number
-    need not be particularly accurate; it is used only to make
-    decisions about when a function is too big to inline.  */
- #define INSNS_PER_STMT (10)
- 
  /* Remap DECL during the copying of the BLOCK tree for the function.  */
  
  static tree
--- 131,136 ----
*************** inlinable_function_p (fn, id, nolimit)
*** 971,977 ****
       int nolimit;
  {
    int inlinable;
!   int currfn_insns;
    int max_inline_insns_single = MAX_INLINE_INSNS_SINGLE;
  
    /* If we've already decided this function shouldn't be inlined,
--- 966,972 ----
       int nolimit;
  {
    int inlinable;
!   int currfn_insns = 0;
    int max_inline_insns_single = MAX_INLINE_INSNS_SINGLE;
  
    /* If we've already decided this function shouldn't be inlined,
*************** inlinable_function_p (fn, id, nolimit)
*** 991,997 ****
      max_inline_insns_single = MAX_INLINE_INSNS_AUTO;
  	
    /* The number of instructions (estimated) of current function.  */
!   currfn_insns = DECL_NUM_STMTS (fn) * INSNS_PER_STMT;
  
    /* If we're not inlining things, then nothing is inlinable.  */
    if (! flag_inline_trees)
--- 986,995 ----
      max_inline_insns_single = MAX_INLINE_INSNS_AUTO;
  	
    /* The number of instructions (estimated) of current function.  */
!   if (!nolimit && !DECL_ESTIMATED_INSNS (fn))
!     DECL_ESTIMATED_INSNS (fn)
!       = (*lang_hooks.tree_inlining.estimate_num_insns) (fn);
!   currfn_insns = DECL_ESTIMATED_INSNS (fn);
  
    /* If we're not inlining things, then nothing is inlinable.  */
    if (! flag_inline_trees)
*************** inlinable_function_p (fn, id, nolimit)
*** 1040,1047 ****
    if (! (*lang_hooks.tree_inlining.disregard_inline_limits) (fn)
        && inlinable && !nolimit)
      {
!       int sum_insns = (id ? id->inlined_stmts : 0) * INSNS_PER_STMT
! 		     + currfn_insns;
        /* In the extreme case that we have exceeded the recursive inlining
           limit by a huge factor (128), we just say no. Should not happen
           in real life.  */
--- 1038,1044 ----
    if (! (*lang_hooks.tree_inlining.disregard_inline_limits) (fn)
        && inlinable && !nolimit)
      {
!       int sum_insns = (id ? id->inlined_insns : 0) + currfn_insns;
        /* In the extreme case that we have exceeded the recursive inlining
           limit by a huge factor (128), we just say no. Should not happen
           in real life.  */
*************** expand_call_inline (tp, walk_subtrees, d
*** 1429,1437 ****
    TREE_USED (*tp) = 1;
  
    /* Our function now has more statements than it did before.  */
!   DECL_NUM_STMTS (VARRAY_TREE (id->fns, 0)) += DECL_NUM_STMTS (fn);
    /* For accounting, subtract one for the saved call/ret.  */
!   id->inlined_stmts += DECL_NUM_STMTS (fn) - 1;
  
    /* Update callgraph if needed.  */
    if (id->decl && flag_unit_at_a_time)
--- 1426,1434 ----
    TREE_USED (*tp) = 1;
  
    /* Our function now has more statements than it did before.  */
!   DECL_ESTIMATED_INSNS (VARRAY_TREE (id->fns, 0)) += DECL_ESTIMATED_INSNS (fn);
    /* For accounting, subtract one for the saved call/ret.  */
!   id->inlined_insns += DECL_ESTIMATED_INSNS (fn) - 1;
  
    /* Update callgraph if needed.  */
    if (id->decl && flag_unit_at_a_time)
*************** expand_call_inline (tp, walk_subtrees, d
*** 1447,1453 ****
    /* If we've returned to the top level, clear out the record of how
       much inlining has been done.  */
    if (VARRAY_ACTIVE_SIZE (id->fns) == id->first_inlined_fn)
!     id->inlined_stmts = 0;
  
    /* Don't walk into subtrees.  We've already handled them above.  */
    *walk_subtrees = 0;
--- 1444,1450 ----
    /* If we've returned to the top level, clear out the record of how
       much inlining has been done.  */
    if (VARRAY_ACTIVE_SIZE (id->fns) == id->first_inlined_fn)
!     id->inlined_insns = 0;
  
    /* Don't walk into subtrees.  We've already handled them above.  */
    *walk_subtrees = 0;
*************** optimize_inline_calls (fn)
*** 1490,1495 ****
--- 1487,1495 ----
    /* Don't allow recursion into FN.  */
    VARRAY_TREE_INIT (id.fns, 32, "fns");
    VARRAY_PUSH_TREE (id.fns, fn);
+   if (!DECL_ESTIMATED_INSNS (fn))
+     DECL_ESTIMATED_INSNS (fn) 
+       = (*lang_hooks.tree_inlining.estimate_num_insns) (fn);
    /* Or any functions that aren't finished yet.  */
    prev_fn = NULL_TREE;
    if (current_function_decl)



More information about the Gcc-patches mailing list