This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
[RFA] new code size estimation for inlining
- From: Jan Hubicka <jh at suse dot cz>
- To: pfeifer at dbai dot tuwien dot ac dot at, rth at redhat dot com,gcc-patches at gcc dot gnu dot org
- Date: Mon, 30 Jun 2003 22:56:31 +0200
- Subject: [RFA] new code size estimation for inlining
Hi,
this patch implements the new estimation of body size via langhook. My
estimation is based on counting number of operations encoded in the trees that
should be better than number of statements as it more closely responds to the
memory requirements, number of isntructions and perhaps also the performance :)
I've set the default values to get approximately as much of inlining as we do
right now. This is perhaps too much but I would like to get all the parts of
infrastructure I am working on at place before tunning it too hard.
The sizes of binaries behaves like:
164.gzip: Base: 59720 bytes
164.gzip: Peak: 59720 bytes
175.vpr: Base: 158808 bytes
175.vpr: Peak: 161144 bytes
176.gcc: Base: 1727744 bytes
176.gcc: Peak: 1677600 bytes
181.mcf: Base: 20816 bytes
181.mcf: Peak: 20816 bytes
186.crafty: Base: 206048 bytes
186.crafty: Peak: 206016 bytes
197.parser: Base: 136608 bytes
197.parser: Peak: 141472 bytes
252.eon: Base: 574464 bytes
252.eon: Peak: 567552 bytes
253.perlbmk: Base: 667992 bytes
253.perlbmk: Peak: 684280 bytes
254.gap: Base: 537464 bytes
254.gap: Peak: 551736 bytes
255.vortex: Base: 638120 bytes
255.vortex: Peak: 630216 bytes
256.bzip2: Base: 58216 bytes
256.bzip2: Peak: 58216 bytes
300.twolf: Base: 201488 bytes
300.twolf: Peak: 202448 bytes
=============================
Total: Base: 4987488 bytes
Total: Peak: 4961216 bytes
Compilation time is unchanged. (375s vs 371s), same as bootstrap time.
Scores are also similar (slightly better):
164.gzip 1400 200 699* 1400 198 707*
175.vpr 1400 183 766* 1400 181 772*
176.gcc 1100 130 843* 1100 131 838*
181.mcf 1800 351 512* 1800 352 511*
186.crafty 1000 80.8 1238* 1000 80.6 1241*
197.parser 1800 337 535* 1800 334 538*
252.eon 1300 140 929* 1300 137 947*
253.perlbmk 1800 218 826* 1800 219 821*
254.gap 1100 158 696* 1100 158 696*
255.vortex 1900 173 1096* 1900 173 1098*
256.bzip2 1500 200 750* 1500 199 755*
300.twolf 3000 387 776* 3000 385 779*
Est. SPECint_base2000 782
Est. SPECint2000 785
Since SPEC2000 is poor testcase for inlining I also use Gerald's application.
On his application I get:
time stripped size
mainline 3m0.440s 1606860
new estimate 2m55.360s 1667564
unit-at-a-time based inlining 2m23.130s 1486764
So the this patch is not major win, but not particularly well tunned (this is
my very first try with new counts) unit-at-a-time patch with this estimate
already is. (22% speedup, 8% code size savings).
I don't have up-to-date-unit-at-a-time-SPEC2000-scores (TM) but these were
mostly the same in last run I tested except for EON saving about 11% of code
size and 5% of compilation time.
The performance is as follows (times in seconds followed by variations
between 3 runs):
| mainline | new size est.| unit-at-a-time|
---------------------+----------------+----------------+----------------+
STRATCOMP1-ALL | 3.11 (0.00) | 2.88 (0.00) | 2.87 (0.00) |
STRATCOMP-770.2-Q | 0.48 (0.00) | 0.46 (0.00) | 0.47 (0.00) |
2QBF1 | 12.59 (0.01) | 11.85 (0.02) | 13.03 (0.00) |
PRIMEIMPL2 | 6.06 (0.00) | 6.02 (0.00) | 6.19 (0.00) |
ANCESTOR | 95.17 (0.02) | 91.50 (0.14) | 89.18 (0.10) |
3COL-SIMPLEX1 | 4.93 (0.02) | 4.70 (0.02) | 4.52 (0.04) |
3COL-LADDER | 116.72 (0.33) | 110.92 (0.12) | 105.46 (0.03) |
3COL-N-LADDER | - (-) | - (-) | - (-) |
3COL-RANDOM1 | 8.73 (0.01) | 8.65 (0.05) | 8.77 (0.07) |
HP-RANDOM1 | 9.57 (0.03) | 9.39 (0.10) | 9.25 (0.07) |
HAMCYCLE-FREE | 0.95 (0.00) | 0.92 (0.00) | 0.83 (0.00) |
DECOMP2 | 8.95 (0.05) | 8.26 (0.07) | 8.45 (0.03) |
BW-P4-Esra-a | 63.83 (0.11) | 64.57 (0.10) | 63.89 (0.05) |
BW-P5-nopush | 5.99 (0.00) | 6.03 (0.00) | 5.96 (0.00) |
BW-P5-pushbin | 5.14 (0.00) | 5.19 (0.01) | 5.14 (0.01) |
BW-P5-nopushbin | 1.54 (0.00) | 1.54 (0.01) | 1.52 (0.00) |
3SAT-1 | 19.23 (0.10) | 19.17 (0.14) | 19.61 (0.05) |
3SAT-1-CONSTRAINT | 10.54 (0.06) | 10.73 (0.03) | 10.90 (0.06) |
HANOI-Towers | 2.95 (0.02) | 2.84 (0.02) | 2.88 (0.00) |
RAMSEY | 5.40 (0.01) | 5.18 (0.01) | 4.98 (0.01) |
CRISTAL | 6.41 (0.01) | 6.14 (0.01) | 6.01 (0.06) |
HANOI-K | 23.01 (0.12) | 23.92 (0.18) | 23.83 (0.07) |
21-QUEENS | 5.55 (0.07) | 5.51 (0.01) | 5.49 (0.02) |
MSTDir[V=13,A=40] | 9.53 (0.00) | 9.73 (0.00) | 9.26 (0.01) |
MSTDir[V=15,A=40] | 9.58 (0.01) | 9.69 (0.00) | 9.22 (0.00) |
MSTUndir[V=13,A=40] | 5.17 (0.01) | 5.27 (0.00) | 4.99 (0.00) |
MSTUndir[V=15,A=40] | 84.05 (0.01) | 85.19 (0.02) | 81.14 (0.02) |
TIMETABLING | 7.05 (0.00) | 6.81 (0.02) | 6.59 (0.01) |
---------------------+----------------+----------------+----------------+
Mon Jun 30 22:50:59 CEST 2003 Jan Hubicka <jh@suse.cz>
* c-common.c (c_estimate_num_insns_1): New static function.
(c_estimate_num_insns): New global function.
* c-common.h (DECL_NUM_STMTS): Rename to...
(DECL_ESTIMATED_INSNS): ... this.
(c_estimate_num_insns): Declare.
* c-decl.c (duplicate_decls): Use DECL_ESTIMATED_INSNS.
* c-lang.c (LANG_HOOKS_TREE_INLINING_ESTIMATE_NUM_INSNS): New.
* c-semantics.c (add_stmt): Do not account statements.
* langhooks-def.h (LANG_HOOKS_TREE_INLINING_ESTIMATE_NUM_INSNS):
New.
* langhooks.h (lang_hooks_for_tree_inlining): Add
estimate_num_insns
* params.def (max-inline-insns-auto, max-inline-insns-auto): set
to 100.
(max-inline-insns): set to 300.
(min-inline-insns): set to 10.
* tree-inline.c (struct inline_data): Rename inlined_stmts to
inlined-insns.
(INSNS_PER_STMT): Kill.
(inlinable_function_p): Compute and store body size.
(expand_call_inline): Likewise.
(optimize_inline_calls): Likewise.
* cp-lang.c (LANG_HOOKS_TREE_INLINING_ESTIMATE_NUM_INSNS): New.
* decl.c (duplicate_decls): Use DECL_ESTIMATED_INSNS.
(start_function): Use DECL_ESTIMATED_INSNS.
* optimize.c (maybe_clone_body): Use DECL_ESTIMATED_INSNS.
* java-tree.h (DECL_NUM_STMTS): Rename to...
(DECL_ESTIMATED_INSNS): ... this.
* lang.c (java_estimate_num_insns, java_estimate_num_insns_1):
New static functions.
(LANG_HOOKS_TREE_INLINING_ESTIMATE_NUM_INSNS): Define.
* parser.y (add_stmt_to_compound): Do not account statements.
diff -Nrc3p gcc.orig/c-common.c gcc/c-common.c
*** gcc.orig/c-common.c Mon Jun 30 22:49:23 2003
--- gcc/c-common.c Mon Jun 30 23:31:28 2003
*************** check_function_arguments_recurse (void (
*** 6014,6017 ****
--- 6014,6127 ----
(*callback) (ctx, param, param_num);
}
+ /* Used by estimate_num_insns. Estimate number of instructions seen
+ by given statement. */
+ static tree
+ c_estimate_num_insns_1 (tree *tp, int *walk_subtrees, void *data)
+ {
+ int *count = data;
+ tree x = *tp;
+
+ if (TYPE_P (x) || DECL_P (x))
+ {
+ *walk_subtrees = 0;
+ return NULL;
+ }
+ /* Assume that constants and references counts nothing. These should
+ be majorized by amount of operations amoung them we count later
+ and are common target of CSE and similar optimizations. */
+ if (TREE_CODE_CLASS (TREE_CODE (x)) == 'c'
+ || TREE_CODE_CLASS (TREE_CODE (x)) == 'r')
+ return NULL;
+ switch (TREE_CODE (x))
+ {
+ /* Reconginze assignments of large structures and constructors of
+ big arrays. */
+ case MODIFY_EXPR:
+ case CONSTRUCTOR:
+ {
+ int size = int_size_in_bytes (TREE_TYPE (x));
+
+ if (!size || size > MOVE_MAX_PIECES)
+ *count += 10;
+ else
+ *count += 2 * (size + MOVE_MAX - 1) / MOVE_MAX;
+ return NULL;
+ }
+ break;
+ /* Few special cases of expensive operations. This is usefull
+ to avoid inlining on functions having too many of these. */
+ case TRUNC_DIV_EXPR:
+ case CEIL_DIV_EXPR:
+ case FLOOR_DIV_EXPR:
+ case ROUND_DIV_EXPR:
+ case TRUNC_MOD_EXPR:
+ case CEIL_MOD_EXPR:
+ case FLOOR_MOD_EXPR:
+ case ROUND_MOD_EXPR:
+ case RDIV_EXPR:
+ case CALL_EXPR:
+ case METHOD_CALL_EXPR:
+ *count += 10;
+ break;
+ /* Various containers that will produce no code themselves. */
+ case INIT_EXPR:
+ case TARGET_EXPR:
+ case BIND_EXPR:
+ case BLOCK:
+ case TREE_LIST:
+ case TREE_VEC:
+ case IDENTIFIER_NODE:
+ case PLACEHOLDER_EXPR:
+ case WITH_CLEANUP_EXPR:
+ case CLEANUP_POINT_EXPR:
+ case NOP_EXPR:
+ case VIEW_CONVERT_EXPR:
+ case SAVE_EXPR:
+ case UNSAVE_EXPR:
+ case COMPLEX_EXPR:
+ case REALPART_EXPR:
+ case IMAGPART_EXPR:
+ case TRY_CATCH_EXPR:
+ case TRY_FINALLY_EXPR:
+ case LABEL_EXPR:
+ case EXIT_EXPR:
+ case LABELED_BLOCK_EXPR:
+ case EXIT_BLOCK_EXPR:
+ case EXPR_WITH_FILE_LOCATION:
+
+ case EXPR_STMT:
+ case COMPOUND_STMT:
+ case RETURN_STMT:
+ case LABEL_STMT:
+ case SCOPE_STMT:
+ case FILE_STMT:
+ case CASE_LABEL:
+ case STMT_EXPR:
+ case CLEANUP_STMT:
+
+ case SIZEOF_EXPR:
+ case ARROW_EXPR:
+ case ALIGNOF_EXPR:
+ break;
+ case DECL_STMT:
+ /* Do not account static initializers. */
+ if (TREE_STATIC (TREE_OPERAND (x, 0)))
+ *walk_subtrees = 0;
+ break;
+ default:
+ (*count)++;
+ }
+ return NULL;
+ }
+
+ /* Estimate number of instructions that will be created by expanding the body. */
+ int
+ c_estimate_num_insns (tree decl)
+ {
+ int num = 0;
+ walk_tree_without_duplicates (&DECL_SAVED_TREE (decl), c_estimate_num_insns_1, &num);
+ return num;
+ }
+
#include "gt-c-common.h"
diff -Nrc3p gcc.orig/c-common.h gcc/c-common.h
*** gcc.orig/c-common.h Mon Jun 30 22:49:23 2003
--- gcc/c-common.h Mon Jun 30 23:18:01 2003
*************** struct c_lang_decl GTY(()) {
*** 348,354 ****
the approximate number of statements in this function. There is
no need for this number to be exact; it is only used in various
heuristics regarding optimization. */
! #define DECL_NUM_STMTS(NODE) \
(FUNCTION_DECL_CHECK (NODE)->decl.u1.i)
/* The variant of the C language being processed. Each C language
--- 348,354 ----
the approximate number of statements in this function. There is
no need for this number to be exact; it is only used in various
heuristics regarding optimization. */
! #define DECL_ESTIMATED_INSNS(NODE) \
(FUNCTION_DECL_CHECK (NODE)->decl.u1.i)
/* The variant of the C language being processed. Each C language
*************** extern void c_common_write_pch (void);
*** 1295,1300 ****
--- 1295,1301 ----
extern void builtin_define_with_value (const char *, const char *, int);
extern void c_stddef_cpp_builtins (void);
extern void fe_file_change (const struct line_map *);
+ extern int c_estimate_num_insns (tree decl);
/* In c-ppoutput.c */
extern void init_pp_output (FILE *);
diff -Nrc3p gcc.orig/c-decl.c gcc/c-decl.c
*** gcc.orig/c-decl.c Mon Jun 30 22:49:39 2003
--- gcc/c-decl.c Mon Jun 30 23:13:56 2003
*************** duplicate_decls (tree newdecl, tree oldd
*** 1474,1480 ****
DECL_INITIAL (newdecl) = DECL_INITIAL (olddecl);
DECL_SAVED_INSNS (newdecl) = DECL_SAVED_INSNS (olddecl);
DECL_SAVED_TREE (newdecl) = DECL_SAVED_TREE (olddecl);
! DECL_NUM_STMTS (newdecl) = DECL_NUM_STMTS (olddecl);
DECL_ARGUMENTS (newdecl) = DECL_ARGUMENTS (olddecl);
/* Set DECL_INLINE on the declaration if we've got a body
--- 1474,1480 ----
DECL_INITIAL (newdecl) = DECL_INITIAL (olddecl);
DECL_SAVED_INSNS (newdecl) = DECL_SAVED_INSNS (olddecl);
DECL_SAVED_TREE (newdecl) = DECL_SAVED_TREE (olddecl);
! DECL_ESTIMATED_INSNS (newdecl) = DECL_ESTIMATED_INSNS (olddecl);
DECL_ARGUMENTS (newdecl) = DECL_ARGUMENTS (olddecl);
/* Set DECL_INLINE on the declaration if we've got a body
diff -Nrc3p gcc.orig/c-lang.c gcc/c-lang.c
*** gcc.orig/c-lang.c Mon Jun 30 22:50:21 2003
--- gcc/c-lang.c Mon Jun 30 23:36:57 2003
*************** static int c_init_options (void);
*** 98,103 ****
--- 98,105 ----
#undef LANG_HOOKS_TREE_INLINING_CONVERT_PARM_FOR_INLINING
#define LANG_HOOKS_TREE_INLINING_CONVERT_PARM_FOR_INLINING \
c_convert_parm_for_inlining
+ #undef LANG_HOOKS_TREE_INLINING_ESTIMATE_NUM_INSNS
+ #define LANG_HOOKS_TREE_INLINING_ESTIMATE_NUM_INSNS c_estimate_num_insns
#undef LANG_HOOKS_TREE_DUMP_DUMP_TREE_FN
#define LANG_HOOKS_TREE_DUMP_DUMP_TREE_FN c_dump_tree
diff -Nrc3p gcc.orig/c-semantics.c gcc/c-semantics.c
*** gcc.orig/c-semantics.c Mon Jun 30 22:49:23 2003
--- gcc/c-semantics.c Mon Jun 30 23:13:56 2003
*************** add_stmt (tree t)
*** 98,107 ****
statements are full-expressions. We record that fact here. */
STMT_IS_FULL_EXPR_P (last_tree) = stmts_are_full_exprs_p ();
- /* Keep track of the number of statements in this function. */
- if (current_function_decl)
- ++DECL_NUM_STMTS (current_function_decl);
-
return t;
}
--- 98,103 ----
diff -Nrc3p gcc.orig/cp/cp-lang.c gcc/cp/cp-lang.c
*** gcc.orig/cp/cp-lang.c Mon Jun 30 22:49:38 2003
--- gcc/cp/cp-lang.c Tue Jul 1 00:02:00 2003
*************** static bool cp_var_mod_type_p (tree);
*** 138,143 ****
--- 138,145 ----
#define LANG_HOOKS_TREE_INLINING_START_INLINING cp_start_inlining
#undef LANG_HOOKS_TREE_INLINING_END_INLINING
#define LANG_HOOKS_TREE_INLINING_END_INLINING cp_end_inlining
+ #undef LANG_HOOKS_TREE_INLINING_ESTIMATE_NUM_INSNS
+ #define LANG_HOOKS_TREE_INLINING_ESTIMATE_NUM_INSNS c_estimate_num_insns
#undef LANG_HOOKS_TREE_DUMP_DUMP_TREE_FN
#define LANG_HOOKS_TREE_DUMP_DUMP_TREE_FN cp_dump_tree
#undef LANG_HOOKS_TREE_DUMP_TYPE_QUALS_FN
diff -Nrc3p gcc.orig/cp/decl.c gcc/cp/decl.c
*** gcc.orig/cp/decl.c Mon Jun 30 22:49:38 2003
--- gcc/cp/decl.c Mon Jun 30 23:13:56 2003
*************** duplicate_decls (tree newdecl, tree oldd
*** 3531,3537 ****
SET_DECL_RTL (newdecl, DECL_RTL (olddecl));
}
else
! DECL_NUM_STMTS (newdecl) = DECL_NUM_STMTS (olddecl);
DECL_RESULT (newdecl) = DECL_RESULT (olddecl);
/* Don't clear out the arguments if we're redefining a function. */
--- 3531,3537 ----
SET_DECL_RTL (newdecl, DECL_RTL (olddecl));
}
else
! DECL_ESTIMATED_INSNS (newdecl) = DECL_ESTIMATED_INSNS (olddecl);
DECL_RESULT (newdecl) = DECL_RESULT (olddecl);
/* Don't clear out the arguments if we're redefining a function. */
*************** start_function (tree declspecs, tree dec
*** 13524,13530 ****
begin_stmt_tree (&DECL_SAVED_TREE (decl1));
/* Don't double-count statements in templates. */
! DECL_NUM_STMTS (decl1) = 0;
/* Let the user know we're compiling this function. */
announce_function (decl1);
--- 13524,13530 ----
begin_stmt_tree (&DECL_SAVED_TREE (decl1));
/* Don't double-count statements in templates. */
! DECL_ESTIMATED_INSNS (decl1) = 0;
/* Let the user know we're compiling this function. */
announce_function (decl1);
diff -Nrc3p gcc.orig/cp/optimize.c gcc/cp/optimize.c
*** gcc.orig/cp/optimize.c Mon Jun 30 22:49:38 2003
--- gcc/cp/optimize.c Mon Jun 30 23:13:56 2003
*************** maybe_clone_body (tree fn)
*** 254,260 ****
/* There are as many statements in the clone as in the
original. */
! DECL_NUM_STMTS (clone) = DECL_NUM_STMTS (fn);
/* Clean up. */
splay_tree_delete (decl_map);
--- 254,260 ----
/* There are as many statements in the clone as in the
original. */
! DECL_ESTIMATED_INSNS (clone) = DECL_ESTIMATED_INSNS (fn);
/* Clean up. */
splay_tree_delete (decl_map);
diff -Nrc3p gcc.orig/java/java-tree.h gcc/java/java-tree.h
*** gcc.orig/java/java-tree.h Mon Jun 30 22:49:23 2003
--- gcc/java/java-tree.h Mon Jun 30 23:13:56 2003
*************** union lang_tree_node
*** 922,931 ****
#define DECL_FIELD_FINAL_WFL(NODE) \
(DECL_LANG_SPECIFIC(NODE)->u.v.wfl)
/* In a FUNCTION_DECL for which DECL_BUILT_IN does not hold, this is
! the approximate number of statements in this function. There is
no need for this number to be exact; it is only used in various
heuristics regarding optimization. */
! #define DECL_NUM_STMTS(NODE) \
(FUNCTION_DECL_CHECK (NODE)->decl.u1.i)
/* True if NODE is a local variable final. */
#define LOCAL_FINAL_P(NODE) (DECL_LANG_SPECIFIC (NODE) && DECL_FINAL (NODE))
--- 922,931 ----
#define DECL_FIELD_FINAL_WFL(NODE) \
(DECL_LANG_SPECIFIC(NODE)->u.v.wfl)
/* In a FUNCTION_DECL for which DECL_BUILT_IN does not hold, this is
! the approximate number of instructions in this function. There is
no need for this number to be exact; it is only used in various
heuristics regarding optimization. */
! #define DECL_ESTIMATED_INSNS(NODE) \
(FUNCTION_DECL_CHECK (NODE)->decl.u1.i)
/* True if NODE is a local variable final. */
#define LOCAL_FINAL_P(NODE) (DECL_LANG_SPECIFIC (NODE) && DECL_FINAL (NODE))
diff -Nrc3p gcc.orig/java/lang.c gcc/java/lang.c
*** gcc.orig/java/lang.c Mon Jun 30 22:49:23 2003
--- gcc/java/lang.c Tue Jul 1 00:08:23 2003
*************** static bool java_can_use_bit_fields_p (v
*** 66,71 ****
--- 66,72 ----
static bool java_dump_tree (void *, tree);
static void dump_compound_expr (dump_info_p, tree);
static bool java_decl_ok_for_sibcall (tree);
+ static int java_estimate_num_insns (tree);
#ifndef TARGET_OBJECT_SUFFIX
# define TARGET_OBJECT_SUFFIX ".o"
*************** struct language_function GTY(())
*** 249,254 ****
--- 250,258 ----
#undef LANG_HOOKS_TREE_INLINING_WALK_SUBTREES
#define LANG_HOOKS_TREE_INLINING_WALK_SUBTREES java_tree_inlining_walk_subtrees
+ #undef LANG_HOOKS_TREE_INLINING_ESTIMATE_NUM_INSNS
+ #define LANG_HOOKS_TREE_INLINING_ESTIMATE_NUM_INSNS java_estimate_num_insns
+
#undef LANG_HOOKS_TREE_DUMP_DUMP_TREE_FN
#define LANG_HOOKS_TREE_DUMP_DUMP_TREE_FN java_dump_tree
*************** java_decl_ok_for_sibcall (tree decl)
*** 1079,1082 ****
--- 1083,1189 ----
return decl != NULL && DECL_CONTEXT (decl) == current_class;
}
+ /* Used by estimate_num_insns. Estimate number of instructions seen
+ by given statement. */
+ static tree
+ java_estimate_num_insns_1 (tree *tp, int *walk_subtrees, void *data)
+ {
+ int *count = data;
+ tree x = *tp;
+
+ if (TYPE_P (x) || DECL_P (x))
+ {
+ *walk_subtrees = 0;
+ return NULL;
+ }
+ /* Assume that constants and references counts nothing. These should
+ be majorized by amount of operations amoung them we count later
+ and are common target of CSE and similar optimizations. */
+ if (TREE_CODE_CLASS (TREE_CODE (x)) == 'c'
+ || TREE_CODE_CLASS (TREE_CODE (x)) == 'r')
+ return NULL;
+ switch (TREE_CODE (x))
+ {
+ /* Reconginze assignments of large structures and constructors of
+ big arrays. */
+ case MODIFY_EXPR:
+ case CONSTRUCTOR:
+ {
+ int size = int_size_in_bytes (TREE_TYPE (x));
+
+ if (!size || size > MOVE_MAX_PIECES)
+ *count += 10;
+ else
+ *count += 2 * (size + MOVE_MAX - 1) / MOVE_MAX;
+ return NULL;
+ }
+ break;
+ /* Few special cases of expensive operations. This is usefull
+ to avoid inlining on functions having too many of these. */
+ case TRUNC_DIV_EXPR:
+ case CEIL_DIV_EXPR:
+ case FLOOR_DIV_EXPR:
+ case ROUND_DIV_EXPR:
+ case TRUNC_MOD_EXPR:
+ case CEIL_MOD_EXPR:
+ case FLOOR_MOD_EXPR:
+ case ROUND_MOD_EXPR:
+ case RDIV_EXPR:
+ case CALL_EXPR:
+ case METHOD_CALL_EXPR:
+
+ case NEW_ARRAY_EXPR:
+ case NEW_ANONYMOUS_ARRAY_EXPR:
+ case NEW_CLASS_EXPR:
+ *count += 10;
+ break;
+ /* Various containers that will produce no code themselves. */
+ case INIT_EXPR:
+ case TARGET_EXPR:
+ case BIND_EXPR:
+ case BLOCK:
+ case TREE_LIST:
+ case TREE_VEC:
+ case IDENTIFIER_NODE:
+ case PLACEHOLDER_EXPR:
+ case WITH_CLEANUP_EXPR:
+ case CLEANUP_POINT_EXPR:
+ case NOP_EXPR:
+ case VIEW_CONVERT_EXPR:
+ case SAVE_EXPR:
+ case UNSAVE_EXPR:
+ case COMPLEX_EXPR:
+ case REALPART_EXPR:
+ case IMAGPART_EXPR:
+ case TRY_CATCH_EXPR:
+ case TRY_FINALLY_EXPR:
+ case LABEL_EXPR:
+ case EXIT_EXPR:
+ case LABELED_BLOCK_EXPR:
+ case EXIT_BLOCK_EXPR:
+ case EXPR_WITH_FILE_LOCATION:
+ case UNARY_PLUS_EXPR:
+ case THIS_EXPR:
+ case DEFAULT_EXPR:
+ case TRY_EXPR:
+
+ break;
+ case CLASS_LITERAL:
+ *walk_subtrees = 0;
+ break;
+ default:
+ (*count)++;
+ }
+ return NULL;
+ }
+
+ /* Estimate number of instructions that will be created by expanding the body. */
+ static int
+ java_estimate_num_insns (tree decl)
+ {
+ int num = 0;
+ walk_tree_without_duplicates (&DECL_SAVED_TREE (decl), java_estimate_num_insns_1, &num);
+ return num;
+ }
+
#include "gt-java-lang.h"
diff -Nrc3p gcc.orig/java/parse.y gcc/java/parse.y
*** gcc.orig/java/parse.y Mon Jun 30 22:49:23 2003
--- gcc/java/parse.y Mon Jun 30 23:13:56 2003
*************** add_stmt_to_block (tree b, tree type, tr
*** 7429,7438 ****
static tree
add_stmt_to_compound (tree existing, tree type, tree stmt)
{
- /* Keep track of this for inlining. */
- if (current_function_decl)
- ++DECL_NUM_STMTS (current_function_decl);
-
if (existing)
return build (COMPOUND_EXPR, type, existing, stmt);
else
--- 7429,7434 ----
diff -Nrc3p gcc.orig/langhooks-def.h gcc/langhooks-def.h
*** gcc.orig/langhooks-def.h Mon Jun 30 22:49:24 2003
--- gcc/langhooks-def.h Mon Jun 30 23:18:49 2003
*************** void write_global_declarations PARAMS ((
*** 152,157 ****
--- 152,159 ----
lhd_tree_inlining_end_inlining
#define LANG_HOOKS_TREE_INLINING_CONVERT_PARM_FOR_INLINING \
lhd_tree_inlining_convert_parm_for_inlining
+ #define LANG_HOOKS_TREE_INLINING_ESTIMATE_NUM_INSNS \
+ NULL
#define LANG_HOOKS_TREE_INLINING_INITIALIZER { \
LANG_HOOKS_TREE_INLINING_WALK_SUBTREES, \
*************** void write_global_declarations PARAMS ((
*** 165,171 ****
LANG_HOOKS_TREE_INLINING_VAR_MOD_TYPE_P, \
LANG_HOOKS_TREE_INLINING_START_INLINING, \
LANG_HOOKS_TREE_INLINING_END_INLINING, \
! LANG_HOOKS_TREE_INLINING_CONVERT_PARM_FOR_INLINING \
} \
#define LANG_HOOKS_CALLGRAPH_LOWER_FUNCTION NULL
--- 167,174 ----
LANG_HOOKS_TREE_INLINING_VAR_MOD_TYPE_P, \
LANG_HOOKS_TREE_INLINING_START_INLINING, \
LANG_HOOKS_TREE_INLINING_END_INLINING, \
! LANG_HOOKS_TREE_INLINING_CONVERT_PARM_FOR_INLINING, \
! LANG_HOOKS_TREE_INLINING_ESTIMATE_NUM_INSNS \
} \
#define LANG_HOOKS_CALLGRAPH_LOWER_FUNCTION NULL
diff -Nrc3p gcc.orig/langhooks.h gcc/langhooks.h
*** gcc.orig/langhooks.h Mon Jun 30 22:49:23 2003
--- gcc/langhooks.h Mon Jun 30 23:14:36 2003
*************** struct lang_hooks_for_tree_inlining
*** 56,61 ****
--- 56,62 ----
union tree_node *(*convert_parm_for_inlining) PARAMS ((union tree_node *,
union tree_node *,
union tree_node *));
+ int (*estimate_num_insns) PARAMS ((union tree_node *));
};
struct lang_hooks_for_callgraph
diff -Nrc3p gcc.orig/params.def gcc/params.def
*** gcc.orig/params.def Mon Jun 30 22:49:40 2003
--- gcc/params.def Mon Jun 30 23:13:56 2003
*************** Software Foundation, 59 Temple Place - S
*** 39,45 ****
of a function counted in internal gcc instructions (not in
real machine instructions) that is eligible for inlining
by the tree inliner.
! The default value is 300.
Only functions marked inline (or methods defined in the class
definition for C++) are affected by this, unless you set the
-finline-functions (included in -O3) compiler option.
--- 39,45 ----
of a function counted in internal gcc instructions (not in
real machine instructions) that is eligible for inlining
by the tree inliner.
! The default value is 100.
Only functions marked inline (or methods defined in the class
definition for C++) are affected by this, unless you set the
-finline-functions (included in -O3) compiler option.
*************** Software Foundation, 59 Temple Place - S
*** 51,57 ****
DEFPARAM (PARAM_MAX_INLINE_INSNS_SINGLE,
"max-inline-insns-single",
"The maximum number of instructions in a single function eligible for inlining",
! 300)
/* The single function inlining limit for functions that are
inlined by virtue of -finline-functions (-O3).
--- 51,57 ----
DEFPARAM (PARAM_MAX_INLINE_INSNS_SINGLE,
"max-inline-insns-single",
"The maximum number of instructions in a single function eligible for inlining",
! 100)
/* The single function inlining limit for functions that are
inlined by virtue of -finline-functions (-O3).
*************** DEFPARAM (PARAM_MAX_INLINE_INSNS_SINGLE,
*** 59,69 ****
that is applied to functions marked inlined (or defined in the
class declaration in C++) given by the "max-inline-insns-single"
parameter.
! The default value is 300. */
DEFPARAM (PARAM_MAX_INLINE_INSNS_AUTO,
"max-inline-insns-auto",
"The maximum number of instructions when automatically inlining",
! 300)
/* The repeated inlining limit. After this number of instructions
(in the internal gcc representation, not real machine instructions)
--- 59,69 ----
that is applied to functions marked inlined (or defined in the
class declaration in C++) given by the "max-inline-insns-single"
parameter.
! The default value is 100. */
DEFPARAM (PARAM_MAX_INLINE_INSNS_AUTO,
"max-inline-insns-auto",
"The maximum number of instructions when automatically inlining",
! 100)
/* The repeated inlining limit. After this number of instructions
(in the internal gcc representation, not real machine instructions)
*************** DEFPARAM (PARAM_MAX_INLINE_INSNS_AUTO,
*** 82,88 ****
DEFPARAM (PARAM_MAX_INLINE_INSNS,
"max-inline-insns",
"The maximum number of instructions by repeated inlining before gcc starts to throttle inlining",
! 600)
/* After the repeated inline limit has been exceeded (see
"max-inline-insns" parameter), a linear function is used to
--- 82,88 ----
DEFPARAM (PARAM_MAX_INLINE_INSNS,
"max-inline-insns",
"The maximum number of instructions by repeated inlining before gcc starts to throttle inlining",
! 300)
/* After the repeated inline limit has been exceeded (see
"max-inline-insns" parameter), a linear function is used to
*************** DEFPARAM (PARAM_MAX_INLINE_SLOPE,
*** 108,114 ****
DEFPARAM (PARAM_MIN_INLINE_INSNS,
"min-inline-insns",
"The number of instructions in a single functions still eligible to inlining after a lot recursive inlining",
! 130)
/* For languages that (still) use the RTL inliner, we can specify
limits for the RTL inliner separately.
--- 108,114 ----
DEFPARAM (PARAM_MIN_INLINE_INSNS,
"min-inline-insns",
"The number of instructions in a single functions still eligible to inlining after a lot recursive inlining",
! 10)
/* For languages that (still) use the RTL inliner, we can specify
limits for the RTL inliner separately.
diff -Nrc3p gcc.orig/tree-inline.c gcc/tree-inline.c
*** gcc.orig/tree-inline.c Mon Jun 30 22:49:38 2003
--- gcc/tree-inline.c Mon Jun 30 23:16:30 2003
*************** typedef struct inline_data
*** 93,101 ****
int in_target_cleanup_p;
/* A list of the functions current function has inlined. */
varray_type inlined_fns;
! /* The approximate number of statements we have inlined in the
current call stack. */
! int inlined_stmts;
/* We use the same mechanism to build clones that we do to perform
inlining. However, there are a few places where we need to
distinguish between those two situations. This flag is true if
--- 93,101 ----
int in_target_cleanup_p;
/* A list of the functions current function has inlined. */
varray_type inlined_fns;
! /* The approximate number of instructions we have inlined in the
current call stack. */
! int inlined_insns;
/* We use the same mechanism to build clones that we do to perform
inlining. However, there are a few places where we need to
distinguish between those two situations. This flag is true if
*************** static tree find_alloca_call PARAMS ((tr
*** 131,141 ****
static tree find_builtin_longjmp_call_1 PARAMS ((tree *, int *, void *));
static tree find_builtin_longjmp_call PARAMS ((tree));
- /* The approximate number of instructions per statement. This number
- need not be particularly accurate; it is used only to make
- decisions about when a function is too big to inline. */
- #define INSNS_PER_STMT (10)
-
/* Remap DECL during the copying of the BLOCK tree for the function. */
static tree
--- 131,136 ----
*************** inlinable_function_p (fn, id, nolimit)
*** 971,977 ****
int nolimit;
{
int inlinable;
! int currfn_insns;
int max_inline_insns_single = MAX_INLINE_INSNS_SINGLE;
/* If we've already decided this function shouldn't be inlined,
--- 966,972 ----
int nolimit;
{
int inlinable;
! int currfn_insns = 0;
int max_inline_insns_single = MAX_INLINE_INSNS_SINGLE;
/* If we've already decided this function shouldn't be inlined,
*************** inlinable_function_p (fn, id, nolimit)
*** 991,997 ****
max_inline_insns_single = MAX_INLINE_INSNS_AUTO;
/* The number of instructions (estimated) of current function. */
! currfn_insns = DECL_NUM_STMTS (fn) * INSNS_PER_STMT;
/* If we're not inlining things, then nothing is inlinable. */
if (! flag_inline_trees)
--- 986,995 ----
max_inline_insns_single = MAX_INLINE_INSNS_AUTO;
/* The number of instructions (estimated) of current function. */
! if (!nolimit && !DECL_ESTIMATED_INSNS (fn))
! DECL_ESTIMATED_INSNS (fn)
! = (*lang_hooks.tree_inlining.estimate_num_insns) (fn);
! currfn_insns = DECL_ESTIMATED_INSNS (fn);
/* If we're not inlining things, then nothing is inlinable. */
if (! flag_inline_trees)
*************** inlinable_function_p (fn, id, nolimit)
*** 1040,1047 ****
if (! (*lang_hooks.tree_inlining.disregard_inline_limits) (fn)
&& inlinable && !nolimit)
{
! int sum_insns = (id ? id->inlined_stmts : 0) * INSNS_PER_STMT
! + currfn_insns;
/* In the extreme case that we have exceeded the recursive inlining
limit by a huge factor (128), we just say no. Should not happen
in real life. */
--- 1038,1044 ----
if (! (*lang_hooks.tree_inlining.disregard_inline_limits) (fn)
&& inlinable && !nolimit)
{
! int sum_insns = (id ? id->inlined_insns : 0) + currfn_insns;
/* In the extreme case that we have exceeded the recursive inlining
limit by a huge factor (128), we just say no. Should not happen
in real life. */
*************** expand_call_inline (tp, walk_subtrees, d
*** 1429,1437 ****
TREE_USED (*tp) = 1;
/* Our function now has more statements than it did before. */
! DECL_NUM_STMTS (VARRAY_TREE (id->fns, 0)) += DECL_NUM_STMTS (fn);
/* For accounting, subtract one for the saved call/ret. */
! id->inlined_stmts += DECL_NUM_STMTS (fn) - 1;
/* Update callgraph if needed. */
if (id->decl && flag_unit_at_a_time)
--- 1426,1434 ----
TREE_USED (*tp) = 1;
/* Our function now has more statements than it did before. */
! DECL_ESTIMATED_INSNS (VARRAY_TREE (id->fns, 0)) += DECL_ESTIMATED_INSNS (fn);
/* For accounting, subtract one for the saved call/ret. */
! id->inlined_insns += DECL_ESTIMATED_INSNS (fn) - 1;
/* Update callgraph if needed. */
if (id->decl && flag_unit_at_a_time)
*************** expand_call_inline (tp, walk_subtrees, d
*** 1447,1453 ****
/* If we've returned to the top level, clear out the record of how
much inlining has been done. */
if (VARRAY_ACTIVE_SIZE (id->fns) == id->first_inlined_fn)
! id->inlined_stmts = 0;
/* Don't walk into subtrees. We've already handled them above. */
*walk_subtrees = 0;
--- 1444,1450 ----
/* If we've returned to the top level, clear out the record of how
much inlining has been done. */
if (VARRAY_ACTIVE_SIZE (id->fns) == id->first_inlined_fn)
! id->inlined_insns = 0;
/* Don't walk into subtrees. We've already handled them above. */
*walk_subtrees = 0;
*************** optimize_inline_calls (fn)
*** 1490,1495 ****
--- 1487,1495 ----
/* Don't allow recursion into FN. */
VARRAY_TREE_INIT (id.fns, 32, "fns");
VARRAY_PUSH_TREE (id.fns, fn);
+ if (!DECL_ESTIMATED_INSNS (fn))
+ DECL_ESTIMATED_INSNS (fn)
+ = (*lang_hooks.tree_inlining.estimate_num_insns) (fn);
/* Or any functions that aren't finished yet. */
prev_fn = NULL_TREE;
if (current_function_decl)