This is the mail archive of the gcc-regression@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

A recent patch increased GCC's memory consumption!


Hi,

I am a friendly script caring about memory consumption in GCC.  Please
contact jh@suse.cz if something is going wrong.

Comparing memory consumption on compilation of combine.i, insn-attrtab.i,
and generate-3.4.ii I got:


comparing combine.c compilation at -O0 level:
    Overall memory needed: 24820k -> 24828k
    Peak memory use before GGC: 9653k
    Peak memory use after GGC: 8965k
    Maximum of released memory in single GGC run: 2790k
    Garbage: 42280k
    Leak: 6706k
    Overhead: 5890k
    GGC runs: 329

comparing combine.c compilation at -O1 level:
  Amount of produced GGC garbage increased from 62999k to 63156k, overall 0.25%
    Overall memory needed: 27600k -> 27612k
    Peak memory use before GGC: 9161k
    Peak memory use after GGC: 8720k
    Maximum of released memory in single GGC run: 2204k
    Garbage: 62999k -> 63156k
    Leak: 7087k
    Overhead: 7636k -> 7793k
    GGC runs: 522

comparing combine.c compilation at -O2 level:
  Amount of memory still referenced at the end of compilation increased from 7050k to 7076k, overall 0.37%
    Overall memory needed: 24824k -> 24816k
    Peak memory use before GGC: 18279k -> 18274k
    Peak memory use after GGC: 18093k
    Maximum of released memory in single GGC run: 2524k
    Garbage: 88162k -> 86279k
    Leak: 7050k -> 7076k
    Overhead: 10970k -> 10936k
    GGC runs: 483 -> 479

comparing combine.c compilation at -O3 level:
  Amount of memory still referenced at the end of compilation increased from 7123k to 7147k, overall 0.34%
    Overall memory needed: 25128k -> 25060k
    Peak memory use before GGC: 18289k -> 18276k
    Peak memory use after GGC: 18093k
    Maximum of released memory in single GGC run: 3094k -> 3099k
    Garbage: 117577k -> 114883k
    Leak: 7123k -> 7147k
    Overhead: 14555k -> 14473k
    GGC runs: 538 -> 527

comparing insn-attrtab.c compilation at -O0 level:
    Overall memory needed: 85596k
    Peak memory use before GGC: 73395k
    Peak memory use after GGC: 45365k
    Maximum of released memory in single GGC run: 37613k
    Garbage: 153738k
    Leak: 11303k
    Overhead: 19826k
    GGC runs: 268

comparing insn-attrtab.c compilation at -O1 level:
  Amount of produced GGC garbage increased from 303463k to 305021k, overall 0.51%
    Overall memory needed: 101136k
    Peak memory use before GGC: 76439k
    Peak memory use after GGC: 65594k
    Maximum of released memory in single GGC run: 37093k
    Garbage: 303463k -> 305021k
    Leak: 11605k
    Overhead: 37174k -> 38732k
    GGC runs: 381

comparing insn-attrtab.c compilation at -O2 level:
  Peak amount of GGC memory allocated before garbage collecting increased from 118406k to 121110k, overall 2.28%
    Overall memory needed: 154100k -> 154672k
    Peak memory use before GGC: 118406k -> 121110k
    Peak memory use after GGC: 93025k -> 92200k
    Maximum of released memory in single GGC run: 32950k
    Garbage: 404704k -> 403219k
    Leak: 11445k -> 11445k
    Overhead: 50573k -> 51578k
    GGC runs: 305 -> 303

comparing insn-attrtab.c compilation at -O3 level:
  Peak amount of GGC memory allocated before garbage collecting increased from 118408k to 121110k, overall 2.28%
    Overall memory needed: 154116k -> 154688k
    Peak memory use before GGC: 118408k -> 121110k
    Peak memory use after GGC: 93027k -> 92202k
    Maximum of released memory in single GGC run: 32950k
    Garbage: 405493k -> 403986k
    Leak: 11466k -> 11467k
    Overhead: 50696k -> 51700k
    GGC runs: 311 -> 309

comparing Gerald's testcase PR8361 compilation at -O0 level:
    Overall memory needed: 127328k
    Peak memory use before GGC: 103080k
    Peak memory use after GGC: 102057k
    Maximum of released memory in single GGC run: 21522k
    Garbage: 247550k
    Leak: 53798k
    Overhead: 43016k
    GGC runs: 347

comparing Gerald's testcase PR8361 compilation at -O1 level:
  Amount of produced GGC garbage increased from 660590k to 662811k, overall 0.34%
    Overall memory needed: 121436k -> 121432k
    Peak memory use before GGC: 112522k
    Peak memory use after GGC: 100700k
    Maximum of released memory in single GGC run: 20025k
    Garbage: 660590k -> 662811k
    Leak: 58903k -> 58927k
    Overhead: 84468k -> 86714k
    GGC runs: 517

comparing Gerald's testcase PR8361 compilation at -O2 level:
    Overall memory needed: 121416k -> 121424k
    Peak memory use before GGC: 112523k
    Peak memory use after GGC: 100700k
    Maximum of released memory in single GGC run: 20025k
    Garbage: 765967k -> 758446k
    Leak: 59599k -> 59614k
    Overhead: 103665k -> 104848k
    GGC runs: 608 -> 601

comparing Gerald's testcase PR8361 compilation at -O3 level:
  Amount of memory still referenced at the end of compilation increased from 60847k to 60958k, overall 0.18%
    Overall memory needed: 124364k
    Peak memory use before GGC: 115265k
    Peak memory use after GGC: 102526k
    Maximum of released memory in single GGC run: 21385k
    Garbage: 822552k -> 814648k
    Leak: 60847k -> 60958k
    Overhead: 110053k -> 111413k
    GGC runs: 609 -> 602

Head of the ChangeLog is:

--- /usr/src/SpecTests/sandbox-britten-memory/x86_64/mem-result/ChangeLog	2005-06-01 23:33:47.000000000 +0000
+++ /usr/src/SpecTests/sandbox-britten-memory/gcc/gcc/ChangeLog	2005-06-02 03:06:32.000000000 +0000
@@ -1,3 +1,118 @@
+2005-06-01  Diego Novillo  <dnovillo@redhat.com>
+
+	PR 14341, PR 21332, PR 20701, PR 21029, PR 21086, PR 21090
+	PR 21289, PR 21348, PR 21367, PR 21368, PR 21458.
+	* fold-const.c (invert_tree_comparison): Make extern.
+	* tree-flow.h (enum value_range_type): Move to tree-ssa-propagate.
+	(struct value_range_def): Limewise.
+	(get_value_range): Remove.
+	(dump_value_range): Remove.
+	(dump_all_value_ranges): Remove.
+	(debug_all_value_ranges): Remove.
+	(vrp_evaluate_conditional): Declare.
+	* tree-ssa-propagate.c (struct prop_stats_d): Add field
+	num_pred_folded.
+	(substitute_and_fold): Add argument use_ranges_p.
+	Update all callers.
+	If use_ranges_p is true, call fold_predicate_in to fold
+	predicates using range information.
+	Ignore ASSERT_EXPRs.
+	Change debugging output to only show statements that have been
+	folded.
+	(replace_phi_args_in): Move debugging output code from
+	substitute and fold.
+	(fold_predicate_in): New local function.
+	* tree-ssa-propagate.h (enum value_range_type): Move from
+	tree-flow.h.
+	(struct value_range_d): Likewise.
+	Add field 'equiv'. 
+	(value_range_t): Rename from value_range.
+	* tree-vrp.c (found_in_subgraph): Rename from found.
+	(get_opposite_operand): Remove.
+	(struct assert_locus_d): Declare.
+	(assert_locus_t): Declare.
+	(need_assert_for): Declare.
+	(asserts_for): Declare.
+	(blocks_visited): Declare.
+	(vr_value): Declare.
+	(set_value_range): Add argument 'equiv'.
+	Don't drop to VARYING ranges that cover all values in the
+	type.
+	Make deep copy of equivalence set 'equiv'.
+	(copy_value_range): New local function.
+	(set_value_range_to_undefined): New local function.
+	(compare_values): Return -2 if either value has overflowed.
+	(range_includes_zero_p): New local function.
+	(extract_range_from_assert): Flip the predicate code if the
+	name being asserted is on the RHS of the predicate.
+	Avoid creating unnecessary symbolic ranges if the comparison
+	includes another name with a known numeric range.
+	Update the equivalnce set of the new range when asserting
+	EQ_EXPR predicates.
+	(extract_range_from_ssa_name): Update the equivalence set of
+	the new range with VAR.
+	(extract_range_from_binary_expr): Also handle TRUTH_*_EXPR.
+	If -fwrapv is used, set the resulting range to VARYING if the
+	operation overflows.  Otherwise, use TYPE_MIN_VALUE and
+	TYPE_MAX_VALUE to represent -INF and +INF.
+	Fix handling of *_DIV_EXPR.
+	(extract_range_from_unary_expr): Handle MINUS_EXPR and
+	ABS_EXPR properly by switching the range around if necessary.
+	(extract_range_from_comparison): New local function.
+	(extract_range_from_expr): Call it.
+	(adjust_range_with_scev): Do not adjust the range if using
+	wrapping arithmetic (-fwrapv).
+	(dump_value_range): Also show equivalence set.
+	Show -INF and +INF for TYPE_MIN_VALUE and TYPE_MAX_VALUE.
+	(build_assert_expr_for): Also build ASSERT_EXPR for EQ_EXPR.
+	(infer_value_range): Change return value to bool.
+	Add arguments 'comp_code_p' and 'val_p'.
+	Do not attempt to infer ranges from statements that may throw.
+	Store the comparison code in comp_code_p.
+	Store the other operand to be used in the predicate in val_p.
+	(dump_asserts_for): New.
+	(debug_asserts_for): New.
+	(dump_all_asserts): New.
+	(debug_all_asserts): New.
+	(register_new_assert_for): New.
+	(register_edge_assert_for): New.
+	(find_conditional_asserts): New.
+	(find_assert_locations): New.
+	(process_assert_insertions_for): New.
+	(process_assert_insertions): New.
+	(insert_range_assertions): Initialize found_in_subgraph,
+	blocks_visited, need_assert_for and asserts_for.
+	Call find_assert_locations and process_assert_insertions.
+	(remove_range_assertions): Add more documentation.
+	(vrp_initialize): Change return type to void.
+	Do not try to guess if running VRP is worth it.  
+	(compare_name_with_value): New.
+	(compare_names): New.
+	(vrp_evaluate_conditional): Add argument 'use_equiv_p'.  If
+	use_equiv_p is true, call compare_names and
+	compare_name_with_value to compare all the ranges for every
+	name in the equivalence set of the predicate operands.
+	Update all callers.
+	(vrp_meet): Try harder not to derive a VARYING range.
+	If two values meet, the resulting equivalence set is the
+	intersection of the two equivalence sets.
+	(vrp_visit_phi_node): Call copy_value_range to get the current
+	range information of the LHS.
+	(vrp_finalize): Create a value vector representing all the
+	names that ended up with exactly one value in their range.
+	Call substitute_and_fold.
+	(execute_vrp): Document equivalence sets in ranges.
+	* tree.h (SSA_NAME_VALUE_RANGE): Remove.
+	(struct tree_ssa_name): Remove field value_range.
+	(invert_tree_comparison): Declare.
+
+2005-06-01  Daniel Berlin  <dberlin@dberlin.org>
+
+	Fix PR tree-optimization/21839
+
+	* gimplify.c (zero_sized_field_decl): New function.
+	(gimplify_init_ctor_eval): Use it.
+
 2005-06-01  Josh Conner <jconner@apple.com>
 
 	PR 21478
@@ -199,7 +314,6 @@
 	
 	* Makefile.in: Update dependencies.
 	
-
 2005-06-01  Danny Smith  <dannysmith@users.sourceforge.net>
 
 	* config/i386/cygming.h (NO_PROFILE_COUNTERS): Define.


The results can be reproduced by building a compiler with

--enable-gather-detailed-mem-stats targetting x86-64

and compiling preprocessed combine.c or testcase from PR8632 with:

-fmem-report --param=ggc-min-heapsize=1024 --param=ggc-min-expand=1 -Ox -Q

The memory consumption summary appears in the dump after detailed listing
of the places they are allocated in.  Peak memory consumption is actually
computed by looking for maximal value in {GC XXXX -> YYYY} report.

Your testing script.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]