This is the mail archive of the
gcc-regression@gcc.gnu.org
mailing list for the GCC project.
A recent patch increased GCC's memory consumption!
- From: gcctest at suse dot de
- To: jh at suse dot cz, gcc-regression at gcc dot gnu dot org
- Date: Fri, 19 Jan 2007 14:41:07 +0000
- Subject: A recent patch increased GCC's memory consumption!
Hi,
I am a friendly script caring about memory consumption in GCC. Please
contact jh@suse.cz if something is going wrong.
Comparing memory consumption on compilation of combine.i, insn-attrtab.i,
and generate-3.4.ii I got:
comparing empty function compilation at -O0 level:
Overall memory needed: 7381k
Peak memory use before GGC: 2264k
Peak memory use after GGC: 1955k
Maximum of released memory in single GGC run: 309k
Garbage: 444k
Leak: 2288k
Overhead: 455k
GGC runs: 3
comparing empty function compilation at -O0 -g level:
Overall memory needed: 7397k
Peak memory use before GGC: 2291k
Peak memory use after GGC: 1982k
Maximum of released memory in single GGC run: 309k
Garbage: 447k
Leak: 2320k
Overhead: 460k
GGC runs: 3
comparing empty function compilation at -O1 level:
Overall memory needed: 7493k
Peak memory use before GGC: 2264k
Peak memory use after GGC: 1955k
Maximum of released memory in single GGC run: 309k
Garbage: 450k
Leak: 2291k
Overhead: 456k
GGC runs: 4
comparing empty function compilation at -O2 level:
Overall memory needed: 7505k
Peak memory use before GGC: 2265k
Peak memory use after GGC: 1955k
Maximum of released memory in single GGC run: 310k
Garbage: 453k
Leak: 2291k
Overhead: 456k
GGC runs: 4
comparing empty function compilation at -O3 level:
Overall memory needed: 7505k
Peak memory use before GGC: 2265k
Peak memory use after GGC: 1955k
Maximum of released memory in single GGC run: 310k
Garbage: 453k
Leak: 2291k
Overhead: 456k
GGC runs: 4
comparing combine.c compilation at -O0 level:
Overall memory needed: 17665k
Peak memory use before GGC: 9334k
Peak memory use after GGC: 8886k
Maximum of released memory in single GGC run: 2628k
Garbage: 37266k
Leak: 6554k
Overhead: 4828k
GGC runs: 276
comparing combine.c compilation at -O0 -g level:
Overall memory needed: 19717k
Peak memory use before GGC: 10917k
Peak memory use after GGC: 10546k
Maximum of released memory in single GGC run: 2388k
Garbage: 37874k
Leak: 9414k
Overhead: 5530k
GGC runs: 272
comparing combine.c compilation at -O1 level:
Overall memory needed: 34309k -> 34517k
Peak memory use before GGC: 19923k
Peak memory use after GGC: 19725k
Maximum of released memory in single GGC run: 2262k
Garbage: 55479k -> 55020k
Leak: 6566k -> 6564k
Overhead: 9967k -> 9902k
GGC runs: 351 -> 348
comparing combine.c compilation at -O2 level:
Overall memory needed: 38181k -> 38017k
Peak memory use before GGC: 19933k -> 19934k
Peak memory use after GGC: 19735k -> 19733k
Maximum of released memory in single GGC run: 2203k
Garbage: 71204k -> 70935k
Leak: 6686k -> 6677k
Overhead: 11868k -> 11765k
GGC runs: 408 -> 406
comparing combine.c compilation at -O3 level:
Overall memory needed: 48845k -> 47957k
Peak memory use before GGC: 21063k -> 21055k
Peak memory use after GGC: 20197k -> 20196k
Maximum of released memory in single GGC run: 3167k
Garbage: 105315k -> 104299k
Leak: 6768k -> 6758k
Overhead: 16499k -> 16722k
GGC runs: 459 -> 461
comparing insn-attrtab.c compilation at -O0 level:
Overall memory needed: 105513k
Peak memory use before GGC: 71144k
Peak memory use after GGC: 45190k
Maximum of released memory in single GGC run: 37768k
Garbage: 131559k
Leak: 9580k
Overhead: 16626k
GGC runs: 208
comparing insn-attrtab.c compilation at -O0 -g level:
Overall memory needed: 106913k -> 106909k
Peak memory use before GGC: 72305k
Peak memory use after GGC: 46458k
Maximum of released memory in single GGC run: 37769k
Garbage: 132721k
Leak: 11269k
Overhead: 17020k
GGC runs: 207
comparing insn-attrtab.c compilation at -O1 level:
Overall memory needed: 124917k -> 125245k
Peak memory use before GGC: 72957k
Peak memory use after GGC: 69150k
Maximum of released memory in single GGC run: 31661k
Garbage: 229890k -> 229871k
Leak: 9397k
Overhead: 29470k -> 29467k
GGC runs: 224
comparing insn-attrtab.c compilation at -O2 level:
Overall memory needed: 190921k -> 191249k
Peak memory use before GGC: 79783k -> 79782k
Peak memory use after GGC: 74012k
Maximum of released memory in single GGC run: 30532k -> 30527k
Garbage: 281020k -> 280984k
Leak: 9394k -> 9393k
Overhead: 35776k -> 35771k
GGC runs: 246
comparing insn-attrtab.c compilation at -O3 level:
Overall memory needed: 190917k -> 191257k
Peak memory use before GGC: 79795k
Peak memory use after GGC: 74024k
Maximum of released memory in single GGC run: 30597k
Garbage: 281720k -> 281703k
Leak: 9396k -> 9395k
Overhead: 35986k -> 35981k
GGC runs: 246
comparing Gerald's testcase PR8361 compilation at -O0 level:
Overall memory needed: 152085k -> 152092k
Peak memory use before GGC: 92946k
Peak memory use after GGC: 92023k
Maximum of released memory in single GGC run: 18804k
Garbage: 208266k
Leak: 49015k
Overhead: 21200k
GGC runs: 408
comparing Gerald's testcase PR8361 compilation at -O0 -g level:
Overall memory needed: 169673k -> 169672k
Peak memory use before GGC: 105297k
Peak memory use after GGC: 104254k
Maximum of released memory in single GGC run: 18718k
Garbage: 214834k
Leak: 72440k
Overhead: 27103k
GGC runs: 382
comparing Gerald's testcase PR8361 compilation at -O1 level:
Overall memory needed: 137472k -> 139360k
Peak memory use before GGC: 98401k
Peak memory use after GGC: 97402k
Maximum of released memory in single GGC run: 17915k
Garbage: 402809k -> 401797k
Leak: 49996k -> 49997k
Overhead: 54394k -> 55054k
GGC runs: 549
comparing Gerald's testcase PR8361 compilation at -O2 level:
Overall memory needed: 139900k -> 141356k
Peak memory use before GGC: 98444k
Peak memory use after GGC: 97468k
Maximum of released memory in single GGC run: 17915k
Garbage: 461452k -> 459856k
Leak: 50821k -> 50796k
Overhead: 47451k -> 48231k
GGC runs: 600 -> 598
comparing Gerald's testcase PR8361 compilation at -O3 level:
Overall memory needed: 142212k -> 144044k
Peak memory use before GGC: 100200k -> 100195k
Peak memory use after GGC: 99152k
Maximum of released memory in single GGC run: 18261k
Garbage: 489290k -> 487472k
Leak: 51430k -> 51412k
Overhead: 49372k -> 50252k
GGC runs: 616 -> 614
comparing PR rtl-optimization/28071 testcase compilation at -O0 level:
Overall memory needed: 246452k -> 246451k
Peak memory use before GGC: 82630k
Peak memory use after GGC: 59510k
Maximum of released memory in single GGC run: 45582k
Garbage: 148155k
Leak: 8080k
Overhead: 25066k
GGC runs: 80
comparing PR rtl-optimization/28071 testcase compilation at -O0 -g level:
Overall memory needed: 247348k -> 247347k
Peak memory use before GGC: 83276k
Peak memory use after GGC: 60155k
Maximum of released memory in single GGC run: 45231k
Garbage: 148325k
Leak: 9335k
Overhead: 25561k
GGC runs: 88
comparing PR rtl-optimization/28071 testcase compilation at -O1 level:
Overall memory allocated via mmap and sbrk increased from 305313k to 315529k, overall 3.35%
Overall memory needed: 305313k -> 315529k
Peak memory use before GGC: 197776k -> 197757k
Peak memory use after GGC: 178776k -> 178756k
Maximum of released memory in single GGC run: 134229k -> 134210k
Garbage: 274126k -> 274105k
Leak: 27473k -> 27473k
Overhead: 33142k -> 33143k
GGC runs: 74
comparing PR rtl-optimization/28071 testcase compilation at -O2 level:
Overall memory needed: 301577k -> 301985k
Peak memory use before GGC: 302461k
Peak memory use after GGC: 178766k -> 178746k
Maximum of released memory in single GGC run: 241049k
Garbage: 586684k -> 586664k
Leak: 27902k -> 27902k
Overhead: 95313k -> 95314k
GGC runs: 83
comparing PR rtl-optimization/28071 testcase compilation at -O3 -fno-tree-pre -fno-tree-fre level:
Overall memory needed: 1415993k -> 1413849k
Peak memory use before GGC: 283571k -> 283520k
Peak memory use after GGC: 276589k -> 276537k
Maximum of released memory in single GGC run: 138367k -> 138347k
Garbage: 451306k -> 451286k
Leak: 48594k -> 48594k
Overhead: 56724k -> 56724k
GGC runs: 73
Head of the ChangeLog is:
--- /usr/src/SpecTests/sandbox-britten-memory/x86_64/mem-result/ChangeLog 2007-01-18 20:43:53.000000000 +0000
+++ /usr/src/SpecTests/sandbox-britten-memory/gcc/gcc/ChangeLog 2007-01-19 12:36:41.000000000 +0000
@@ -1,3 +1,154 @@
+2007-01-19 Dirk Mueller <dmueller@suse.de>
+
+ * tree-ssa-alias.c (perform_var_substitution): Fix typo
+ in dump_flags test.
+
+2007-01-19 Richard Guenther <rguenther@suse.de>
+
+ * builtins.c (expand_builtin_cexpi): Fall back to expanding
+ via cexp in case sincos is not available.
+
+2007-01-19 Richard Guenther <rguenther@suse.de>
+
+ * doc/tm.texi (TARGET_HAS_SINCOS): Document new target macro.
+ * defaults.h (TARGET_HAS_SINCOS): Default to off.
+ * config/linux.h (TARGET_HAS_SINCOS): Set to on if we have glibc.
+ * config/alpha/linux.h (TARGET_HAS_SINCOS): Likewise.
+ * config/sparc/linux.h (TARGET_HAS_SINCOS): Likewise.
+ * config/sparc/linux64.h (TARGET_HAS_SINCOS): Likewise.
+ * config/rs6000/linux.h (TARGET_HAS_SINCOS): Likewise.
+ * config/rs6000/linux64.h (TARGET_HAS_SINCOS): Likewise.
+
+2007-01-19 Uros Bizjak <ubizjak@gmail.com>
+
+ * config/i386/i386.md (*fpatanxf3_i387, fpatan_extend<mode>xf3_i387):
+ New insn patterns.
+ (atan2sf3_1, atan2df3_1, atan2xf3_1): Remove insn patterns.
+ (atan2xf3): Directly generate RTL pattern.
+ (atan2<mode>3): Rename from atan2sf3 and atan2df3 and macroize insn
+ patterns using X87MODEF12 mode macro. Use fpatan_extend<mode>xf3_i387
+ and truncate result to requested mode. Use SSE_FLOAT_MODE_P to
+ disable patterns for SSE math.
+ (atan<mode>2): Rename from atansf2 and atandf2 and macroize insn
+ patterns using X87MODEF12 mode macro. Use fpatan_extend<mode>xf3_i387
+ and truncate result to requested mode. Use SSE_FLOAT_MODE_P to
+ disable patterns for SSE math.
+
+2007-01-19 Alexandre Oliva <aoliva@redhat.com>
+
+ * libgcc-std.ver: Fix typo in %inherit for GCC_4.3.0.
+
+2007-01-18 Roger Sayle <roger@eyesopen.com>
+
+ * fold-const.c (fold_unary) <VIEW_CONVERT_EXPR>: Optimize away a
+ VIEW_CONVERT_EXPR to the same type as it's operand.
+
+2007-01-18 David Edelsohn <edelsohn@gnu.org>
+
+ * config/rs6000/darwin-ldouble.c: Only build _SOFT_FLOAT if
+ configured for long double 128.
+
+2007-01-18 Mike Stump <mrs@apple.com>
+
+ * config/rs6000/rs6000.c (rs6000_emit_vector_compare): Fix build
+ error.
+
+2007-01-18 Michael Meissner <michael.meissner@amd.com>
+
+ * i386.c (ix86_compute_frame_layout): Make fprintf's in #if 0 code
+ type correct.
+
+2007-01-18 Jan Hubicka <jh@suse.cz>
+
+ * tree-ssa-operands.c (vop_free_bucket_size): Never return value
+ greater than NUM_VOP_FREE_BUCKETS.
+
+2007-01-18 Daniel Berlin <dberlin@dberlin.org>
+
+ * tree-ssa-structalias.c: Update comments.
+ (ptabitmap_obstack): Removed.
+ (pta_obstack): New.
+ (oldpta_obstack): Ditto.
+ (stats): Add a few members.
+ (struct variable_info): Remove node, complex, address_taken, and
+ indirect_target members. Add oldsolution member.
+ (new_var_info): Do not initialize removed members.
+ (constraint_expr_type): Remove INCLUDES.
+ (constraint_graph): Add size, implicit_preds, rep,
+ indirect_cycles, eq_rep, label, direct_nodes, and complex members.
+ (FIRST_REF_NODE): New macro.
+ (LAST_REF_NODE): Ditto.
+ (FIRST_ADDR_NODE): Ditto.
+ (find): New function.
+ (unite): Ditto.
+ (dump_constraint): Do not handle INCLUDES.
+ (insert_into_complex): Do not insert duplicate constraints.
+ (condense_varmap_nodes): Renamed and rewritten into ...
+ (merge_node_constraints): This. Also fix bug in handling of
+ offseted copy constraints.
+ (clear_edges_for_node): No longer need to deal with preds at all,
+ or removing associated preds/succs.
+ (merge_graph_nodes): Deal with indirect_cycles.
+ Don't deal with predecessors.
+ (add_implicit_graph_edge): New function.
+ (add_pred_graph_edge): Ditto.
+ (add_graph_edge): Don't deal with predecessors.
+ (build_constraint_graph): Removed.
+ (build_pred_graph): New function.
+ (build_succ_graph): Ditto.
+ (struct scc_info): Removed in_component. Added roots, dfs, and
+ node_mapping. Remove visited_index, unification_queue.
+ (scc_visit): Deal with union-find we do now.
+ Deal with cycles with REF nodes.
+ (collapse_nodes): Renamed and rewritten to ...
+ (unify_nodes): This.
+ (process_unification_queue): Removed.
+ (topo_visit): Cleanup
+ (do_da_constraint): Use find.
+ (do_sd_constraint): Ditto.
+ (do_ds_constraint): Ditto.
+ (do_complex_constraint): Ditto.
+ (init_scc_info): Update for removed and added members.
+ (find_and_collapse_graph_cycles): Renamed and rewritten into ...
+ (find_indirect_cycles): This.
+ (equivalence_class): New variable.
+ (label_visit): New function.
+ (perform_variable_substitution): Rewritten.
+ (free_var_substitution_info): New function.
+ (find_equivalent_node): Ditto.
+ (move_complex_constraints): Ditto.
+ (eliminate_indirect_cycles): Ditto.
+ (solve_graph): Only propagate changed bits.
+ Use indirect cycle elimination.
+ Use find.
+ (tree_id_t): Rename to tree_vi_t, delete id member, add vi member.
+ (tree_id_eq): Renamed to ...
+ (tree_vi_eq): This. Update for member change
+ (insert_id_for_tree): Renamed and rewritten to ...
+ (insert_vi_for_tree): This.
+ (lookup_id_for_tree): Renamed and rewritten to ...
+ (lookup_vi_for_tree): This.
+ (get_id_for_tree): Renamed and rewritten to ...
+ (get_vi_for_tree): Ditto.
+ (get_constraint_exp_from_ssa_var): Update to use get_vi_for_tree.
+ (process_constraint): Don't handle INCLUDES.
+ Remove special ADDRESSOF case.
+ (find_func_aliases): Rewrite to use vi functions instead of id
+ ones.
+ (create_function_info_for): Ditto.
+ (create_variable_info_for): Ditto.
+ (intra_create_variable_infos): Ditto.
+ (merge_smts_into): Ditto.
+ (find_what_p_points_to): Ditto.
+ (init_base_vars): Ditto.
+ (init_alias_vars): Ditto.
+ (remove_preds_and_fake_succs): New function.
+ (dump_sa_points_to_info): Dump new stats.
+ (dump_solution_for_var): Use find.
+ (set_used_smts): Fix formatting.
+ (compute_points_to_sets): Updated for new functions.
+ (ipa_pta_execute): Ditto.
+
2007-01-18 Kazu Hirata <kazu@codesourcery.com>
Richard Sandiford <richard@codesourcery.com>
The results can be reproduced by building a compiler with
--enable-gather-detailed-mem-stats targetting x86-64
and compiling preprocessed combine.c or testcase from PR8632 with:
-fmem-report --param=ggc-min-heapsize=1024 --param=ggc-min-expand=1 -Ox -Q
The memory consumption summary appears in the dump after detailed listing
of the places they are allocated in. Peak memory consumption is actually
computed by looking for maximal value in {GC XXXX -> YYYY} report.
Your testing script.