This is the mail archive of the
gcc-regression@gcc.gnu.org
mailing list for the GCC project.
A recent patch increased GCC's memory consumption!
- From: gcctest at suse dot de
- To: jh at suse dot cz, gcc-regression at gcc dot gnu dot org
- Date: Fri, 17 Nov 2006 15:12:13 +0000
- Subject: A recent patch increased GCC's memory consumption!
Hi,
I am a friendly script caring about memory consumption in GCC. Please
contact jh@suse.cz if something is going wrong.
Comparing memory consumption on compilation of combine.i, insn-attrtab.i,
and generate-3.4.ii I got:
comparing empty function compilation at -O0 level:
Overall memory needed: 18243k
Peak memory use before GGC: 2229k
Peak memory use after GGC: 1936k
Maximum of released memory in single GGC run: 293k
Garbage: 422k -> 422k
Leak: 2266k
Overhead: 445k -> 445k
GGC runs: 3
comparing empty function compilation at -O0 -g level:
Overall memory needed: 18259k
Peak memory use before GGC: 2256k
Peak memory use after GGC: 1963k
Maximum of released memory in single GGC run: 293k
Garbage: 424k -> 424k
Leak: 2298k
Overhead: 449k -> 449k
GGC runs: 3
comparing empty function compilation at -O1 level:
Overall memory needed: 18343k
Peak memory use before GGC: 2229k
Peak memory use after GGC: 1936k
Maximum of released memory in single GGC run: 293k
Garbage: 427k -> 427k
Leak: 2269k -> 2269k
Overhead: 445k -> 445k
GGC runs: 4
comparing empty function compilation at -O2 level:
Overall memory needed: 18355k
Peak memory use before GGC: 2229k
Peak memory use after GGC: 1936k
Maximum of released memory in single GGC run: 293k
Garbage: 430k -> 430k
Leak: 2269k -> 2269k
Overhead: 446k -> 446k
GGC runs: 4
comparing empty function compilation at -O3 level:
Overall memory needed: 18355k
Peak memory use before GGC: 2229k
Peak memory use after GGC: 1936k
Maximum of released memory in single GGC run: 293k
Garbage: 430k -> 430k
Leak: 2269k -> 2269k
Overhead: 446k -> 446k
GGC runs: 4
comparing combine.c compilation at -O0 level:
Overall memory needed: 28427k
Peak memory use before GGC: 9305k
Peak memory use after GGC: 8844k
Maximum of released memory in single GGC run: 2665k
Garbage: 36852k -> 36852k
Leak: 6457k -> 6456k
Overhead: 4869k -> 4868k
GGC runs: 280
comparing combine.c compilation at -O0 -g level:
Overall memory needed: 30519k
Peak memory use before GGC: 10855k
Peak memory use after GGC: 10485k
Maximum of released memory in single GGC run: 2415k
Garbage: 37429k -> 37429k
Leak: 9267k -> 9266k
Overhead: 5537k -> 5536k
GGC runs: 271
comparing combine.c compilation at -O1 level:
Overall memory needed: 40267k
Peak memory use before GGC: 17295k
Peak memory use after GGC: 17120k
Maximum of released memory in single GGC run: 2275k
Garbage: 57482k -> 57480k
Leak: 6510k
Overhead: 6227k -> 6226k
GGC runs: 357
comparing combine.c compilation at -O2 level:
Overall memory needed: 29802k
Peak memory use before GGC: 17291k
Peak memory use after GGC: 17120k
Maximum of released memory in single GGC run: 2869k
Garbage: 74952k -> 74950k
Leak: 6616k
Overhead: 8486k -> 8484k
GGC runs: 413
comparing combine.c compilation at -O3 level:
Overall memory needed: 28902k
Peak memory use before GGC: 18419k
Peak memory use after GGC: 17847k
Maximum of released memory in single GGC run: 4106k
Garbage: 112699k -> 112706k
Leak: 6684k
Overhead: 13039k -> 13037k
GGC runs: 463
Overall memory needed: 28427k
Peak memory use before GGC: 9305k
Peak memory use after GGC: 8844k
Maximum of released memory in single GGC run: 2665k
Garbage: 36852k -> 36852k
Leak: 6457k -> 6456k
Overhead: 4869k -> 4868k
GGC runs: 280
comparing combine.c compilation at -O1 level:
Overall memory needed: 40267k
Peak memory use before GGC: 17295k
Peak memory use after GGC: 17120k
Maximum of released memory in single GGC run: 2275k
Garbage: 57482k -> 57480k
Leak: 6510k
Overhead: 6227k -> 6226k
GGC runs: 357
comparing combine.c compilation at -O2 level:
Overall memory needed: 29802k
Peak memory use before GGC: 17291k
Peak memory use after GGC: 17120k
Maximum of released memory in single GGC run: 2869k
Garbage: 74952k -> 74950k
Leak: 6616k
Overhead: 8486k -> 8484k
GGC runs: 413
comparing combine.c compilation at -O3 level:
Overall memory needed: 28902k
Peak memory use before GGC: 18419k
Peak memory use after GGC: 17847k
Maximum of released memory in single GGC run: 4106k
Garbage: 112699k -> 112706k
Leak: 6684k
Overhead: 13039k -> 13037k
GGC runs: 463
comparing insn-attrtab.c compilation at -O0 level:
Overall memory needed: 88242k
Peak memory use before GGC: 69789k
Peak memory use after GGC: 44199k
Maximum of released memory in single GGC run: 36964k
Garbage: 129066k -> 129066k
Leak: 9516k -> 9515k
Overhead: 17001k -> 17000k
GGC runs: 216
comparing insn-attrtab.c compilation at -O0 -g level:
Overall memory needed: 89422k
Peak memory use before GGC: 70938k
Peak memory use after GGC: 45455k
Maximum of released memory in single GGC run: 36964k
Garbage: 130495k -> 130494k
Leak: 10947k -> 10946k
Overhead: 17380k -> 17379k
GGC runs: 212
comparing insn-attrtab.c compilation at -O1 level:
Overall memory needed: 112882k
Peak memory use before GGC: 90375k
Peak memory use after GGC: 83737k
Maximum of released memory in single GGC run: 31852k
Garbage: 277775k -> 277775k
Leak: 9357k
Overhead: 29792k -> 29791k
GGC runs: 222
comparing insn-attrtab.c compilation at -O2 level:
Overall memory allocated via mmap and sbrk increased from 119750k to 129382k, overall 8.04%
Overall memory needed: 119750k -> 129382k
Peak memory use before GGC: 92604k
Peak memory use after GGC: 84716k
Maximum of released memory in single GGC run: 30395k
Garbage: 317213k -> 317213k
Leak: 9359k
Overhead: 36376k -> 36375k
GGC runs: 245
comparing insn-attrtab.c compilation at -O3 level:
Overall memory needed: 129418k
Peak memory use before GGC: 92630k
Peak memory use after GGC: 84742k
Maximum of released memory in single GGC run: 30582k
Garbage: 318076k -> 318075k
Leak: 9362k
Overhead: 36612k -> 36611k
GGC runs: 249
comparing Gerald's testcase PR8361 compilation at -O0 level:
Peak amount of GGC memory allocated before garbage collecting increased from 92890k to 93308k, overall 0.45%
Peak amount of GGC memory still allocated after garbage collectin increased from 91969k to 92381k, overall 0.45%
Amount of produced GGC garbage increased from 205958k to 207828k, overall 0.91%
Overall memory needed: 119582k -> 119998k
Peak memory use before GGC: 92890k -> 93308k
Peak memory use after GGC: 91969k -> 92381k
Maximum of released memory in single GGC run: 19468k -> 20013k
Garbage: 205958k -> 207828k
Leak: 47742k -> 47725k
Overhead: 21054k -> 20983k
GGC runs: 402 -> 409
comparing Gerald's testcase PR8361 compilation at -O0 -g level:
Amount of produced GGC garbage increased from 212535k to 214409k, overall 0.88%
Overall memory needed: 132478k -> 132498k
Peak memory use before GGC: 105436k -> 105437k
Peak memory use after GGC: 104385k -> 104386k
Maximum of released memory in single GGC run: 19645k -> 19646k
Garbage: 212535k -> 214409k
Leak: 70700k -> 70684k
Overhead: 26671k -> 26599k
GGC runs: 377 -> 380
comparing Gerald's testcase PR8361 compilation at -O1 level:
Amount of produced GGC garbage increased from 444141k to 445924k, overall 0.40%
Overall memory needed: 119338k -> 119126k
Peak memory use before GGC: 97919k -> 97918k
Peak memory use after GGC: 95707k -> 95706k
Maximum of released memory in single GGC run: 18692k -> 18553k
Garbage: 444141k -> 445924k
Leak: 50075k -> 50059k
Overhead: 32953k -> 32794k
GGC runs: 552 -> 559
comparing Gerald's testcase PR8361 compilation at -O2 level:
Amount of produced GGC garbage increased from 502230k to 504007k, overall 0.35%
Overall memory needed: 119346k -> 119150k
Peak memory use before GGC: 97920k
Peak memory use after GGC: 95707k
Maximum of released memory in single GGC run: 18691k -> 18552k
Garbage: 502230k -> 504007k
Leak: 50758k -> 50741k
Overhead: 40063k -> 39904k
GGC runs: 606 -> 613
comparing Gerald's testcase PR8361 compilation at -O3 level:
Amount of produced GGC garbage increased from 524391k to 526197k, overall 0.34%
Overall memory needed: 118938k -> 118982k
Peak memory use before GGC: 97964k
Peak memory use after GGC: 96993k
Maximum of released memory in single GGC run: 18918k -> 18932k
Garbage: 524391k -> 526197k
Leak: 50333k -> 50316k
Overhead: 41060k -> 40898k
GGC runs: 621 -> 627
comparing PR rtl-optimization/28071 testcase compilation at -O0 level:
Overall memory needed: 137958k
Peak memory use before GGC: 81909k
Peak memory use after GGC: 58788k
Maximum of released memory in single GGC run: 45493k
Garbage: 147245k -> 147244k
Leak: 7536k -> 7536k
Overhead: 25304k -> 25303k
GGC runs: 82
comparing PR rtl-optimization/28071 testcase compilation at -O0 -g level:
Overall memory needed: 138134k
Peak memory use before GGC: 82542k
Peak memory use after GGC: 59422k
Maximum of released memory in single GGC run: 45558k
Garbage: 147415k -> 147415k
Leak: 9244k -> 9244k
Overhead: 25769k -> 25769k
GGC runs: 88
comparing PR rtl-optimization/28071 testcase compilation at -O1 level:
Overall memory needed: 424330k -> 424194k
Peak memory use before GGC: 205229k
Peak memory use after GGC: 201005k
Maximum of released memory in single GGC run: 101903k
Garbage: 272210k -> 272136k
Leak: 47601k -> 47601k
Overhead: 31355k -> 31281k
GGC runs: 101
comparing PR rtl-optimization/28071 testcase compilation at -O2 level:
Overall memory needed: 352150k -> 352034k
Peak memory use before GGC: 206002k
Peak memory use after GGC: 201778k
Maximum of released memory in single GGC run: 108809k -> 108808k
Garbage: 352435k -> 352361k
Leak: 48185k -> 48184k
Overhead: 47100k -> 47026k
GGC runs: 110
comparing PR rtl-optimization/28071 testcase compilation at -O3 -fno-tree-pre -fno-tree-fre level:
Overall memory needed: 781306k -> 781310k
Peak memory use before GGC: 314925k
Peak memory use after GGC: 293268k
Maximum of released memory in single GGC run: 165330k -> 165331k
Garbage: 494631k -> 494545k
Leak: 65517k -> 65517k
Overhead: 60002k -> 59917k
GGC runs: 98
Head of the ChangeLog is:
--- /usr/src/SpecTests/sandbox-britten-memory/x86_64/mem-result/ChangeLog 2006-11-16 20:57:46.000000000 +0000
+++ /usr/src/SpecTests/sandbox-britten-memory/gcc/gcc/ChangeLog 2006-11-17 13:28:20.000000000 +0000
@@ -1,3 +1,157 @@
+2006-11-17 Zdenek Dvorak <dvorakz@suse.cz>
+
+ * tree-vrp.c (execute_vrp): Do not update current_loops.
+ * loop-unswitch.c (unswitch_loop): Do not use loop_split_edge_with.
+ * doc/loop.texi: Remove documentation for cancelled functions.
+ * tree-ssa-loop-im.c (loop_commit_inserts): Removed.
+ (move_computations, determine_lsm): Use bsi_commit_edge_inserts
+ instead.
+ * cfgloopmanip.c (remove_bbs): Do not update loops explicitly.
+ (remove_path): Ensure that in delete_basic_blocks, the loops
+ are still allocated.
+ (add_loop): Work on valid loop structures.
+ (loopify): Modify call of add_loop.
+ (mfb_update_loops): Removed.
+ (create_preheader): Do not update loops explicitly.
+ (force_single_succ_latches, loop_version): Do not use
+ loop_split_edge_with.
+ (loop_split_edge_with): Removed.
+ * tree-ssa-loop-manip.c (create_iv, determine_exit_conditions):
+ Do not use bsi_insert_on_edge_immediate_loop.
+ (split_loop_exit_edge, tree_unroll_loop): Do not use
+ loop_split_edge_with.
+ (bsi_insert_on_edge_immediate_loop): Removed.
+ * tree-ssa-loop-ch.c (copy_loop_headers): Use current_loops. Do not
+ use loop_split_edge_with.
+ * cfghooks.c: Include cfgloop.h.
+ (verify_flow_info): Verify that loop_father is filled iff current_loops
+ are available.
+ (redirect_edge_and_branch_force, split_block, delete_basic_block,
+ split_edge, merge_blocks, make_forwarder_block, duplicate_block):
+ Update cfg.
+ * cfgloopanal.c (mark_irreducible_loops): Work if the function contains
+ no loops.
+ * modulo-sched.c (generate_prolog_epilog, canon_loop): Do not use
+ loop_split_edge_with.
+ (sms_schedule): Use current_loops.
+ * tree-ssa-dom.c (tree_ssa_dominator_optimize): Use current_loops.
+ * loop-init.c (loop_optimizer_init, loop_optimizer_finalize): Set
+ current_loops.
+ (rtl_loop_init, rtl_loop_done): Do not set current_loops.
+ * tree-ssa-sink.c (execute_sink_code): Use current_loops.
+ * ifcvt.c (if_convert): Ditto.
+ * predict.c (predict_loops): Do not clear current_loops.
+ (tree_estimate_probability): Use current_loops.
+ (propagate_freq): Receive head of the region to propagate instead of
+ loop.
+ (estimate_loops_at_level): Do not use shared to_visit bitmap.
+ (estimate_loops): New function. Handle case current_loops == NULL.
+ (estimate_bb_frequencies): Do not allocate tovisit. Use
+ estimate_loops.
+ * tree-ssa-loop.c (current_loops): Removed.
+ (tree_loop_optimizer_init): Do not return loops.
+ (tree_ssa_loop_init, tree_ssa_loop_done): Do not set current_loops.
+ * tree-vectorizer.c (slpeel_update_phi_nodes_for_guard1,
+ slpeel_update_phi_nodes_for_guard2, slpeel_tree_peel_loop_to_edge):
+ Do not update loops explicitly.
+ * function.h (struct function): Add x_current_loops field.
+ (current_loops): New macro.
+ * tree-if-conv.c (combine_blocks): Do not update loops explicitly.
+ * loop-unroll.c (split_edge_and_insert): New function.
+ (unroll_loop_runtime_iterations, analyze_insns_in_loop): Do not
+ use loop_split_edge_with.
+ * loop-doloop.c (add_test, doloop_modify): Ditto.
+ * tree-ssa-pre.c (init_pre, fini_pre): Do not set current_loops.
+ * cfglayout.c (copy_bbs): Do not update loops explicitly.
+ * lambda-code.c (perfect_nestify): Do not use loop_split_edge_with.
+ * tree-vect-transform.c (vect_transform_loop): Do not update loops
+ explicitly.
+ * cfgloop.c (flow_loops_cfg_dump): Do not dump dfs_order and rc_order.
+ (flow_loops_free): Do not free dfs_order and rc_order.
+ (flow_loops_find): Do not set dfs_order and rc_order in loops
+ structure. Do not call loops and flow info verification.
+ (add_bb_to_loop, remove_bb_from_loops): Check whether the block
+ already belongs to some loop.
+ * cfgloop.h (struct loops): Remove struct cfg.
+ (current_loops, loop_split_edge_with): Declaration removed.
+ (loop_optimizer_init, loop_optimizer_finalize): Declaration changed.
+ * tree-flow.h (loop_commit_inserts, bsi_insert_on_edge_immediate_loop):
+ Declaration removed.
+ * Makefile.in (cfghooks.o): Add CFGLOOP_H dependency.
+ * basic-block.h (split_edge_and_insert): Declare.
+ * tree-cfg.c (remove_bb): Do not update loops explicitly.
+
+2006-11-17 Zdenek Dvorak <dvorakz@suse.cz>
+
+ PR tree-optimization/29801
+ * tree-ssa-ccp.c (get_symbol_constant_value): New function.
+ (get_default_value): Use get_symbol_constant_value.
+ (set_lattice_value): ICE when the value of the constant is
+ changed.
+ (visit_assignment): Ignore VDEFs of read-only variables.
+
+2006-11-17 Zdenek Dvorak <dvorakz@suse.cz>
+
+ * tree-vect-transform.c (vect_create_epilog_for_reduction): Fix
+ formating.
+ (vect_generate_tmps_on_preheader, vect_update_ivs_after_vectorizer,
+ vect_gen_niters_for_prolog_loop): Fold the emited expressions.
+
+2006-11-17 Zdenek Dvorak <dvorakz@suse.cz>
+
+ * tree-ssa-alias.c (new_type_alias): Do not use offset of expr to
+ select subvars of var.
+
+2006-11-17 Jakub Jelinek <jakub@redhat.com>
+
+ PR middle-end/29584
+ * tree-ssa-forwprop.c (simplify_switch_expr): Don't
+ optimize if DEF doesn't have integral type.
+
+2006-11-16 Mike Stump <mrs@apple.com>
+
+ * config/darwin.h (LINK_COMMAND_SPEC): Don't do dwarf stuff on
+ pre-darwin9 system, unless the user asks for it directly.
+ (PREFERRED_DEBUGGING_TYPE): Likewise.
+ * config/i386/darwin.h (PREFERRED_DEBUGGING_TYPE): Likewise.
+ * config.gcc: Add suppport for darwin9.h.
+ * config/darwin9.h: Add.
+ * doc/install.texi (Specific): Clarify darwin documentation.
+
+2006-11-16 Richard Earnshaw <rearnsha@arm.com>
+
+ * arm.h (CONSTANT_ALIGNMENT): Don't over-align strings when
+ optimizing for size.
+
+2006-11-16 Mike Stump <mrs@apple.com>
+
+ * Makefile.in (targhooks.o): Add $(OPTABS_H).
+
+2006-11-16 Dirk Mueller <dmueller@suse.de>
+
+ * tree-vrp.c (get_value_range): Use XCNEW instead
+ of XNEW and memset.
+ (insert_range_assertions): Use XCNEWVEC instead
+ of XNEWVEC and memset.
+ (vrp_initialize): Same.
+ (vrp_finalize): Same.
+ * tree-ssa-ccp.c (ccp_initialize): Same.
+ * predict.c (tree_bb_level_predictions): Same.
+ * calls.c (expand_call): Same.
+ * tree-ssa-copy.c (init_copy_prop): Same.
+ (fini_copy_prop): Same.
+ * tree-ssa-alias.c (get_ptr_info): Use GGC_CNEW instead
+ of GGC_NEW and memset.
+
+2006-11-16 Eric Botcazou <ebotcazou@adacore.com>
+
+ PR middle-end/26306
+ * gimplify.c (gimplify_expr): Only force a load for references to
+ non-BLKmode volatile values.
+ * doc/implement-c.texi (Qualifiers implementation): Document the
+ interpretation of what a volatile access is.
+ * doc/extend.texi (C++ Extensions): Rework same documentation.
+
2006-11-16 Joseph Myers <joseph@codesourcery.com>
* config/rs6000/spe.md (frob_di_df_2): Handle non-offsettable
--- /usr/src/SpecTests/sandbox-britten-memory/x86_64/mem-result/ChangeLog.cp 2006-11-14 02:45:55.000000000 +0000
+++ /usr/src/SpecTests/sandbox-britten-memory/gcc/gcc/cp/ChangeLog 2006-11-17 13:28:19.000000000 +0000
@@ -1,3 +1,8 @@
+2006-11-16 Dirk Mueller <dmueller@suse.de>
+
+ * name-lookup.c (begin_scope): Use GGC_CNEW instead of
+ GGC_NEW and memset.
+
2006-11-13 Roger Sayle <roger@eyesopen.com>
* rtti.c (get_pseudo_ti_init): Ensure that the offset field of the
The results can be reproduced by building a compiler with
--enable-gather-detailed-mem-stats targetting x86-64
and compiling preprocessed combine.c or testcase from PR8632 with:
-fmem-report --param=ggc-min-heapsize=1024 --param=ggc-min-expand=1 -Ox -Q
The memory consumption summary appears in the dump after detailed listing
of the places they are allocated in. Peak memory consumption is actually
computed by looking for maximal value in {GC XXXX -> YYYY} report.
Your testing script.