This is the mail archive of the gcc-regression@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

A recent patch increased GCC's memory consumption!


Hi,

I am a friendly script caring about memory consumption in GCC.  Please
contact jh@suse.cz if something is going wrong.

Comparing memory consumption on compilation of combine.i, insn-attrtab.i,
and generate-3.4.ii I got:


comparing empty function compilation at -O0 level:
    Overall memory needed: 18243k
    Peak memory use before GGC: 2229k
    Peak memory use after GGC: 1936k
    Maximum of released memory in single GGC run: 293k
    Garbage: 422k -> 422k
    Leak: 2266k
    Overhead: 445k -> 445k
    GGC runs: 3

comparing empty function compilation at -O0 -g level:
    Overall memory needed: 18259k
    Peak memory use before GGC: 2256k
    Peak memory use after GGC: 1963k
    Maximum of released memory in single GGC run: 293k
    Garbage: 424k -> 424k
    Leak: 2298k
    Overhead: 449k -> 449k
    GGC runs: 3

comparing empty function compilation at -O1 level:
    Overall memory needed: 18343k
    Peak memory use before GGC: 2229k
    Peak memory use after GGC: 1936k
    Maximum of released memory in single GGC run: 293k
    Garbage: 427k -> 427k
    Leak: 2269k -> 2269k
    Overhead: 445k -> 445k
    GGC runs: 4

comparing empty function compilation at -O2 level:
    Overall memory needed: 18355k
    Peak memory use before GGC: 2229k
    Peak memory use after GGC: 1936k
    Maximum of released memory in single GGC run: 293k
    Garbage: 430k -> 430k
    Leak: 2269k -> 2269k
    Overhead: 446k -> 446k
    GGC runs: 4

comparing empty function compilation at -O3 level:
    Overall memory needed: 18355k
    Peak memory use before GGC: 2229k
    Peak memory use after GGC: 1936k
    Maximum of released memory in single GGC run: 293k
    Garbage: 430k -> 430k
    Leak: 2269k -> 2269k
    Overhead: 446k -> 446k
    GGC runs: 4

comparing combine.c compilation at -O0 level:
    Overall memory needed: 28427k
    Peak memory use before GGC: 9305k
    Peak memory use after GGC: 8844k
    Maximum of released memory in single GGC run: 2665k
    Garbage: 36852k -> 36852k
    Leak: 6457k -> 6456k
    Overhead: 4869k -> 4868k
    GGC runs: 280

comparing combine.c compilation at -O0 -g level:
    Overall memory needed: 30519k
    Peak memory use before GGC: 10855k
    Peak memory use after GGC: 10485k
    Maximum of released memory in single GGC run: 2415k
    Garbage: 37429k -> 37429k
    Leak: 9267k -> 9266k
    Overhead: 5537k -> 5536k
    GGC runs: 271

comparing combine.c compilation at -O1 level:
    Overall memory needed: 40267k
    Peak memory use before GGC: 17295k
    Peak memory use after GGC: 17120k
    Maximum of released memory in single GGC run: 2275k
    Garbage: 57482k -> 57480k
    Leak: 6510k
    Overhead: 6227k -> 6226k
    GGC runs: 357

comparing combine.c compilation at -O2 level:
    Overall memory needed: 29802k
    Peak memory use before GGC: 17291k
    Peak memory use after GGC: 17120k
    Maximum of released memory in single GGC run: 2869k
    Garbage: 74952k -> 74950k
    Leak: 6616k
    Overhead: 8486k -> 8484k
    GGC runs: 413

comparing combine.c compilation at -O3 level:
    Overall memory needed: 28902k
    Peak memory use before GGC: 18419k
    Peak memory use after GGC: 17847k
    Maximum of released memory in single GGC run: 4106k
    Garbage: 112699k -> 112706k
    Leak: 6684k
    Overhead: 13039k -> 13037k
    GGC runs: 463

    Overall memory needed: 28427k
    Peak memory use before GGC: 9305k
    Peak memory use after GGC: 8844k
    Maximum of released memory in single GGC run: 2665k
    Garbage: 36852k -> 36852k
    Leak: 6457k -> 6456k
    Overhead: 4869k -> 4868k
    GGC runs: 280

comparing combine.c compilation at -O1 level:
    Overall memory needed: 40267k
    Peak memory use before GGC: 17295k
    Peak memory use after GGC: 17120k
    Maximum of released memory in single GGC run: 2275k
    Garbage: 57482k -> 57480k
    Leak: 6510k
    Overhead: 6227k -> 6226k
    GGC runs: 357

comparing combine.c compilation at -O2 level:
    Overall memory needed: 29802k
    Peak memory use before GGC: 17291k
    Peak memory use after GGC: 17120k
    Maximum of released memory in single GGC run: 2869k
    Garbage: 74952k -> 74950k
    Leak: 6616k
    Overhead: 8486k -> 8484k
    GGC runs: 413

comparing combine.c compilation at -O3 level:
    Overall memory needed: 28902k
    Peak memory use before GGC: 18419k
    Peak memory use after GGC: 17847k
    Maximum of released memory in single GGC run: 4106k
    Garbage: 112699k -> 112706k
    Leak: 6684k
    Overhead: 13039k -> 13037k
    GGC runs: 463

comparing insn-attrtab.c compilation at -O0 level:
    Overall memory needed: 88242k
    Peak memory use before GGC: 69789k
    Peak memory use after GGC: 44199k
    Maximum of released memory in single GGC run: 36964k
    Garbage: 129066k -> 129066k
    Leak: 9516k -> 9515k
    Overhead: 17001k -> 17000k
    GGC runs: 216

comparing insn-attrtab.c compilation at -O0 -g level:
    Overall memory needed: 89422k
    Peak memory use before GGC: 70938k
    Peak memory use after GGC: 45455k
    Maximum of released memory in single GGC run: 36964k
    Garbage: 130495k -> 130494k
    Leak: 10947k -> 10946k
    Overhead: 17380k -> 17379k
    GGC runs: 212

comparing insn-attrtab.c compilation at -O1 level:
    Overall memory needed: 112882k
    Peak memory use before GGC: 90375k
    Peak memory use after GGC: 83737k
    Maximum of released memory in single GGC run: 31852k
    Garbage: 277775k -> 277775k
    Leak: 9357k
    Overhead: 29792k -> 29791k
    GGC runs: 222

comparing insn-attrtab.c compilation at -O2 level:
  Overall memory allocated via mmap and sbrk increased from 119750k to 129382k, overall 8.04%
    Overall memory needed: 119750k -> 129382k
    Peak memory use before GGC: 92604k
    Peak memory use after GGC: 84716k
    Maximum of released memory in single GGC run: 30395k
    Garbage: 317213k -> 317213k
    Leak: 9359k
    Overhead: 36376k -> 36375k
    GGC runs: 245

comparing insn-attrtab.c compilation at -O3 level:
    Overall memory needed: 129418k
    Peak memory use before GGC: 92630k
    Peak memory use after GGC: 84742k
    Maximum of released memory in single GGC run: 30582k
    Garbage: 318076k -> 318075k
    Leak: 9362k
    Overhead: 36612k -> 36611k
    GGC runs: 249

comparing Gerald's testcase PR8361 compilation at -O0 level:
  Peak amount of GGC memory allocated before garbage collecting increased from 92890k to 93308k, overall 0.45%
  Peak amount of GGC memory still allocated after garbage collectin increased from 91969k to 92381k, overall 0.45%
  Amount of produced GGC garbage increased from 205958k to 207828k, overall 0.91%
    Overall memory needed: 119582k -> 119998k
    Peak memory use before GGC: 92890k -> 93308k
    Peak memory use after GGC: 91969k -> 92381k
    Maximum of released memory in single GGC run: 19468k -> 20013k
    Garbage: 205958k -> 207828k
    Leak: 47742k -> 47725k
    Overhead: 21054k -> 20983k
    GGC runs: 402 -> 409

comparing Gerald's testcase PR8361 compilation at -O0 -g level:
  Amount of produced GGC garbage increased from 212535k to 214409k, overall 0.88%
    Overall memory needed: 132478k -> 132498k
    Peak memory use before GGC: 105436k -> 105437k
    Peak memory use after GGC: 104385k -> 104386k
    Maximum of released memory in single GGC run: 19645k -> 19646k
    Garbage: 212535k -> 214409k
    Leak: 70700k -> 70684k
    Overhead: 26671k -> 26599k
    GGC runs: 377 -> 380

comparing Gerald's testcase PR8361 compilation at -O1 level:
  Amount of produced GGC garbage increased from 444141k to 445924k, overall 0.40%
    Overall memory needed: 119338k -> 119126k
    Peak memory use before GGC: 97919k -> 97918k
    Peak memory use after GGC: 95707k -> 95706k
    Maximum of released memory in single GGC run: 18692k -> 18553k
    Garbage: 444141k -> 445924k
    Leak: 50075k -> 50059k
    Overhead: 32953k -> 32794k
    GGC runs: 552 -> 559

comparing Gerald's testcase PR8361 compilation at -O2 level:
  Amount of produced GGC garbage increased from 502230k to 504007k, overall 0.35%
    Overall memory needed: 119346k -> 119150k
    Peak memory use before GGC: 97920k
    Peak memory use after GGC: 95707k
    Maximum of released memory in single GGC run: 18691k -> 18552k
    Garbage: 502230k -> 504007k
    Leak: 50758k -> 50741k
    Overhead: 40063k -> 39904k
    GGC runs: 606 -> 613

comparing Gerald's testcase PR8361 compilation at -O3 level:
  Amount of produced GGC garbage increased from 524391k to 526197k, overall 0.34%
    Overall memory needed: 118938k -> 118982k
    Peak memory use before GGC: 97964k
    Peak memory use after GGC: 96993k
    Maximum of released memory in single GGC run: 18918k -> 18932k
    Garbage: 524391k -> 526197k
    Leak: 50333k -> 50316k
    Overhead: 41060k -> 40898k
    GGC runs: 621 -> 627

comparing PR rtl-optimization/28071 testcase compilation at -O0 level:
    Overall memory needed: 137958k
    Peak memory use before GGC: 81909k
    Peak memory use after GGC: 58788k
    Maximum of released memory in single GGC run: 45493k
    Garbage: 147245k -> 147244k
    Leak: 7536k -> 7536k
    Overhead: 25304k -> 25303k
    GGC runs: 82

comparing PR rtl-optimization/28071 testcase compilation at -O0 -g level:
    Overall memory needed: 138134k
    Peak memory use before GGC: 82542k
    Peak memory use after GGC: 59422k
    Maximum of released memory in single GGC run: 45558k
    Garbage: 147415k -> 147415k
    Leak: 9244k -> 9244k
    Overhead: 25769k -> 25769k
    GGC runs: 88

comparing PR rtl-optimization/28071 testcase compilation at -O1 level:
    Overall memory needed: 424330k -> 424194k
    Peak memory use before GGC: 205229k
    Peak memory use after GGC: 201005k
    Maximum of released memory in single GGC run: 101903k
    Garbage: 272210k -> 272136k
    Leak: 47601k -> 47601k
    Overhead: 31355k -> 31281k
    GGC runs: 101

comparing PR rtl-optimization/28071 testcase compilation at -O2 level:
    Overall memory needed: 352150k -> 352034k
    Peak memory use before GGC: 206002k
    Peak memory use after GGC: 201778k
    Maximum of released memory in single GGC run: 108809k -> 108808k
    Garbage: 352435k -> 352361k
    Leak: 48185k -> 48184k
    Overhead: 47100k -> 47026k
    GGC runs: 110

comparing PR rtl-optimization/28071 testcase compilation at -O3 -fno-tree-pre -fno-tree-fre level:
    Overall memory needed: 781306k -> 781310k
    Peak memory use before GGC: 314925k
    Peak memory use after GGC: 293268k
    Maximum of released memory in single GGC run: 165330k -> 165331k
    Garbage: 494631k -> 494545k
    Leak: 65517k -> 65517k
    Overhead: 60002k -> 59917k
    GGC runs: 98

Head of the ChangeLog is:

--- /usr/src/SpecTests/sandbox-britten-memory/x86_64/mem-result/ChangeLog	2006-11-16 20:57:46.000000000 +0000
+++ /usr/src/SpecTests/sandbox-britten-memory/gcc/gcc/ChangeLog	2006-11-17 13:28:20.000000000 +0000
@@ -1,3 +1,157 @@
+2006-11-17  Zdenek Dvorak <dvorakz@suse.cz>
+
+	* tree-vrp.c (execute_vrp): Do not update current_loops.
+	* loop-unswitch.c (unswitch_loop): Do not use loop_split_edge_with.
+	* doc/loop.texi: Remove documentation for cancelled functions.
+	* tree-ssa-loop-im.c (loop_commit_inserts): Removed.
+	(move_computations, determine_lsm): Use bsi_commit_edge_inserts
+	instead.
+	* cfgloopmanip.c (remove_bbs): Do not update loops explicitly.
+	(remove_path): Ensure that in delete_basic_blocks, the loops
+	are still allocated.
+	(add_loop): Work on valid loop structures.
+	(loopify): Modify call of add_loop.
+	(mfb_update_loops): Removed.
+	(create_preheader): Do not update loops explicitly.
+	(force_single_succ_latches, loop_version): Do not use
+	loop_split_edge_with.
+	(loop_split_edge_with): Removed.
+	* tree-ssa-loop-manip.c (create_iv, determine_exit_conditions):
+	Do not use bsi_insert_on_edge_immediate_loop.
+	(split_loop_exit_edge, tree_unroll_loop): Do not use
+	loop_split_edge_with.
+	(bsi_insert_on_edge_immediate_loop): Removed.
+	* tree-ssa-loop-ch.c (copy_loop_headers): Use current_loops.  Do not
+	use loop_split_edge_with.
+	* cfghooks.c: Include cfgloop.h.
+	(verify_flow_info): Verify that loop_father is filled iff current_loops
+	are available.
+	(redirect_edge_and_branch_force, split_block, delete_basic_block,
+	split_edge, merge_blocks, make_forwarder_block, duplicate_block):
+	Update cfg.
+	* cfgloopanal.c (mark_irreducible_loops): Work if the function contains
+	no loops.
+	* modulo-sched.c (generate_prolog_epilog, canon_loop): Do not use
+	loop_split_edge_with.
+	(sms_schedule): Use current_loops.
+	* tree-ssa-dom.c (tree_ssa_dominator_optimize): Use current_loops.
+	* loop-init.c (loop_optimizer_init, loop_optimizer_finalize): Set
+	current_loops.
+	(rtl_loop_init, rtl_loop_done): Do not set current_loops.
+	* tree-ssa-sink.c (execute_sink_code): Use current_loops.
+	* ifcvt.c (if_convert): Ditto.
+	* predict.c (predict_loops): Do not clear current_loops.
+	(tree_estimate_probability): Use current_loops.
+	(propagate_freq): Receive head of the region to propagate instead of
+	loop.
+	(estimate_loops_at_level): Do not use shared to_visit bitmap.
+	(estimate_loops): New function.  Handle case current_loops == NULL.
+	(estimate_bb_frequencies): Do not allocate tovisit.  Use
+	estimate_loops.
+	* tree-ssa-loop.c (current_loops): Removed.
+	(tree_loop_optimizer_init): Do not return loops.
+	(tree_ssa_loop_init, tree_ssa_loop_done): Do not set current_loops.
+	* tree-vectorizer.c (slpeel_update_phi_nodes_for_guard1,
+	slpeel_update_phi_nodes_for_guard2, slpeel_tree_peel_loop_to_edge):
+	Do not update loops explicitly.
+	* function.h (struct function): Add x_current_loops field.
+	(current_loops): New macro.
+	* tree-if-conv.c (combine_blocks): Do not update loops explicitly.
+	* loop-unroll.c (split_edge_and_insert): New function.
+	(unroll_loop_runtime_iterations, analyze_insns_in_loop): Do not
+	use loop_split_edge_with.
+	* loop-doloop.c (add_test, doloop_modify): Ditto.
+	* tree-ssa-pre.c (init_pre, fini_pre): Do not set current_loops.
+	* cfglayout.c (copy_bbs): Do not update loops explicitly.
+	* lambda-code.c (perfect_nestify): Do not use loop_split_edge_with.
+	* tree-vect-transform.c (vect_transform_loop): Do not update loops
+	explicitly.
+	* cfgloop.c (flow_loops_cfg_dump): Do not dump dfs_order and rc_order.
+	(flow_loops_free): Do not free dfs_order and rc_order.
+	(flow_loops_find): Do not set dfs_order and rc_order in loops
+	structure.  Do not call loops and flow info verification.
+	(add_bb_to_loop, remove_bb_from_loops): Check whether the block
+	already belongs to some loop.
+	* cfgloop.h (struct loops): Remove struct cfg.
+	(current_loops, loop_split_edge_with): Declaration removed.
+	(loop_optimizer_init, loop_optimizer_finalize): Declaration changed.
+	* tree-flow.h (loop_commit_inserts, bsi_insert_on_edge_immediate_loop):
+	Declaration removed.
+	* Makefile.in (cfghooks.o): Add CFGLOOP_H dependency.
+	* basic-block.h (split_edge_and_insert): Declare.
+	* tree-cfg.c (remove_bb): Do not update loops explicitly.
+
+2006-11-17  Zdenek Dvorak <dvorakz@suse.cz>
+
+	PR tree-optimization/29801
+	* tree-ssa-ccp.c (get_symbol_constant_value): New function.
+	(get_default_value): Use get_symbol_constant_value.
+	(set_lattice_value): ICE when the value of the constant is
+	changed.
+	(visit_assignment): Ignore VDEFs of read-only variables.
+
+2006-11-17  Zdenek Dvorak <dvorakz@suse.cz>
+
+	* tree-vect-transform.c (vect_create_epilog_for_reduction): Fix
+	formating.
+	(vect_generate_tmps_on_preheader, vect_update_ivs_after_vectorizer,
+	vect_gen_niters_for_prolog_loop): Fold the emited expressions.
+
+2006-11-17  Zdenek Dvorak <dvorakz@suse.cz>
+
+	* tree-ssa-alias.c (new_type_alias): Do not use offset of expr to
+	select subvars of var.
+
+2006-11-17  Jakub Jelinek  <jakub@redhat.com>
+
+	PR middle-end/29584
+	* tree-ssa-forwprop.c (simplify_switch_expr): Don't
+	optimize if DEF doesn't have integral type.
+
+2006-11-16  Mike Stump  <mrs@apple.com>
+
+	* config/darwin.h (LINK_COMMAND_SPEC): Don't do dwarf stuff on
+	pre-darwin9 system, unless the user asks for it directly.
+	(PREFERRED_DEBUGGING_TYPE): Likewise.
+	* config/i386/darwin.h (PREFERRED_DEBUGGING_TYPE): Likewise.
+	* config.gcc: Add suppport for darwin9.h.
+	* config/darwin9.h: Add.
+	* doc/install.texi (Specific): Clarify darwin documentation.
+	
+2006-11-16  Richard Earnshaw  <rearnsha@arm.com>
+
+	* arm.h (CONSTANT_ALIGNMENT): Don't over-align strings when
+	optimizing for size.
+
+2006-11-16  Mike Stump  <mrs@apple.com>
+
+	* Makefile.in (targhooks.o): Add $(OPTABS_H).
+
+2006-11-16  Dirk Mueller  <dmueller@suse.de>
+
+	* tree-vrp.c (get_value_range): Use XCNEW instead
+	of XNEW and memset.
+	(insert_range_assertions): Use XCNEWVEC instead
+	of XNEWVEC and memset.
+	(vrp_initialize): Same.
+	(vrp_finalize): Same.
+	* tree-ssa-ccp.c (ccp_initialize): Same.
+	* predict.c (tree_bb_level_predictions): Same.
+	* calls.c (expand_call): Same.
+	* tree-ssa-copy.c (init_copy_prop): Same.
+	(fini_copy_prop): Same.
+	* tree-ssa-alias.c (get_ptr_info): Use GGC_CNEW instead
+	of GGC_NEW and memset.
+
+2006-11-16  Eric Botcazou  <ebotcazou@adacore.com>
+
+	PR middle-end/26306
+	* gimplify.c (gimplify_expr): Only force a load for references to
+	non-BLKmode volatile values.
+	* doc/implement-c.texi (Qualifiers implementation): Document the
+	interpretation of what a volatile access is.
+	* doc/extend.texi (C++ Extensions): Rework same documentation.
+
 2006-11-16  Joseph Myers  <joseph@codesourcery.com>
 
 	* config/rs6000/spe.md (frob_di_df_2): Handle non-offsettable
--- /usr/src/SpecTests/sandbox-britten-memory/x86_64/mem-result/ChangeLog.cp	2006-11-14 02:45:55.000000000 +0000
+++ /usr/src/SpecTests/sandbox-britten-memory/gcc/gcc/cp/ChangeLog	2006-11-17 13:28:19.000000000 +0000
@@ -1,3 +1,8 @@
+2006-11-16  Dirk Mueller  <dmueller@suse.de>
+
+	* name-lookup.c (begin_scope): Use GGC_CNEW instead of
+	GGC_NEW and memset.
+
 2006-11-13  Roger Sayle  <roger@eyesopen.com>
 
 	* rtti.c (get_pseudo_ti_init): Ensure that the offset field of the


The results can be reproduced by building a compiler with

--enable-gather-detailed-mem-stats targetting x86-64

and compiling preprocessed combine.c or testcase from PR8632 with:

-fmem-report --param=ggc-min-heapsize=1024 --param=ggc-min-expand=1 -Ox -Q

The memory consumption summary appears in the dump after detailed listing
of the places they are allocated in.  Peak memory consumption is actually
computed by looking for maximal value in {GC XXXX -> YYYY} report.

Your testing script.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]