This is the mail archive of the gcc-regression@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

A recent patch increased GCC's memory consumption in some cases!


Hi,

I am a friendly script caring about memory consumption in GCC.  Please
contact jh@suse.cz if something is going wrong.

Comparing memory consumption on compilation of combine.i, insn-attrtab.i,
and generate-3.4.ii I got:


comparing empty function compilation at -O0 level:
    Overall memory needed: 18349k
    Peak memory use before GGC: 2264k
    Peak memory use after GGC: 1955k
    Maximum of released memory in single GGC run: 309k
    Garbage: 444k
    Leak: 2288k
    Overhead: 455k
    GGC runs: 3

comparing empty function compilation at -O0 -g level:
    Overall memory needed: 18365k
    Peak memory use before GGC: 2291k
    Peak memory use after GGC: 1982k
    Maximum of released memory in single GGC run: 309k
    Garbage: 447k
    Leak: 2320k
    Overhead: 460k
    GGC runs: 3

comparing empty function compilation at -O1 level:
    Overall memory needed: 18457k -> 18461k
    Peak memory use before GGC: 2264k
    Peak memory use after GGC: 1955k
    Maximum of released memory in single GGC run: 309k
    Garbage: 450k
    Leak: 2291k
    Overhead: 456k
    GGC runs: 4

comparing empty function compilation at -O2 level:
    Overall memory needed: 18469k -> 18473k
    Peak memory use before GGC: 2265k
    Peak memory use after GGC: 1955k
    Maximum of released memory in single GGC run: 310k
    Garbage: 453k
    Leak: 2291k
    Overhead: 456k
    GGC runs: 4

comparing empty function compilation at -O3 level:
    Overall memory needed: 18469k -> 18473k
    Peak memory use before GGC: 2265k
    Peak memory use after GGC: 1955k
    Maximum of released memory in single GGC run: 310k
    Garbage: 453k
    Leak: 2291k
    Overhead: 456k
    GGC runs: 4

comparing combine.c compilation at -O0 level:
  Peak amount of GGC memory allocated before garbage collecting increased from 9312k to 9334k, overall 0.24%
  Peak amount of GGC memory still allocated after garbage collectin increased from 8864k to 8886k, overall 0.25%
  Amount of memory still referenced at the end of compilation increased from 6538k to 6554k, overall 0.24%
    Overall memory needed: 28633k
    Peak memory use before GGC: 9312k -> 9334k
    Peak memory use after GGC: 8864k -> 8886k
    Maximum of released memory in single GGC run: 2606k -> 2628k
    Garbage: 37297k -> 37266k
    Leak: 6538k -> 6554k
    Overhead: 4829k -> 4828k
    GGC runs: 276

comparing combine.c compilation at -O0 -g level:
  Peak amount of GGC memory allocated before garbage collecting increased from 10895k to 10917k, overall 0.20%
  Peak amount of GGC memory still allocated after garbage collectin increased from 10524k to 10546k, overall 0.21%
    Overall memory needed: 30681k -> 30685k
    Peak memory use before GGC: 10895k -> 10917k
    Peak memory use after GGC: 10524k -> 10546k
    Maximum of released memory in single GGC run: 2365k -> 2388k
    Garbage: 37870k -> 37874k
    Leak: 9414k
    Overhead: 5530k
    GGC runs: 271 -> 272

comparing combine.c compilation at -O1 level:
  Amount of produced GGC garbage increased from 55193k to 55507k, overall 0.57%
    Overall memory needed: 33270k -> 33390k
    Peak memory use before GGC: 19924k -> 19919k
    Peak memory use after GGC: 19725k -> 19721k
    Maximum of released memory in single GGC run: 2264k -> 2262k
    Garbage: 55193k -> 55507k
    Leak: 6566k -> 6564k
    Overhead: 9959k -> 9970k
    GGC runs: 352 -> 351

comparing combine.c compilation at -O2 level:
  Amount of produced GGC garbage increased from 70931k to 71255k, overall 0.46%
    Overall memory needed: 33274k -> 33394k
    Peak memory use before GGC: 19933k
    Peak memory use after GGC: 19735k
    Maximum of released memory in single GGC run: 2206k -> 2204k
    Garbage: 70931k -> 71255k
    Leak: 6694k -> 6686k
    Overhead: 11862k -> 11872k
    GGC runs: 410 -> 409

comparing combine.c compilation at -O3 level:
  Peak amount of GGC memory allocated before garbage collecting increased from 20881k to 21065k, overall 0.88%
  Amount of produced GGC garbage increased from 103323k to 105390k, overall 2.00%
    Overall memory needed: 32470k -> 32450k
    Peak memory use before GGC: 20881k -> 21065k
    Peak memory use after GGC: 20311k -> 20197k
    Maximum of released memory in single GGC run: 3125k -> 3167k
    Garbage: 103323k -> 105390k
    Leak: 6768k -> 6768k
    Overhead: 16334k -> 16504k
    GGC runs: 470 -> 459

comparing insn-attrtab.c compilation at -O0 level:
  Peak amount of GGC memory allocated before garbage collecting increased from 70731k to 71144k, overall 0.58%
  Peak amount of GGC memory still allocated after garbage collectin increased from 44750k to 45190k, overall 0.98%
    Overall memory needed: 89190k -> 89610k
    Peak memory use before GGC: 70731k -> 71144k
    Peak memory use after GGC: 44750k -> 45190k
    Maximum of released memory in single GGC run: 37355k -> 37768k
    Garbage: 131553k -> 131559k
    Leak: 9580k
    Overhead: 16626k -> 16626k
    GGC runs: 210 -> 208

comparing insn-attrtab.c compilation at -O0 -g level:
  Peak amount of GGC memory allocated before garbage collecting increased from 71893k to 72305k, overall 0.57%
  Peak amount of GGC memory still allocated after garbage collectin increased from 46017k to 46458k, overall 0.96%
    Overall memory needed: 90366k -> 90786k
    Peak memory use before GGC: 71893k -> 72305k
    Peak memory use after GGC: 46017k -> 46458k
    Maximum of released memory in single GGC run: 37357k -> 37769k
    Garbage: 132717k -> 132721k
    Leak: 11269k
    Overhead: 17020k
    GGC runs: 209 -> 207

comparing insn-attrtab.c compilation at -O1 level:
  Peak amount of GGC memory allocated before garbage collecting increased from 72561k to 72996k, overall 0.60%
  Peak amount of GGC memory still allocated after garbage collectin increased from 68716k to 69150k, overall 0.63%
  Amount of produced GGC garbage increased from 227951k to 230037k, overall 0.92%
    Overall memory needed: 98154k -> 98490k
    Peak memory use before GGC: 72561k -> 72996k
    Peak memory use after GGC: 68716k -> 69150k
    Maximum of released memory in single GGC run: 31302k -> 31661k
    Garbage: 227951k -> 230037k
    Leak: 9397k -> 9397k
    Overhead: 29388k -> 29471k
    GGC runs: 223 -> 224

comparing insn-attrtab.c compilation at -O2 level:
  Overall memory allocated via mmap and sbrk increased from 105158k to 110406k, overall 4.99%
  Peak amount of GGC memory allocated before garbage collecting increased from 79387k to 79821k, overall 0.55%
  Peak amount of GGC memory still allocated after garbage collectin increased from 73577k to 74012k, overall 0.59%
  Amount of produced GGC garbage increased from 279218k to 281304k, overall 0.75%
    Overall memory needed: 105158k -> 110406k
    Peak memory use before GGC: 79387k -> 79821k
    Peak memory use after GGC: 73577k -> 74012k
    Maximum of released memory in single GGC run: 29594k -> 30532k
    Garbage: 279218k -> 281304k
    Leak: 9395k -> 9394k
    Overhead: 35694k -> 35777k
    GGC runs: 245 -> 246

comparing insn-attrtab.c compilation at -O3 level:
  Peak amount of GGC memory allocated before garbage collecting increased from 79427k to 79834k, overall 0.51%
  Peak amount of GGC memory still allocated after garbage collectin increased from 73617k to 74024k, overall 0.55%
  Amount of produced GGC garbage increased from 279917k to 282004k, overall 0.75%
    Overall memory needed: 109994k -> 110406k
    Peak memory use before GGC: 79427k -> 79834k
    Peak memory use after GGC: 73617k -> 74024k
    Maximum of released memory in single GGC run: 29758k -> 30597k
    Garbage: 279917k -> 282004k
    Leak: 9399k -> 9396k
    Overhead: 35905k -> 35987k
    GGC runs: 245 -> 246

comparing Gerald's testcase PR8361 compilation at -O0 level:
    Overall memory needed: 119778k -> 119706k
    Peak memory use before GGC: 93022k -> 92946k
    Peak memory use after GGC: 92099k -> 92023k
    Maximum of released memory in single GGC run: 18912k -> 18804k
    Garbage: 208221k -> 208266k
    Leak: 49015k
    Overhead: 21199k -> 21200k
    GGC runs: 407 -> 408

comparing Gerald's testcase PR8361 compilation at -O0 -g level:
    Overall memory needed: 132278k -> 132150k
    Peak memory use before GGC: 105299k -> 105297k
    Peak memory use after GGC: 104254k
    Maximum of released memory in single GGC run: 18752k -> 18718k
    Garbage: 214808k -> 214834k
    Leak: 72442k -> 72440k
    Overhead: 27104k -> 27103k
    GGC runs: 382

comparing Gerald's testcase PR8361 compilation at -O1 level:
  Ovarall memory allocated via mmap and sbrk decreased from 130334k to 123354k, overall -5.66%
  Amount of produced GGC garbage increased from 395085k to 402864k, overall 1.97%
    Overall memory needed: 130334k -> 123354k
    Peak memory use before GGC: 98685k -> 98455k
    Peak memory use after GGC: 97707k -> 97460k
    Maximum of released memory in single GGC run: 17925k -> 17915k
    Garbage: 395085k -> 402864k
    Leak: 50038k -> 49997k
    Overhead: 53237k -> 54392k
    GGC runs: 572 -> 549

comparing Gerald's testcase PR8361 compilation at -O2 level:
  Ovarall memory allocated via mmap and sbrk decreased from 130394k to 123362k, overall -5.70%
  Amount of produced GGC garbage increased from 452204k to 461503k, overall 2.06%
    Overall memory needed: 130394k -> 123362k
    Peak memory use before GGC: 98752k -> 98507k
    Peak memory use after GGC: 97773k -> 97525k
    Maximum of released memory in single GGC run: 17924k -> 17915k
    Garbage: 452204k -> 461503k
    Leak: 50820k -> 50820k
    Overhead: 46371k -> 47445k
    GGC runs: 632 -> 599

comparing Gerald's testcase PR8361 compilation at -O3 level:
  Ovarall memory allocated via mmap and sbrk decreased from 132214k to 125390k, overall -5.44%
  Amount of produced GGC garbage increased from 467189k to 489579k, overall 4.79%
  Amount of memory still referenced at the end of compilation increased from 51028k to 51430k, overall 0.79%
    Overall memory needed: 132214k -> 125390k
    Peak memory use before GGC: 100342k -> 100260k
    Peak memory use after GGC: 99308k -> 99209k
    Maximum of released memory in single GGC run: 18325k -> 18261k
    Garbage: 467189k -> 489579k
    Leak: 51028k -> 51430k
    Overhead: 45791k -> 49389k
    GGC runs: 642 -> 616

comparing PR rtl-optimization/28071 testcase compilation at -O0 level:
  Peak amount of GGC memory allocated before garbage collecting increased from 81607k to 82630k, overall 1.25%
  Peak amount of GGC memory still allocated after garbage collectin increased from 58487k to 59510k, overall 1.75%
    Overall memory needed: 137646k -> 138314k
    Peak memory use before GGC: 81607k -> 82630k
    Peak memory use after GGC: 58487k -> 59510k
    Maximum of released memory in single GGC run: 44559k -> 45582k
    Garbage: 148154k -> 148155k
    Leak: 8080k
    Overhead: 25066k
    GGC runs: 81 -> 80

comparing PR rtl-optimization/28071 testcase compilation at -O0 -g level:
  Peak amount of GGC memory allocated before garbage collecting increased from 82253k to 83276k, overall 1.24%
  Peak amount of GGC memory still allocated after garbage collectin increased from 59133k to 60155k, overall 1.73%
    Overall memory needed: 138198k -> 138698k
    Peak memory use before GGC: 82253k -> 83276k
    Peak memory use after GGC: 59133k -> 60155k
    Maximum of released memory in single GGC run: 44208k -> 45231k
    Garbage: 148325k -> 148325k
    Leak: 9335k
    Overhead: 25561k
    GGC runs: 89 -> 88

comparing PR rtl-optimization/28071 testcase compilation at -O1 level:
  Ovarall memory allocated via mmap and sbrk decreased from 436618k to 249142k, overall -75.25%
  Peak amount of GGC memory still allocated after garbage collecting decreased from 192571k to 178775k, overall -7.72%
  Amount of memory still referenced at the end of compilation decreased from 29804k to 27473k, overall -8.49%
    Overall memory needed: 436618k -> 249142k
    Peak memory use before GGC: 202594k -> 197776k
    Peak memory use after GGC: 192571k -> 178775k
    Maximum of released memory in single GGC run: 137171k -> 134229k
    Garbage: 278226k -> 274126k
    Leak: 29804k -> 27473k
    Overhead: 32030k -> 33142k
    GGC runs: 92 -> 74

comparing PR rtl-optimization/28071 testcase compilation at -O2 level:
  Ovarall memory allocated via mmap and sbrk decreased from 365202k to 192018k, overall -90.19%
  Peak amount of GGC memory allocated before garbage collecting increased from 207227k to 302461k, overall 45.96%
  Peak amount of GGC memory still allocated after garbage collecting decreased from 192563k to 178766k, overall -7.72%
  Amount of produced GGC garbage increased from 355277k to 586684k, overall 65.13%
  Amount of memory still referenced at the end of compilation decreased from 30387k to 27902k, overall -8.91%
    Overall memory needed: 365202k -> 192018k
    Peak memory use before GGC: 207227k -> 302461k
    Peak memory use after GGC: 192563k -> 178766k
    Maximum of released memory in single GGC run: 140362k -> 241049k
    Garbage: 355277k -> 586684k
    Leak: 30387k -> 27902k
    Overhead: 47185k -> 95313k
    GGC runs: 98 -> 83

comparing PR rtl-optimization/28071 testcase compilation at -O3 -fno-tree-pre -fno-tree-fre level:
  Overall memory allocated via mmap and sbrk increased from 572546k to 586858k, overall 2.50%
  Peak amount of GGC memory allocated before garbage collecting increased from 282245k to 283571k, overall 0.47%
  Peak amount of GGC memory still allocated after garbage collectin increased from 273215k to 276589k, overall 1.23%
  Amount of produced GGC garbage increased from 448504k to 451307k, overall 0.62%
  Amount of memory still referenced at the end of compilation increased from 45440k to 48594k, overall 6.94%
    Overall memory needed: 572546k -> 586858k
    Peak memory use before GGC: 282245k -> 283571k
    Peak memory use after GGC: 273215k -> 276589k
    Maximum of released memory in single GGC run: 138326k -> 138367k
    Garbage: 448504k -> 451307k
    Leak: 45440k -> 48594k
    Overhead: 56089k -> 56724k
    GGC runs: 97 -> 72

Head of the ChangeLog is:

--- /usr/src/SpecTests/sandbox-britten-memory/x86_64/mem-result/ChangeLog	2007-01-16 20:23:49.000000000 +0000
+++ /usr/src/SpecTests/sandbox-britten-memory/gcc/gcc/ChangeLog	2007-01-18 04:56:20.000000000 +0000
@@ -1,3 +1,135 @@
+2007-01-18  Ben Elliston  <bje@au.ibm.com>
+
+	* genautomata.c (write_automata): Include xstrerror output in the
+	error message if writing the DFA description file fails.
+
+2007-01-17  H.J. Lu  <hongjiu.lu@intel.com>
+
+	* config/mips/mips-protos.h (mips_output_external): Make it
+	return void.
+	* config/mips/iris.h (TARGET_ASM_EXTERNAL_LIBCALL): Removed.
+	* config/mips/mips.c (irix_output_external_libcall): Likewise.
+	(extern_list): Likewise.
+	(extern_head): Likewise.
+	(TARGET_ASM_FILE_END): Likewise.
+	(mips_file_end): Likewise.
+	(mips_output_external): Rewritten.
+
+2007-01-18  Ben Elliston  <bje@au.ibm.com>
+
+	* genpreds.c (write_insn_preds_c): Only write out the function
+	body for regclass_for_constraint if we have register constraints.
+
+2007-01-17  Tom Tromey  <tromey@redhat.com>
+
+	* doc/sourcebuild.texi (libgcj Tests): Use sourceware.org.
+	* doc/install.texi (Testing): Use sourceware.org.
+	(Binaries): Likewise.
+	(Specific): Likewise.
+	* doc/contrib.texi (Contributors): Use sourceware.org.
+
+2007-01-17  Anatoly Sokolov <aesok@post.ru>
+
+	* config/avr/avr.h (AVR_HAVE_LPMX): New macro.
+	(AVR_ENHANCED): Rename to ...
+	(AVR_HAVE_MUL): ... new.
+	(avr_enhanced_p): Rename to ...
+	(avr_have_mul_p): ... new.
+	(TARGET_CPU_CPP_BUILTINS): Use 'avr_have_mul_p' instead of 
+	'avr_enhanced_p' for "__AVR_ENHANCED__". Define "__AVR_HAVE_MUL__".
+	* config/avr/avr.c (avr_enhanced_p): Rename to ...
+	(avr_have_mul_p): ... new.
+	(base_arch_s): Rename 'enhanced' to 'have_mul'.
+	(avr_override_options): Use 'avr_have_mul_p' and 'have_mul' instead of
+	'avr_enhanced_p' and 'enhanced'.
+	(ashlhi3_out, ashrhi3_out, lshrhi3_out, avr_rtx_costs): Use 
+	AVR_HAVE_MUL instead of AVR_ENHANCED.
+	* avr.md (*tablejump_enh): Use AVR_HAVE_LPMX instead of AVR_ENHANCED.
+	(mulqi3, *mulqi3_enh, *mulqi3_call, mulqihi3, umulqihi3, mulhi3, 
+	*mulhi3_enh, *mulhi3_call, mulsi3, *mulsi3_call): Use AVR_HAVE_MUL 
+	instead of AVR_ENHANCED.
+	(*tablejump_enh): Use AVR_HAVE_LPMX instead of AVR_ENHANCED.
+	* libgcc.S: Use __AVR_HAVE_MUL__ instead of __AVR_ENHANCED__.
+	(__tablejump__): Use __AVR_HAVE_LPMX__ instead of __AVR_ENHANCED__.
+
+2007-01-17  Ian Lance Taylor  <iant@google.com>
+
+	* vec.h (VEC_reserve_exact): Define.
+	(vec_gc_p_reserve_exact): Declare.
+	(vec_gc_o_reserve_exact): Declare.
+	(vec_heap_p_reserve_exact): Declare.
+	(vec_heap_o_reserve_exact): Declare.
+	(VEC_OP (T,A,reserve_exact)): New static inline function, three
+	versions.
+	(VEC_OP (T,A,reserve)) [all versions]: Remove handling of
+	negative parameter.
+	(VEC_OP (T,A,alloc)) [all versions]: Call ...reserve_exact.
+	(VEC_OP (T,A,copy)) [all versions]: Likewise.
+	(VEC_OP (T,a,safe_grow)) [all versions]: Likewise.
+	* vec.c (calculate_allocation): Add exact parameter.  Change all
+	callers.
+	(vec_gc_o_reserve_1): New static function, from vec_gc_o_reserve.
+	(vec_gc_p_reserve, vec_gc_o_reserve): Call vec_gc_o_reserve_1.
+	(vec_gc_p_reserve_exact, vec_gc_o_reserve_exact): New functions.
+	(vec_heap_o_reserve_1): New static function, from vec_heap_o_reserve.
+	(vec_heap_p_reserve, vec_heap_o_reserve): Call vec_heap_o_reserve_1.
+	(vec_heap_p_reserve_exact): New function.
+	(vec_heap_o_reserve_exact): New function.
+
+2007-01-17  Jan Hubicka  <jh@suse.cz>
+
+	* ipa-type-escape.c (look_for_casts): Revamp using handled_component_p.
+
+2007-01-17  Eric Christopher  <echristo@apple.com>
+
+	* config.gcc: Support core2 processor.
+
+2007-01-16  Jan Hubicka  <jh@suse.cz>
+
+	* tree-ssanames.c (release_dead_ssa_names): Instead of ggc_freeing
+	the names, just unlink the chain so we don't crash on dangling pointers
+	to dead SSA names.
+
+2007-01-16  Jan Hubicka  <jh@suse.cz>
+
+	* cgraph.h (cgraph_decide_inlining_incrementally): Kill.
+	* tree-pass.h: Reorder to make IPA passes appear toegher.
+	(pass_early_inline, pass_inline_parameters, pass_apply_inline): Declare.
+	* cgraphunit.c (cgraph_finalize_function): Do not compute inling
+	parameters, do not call early inliner.
+	* ipa-inline.c: Update comments.  Include tree-flow.h
+	(cgraph_decide_inlining): Do not compute inlining parameters.
+	(cgraph_decide_inlining_incrementally): Return TODOs; assume to
+	be called with function context set up.
+	(pass_ipa_inline): Remove unreachable functions before pass.
+	(cgraph_early_inlining): Simplify assuming to be called from the
+	PM as local pass.
+	(pass_early_inline): New pass.
+	(cgraph_gate_ipa_early_inlining): New gate.
+	(pass_ipa_early_inline): Turn into simple wrapper.
+	(compute_inline_parameters): New function.
+	(gate_inline_passes): New gate.
+	(pass_inline_parameters): New pass.
+	(apply_inline): Move here from tree-optimize.c
+	(pass_apply_inline): New pass.
+	* ipa.c (cgraph_remove_unreachable_nodes): Verify cgraph after
+	transforming.
+	* tree-inline.c (optimize_inline_calls): Return TODOs rather than
+	doing them by hand.
+	(tree_function_versioning): Do not allocate dummy struct function.
+	* tree-inline.h (optimize_inline_calls): Update prototype.
+	* tree-optimize.c (execute_fixup_cfg): Export.
+	(pass_fixup_cfg): Remove
+	(tree_rest_of_compilation): Do not apply inlines.
+	* tree-flow.h (execute_fixup_cfg): Declare.
+	* Makefile.in (gt-passes.c): New.
+	* passes.c: Include gt-passes.h
+	(init_optimization_passes): New passes.
+	(nnodes, order): New static vars.
+	(do_per_function_toporder): New function.
+	(execute_one_pass): Dump current pass here.
+	(execute_ipa_pass_list): Don't dump current pass here.
+
 2007-01-16  Janis Johnson  <janis187@us.ibm.com>
 
 	* config/dfp-bit.c (dfp_compare_op): Return separate value for NaN.
--- /usr/src/SpecTests/sandbox-britten-memory/x86_64/mem-result/ChangeLog.cp	2007-01-12 08:03:04.000000000 +0000
+++ /usr/src/SpecTests/sandbox-britten-memory/gcc/gcc/cp/ChangeLog	2007-01-18 04:56:19.000000000 +0000
@@ -1,3 +1,8 @@
+2007-01-17  Ian Lance Taylor  <iant@google.com>
+
+	* class.c (add_method): Call VEC_reserve_exact rather than passing
+	a negative size to VEC_reserve.
+
 2007-01-11  Simon Martin  <simartin@users.sourceforge.net>
 
 	PR c++/29573


The results can be reproduced by building a compiler with

--enable-gather-detailed-mem-stats targetting x86-64

and compiling preprocessed combine.c or testcase from PR8632 with:

-fmem-report --param=ggc-min-heapsize=1024 --param=ggc-min-expand=1 -Ox -Q

The memory consumption summary appears in the dump after detailed listing
of the places they are allocated in.  Peak memory consumption is actually
computed by looking for maximal value in {GC XXXX -> YYYY} report.

Your testing script.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]