A recent patch increased GCC's memory consumption in some cases!

gcctest@suse.de gcctest@suse.de
Thu Dec 21 06:35:00 GMT 2006


Hi,

I am a friendly script caring about memory consumption in GCC.  Please
contact jh@suse.cz if something is going wrong.

Comparing memory consumption on compilation of combine.i, insn-attrtab.i,
and generate-3.4.ii I got:


comparing empty function compilation at -O0 level:
    Overall memory needed: 18281k -> 18293k
    Peak memory use before GGC: 2235k
    Peak memory use after GGC: 1942k
    Maximum of released memory in single GGC run: 293k
    Garbage: 423k -> 423k
    Leak: 2273k
    Overhead: 446k
    GGC runs: 3

comparing empty function compilation at -O0 -g level:
    Overall memory needed: 18297k -> 18309k
    Peak memory use before GGC: 2263k
    Peak memory use after GGC: 1970k
    Maximum of released memory in single GGC run: 293k
    Garbage: 425k -> 426k
    Leak: 2305k
    Overhead: 450k
    GGC runs: 3

comparing empty function compilation at -O1 level:
    Overall memory needed: 18381k -> 18397k
    Peak memory use before GGC: 2235k
    Peak memory use after GGC: 1942k
    Maximum of released memory in single GGC run: 293k
    Garbage: 427k -> 427k
    Leak: 2275k
    Overhead: 446k
    GGC runs: 4

comparing empty function compilation at -O2 level:
    Overall memory needed: 18393k -> 18409k
    Peak memory use before GGC: 2236k
    Peak memory use after GGC: 1942k
    Maximum of released memory in single GGC run: 294k
    Garbage: 430k -> 430k
    Leak: 2275k
    Overhead: 447k
    GGC runs: 4

comparing empty function compilation at -O3 level:
    Overall memory needed: 18393k -> 18409k
    Peak memory use before GGC: 2236k
    Peak memory use after GGC: 1942k
    Maximum of released memory in single GGC run: 294k
    Garbage: 430k -> 430k
    Leak: 2275k
    Overhead: 447k
    GGC runs: 4

comparing combine.c compilation at -O0 level:
    Overall memory needed: 28441k -> 28461k
    Peak memory use before GGC: 9288k
    Peak memory use after GGC: 8804k -> 8803k
    Maximum of released memory in single GGC run: 2642k
    Garbage: 37525k -> 37557k
    Leak: 6454k
    Overhead: 4872k -> 4872k
    GGC runs: 280 -> 281

comparing combine.c compilation at -O0 -g level:
    Overall memory needed: 30525k -> 30537k
    Peak memory use before GGC: 10835k -> 10834k
    Peak memory use after GGC: 10464k -> 10463k
    Maximum of released memory in single GGC run: 2320k
    Garbage: 38104k -> 38132k
    Leak: 9330k
    Overhead: 5574k -> 5574k
    GGC runs: 272

comparing combine.c compilation at -O1 level:
  Amount of memory still referenced at the end of compilation increased from 6481k to 6489k, overall 0.12%
    Overall memory needed: 29474k
    Peak memory use before GGC: 16962k -> 16963k
    Peak memory use after GGC: 16792k
    Maximum of released memory in single GGC run: 2254k -> 2252k
    Garbage: 55702k -> 55738k
    Leak: 6481k -> 6489k
    Overhead: 9959k -> 9959k
    GGC runs: 358

comparing combine.c compilation at -O2 level:
    Overall memory needed: 29474k
    Peak memory use before GGC: 16967k
    Peak memory use after GGC: 16792k
    Maximum of released memory in single GGC run: 2363k -> 2371k
    Garbage: 71797k -> 71830k
    Leak: 6611k -> 6602k
    Overhead: 11849k -> 11849k
    GGC runs: 412

comparing combine.c compilation at -O3 level:
    Overall memory needed: 29602k
    Peak memory use before GGC: 18066k -> 18067k
    Peak memory use after GGC: 17599k -> 17600k
    Maximum of released memory in single GGC run: 3676k -> 3678k
    Garbage: 106280k -> 106302k
    Leak: 6684k
    Overhead: 16905k -> 16904k
    GGC runs: 460

comparing insn-attrtab.c compilation at -O0 level:
  Amount of produced GGC garbage increased from 132119k to 132390k, overall 0.21%
    Overall memory needed: 89654k -> 89650k
    Peak memory use before GGC: 71193k
    Peak memory use after GGC: 44700k -> 44699k
    Maximum of released memory in single GGC run: 37868k
    Garbage: 132119k -> 132390k
    Leak: 9518k -> 9278k
    Overhead: 16954k -> 16954k
    GGC runs: 211 -> 212

comparing insn-attrtab.c compilation at -O0 -g level:
  Amount of memory still referenced at the end of compilation increased from 10981k to 11221k, overall 2.19%
    Overall memory needed: 90826k
    Peak memory use before GGC: 72355k -> 72354k
    Peak memory use after GGC: 45967k -> 45966k
    Maximum of released memory in single GGC run: 37869k -> 37868k
    Garbage: 133525k -> 133284k
    Leak: 10981k -> 11221k
    Overhead: 17349k -> 17349k
    GGC runs: 209

comparing insn-attrtab.c compilation at -O1 level:
    Overall memory needed: 93726k -> 93730k
    Peak memory use before GGC: 71858k
    Peak memory use after GGC: 67990k
    Maximum of released memory in single GGC run: 31671k -> 31668k
    Garbage: 229875k -> 229900k
    Leak: 9343k
    Overhead: 29520k -> 29520k
    GGC runs: 220

comparing insn-attrtab.c compilation at -O2 level:
    Overall memory needed: 123986k -> 124006k
    Peak memory use before GGC: 79550k
    Peak memory use after GGC: 73714k -> 73715k
    Maximum of released memory in single GGC run: 30216k -> 30212k
    Garbage: 282685k -> 282733k
    Leak: 9345k
    Overhead: 35777k -> 35781k
    GGC runs: 243 -> 244

comparing insn-attrtab.c compilation at -O3 level:
  Ovarall memory allocated via mmap and sbrk decreased from 128814k to 123766k, overall -4.08%
    Overall memory needed: 128814k -> 123766k
    Peak memory use before GGC: 79575k -> 79576k
    Peak memory use after GGC: 73740k
    Maximum of released memory in single GGC run: 30410k -> 30407k
    Garbage: 283525k -> 283552k
    Leak: 9350k
    Overhead: 36009k -> 36009k
    GGC runs: 247

comparing Gerald's testcase PR8361 compilation at -O0 level:
  Amount of produced GGC garbage increased from 211915k to 212865k, overall 0.45%
    Overall memory needed: 119066k
    Peak memory use before GGC: 92332k -> 92331k
    Peak memory use after GGC: 91417k
    Maximum of released memory in single GGC run: 19250k -> 19248k
    Garbage: 211915k -> 212865k
    Leak: 48116k
    Overhead: 21222k -> 21222k
    GGC runs: 416 -> 417

comparing Gerald's testcase PR8361 compilation at -O0 -g level:
  Amount of produced GGC garbage increased from 218494k to 219448k, overall 0.44%
    Overall memory needed: 131642k
    Peak memory use before GGC: 104689k -> 104692k
    Peak memory use after GGC: 103652k
    Maximum of released memory in single GGC run: 18932k -> 18936k
    Garbage: 218494k -> 219448k
    Leak: 71548k
    Overhead: 27127k -> 27126k
    GGC runs: 389 -> 390

comparing Gerald's testcase PR8361 compilation at -O1 level:
    Overall memory needed: 119502k
    Peak memory use before GGC: 96665k
    Peak memory use after GGC: 94462k
    Maximum of released memory in single GGC run: 17940k
    Garbage: 442643k -> 442789k
    Leak: 50182k -> 50182k
    Overhead: 103830k -> 103827k
    GGC runs: 562

comparing Gerald's testcase PR8361 compilation at -O2 level:
    Overall memory needed: 119554k
    Peak memory use before GGC: 96692k
    Peak memory use after GGC: 94489k
    Maximum of released memory in single GGC run: 18081k
    Garbage: 497476k -> 497651k
    Leak: 51150k
    Overhead: 58753k -> 58753k
    GGC runs: 615 -> 616

comparing Gerald's testcase PR8361 compilation at -O3 level:
    Overall memory needed: 121138k
    Peak memory use before GGC: 97648k
    Peak memory use after GGC: 96144k
    Maximum of released memory in single GGC run: 18476k
    Garbage: 517975k -> 518089k
    Leak: 51123k
    Overhead: 58758k -> 58756k
    GGC runs: 621

comparing PR rtl-optimization/28071 testcase compilation at -O0 level:
    Overall memory needed: 137642k -> 137634k
    Peak memory use before GGC: 81588k -> 81587k
    Peak memory use after GGC: 58467k
    Maximum of released memory in single GGC run: 45167k -> 45166k
    Garbage: 148519k -> 148526k
    Leak: 7542k
    Overhead: 25329k
    GGC runs: 82

comparing PR rtl-optimization/28071 testcase compilation at -O0 -g level:
    Overall memory needed: 138014k
    Peak memory use before GGC: 82234k -> 82233k
    Peak memory use after GGC: 59113k
    Maximum of released memory in single GGC run: 45232k
    Garbage: 148730k -> 148736k
    Leak: 9309k
    Overhead: 25824k
    GGC runs: 88

comparing PR rtl-optimization/28071 testcase compilation at -O1 level:
    Overall memory needed: 418466k -> 418794k
    Peak memory use before GGC: 199519k
    Peak memory use after GGC: 192216k -> 192217k
    Maximum of released memory in single GGC run: 94104k -> 109875k
    Garbage: 284119k -> 284125k
    Leak: 29778k
    Overhead: 31548k -> 31548k
    GGC runs: 98

comparing PR rtl-optimization/28071 testcase compilation at -O2 level:
    Overall memory needed: 343286k -> 343378k
    Peak memory use before GGC: 199512k
    Peak memory use after GGC: 192209k -> 192210k
    Maximum of released memory in single GGC run: 96120k -> 111890k
    Garbage: 364357k -> 364364k
    Leak: 30361k
    Overhead: 47297k -> 47297k
    GGC runs: 104

comparing PR rtl-optimization/28071 testcase compilation at -O3 -fno-tree-pre -fno-tree-fre level:
  Ovarall memory allocated via mmap and sbrk decreased from 771946k to 748734k, overall -3.10%
    Overall memory needed: 771946k -> 748734k
    Peak memory use before GGC: 317621k -> 317622k
    Peak memory use after GGC: 296096k
    Maximum of released memory in single GGC run: 168283k -> 186389k
    Garbage: 504436k -> 504442k
    Leak: 45414k
    Overhead: 60274k -> 60274k
    GGC runs: 98 -> 99

Head of the ChangeLog is:

--- /usr/src/SpecTests/sandbox-britten-memory/x86_64/mem-result/ChangeLog	2006-12-20 13:14:45.000000000 +0000
+++ /usr/src/SpecTests/sandbox-britten-memory/gcc/gcc/ChangeLog	2006-12-21 04:54:51.000000000 +0000
@@ -1,3 +1,111 @@
+2006-12-20  Roger Sayle  <roger@eyesopen.com>
+
+	* simplify-rtx.c (simplify_subreg): Use the correct mode when
+	determining whether a SUBREG of a CONCAT refers to the first or
+	second component.
+
+2006-12-21  Ben Elliston  <bje@au.ibm.com>
+
+	* config/spu/spu.c (spu_builtin_mul_widen_even): Remove unused
+	local variable `d'.
+
+2006-12-20  Jan Hubicka  <jh@suse.cz>
+
+	* tree-dfa.c (add_referenced_var): Walk initializers of
+	non-constant/readonly static vars.
+
+2006-12-20  Jan Hubicka  <jh@suse.cz>
+
+	* tree-flow-inline.h (gimple_var_anns): New function.
+	(var_ann): Use hashtable for static functions.
+	* tree-dfa.c (create_var_ann): Likewise.
+	* tree-ssa.c (var_ann_eq, var_ann_hash): New functions.
+	(init_tree_ssa): Initialize var anns.
+	(delete_tree_ssa): Delete var anns; also clear out gimple_df.
+	* tree-flow.h (struct static_var_ann_d): New structure.
+	(gimple_df): Add var_anns.
+
+2006-12-20  Carlos O'Donell  <carlos@codesourcery.com>
+
+	PR bootstrap/30242
+	* gcc/c-incpath.c (add_standard_paths): Only relocate paths that 
+	begin with the configured prefix. 
+
+2006-12-20  Jan Hubicka  <jh@suse.cz>
+
+	PR target/30213
+	* i386.c (expand_setmem_epilogue): Fix formating.
+	(dsmalest_pow2_greater_than): New function.
+	(ix86_expand_movmem): Improve comments; avoid re-computing of
+	epilogue size.
+	(promote_duplicated_reg_to_size): Break out from ...
+	(expand_setmem): ... this one; reorganize promotion code;
+	improve comments; avoid recomputation of epilogue size.
+
+2006-12-20  Andrew Pinski  <pinskia@gmail.com>
+
+	PR middle-end/30143
+	* omp-low.c (init_tmp_var): New function.       
+	(save_tmp_var): New function.
+	(lower_omp_1): Use them for VAR_DECL.
+
+2006-12-20  Andrew Pinski  <pinskia@gmail.com>
+
+	* tree-gimple.c (is_gimple_min_invariant): Treat constant vector
+	CONSTRUCTORs as min invariants.
+
+2006-12-20  Joseph Myers  <joseph@codesourcery.com>
+
+	* rtlanal.c (struct subreg_info, subreg_get_info, subreg_nregs):
+	New.
+	(subreg_regno_offset, subreg_offset_representable_p): Change to
+	wrappers about subreg_get_info.
+	(refers_to_regno_p, reg_overlap_mentioned_p): Use subreg_nregs.
+	* rtl.h (subreg_nregs): Declare.
+	* doc/tm.texi (HARD_REGNO_NREGS_HAS_PADDING): Update to refer to
+	subreg_get_info.
+	* caller-save.c (mark_set_regs, add_stored_regs): Use
+	subreg_nregs.
+	* df-scan.c (df_ref_record): Use subreg_nregs.
+	* flow.c (mark_set_1): Use subreg_nregs.
+	* postreload.c (move2add_note_store): Use subreg_nregs.
+	* reload.c (decompose, refers_to_regno_for_reload_p,
+	reg_overlap_mentioned_for_reload_p): Use subreg_nregs.
+	* resource.c (update_live_status, mark_referenced_resources,
+	mark_set_resources): Use subreg_nregs.
+
+2006-12-20  Zdenek Dvorak <dvorakz@suse.cz>
+
+	* loop-unswitch.c (unswitch_loop): Update arguments of
+	duplicate_loop_to_header_edge call.
+	* tree-ssa-loop-ivcanon.c (try_unroll_loop_completely): Ditto.
+	* loop-unroll.c (peel_loop_completely, unroll_loop_constant_iterations,
+	unroll_loop_runtime_iterations, peel_loop_simple, unroll_loop_stupid):
+	Ditto.
+	* cfgloopmanip.c (loop_version): Ditto.
+	(duplicate_loop_to_header_edge): Change
+	type of to_remove to VEC(edge), remove n_to_remove argument.
+	* tree-ssa-loop-manip.c (tree_duplicate_loop_to_header_edge):
+	Change type of to_remove to VEC(edge), remove n_to_remove argument.
+	(tree_unroll_loop): Update arguments of
+	tree_duplicate_loop_to_header_edge call.
+	* cfghooks.c (cfg_hook_duplicate_loop_to_header_edge):
+	Change type of to_remove to VEC(edge), remove n_to_remove argument.
+	* cfghooks.h (struct cfg_hooks): Type of
+	cfg_hook_duplicate_loop_to_header_edge changed.
+	(cfg_hook_duplicate_loop_to_header_edge): Declaration changed.
+	* cfgloop.h (duplicate_loop_to_header_edge): Ditto.
+	* tree-flow.h (tree_duplicate_loop_to_header_edge): Ditto.
+
+2006-12-20  Dorit Nuzman  <dorit@il.ibm.com>
+
+	* config/spu/spu.md (vec_widen_umult_hi_v8hi): New.
+	(vec_widen_umult_lo_v8hi, vec_widen_smult_hi_v8hi): New.
+	(vec_widen_smult_lo_v8hi): New.
+	* config/spu/spu.c (TARGET_VECTORIZE_BUILTIN_MUL_WIDEN_EVEN): Defined.
+	(TARGET_VECTORIZE_BUILTIN_MUL_WIDEN_ODD): Defined.
+	(spu_builtin_mul_widen_even, spu_builtin_mul_widen_odd): New.
+
 2006-12-20  Jan Hubicka  <jh@suse.cz>
 
 	* cgraph.c: Update overall comment; fix vertical spacing.


The results can be reproduced by building a compiler with

--enable-gather-detailed-mem-stats targetting x86-64

and compiling preprocessed combine.c or testcase from PR8632 with:

-fmem-report --param=ggc-min-heapsize=1024 --param=ggc-min-expand=1 -Ox -Q

The memory consumption summary appears in the dump after detailed listing
of the places they are allocated in.  Peak memory consumption is actually
computed by looking for maximal value in {GC XXXX -> YYYY} report.

Your testing script.



More information about the Gcc-regression mailing list