This is the mail archive of the
gcc-regression@gcc.gnu.org
mailing list for the GCC project.
A recent patch increased GCC's memory consumption in some cases!
- From: gcctest at suse dot de
- To: jh at suse dot cz, gcc-regression at gcc dot gnu dot org
- Date: Sat, 10 Feb 2007 14:13:57 +0000
- Subject: A recent patch increased GCC's memory consumption in some cases!
Hi,
I am a friendly script caring about memory consumption in GCC. Please
contact jh@suse.cz if something is going wrong.
Comparing memory consumption on compilation of combine.i, insn-attrtab.i,
and generate-3.4.ii I got:
comparing empty function compilation at -O0 level:
Overall memory needed: 7383k -> 7384k
Peak memory use before GGC: 2265k
Peak memory use after GGC: 1955k
Maximum of released memory in single GGC run: 310k
Garbage: 445k
Leak: 2289k
Overhead: 456k
GGC runs: 3
comparing empty function compilation at -O0 -g level:
Overall memory needed: 7399k -> 7400k
Peak memory use before GGC: 2293k
Peak memory use after GGC: 1982k
Maximum of released memory in single GGC run: 311k
Garbage: 448k
Leak: 2321k
Overhead: 461k
GGC runs: 3
comparing empty function compilation at -O1 level:
Overall memory needed: 7495k -> 7512k
Peak memory use before GGC: 2265k
Peak memory use after GGC: 1955k
Maximum of released memory in single GGC run: 310k
Garbage: 451k
Leak: 2291k
Overhead: 457k
GGC runs: 4
comparing empty function compilation at -O2 level:
Overall memory needed: 7507k -> 7512k
Peak memory use before GGC: 2266k
Peak memory use after GGC: 1955k
Maximum of released memory in single GGC run: 311k
Garbage: 454k
Leak: 2291k
Overhead: 457k
GGC runs: 4
comparing empty function compilation at -O3 level:
Overall memory needed: 7507k -> 7512k
Peak memory use before GGC: 2266k
Peak memory use after GGC: 1955k
Maximum of released memory in single GGC run: 311k
Garbage: 454k
Leak: 2291k
Overhead: 457k
GGC runs: 4
comparing combine.c compilation at -O0 level:
Overall memory needed: 17731k -> 17728k
Peak memory use before GGC: 9291k
Peak memory use after GGC: 8871k
Maximum of released memory in single GGC run: 2603k
Garbage: 37162k
Leak: 6539k
Overhead: 5024k
GGC runs: 280
comparing combine.c compilation at -O0 -g level:
Overall memory needed: 19823k -> 19824k
Peak memory use before GGC: 10897k
Peak memory use after GGC: 10531k
Maximum of released memory in single GGC run: 2375k
Garbage: 37738k
Leak: 9415k
Overhead: 5727k
GGC runs: 270
comparing combine.c compilation at -O1 level:
Peak amount of GGC memory allocated before garbage collecting increased from 19412k to 19436k, overall 0.12%
Amount of produced GGC garbage increased from 57374k to 57494k, overall 0.21%
Overall memory needed: 35163k -> 35176k
Peak memory use before GGC: 19412k -> 19436k
Peak memory use after GGC: 19206k -> 19221k
Maximum of released memory in single GGC run: 2197k
Garbage: 57374k -> 57494k
Leak: 6562k -> 6564k
Overhead: 6354k -> 6358k
GGC runs: 349 -> 352
comparing combine.c compilation at -O2 level:
Peak amount of GGC memory allocated before garbage collecting increased from 19446k to 19471k, overall 0.13%
Peak amount of GGC memory still allocated after garbage collecting increased from 19245k to 19269k, overall 0.12%
Overall memory needed: 37563k -> 37500k
Peak memory use before GGC: 19446k -> 19471k
Peak memory use after GGC: 19245k -> 19269k
Maximum of released memory in single GGC run: 2187k -> 2185k
Garbage: 68818k -> 68794k
Leak: 6681k -> 6673k
Overhead: 7955k -> 7992k
GGC runs: 406 -> 405
comparing combine.c compilation at -O3 level:
Overall memory needed: 45623k -> 45720k
Peak memory use before GGC: 20560k -> 20504k
Peak memory use after GGC: 19744k -> 19622k
Maximum of released memory in single GGC run: 3125k -> 3158k
Garbage: 102695k -> 101168k
Leak: 6817k -> 6814k
Overhead: 12395k -> 12209k
GGC runs: 456 -> 453
comparing insn-attrtab.c compilation at -O0 level:
Overall memory needed: 103547k -> 103544k
Peak memory use before GGC: 69329k
Peak memory use after GGC: 44976k
Maximum of released memory in single GGC run: 36886k
Garbage: 130571k
Leak: 9588k
Overhead: 16932k
GGC runs: 206
comparing insn-attrtab.c compilation at -O0 -g level:
Overall memory needed: 105067k -> 105068k
Peak memory use before GGC: 70490k
Peak memory use after GGC: 46244k
Maximum of released memory in single GGC run: 36886k
Garbage: 131730k
Leak: 11278k
Overhead: 17326k
GGC runs: 206
comparing insn-attrtab.c compilation at -O1 level:
Overall memory needed: 148183k -> 148000k
Peak memory use before GGC: 86337k
Peak memory use after GGC: 80543k -> 80544k
Maximum of released memory in single GGC run: 33045k -> 33048k
Garbage: 264605k -> 264635k
Leak: 9404k -> 9405k
Overhead: 27590k -> 27596k
GGC runs: 225
comparing insn-attrtab.c compilation at -O2 level:
Overall memory needed: 191643k -> 191528k
Peak memory use before GGC: 87648k -> 87649k
Peak memory use after GGC: 80609k
Maximum of released memory in single GGC run: 31388k -> 31387k
Garbage: 299515k -> 299498k
Leak: 9401k -> 9402k
Overhead: 33192k -> 33191k
GGC runs: 245
comparing insn-attrtab.c compilation at -O3 level:
Overall memory allocated via mmap and sbrk increased from 191659k to 196316k, overall 2.43%
Overall memory needed: 191659k -> 196316k
Peak memory use before GGC: 87665k -> 87666k
Peak memory use after GGC: 80626k -> 80627k
Maximum of released memory in single GGC run: 31450k
Garbage: 300153k -> 300155k
Leak: 9407k -> 9407k
Overhead: 33388k -> 33391k
GGC runs: 245
comparing Gerald's testcase PR8361 compilation at -O0 level:
Overall memory needed: 151192k -> 151197k
Peak memory use before GGC: 92317k
Peak memory use after GGC: 91394k
Maximum of released memory in single GGC run: 18793k
Garbage: 210317k
Leak: 49389k
Overhead: 23720k
GGC runs: 411
comparing Gerald's testcase PR8361 compilation at -O0 -g level:
Overall memory needed: 169352k -> 169353k
Peak memory use before GGC: 104950k
Peak memory use after GGC: 103910k
Maximum of released memory in single GGC run: 18979k
Garbage: 216936k
Leak: 72814k
Overhead: 29643k
GGC runs: 382
comparing Gerald's testcase PR8361 compilation at -O1 level:
Amount of produced GGC garbage decreased from 392073k to 346698k, overall -13.09%
Overall memory needed: 145031k -> 142270k
Peak memory use before GGC: 103088k -> 102378k
Peak memory use after GGC: 102050k -> 101350k
Maximum of released memory in single GGC run: 17981k -> 17982k
Garbage: 392073k -> 346698k
Leak: 50219k -> 50049k
Overhead: 34383k -> 30234k
GGC runs: 541 -> 523
comparing Gerald's testcase PR8361 compilation at -O2 level:
Amount of produced GGC garbage decreased from 431219k to 375730k, overall -14.77%
Overall memory needed: 146127k -> 142654k
Peak memory use before GGC: 103514k -> 103057k
Peak memory use after GGC: 102485k -> 101997k
Maximum of released memory in single GGC run: 17979k
Garbage: 431219k -> 375730k
Leak: 50909k -> 50648k
Overhead: 39623k -> 34176k
GGC runs: 592 -> 560
comparing Gerald's testcase PR8361 compilation at -O3 level:
Amount of produced GGC garbage decreased from 450979k to 392061k, overall -15.03%
Amount of memory still referenced at the end of compilation increased from 51004k to 51296k, overall 0.57%
Overall memory needed: 148995k -> 145910k
Peak memory use before GGC: 104781k -> 104814k
Peak memory use after GGC: 103711k -> 103748k
Maximum of released memory in single GGC run: 18300k
Garbage: 450979k -> 392061k
Leak: 51004k -> 51296k
Overhead: 41437k -> 35426k
GGC runs: 607 -> 570
comparing PR rtl-optimization/28071 testcase compilation at -O0 level:
Overall memory needed: 245794k -> 245795k
Peak memory use before GGC: 81775k
Peak memory use after GGC: 59514k
Maximum of released memory in single GGC run: 44985k
Garbage: 145986k
Leak: 7570k
Overhead: 24807k
GGC runs: 80
comparing PR rtl-optimization/28071 testcase compilation at -O0 -g level:
Overall memory needed: 246646k -> 246647k
Peak memory use before GGC: 82421k
Peak memory use after GGC: 60160k
Maximum of released memory in single GGC run: 44974k
Garbage: 146205k
Leak: 9338k
Overhead: 25303k
GGC runs: 89
comparing PR rtl-optimization/28071 testcase compilation at -O1 level:
Ovarall memory allocated via mmap and sbrk decreased from 258635k to 240616k, overall -7.49%
Peak amount of GGC memory allocated before garbage collecting run decreased from 104509k to 84340k, overall -23.91%
Peak amount of GGC memory still allocated after garbage collecting decreased from 101295k to 73894k, overall -37.08%
Amount of produced GGC garbage decreased from 240015k to 224276k, overall -7.02%
Amount of memory still referenced at the end of compilation decreased from 25176k to 19608k, overall -28.40%
Overall memory needed: 258635k -> 240616k
Peak memory use before GGC: 104509k -> 84340k
Peak memory use after GGC: 101295k -> 73894k
Maximum of released memory in single GGC run: 51835k -> 36499k
Garbage: 240015k -> 224276k
Leak: 25176k -> 19608k
Overhead: 29476k -> 30436k
GGC runs: 79 -> 81
comparing PR rtl-optimization/28071 testcase compilation at -O2 level:
Ovarall memory allocated via mmap and sbrk decreased from 531975k to 485972k, overall -9.47%
Peak amount of GGC memory allocated before garbage collecting run decreased from 104503k to 78744k, overall -32.71%
Peak amount of GGC memory still allocated after garbage collecting decreased from 101289k to 73894k, overall -37.07%
Amount of produced GGC garbage decreased from 270332k to 231677k, overall -16.69%
Amount of memory still referenced at the end of compilation decreased from 25093k to 19698k, overall -27.39%
Overall memory needed: 531975k -> 485972k
Peak memory use before GGC: 104503k -> 78744k
Peak memory use after GGC: 101289k -> 73894k
Maximum of released memory in single GGC run: 37167k -> 33796k
Garbage: 270332k -> 231677k
Leak: 25093k -> 19698k
Overhead: 35313k -> 32578k
GGC runs: 91 -> 92
comparing PR rtl-optimization/28071 testcase compilation at -O3 -fno-tree-pre -fno-tree-fre level:
Amount of produced GGC garbage increased from 371298k to 376322k, overall 1.35%
Overall memory needed: 1180679k -> 1180644k
Peak memory use before GGC: 200610k
Peak memory use after GGC: 188951k
Maximum of released memory in single GGC run: 80732k
Garbage: 371298k -> 376322k
Leak: 45260k -> 45252k
Overhead: 48285k -> 49159k
GGC runs: 70
Head of the ChangeLog is:
--- /usr/src/SpecTests/sandbox-britten-memory/x86_64/mem-result/ChangeLog 2007-02-09 20:27:14.000000000 +0000
+++ /usr/src/SpecTests/sandbox-britten-memory/gcc/gcc/ChangeLog 2007-02-10 12:12:58.000000000 +0000
@@ -1,3 +1,80 @@
+2007-02-10 Kaz Kojima <kkojima@gcc.gnu.org>
+
+ PR rtl-optimization/29599
+ * reload1.c (eliminate_regs_in_insn): Take the destination
+ mode into account when computing the offset.
+
+2007-02-09 Stuart Hastings <stuart@apple.com>
+ Richard Henderson <rth@redhat.com>
+
+ * gcc/config/i386/i386.h (TARGET_KEEPS_VECTOR_ALIGNED_STACK): New.
+ * gcc/config/i386/darwin.h: (TARGET_KEEPS_VECTOR_ALIGNED_STACK): New.
+ * gcc/config/i386/i386.md (fixuns_trunc<mode>si2, fixuns_truncsfhi2,
+ fixuns_truncdfhi2): New.
+ (fix_truncsfdi_sse): Call ix86_expand_convert_sign_didf_sse.
+ (floatunsdidf2): Call ix86_expand_convert_uns_didf_sse.
+ (floatunssisf2): Add call to ix86_expand_convert_uns_sisf_sse.
+ (floatunssidf2): Allow nonimmediate source.
+ * gcc/config/i386/sse.md (movdi_to_sse): New. (vec_concatv2di): Drop '*'.
+ * gcc/config/i386/i386-protos.h (ix86_expand_convert_uns_si_sse,
+ ix86_expand_convert_uns_didf_sse, ix86_expand_convert_uns_sidf_sse,
+ ix86_expand_convert_uns_sisf_sse, ix86_expand_convert_sign_didf_sse): New.
+ * gcc/config/i386/i386.c (ix86_expand_convert_uns_si_sse,
+ ix86_expand_convert_uns_didf_sse, ix86_expand_convert_uns_sidf_sse,
+ ix86_expand_convert_uns_sisf_sse, ix86_expand_convert_sign_didf_sse,
+ ix86_build_const_vector, ix86_expand_vector_init_one_nonzero): New.
+ (ix86_build_signbit_mask): Fix decl of v, refactor to call ix86_build_const_vector.
+ (x86_emit_floatuns): Rewrite.
+
+2007-02-10 Manuel Lopez-Ibanez <manu@gcc.gnu.org>
+
+ * genautomata.c (longest_path_length): Delete unused function.
+ (struct state): Delete unused longest_path_length.
+ (UNDEFINED_LONGEST_PATH_LENGTH): Delete unused macro.
+ (get_free_state): Delete unused.
+
+2007-02-09 Jan Hubicka <jh@suse.cz>
+
+ * params.def (PARAM_INLINE_UNIT_GROWTH): Set to 30.
+ * doc/invoke.texi (inline-unit-growth): Update default value.
+
+ * Makefile.in (passes.o, ipa-inline.o): Add dependencies.
+ * cgraphbuild.c (build_cgraph_edges): Compute frequencies.
+ (rebuild_cgraph_edges): Likewise.
+ * cgraph.c (cgraph_set_call_stmt): Add new argument frequency.
+ (dump_cgraph_node): Dump frequencies.
+ (cgraph_clone_edge): Add frequency scales.
+ (cgraph_clone_node): Add freuqnecy.
+ * cgraph.h (cgraph_edge): Add freuqnecy argument.
+ (CGRAPH_FREQ_BASE, CGRAPH_FREQ_MAX): New constants.
+ (cgraph_create_edge, cgraph_clone_edge, cgraph_clone_node): Update.
+ * tree-pass.h (TODO_rebuild_frequencies): New constant.
+ * cgraphunit.c (verify_cgraph_node): Verify frequencies.
+ (cgraph_copy_node_for_versioning): Update call of cgraph_clone_edge.
+ (save_inline_function_body): Likewise.
+ * ipa-inline.c: inluce rtl.h
+ (cgraph_clone_inlined_nods): Update call of cgraph_clone_node.
+ (cgraph_edge_badness): Use frequencies.
+ (cgraph_decide_recursive_inlining): Update clonning.
+ (cgraph_decide_inlining_of_small_function): Dump frequency.
+ * predict.c (estimate_bb_frequencies): Export.
+ * predict.h (estimate_bb_frequencies): Declare.
+ * tree-inline.c (copy_bb): Watch overflows.
+ (expand_call_inline): Update call of cgraph_create_edge.
+ (optimize_inline_calls): Use TODO flags to update frequnecies.
+ * passes.h: Include predict.h
+ (init_optimization_passes): Move profile ahead.
+ (execute_function_todo): Handle TODO_rebuild_frequencies.
+
+2007-02-09 Roger Sayle <roger@eyesopen.com>
+
+ * config/alpha/alpha.c (emit_insxl): Force the first operand of
+ the insbl or inswl pattern into a register.
+
+2007-02-09 Roger Sayle <roger@eyesopen.com>
+
+ * config/ia64/ia64.md (bswapdi2): New define_insn.
+
2007-02-09 Richard Henderson <rth@redhat.com>
* config/i386/constraints.md (Ym): New constraint.
The results can be reproduced by building a compiler with
--enable-gather-detailed-mem-stats targetting x86-64
and compiling preprocessed combine.c or testcase from PR8632 with:
-fmem-report --param=ggc-min-heapsize=1024 --param=ggc-min-expand=1 -Ox -Q
The memory consumption summary appears in the dump after detailed listing
of the places they are allocated in. Peak memory consumption is actually
computed by looking for maximal value in {GC XXXX -> YYYY} report.
Your testing script.