This is the mail archive of the
gcc-regression@gcc.gnu.org
mailing list for the GCC project.
A recent patch increased GCC's memory consumption in some cases!
- From: gcctest at suse dot de
- To: jh at suse dot cz, gcc-regression at gcc dot gnu dot org
- Date: Sat, 04 Nov 2006 10:36:57 +0000
- Subject: A recent patch increased GCC's memory consumption in some cases!
Hi,
I am a friendly script caring about memory consumption in GCC. Please
contact jh@suse.cz if something is going wrong.
Comparing memory consumption on compilation of combine.i, insn-attrtab.i,
and generate-3.4.ii I got:
comparing combine.c compilation at -O0 level:
Overall memory needed: 28370k -> 28378k
Peak memory use before GGC: 9293k
Peak memory use after GGC: 8832k
Maximum of released memory in single GGC run: 2666k
Garbage: 36856k
Leak: 6441k
Overhead: 4860k
GGC runs: 280
comparing combine.c compilation at -O1 level:
Amount of produced GGC garbage increased from 57445k to 57582k, overall 0.24%
Overall memory needed: 40210k -> 40214k
Peak memory use before GGC: 17281k
Peak memory use after GGC: 17106k
Maximum of released memory in single GGC run: 2382k -> 2363k
Garbage: 57445k -> 57582k
Leak: 6505k -> 6495k
Overhead: 6200k -> 6224k
GGC runs: 355
comparing combine.c compilation at -O2 level:
Amount of memory still referenced at the end of compilation increased from 6593k to 6603k, overall 0.15%
Overall memory needed: 29790k
Peak memory use before GGC: 17277k
Peak memory use after GGC: 17106k
Maximum of released memory in single GGC run: 2883k -> 2803k
Garbage: 76254k -> 74898k
Leak: 6593k -> 6603k
Overhead: 8744k -> 8470k
GGC runs: 420 -> 413
comparing combine.c compilation at -O3 level:
Overall memory needed: 28894k
Peak memory use before GGC: 18217k -> 18218k
Peak memory use after GGC: 17833k -> 17834k
Maximum of released memory in single GGC run: 4104k
Garbage: 106198k -> 104230k
Leak: 6668k
Overhead: 12303k -> 11907k
GGC runs: 469 -> 462
comparing insn-attrtab.c compilation at -O0 level:
Overall memory needed: 88230k
Peak memory use before GGC: 69777k
Peak memory use after GGC: 44187k
Maximum of released memory in single GGC run: 36963k
Garbage: 129065k
Leak: 9501k
Overhead: 16993k
GGC runs: 216
comparing insn-attrtab.c compilation at -O1 level:
Overall memory needed: 114174k -> 115034k
Peak memory use before GGC: 90363k
Peak memory use after GGC: 83725k
Maximum of released memory in single GGC run: 31806k -> 31852k
Garbage: 277740k -> 277769k
Leak: 9343k -> 9343k
Overhead: 29775k -> 29778k
GGC runs: 223
comparing insn-attrtab.c compilation at -O2 level:
Ovarall memory allocated via mmap and sbrk decreased from 134058k to 120390k, overall -11.35%
Overall memory needed: 134058k -> 120390k
Peak memory use before GGC: 92593k
Peak memory use after GGC: 84705k
Maximum of released memory in single GGC run: 30380k -> 30394k
Garbage: 319045k -> 317192k
Leak: 9345k
Overhead: 36716k -> 36353k
GGC runs: 247 -> 246
comparing insn-attrtab.c compilation at -O3 level:
Overall memory allocated via mmap and sbrk increased from 115570k to 134218k, overall 16.14%
Overall memory needed: 115570k -> 134218k
Peak memory use before GGC: 92618k
Peak memory use after GGC: 84731k
Maximum of released memory in single GGC run: 30570k -> 30584k
Garbage: 319697k -> 317844k
Leak: 9348k
Overhead: 36914k -> 36551k
GGC runs: 250
comparing Gerald's testcase PR8361 compilation at -O0 level:
Overall memory needed: 119538k
Peak memory use before GGC: 92680k
Peak memory use after GGC: 91760k
Maximum of released memory in single GGC run: 19314k
Garbage: 205600k
Leak: 47677k
Overhead: 20817k
GGC runs: 402
comparing Gerald's testcase PR8361 compilation at -O1 level:
Overall memory needed: 119278k
Peak memory use before GGC: 97848k
Peak memory use after GGC: 95638k
Maximum of released memory in single GGC run: 18600k
Garbage: 444357k -> 444206k
Leak: 50010k -> 50011k
Overhead: 32820k -> 32784k
GGC runs: 552
comparing Gerald's testcase PR8361 compilation at -O2 level:
Overall memory needed: 119286k
Peak memory use before GGC: 97848k
Peak memory use after GGC: 95638k
Maximum of released memory in single GGC run: 18600k
Garbage: 506005k -> 503957k
Leak: 50715k -> 50716k
Overhead: 40490k -> 40089k
GGC runs: 610 -> 609
comparing Gerald's testcase PR8361 compilation at -O3 level:
Overall memory needed: 118930k
Peak memory use before GGC: 97894k
Peak memory use after GGC: 96924k
Maximum of released memory in single GGC run: 18847k
Garbage: 525605k -> 523592k
Leak: 50291k -> 50291k
Overhead: 40993k -> 40599k
GGC runs: 623 -> 622
comparing PR rtl-optimization/28071 testcase compilation at -O0 level:
Overall memory needed: 137946k
Peak memory use before GGC: 81898k
Peak memory use after GGC: 58777k
Maximum of released memory in single GGC run: 45493k
Garbage: 147195k
Leak: 7522k
Overhead: 25300k
GGC runs: 83
comparing PR rtl-optimization/28071 testcase compilation at -O1 level:
Overall memory needed: 424310k -> 423006k
Peak memory use before GGC: 205260k
Peak memory use after GGC: 201036k
Maximum of released memory in single GGC run: 101716k -> 101714k
Garbage: 271708k -> 271706k
Leak: 47588k
Overhead: 30829k -> 30829k
GGC runs: 101
comparing PR rtl-optimization/28071 testcase compilation at -O2 level:
Amount of produced GGC garbage increased from 350433k to 351905k, overall 0.42%
Overall memory needed: 351126k -> 352334k
Peak memory use before GGC: 206011k -> 206001k
Peak memory use after GGC: 201787k -> 201777k
Maximum of released memory in single GGC run: 108042k -> 108617k
Garbage: 350433k -> 351905k
Leak: 48171k -> 48171k
Overhead: 46275k -> 46573k
GGC runs: 108 -> 110
comparing PR rtl-optimization/28071 testcase compilation at -O3 -fno-tree-pre -fno-tree-fre level:
Overall memory allocated via mmap and sbrk increased from 535350k to 781042k, overall 45.89%
Amount of produced GGC garbage increased from 491202k to 494299k, overall 0.63%
Overall memory needed: 535350k -> 781042k
Peak memory use before GGC: 314918k -> 314916k
Peak memory use after GGC: 293261k -> 293259k
Maximum of released memory in single GGC run: 163448k -> 165331k
Garbage: 491202k -> 494299k
Leak: 65503k -> 65503k
Overhead: 59091k -> 59714k
GGC runs: 95 -> 98
Head of the ChangeLog is:
--- /usr/src/SpecTests/sandbox-britten-memory/x86_64/mem-result/ChangeLog 2006-11-04 05:20:54.000000000 +0000
+++ /usr/src/SpecTests/sandbox-britten-memory/gcc/gcc/ChangeLog 2006-11-04 08:57:00.000000000 +0000
@@ -1,3 +1,32 @@
+2006-11-03 Paolo Bonzini <bonzini@gnu.org>
+ Steven Bosscher <stevenb.gcc@gmail.com>
+
+ * fwprop.c: New file.
+ * Makefile.in: Add fwprop.o.
+ * tree-pass.h (pass_rtl_fwprop, pass_rtl_fwprop_with_addr): New.
+ * passes.c (init_optimization_passes): Schedule forward propagation.
+ * rtlanal.c (loc_mentioned_in_p): Support NULL value of the second
+ parameter.
+ * timevar.def (TV_FWPROP): New.
+ * common.opt (-fforward-propagate): New.
+ * opts.c (decode_options): Enable forward propagation at -O2.
+ * gcse.c (one_cprop_pass): Do not run local cprop unless touching jumps.
+ * cse.c (fold_rtx_subreg, fold_rtx_mem, fold_rtx_mem_1, find_best_addr,
+ canon_for_address, table_size): Remove.
+ (new_basic_block, insert, remove_from_table): Remove references to
+ table_size.
+ (fold_rtx): Process SUBREGs and MEMs with equiv_constant, make
+ simplification loop more straightforward by not calling fold_rtx
+ recursively.
+ (equiv_constant): Move here a small part of fold_rtx_subreg,
+ do not call fold_rtx. Call avoid_constant_pool_reference
+ to process MEMs.
+ * recog.c (canonicalize_change_group): New.
+ * recog.h (canonicalize_change_group): New.
+
+ * doc/invoke.texi (Optimization Options): Document fwprop.
+ * doc/passes.texi (RTL passes): Document fwprop.
+
2006-11-03 Geoffrey Keating <geoffk@apple.com>
* c-decl.c (WANT_C99_INLINE_SEMANTICS): New, set to 1.
@@ -23,7 +52,6 @@
2006-11-03 Paul Brook <paul@codesourcery.com>
- gcc/
* config/arm/arm.c (arm_file_start): New function.
(TARGET_ASM_FILE_START): Define.
(arm_default_cpu): New variable.
The results can be reproduced by building a compiler with
--enable-gather-detailed-mem-stats targetting x86-64
and compiling preprocessed combine.c or testcase from PR8632 with:
-fmem-report --param=ggc-min-heapsize=1024 --param=ggc-min-expand=1 -Ox -Q
The memory consumption summary appears in the dump after detailed listing
of the places they are allocated in. Peak memory consumption is actually
computed by looking for maximal value in {GC XXXX -> YYYY} report.
Your testing script.