This is the mail archive of the
gcc-regression@gcc.gnu.org
mailing list for the GCC project.
A recent patch increased GCC's memory consumption in some cases!
- From: gcctest at suse dot de
- To: jh at suse dot cz, gcc-regression at gcc dot gnu dot org
- Date: Sun, 28 Jan 2007 09:33:58 +0000
- Subject: A recent patch increased GCC's memory consumption in some cases!
Hi,
I am a friendly script caring about memory consumption in GCC. Please
contact jh@suse.cz if something is going wrong.
Comparing memory consumption on compilation of combine.i, insn-attrtab.i,
and generate-3.4.ii I got:
comparing empty function compilation at -O0 level:
Overall memory needed: 7384k
Peak memory use before GGC: 2264k -> 2263k
Peak memory use after GGC: 1955k -> 1954k
Maximum of released memory in single GGC run: 309k
Garbage: 444k -> 444k
Leak: 2289k -> 2288k
Overhead: 456k
GGC runs: 3
comparing empty function compilation at -O0 -g level:
Overall memory needed: 7400k
Peak memory use before GGC: 2292k -> 2290k
Peak memory use after GGC: 1982k -> 1981k
Maximum of released memory in single GGC run: 310k -> 309k
Garbage: 447k -> 446k
Leak: 2321k -> 2320k
Overhead: 460k
GGC runs: 3
comparing empty function compilation at -O1 level:
Overall memory needed: 7492k -> 7488k
Peak memory use before GGC: 2264k -> 2263k
Peak memory use after GGC: 1955k -> 1954k
Maximum of released memory in single GGC run: 309k
Garbage: 451k -> 450k
Leak: 2291k -> 2290k
Overhead: 456k
GGC runs: 4
comparing empty function compilation at -O2 level:
Overall memory needed: 7504k -> 7500k
Peak memory use before GGC: 2265k -> 2263k
Peak memory use after GGC: 1955k -> 1954k
Maximum of released memory in single GGC run: 310k -> 309k
Garbage: 454k -> 453k
Leak: 2291k -> 2290k
Overhead: 457k
GGC runs: 4
comparing empty function compilation at -O3 level:
Overall memory needed: 7504k -> 7500k
Peak memory use before GGC: 2265k -> 2263k
Peak memory use after GGC: 1955k -> 1954k
Maximum of released memory in single GGC run: 310k -> 309k
Garbage: 454k -> 453k
Leak: 2291k -> 2290k
Overhead: 457k
GGC runs: 4
comparing combine.c compilation at -O0 level:
Overall memory needed: 17812k
Peak memory use before GGC: 9328k -> 9326k
Peak memory use after GGC: 8891k -> 8889k
Maximum of released memory in single GGC run: 2633k
Garbage: 37290k -> 37276k
Leak: 6539k -> 6537k
Overhead: 4654k -> 4654k
GGC runs: 280
comparing combine.c compilation at -O0 -g level:
Overall memory needed: 19716k -> 19724k
Peak memory use before GGC: 10917k -> 10915k
Peak memory use after GGC: 10551k -> 10549k
Maximum of released memory in single GGC run: 2393k
Garbage: 37860k -> 37861k
Leak: 9415k -> 9413k
Overhead: 5357k -> 5358k
GGC runs: 271
comparing combine.c compilation at -O1 level:
Overall memory needed: 33008k
Peak memory use before GGC: 19646k -> 19635k
Peak memory use after GGC: 19446k -> 19411k
Maximum of released memory in single GGC run: 2266k
Garbage: 54707k -> 54602k
Leak: 6570k -> 6565k
Overhead: 9682k -> 9683k
GGC runs: 352 -> 351
comparing combine.c compilation at -O2 level:
Overall memory needed: 36516k -> 36508k
Peak memory use before GGC: 19650k -> 19648k
Peak memory use after GGC: 19454k -> 19452k
Maximum of released memory in single GGC run: 2206k
Garbage: 70486k -> 70452k
Leak: 6681k -> 6679k
Overhead: 11486k -> 11486k
GGC runs: 410 -> 409
comparing combine.c compilation at -O3 level:
Overall memory needed: 47456k -> 47452k
Peak memory use before GGC: 20775k -> 20773k
Peak memory use after GGC: 19927k -> 19925k
Maximum of released memory in single GGC run: 3165k
Garbage: 103719k -> 103686k
Leak: 6769k -> 6767k
Overhead: 16368k -> 16368k
GGC runs: 463 -> 462
comparing insn-attrtab.c compilation at -O0 level:
Overall memory needed: 104740k -> 104728k
Peak memory use before GGC: 70357k -> 70355k
Peak memory use after GGC: 45189k -> 45187k
Maximum of released memory in single GGC run: 37701k
Garbage: 131191k -> 131157k
Leak: 9580k -> 9580k
Overhead: 15666k -> 15666k
GGC runs: 208 -> 206
comparing insn-attrtab.c compilation at -O0 -g level:
Overall memory needed: 106124k -> 106132k
Peak memory use before GGC: 71519k -> 71517k
Peak memory use after GGC: 46457k -> 46455k
Maximum of released memory in single GGC run: 37702k
Garbage: 132352k -> 132319k
Leak: 11270k
Overhead: 16061k -> 16060k
GGC runs: 206
comparing insn-attrtab.c compilation at -O1 level:
Overall memory needed: 123928k
Peak memory use before GGC: 71508k
Peak memory use after GGC: 67853k
Maximum of released memory in single GGC run: 31672k
Garbage: 227207k -> 227166k
Leak: 9399k -> 9399k
Overhead: 28103k -> 28102k
GGC runs: 224
comparing insn-attrtab.c compilation at -O2 level:
Overall memory needed: 187960k -> 187932k
Peak memory use before GGC: 78337k
Peak memory use after GGC: 72716k
Maximum of released memory in single GGC run: 30537k
Garbage: 278319k -> 278290k
Leak: 9396k -> 9396k
Overhead: 34361k
GGC runs: 246
comparing insn-attrtab.c compilation at -O3 level:
Overall memory needed: 192884k -> 192868k
Peak memory use before GGC: 78349k
Peak memory use after GGC: 72729k
Maximum of released memory in single GGC run: 30607k
Garbage: 279046k -> 279017k
Leak: 9398k -> 9398k
Overhead: 34569k
GGC runs: 246
comparing Gerald's testcase PR8361 compilation at -O0 level:
Overall memory needed: 151599k -> 151647k
Peak memory use before GGC: 92618k -> 92635k
Peak memory use after GGC: 91701k -> 91718k
Maximum of released memory in single GGC run: 18916k -> 18923k
Garbage: 209460k -> 209437k
Leak: 49262k -> 49261k
Overhead: 21551k -> 21553k
GGC runs: 409
comparing Gerald's testcase PR8361 compilation at -O0 -g level:
Overall memory needed: 169779k -> 169803k
Peak memory use before GGC: 105241k -> 105257k
Peak memory use after GGC: 104198k -> 104215k
Maximum of released memory in single GGC run: 19093k -> 19098k
Garbage: 216087k -> 216064k
Leak: 72687k -> 72686k
Overhead: 27474k -> 27476k
GGC runs: 383
comparing Gerald's testcase PR8361 compilation at -O1 level:
Peak amount of GGC memory allocated before garbage collecting increased from 98365k to 102196k, overall 3.89%
Peak amount of GGC memory still allocated after garbage collectin increased from 97363k to 101168k, overall 3.91%
Overall memory needed: 137347k -> 138968k
Peak memory use before GGC: 98365k -> 102196k
Peak memory use after GGC: 97363k -> 101168k
Maximum of released memory in single GGC run: 18086k -> 18090k
Garbage: 391125k -> 386849k
Leak: 50527k -> 50191k
Overhead: 52032k -> 56787k
GGC runs: 543 -> 538
comparing Gerald's testcase PR8361 compilation at -O2 level:
Peak amount of GGC memory allocated before garbage collecting increased from 98404k to 102499k, overall 4.16%
Peak amount of GGC memory still allocated after garbage collectin increased from 97429k to 101482k, overall 4.16%
Overall memory needed: 139199k -> 140712k
Peak memory use before GGC: 98404k -> 102499k
Peak memory use after GGC: 97429k -> 101482k
Maximum of released memory in single GGC run: 18077k
Garbage: 446548k -> 445293k
Leak: 51291k -> 51298k
Overhead: 47086k -> 49303k
GGC runs: 593 -> 590
comparing Gerald's testcase PR8361 compilation at -O3 level:
Overall memory allocated via mmap and sbrk increased from 141859k to 146852k, overall 3.52%
Peak amount of GGC memory allocated before garbage collecting increased from 100102k to 104121k, overall 4.01%
Peak amount of GGC memory still allocated after garbage collectin increased from 99104k to 103085k, overall 4.02%
Amount of memory still referenced at the end of compilation increased from 51434k to 51563k, overall 0.25%
Overall memory needed: 141859k -> 146852k
Peak memory use before GGC: 100102k -> 104121k
Peak memory use after GGC: 99104k -> 103085k
Maximum of released memory in single GGC run: 18473k -> 18474k
Garbage: 469348k -> 468212k
Leak: 51434k -> 51563k
Overhead: 47685k -> 49604k
GGC runs: 607 -> 604
comparing PR rtl-optimization/28071 testcase compilation at -O0 level:
Overall memory needed: 246463k
Peak memory use before GGC: 82633k -> 82631k
Peak memory use after GGC: 59515k -> 59514k
Maximum of released memory in single GGC run: 45585k
Garbage: 148110k -> 148104k
Leak: 8083k -> 8081k
Overhead: 24863k
GGC runs: 80
comparing PR rtl-optimization/28071 testcase compilation at -O0 -g level:
Overall memory needed: 247363k -> 247355k
Peak memory use before GGC: 83279k -> 83277k
Peak memory use after GGC: 60161k -> 60160k
Maximum of released memory in single GGC run: 45230k
Garbage: 148385k -> 148385k
Leak: 9338k -> 9337k
Overhead: 25359k -> 25359k
GGC runs: 88
comparing PR rtl-optimization/28071 testcase compilation at -O1 level:
Ovarall memory allocated via mmap and sbrk decreased from 329200k to 269756k, overall -22.04%
Peak amount of GGC memory allocated before garbage collecting run decreased from 201002k to 108753k, overall -84.82%
Peak amount of GGC memory still allocated after garbage collecting decreased from 189198k to 104116k, overall -81.72%
Amount of produced GGC garbage decreased from 275765k to 241610k, overall -14.14%
Amount of memory still referenced at the end of compilation decreased from 30235k to 25175k, overall -20.10%
Overall memory needed: 329200k -> 269756k
Peak memory use before GGC: 201002k -> 108753k
Peak memory use after GGC: 189198k -> 104116k
Maximum of released memory in single GGC run: 135346k -> 54664k
Garbage: 275765k -> 241610k
Leak: 30235k -> 25175k
Overhead: 31268k -> 30382k
GGC runs: 74 -> 78
comparing PR rtl-optimization/28071 testcase compilation at -O2 level:
Overall memory allocated via mmap and sbrk increased from 302720k to 431276k, overall 42.47%
Peak amount of GGC memory allocated before garbage collecting run decreased from 319046k to 299263k, overall -6.61%
Peak amount of GGC memory still allocated after garbage collecting decreased from 189193k to 104110k, overall -81.72%
Amount of produced GGC garbage decreased from 606140k to 555529k, overall -9.11%
Amount of memory still referenced at the end of compilation decreased from 30664k to 25604k, overall -19.76%
Overall memory needed: 302720k -> 431276k
Peak memory use before GGC: 319046k -> 299263k
Peak memory use after GGC: 189193k -> 104110k
Maximum of released memory in single GGC run: 255677k -> 240232k
Garbage: 606140k -> 555529k
Leak: 30664k -> 25604k
Overhead: 97140k -> 92838k
GGC runs: 83 -> 87
comparing PR rtl-optimization/28071 testcase compilation at -O3 -fno-tree-pre -fno-tree-fre level:
Ovarall memory allocated via mmap and sbrk decreased from 1412084k to 1209172k, overall -16.78%
Peak amount of GGC memory allocated before garbage collecting run decreased from 280855k to 204073k, overall -37.62%
Peak amount of GGC memory still allocated after garbage collecting decreased from 273873k to 192415k, overall -42.33%
Amount of produced GGC garbage decreased from 446260k to 393834k, overall -13.31%
Amount of memory still referenced at the end of compilation decreased from 50427k to 45259k, overall -11.42%
Overall memory needed: 1412084k -> 1209172k
Peak memory use before GGC: 280855k -> 204073k
Peak memory use after GGC: 273873k -> 192415k
Maximum of released memory in single GGC run: 114236k -> 83543k
Garbage: 446260k -> 393834k
Leak: 50427k -> 45259k
Overhead: 55164k -> 53069k
GGC runs: 74 -> 72
Head of the ChangeLog is:
--- /usr/src/SpecTests/sandbox-britten-memory/x86_64/mem-result/ChangeLog 2007-01-27 16:10:51.000000000 +0000
+++ /usr/src/SpecTests/sandbox-britten-memory/gcc/gcc/ChangeLog 2007-01-28 07:53:49.000000000 +0000
@@ -1,3 +1,82 @@
+2007-01-27 Ian Lance Taylor <iant@google.com>
+
+ * common.opt: Add fstrict-overflow.
+ * opts.c (decode_options): Set flag_strict_overflow if -O2.
+ * flags.h (TYPE_OVERFLOW_WRAPS): Define.
+ (TYPE_OVERFLOW_UNDEFINED): Define.
+ (TYPE_OVERFLOW_TRAPS): Define. This replaces TYPE_TRAP_SIGNED.
+ Replace all uses.
+ * tree.h (TYPE_TRAP_SIGNED): Don't define.
+ * fold-const.c (negate_expr_p): Use TYPE_OVERFLOW_UNDEFINED.
+ (fold_negate_expr): Likewise.
+ (make_range): Likewise.
+ (extract_muldiv_1): Likewise.
+ (maybe_canonicalize_comparison): Likewise.
+ (fold_comparison): Likewise.
+ (fold_binary): Likewise.
+ (tree_expr_nonnegative_p): Likewise.
+ (tree_expr_nonzero_p): Likewise.
+ * tree-vrp.c (compare_values): Likewise.
+ (extract_range_from_binary_expr): Likewise.
+ (extract_range_from_unary_expr): Likewise.
+ * tree-ssa-loop-niter.c (infer_loop_bounds_from_signedness):
+ Likewise.
+ (nowrap_type_p): Likewise.
+ * tree-scalar-evolution.c (simple_iv): Likewise.
+ * fold-const.c (negate_expr_p): Use TYPE_OVERFLOW_WRAPS.
+ (build_range_check): Likewise.
+ (extract_muldiv_1): Likewise.
+ (fold_comparison): Likewise.
+ * tree-vrp.c (vrp_int_const_binop): Likewise.
+ (extract_range_from_unary_expr): Likewise.
+ * convert.c (convert_to_integer): Likewise.
+ * fold-const.c (fold_negate_expr): Use TYPE_OVERFLOW_TRAPS.
+ (fold_comparison): Likewise.
+ (fold_binary): Likewise.
+ * optabs.c (optab_for_tree_code): Likewise.
+ * tree-vectorizer.c (vect_is_simple_reduction): Likewise.
+ * simplify-rtx.c (simplify_const_relational_operation): Check
+ flag_strict_overflow and flag_trapv.
+ (simplify_const_relational_operation): Likewise.
+ * doc/invoke.texi (Option Summary): Mention -fstrict-overflow.
+ (Optimize Options): Add -fstrict-overflow to -O2 list. Document
+ -fstrict-overflow.
+
+2007-01-27 Roger Sayle <roger@eyesopen.com>
+
+ * tree.c (tree_fold_gcd): Delete.
+ * tree.h (tree_fold_gcd): Remove prototype.
+ * tree-data-ref.c (tree_fold_divides_p): Don't use tree_fold_gcd to
+ test whether one constant integer is a multiple of another. Instead
+ call int_const_binop with TRUNC_MOD_EXPR and test for a zero result.
+ * fold-const.c (multiple_of_p): We've determined both TOP and
+ BOTTOM are integer constants so we can call int_const_binop directly
+ instead of the more generic const_binop.
+
+2007-01-27 Roger Sayle <roger@eyesopen.com>
+
+ * fold-const.c (size_binop): In the fast-paths for X+0, 0+X, X-0 and
+ 1*X check that the constant hasn't overflowed, to preserve the
+ TREE_OVERFLOW bit.
+ (round_up): Provide an efficient implementation when rouding-up an
+ INTEGER_CST to a power-of-two.
+
+2007-01-28 Ralf Wildenhues <Ralf.Wildenhues@gmx.de>
+
+ * doc/sourcebuild.texi: Add comma for clarity.
+ * doc/extend.texi: Fix some typos.
+ * doc/passes.texi: Likewise.
+ * doc/cppinternals.texi: Likewise.
+ * doc/c-tree.texi: Likewise.
+ * doc/tree-ssa.texi: Likewise.
+ * doc/install.texi: Likewise.
+
+2007-01-27 Jan Hubicka <jh@suse.cz>
+
+ * tree-sra.c (sra_walk_function): Don't rely on aliases being build.
+ (pass_sra): Do not require alias information.
+ * passes.c (init_optimization_passes): Add SRA
+
2007-01-27 Steven Bosscher <steven@gcc.gnu.org>
* tracer.c (rest_of_handle_tracer): We already cleaned
The results can be reproduced by building a compiler with
--enable-gather-detailed-mem-stats targetting x86-64
and compiling preprocessed combine.c or testcase from PR8632 with:
-fmem-report --param=ggc-min-heapsize=1024 --param=ggc-min-expand=1 -Ox -Q
The memory consumption summary appears in the dump after detailed listing
of the places they are allocated in. Peak memory consumption is actually
computed by looking for maximal value in {GC XXXX -> YYYY} report.
Your testing script.