A recent patch increased GCC's memory consumption!
gcctest@suse.de
gcctest@suse.de
Mon Oct 30 04:22:00 GMT 2006
Hi,
I am a friendly script caring about memory consumption in GCC. Please
contact jh@suse.cz if something is going wrong.
Comparing memory consumption on compilation of combine.i, insn-attrtab.i,
and generate-3.4.ii I got:
comparing combine.c compilation at -O0 level:
Overall memory needed: 28319k -> 28331k
Peak memory use before GGC: 9282k
Peak memory use after GGC: 8821k
Maximum of released memory in single GGC run: 2666k
Garbage: 36829k
Leak: 6441k
Overhead: 4856k
GGC runs: 280
comparing combine.c compilation at -O1 level:
Amount of produced GGC garbage increased from 57302k to 57448k, overall 0.25%
Overall memory needed: 39731k -> 40179k
Peak memory use before GGC: 17269k -> 17270k
Peak memory use after GGC: 17094k
Maximum of released memory in single GGC run: 2263k -> 2383k
Garbage: 57302k -> 57448k
Leak: 6482k -> 6482k
Overhead: 6165k -> 6196k
GGC runs: 354 -> 355
comparing combine.c compilation at -O2 level:
Overall memory needed: 29778k
Peak memory use before GGC: 17266k
Peak memory use after GGC: 17094k
Maximum of released memory in single GGC run: 2855k -> 2916k
Garbage: 77544k -> 76563k
Leak: 6582k -> 6578k
Overhead: 8943k -> 8786k
GGC runs: 421 -> 420
comparing combine.c compilation at -O3 level:
Overall memory needed: 28882k
Peak memory use before GGC: 18270k -> 18206k
Peak memory use after GGC: 17826k -> 17822k
Maximum of released memory in single GGC run: 4036k -> 4105k
Garbage: 108102k -> 106666k
Leak: 6660k -> 6653k
Overhead: 12567k -> 12394k
GGC runs: 470 -> 471
comparing insn-attrtab.c compilation at -O0 level:
Overall memory needed: 88222k
Peak memory use before GGC: 69766k
Peak memory use after GGC: 44176k
Maximum of released memory in single GGC run: 36964k
Garbage: 129071k
Leak: 9486k
Overhead: 16989k
GGC runs: 217
comparing insn-attrtab.c compilation at -O1 level:
Peak amount of GGC memory allocated before garbage collecting increased from 89834k to 90352k, overall 0.58%
Peak amount of GGC memory still allocated after garbage collectin increased from 83210k to 83714k, overall 0.61%
Amount of produced GGC garbage increased from 276300k to 277742k, overall 0.52%
Overall memory needed: 114138k -> 114146k
Peak memory use before GGC: 89834k -> 90352k
Peak memory use after GGC: 83210k -> 83714k
Maximum of released memory in single GGC run: 31806k
Garbage: 276300k -> 277742k
Leak: 9328k -> 9328k
Overhead: 29482k -> 29771k
GGC runs: 223
comparing insn-attrtab.c compilation at -O2 level:
Peak amount of GGC memory allocated before garbage collecting increased from 92111k to 92581k, overall 0.51%
Peak amount of GGC memory still allocated after garbage collectin increased from 84216k to 84694k, overall 0.57%
Amount of produced GGC garbage increased from 319269k to 320900k, overall 0.51%
Overall memory needed: 111658k -> 111178k
Peak memory use before GGC: 92111k -> 92581k
Peak memory use after GGC: 84216k -> 84694k
Maximum of released memory in single GGC run: 30368k -> 30383k
Garbage: 319269k -> 320900k
Leak: 9329k -> 9330k
Overhead: 36808k -> 37085k
GGC runs: 249 -> 250
comparing insn-attrtab.c compilation at -O3 level:
Peak amount of GGC memory allocated before garbage collecting increased from 92137k to 92607k, overall 0.51%
Peak amount of GGC memory still allocated after garbage collectin increased from 84242k to 84719k, overall 0.57%
Amount of produced GGC garbage increased from 319883k to 321528k, overall 0.51%
Overall memory needed: 111686k -> 111214k
Peak memory use before GGC: 92137k -> 92607k
Peak memory use after GGC: 84242k -> 84719k
Maximum of released memory in single GGC run: 30559k -> 30575k
Garbage: 319883k -> 321528k
Leak: 9332k -> 9333k
Overhead: 36998k -> 37278k
GGC runs: 252 -> 254
comparing Gerald's testcase PR8361 compilation at -O0 level:
Overall memory needed: 119490k
Peak memory use before GGC: 92635k
Peak memory use after GGC: 91717k
Maximum of released memory in single GGC run: 19299k
Garbage: 205556k
Leak: 47662k
Overhead: 20811k
GGC runs: 402
comparing Gerald's testcase PR8361 compilation at -O1 level:
Amount of produced GGC garbage increased from 440853k to 444259k, overall 0.77%
Overall memory needed: 119222k
Peak memory use before GGC: 97821k
Peak memory use after GGC: 95611k
Maximum of released memory in single GGC run: 18569k
Garbage: 440853k -> 444259k
Leak: 49994k -> 49995k
Overhead: 32124k -> 32811k
GGC runs: 550 -> 552
comparing Gerald's testcase PR8361 compilation at -O2 level:
Overall memory needed: 119178k -> 119218k
Peak memory use before GGC: 97820k
Peak memory use after GGC: 95611k
Maximum of released memory in single GGC run: 18569k
Garbage: 507740k -> 508058k
Leak: 50711k -> 50699k
Overhead: 40630k -> 40860k
GGC runs: 613 -> 612
comparing Gerald's testcase PR8361 compilation at -O3 level:
Amount of produced GGC garbage increased from 526587k to 527424k, overall 0.16%
Overall memory needed: 118882k
Peak memory use before GGC: 97868k
Peak memory use after GGC: 96898k
Maximum of released memory in single GGC run: 18831k
Garbage: 526587k -> 527424k
Leak: 50287k -> 50275k
Overhead: 41062k -> 41375k
GGC runs: 622 -> 626
comparing PR rtl-optimization/28071 testcase compilation at -O0 level:
Overall memory needed: 137934k
Peak memory use before GGC: 81886k
Peak memory use after GGC: 58766k
Maximum of released memory in single GGC run: 45494k
Garbage: 147250k
Leak: 7507k
Overhead: 25296k
GGC runs: 83
comparing PR rtl-optimization/28071 testcase compilation at -O1 level:
Peak amount of GGC memory allocated before garbage collecting increased from 203459k to 205249k, overall 0.88%
Peak amount of GGC memory still allocated after garbage collectin increased from 199235k to 201025k, overall 0.90%
Amount of produced GGC garbage increased from 268729k to 271709k, overall 1.11%
Overall memory needed: 426038k -> 424546k
Peak memory use before GGC: 203459k -> 205249k
Peak memory use after GGC: 199235k -> 201025k
Maximum of released memory in single GGC run: 100817k -> 101716k
Garbage: 268729k -> 271709k
Leak: 47572k -> 47573k
Overhead: 30229k -> 30825k
GGC runs: 101
comparing PR rtl-optimization/28071 testcase compilation at -O2 level:
Peak amount of GGC memory allocated before garbage collecting increased from 204210k to 206000k, overall 0.88%
Peak amount of GGC memory still allocated after garbage collectin increased from 199987k to 201776k, overall 0.89%
Overall memory needed: 349494k -> 352042k
Peak memory use before GGC: 204210k -> 206000k
Peak memory use after GGC: 199987k -> 201776k
Maximum of released memory in single GGC run: 107089k -> 108041k
Garbage: 358246k -> 350431k
Leak: 48156k -> 48156k
Overhead: 47830k -> 46270k
GGC runs: 108
comparing PR rtl-optimization/28071 testcase compilation at -O3 -fno-tree-pre -fno-tree-fre level:
Overall memory needed: 535282k -> 535306k
Peak memory use before GGC: 314907k
Peak memory use after GGC: 293250k
Maximum of released memory in single GGC run: 163448k
Garbage: 491201k
Leak: 65488k
Overhead: 59087k
GGC runs: 95
Head of the ChangeLog is:
--- /usr/src/SpecTests/sandbox-britten-memory/x86_64/mem-result/ChangeLog 2006-10-29 15:20:12.000000000 +0000
+++ /usr/src/SpecTests/sandbox-britten-memory/gcc/gcc/ChangeLog 2006-10-30 02:52:31.000000000 +0000
@@ -1,3 +1,188 @@
+2006-10-29 Daniel Berlin <dberlin@dberlin.org>
+
+ * tree.h (tree_value_handle): Remove struct value_set declaration.
+ Change value_set to bitmap_set.
+ * tree-pretty-print.c (dump_generic_node): Use has_stmt_ann.
+ * tree-vn.c (get_value_handle): Made inline and moved to
+ tree-flow-inline.h.
+ * tree-flow-inline.h: (has_stmt_ann): New function.
+ * tree-ssa-pre.c (expressions): New variable.
+ (next_expression_id): Ditto.
+ (alloc_expression_id): New function.
+ (struct value_set): Remove.
+ (get_expression_id): New function.
+ (get_or_alloc_expression_id): Ditto.
+ (expression_for_id): Ditto.
+ (clear_expression_ids): Ditto.
+ (FOR_EACH_EXPR_ID_IN_SET): New macro.
+ (bb_value_sets): Renamed to bb_bitmap_sets.
+ All value sets replaced with bitmap_sets.
+ Add visited member.
+ (BB_VISITED): New macro.
+ (postorder): New variable.
+ (add_to_value): Removed.
+ (value_exists_in_set_bitmap): Ditto.
+ (value_insert_into_set_bitmap): Ditto.
+ (set_new): Ditto.
+ (set_copy): Ditto.
+ (set_remove): Ditto.
+ (set_contains_value): Ditto.
+ (insert_into_set): Ditto.
+ (set_equal): Ditto.
+ (find_leader): Ditto.
+ (bitmap_set_subtract_from_value_set): Ditto.
+ (value_insert_into_set): Ditto.
+ (print_value_set): Ditto.
+ (debug_value_set): Ditto.
+ (constant_expr_p): New function.
+ (bitmap_remove_from_set): Ditto.
+ (bitmap_insert_into_set): Ditto.
+ (bitmap_set_free): Ditto.
+ (vh_compare): Ditto.
+ (sorted_array_from_bitmap_set): Ditto.
+ (bitmap_set_subtract): Ditto.
+ (bitmap_set_equal): Ditto.
+ (debug_bitmap_set): Ditto.
+ (find_leader_in_sets): Ditto.
+ (bitmap_set_replace_value): Modify for bitmapped sets.
+ (phi_translate): Ditto.
+ (phi_translate_set): Ditto.
+ (bitmap_find_leader): Ditto.
+ (valid_in_sets): Ditto.
+ (union_contains_value): Ditto.
+ (clean): Ditto.
+ (compute_antic_aux): Ditto. Mark changed blocks.
+ (compute_antic): Ditto. Iterate in postorder and only over
+ changing blocks.
+ (compute_rvuse_and_antic_safe): Reuse postorder.
+ (create_component_ref_by_pieces): Modify for bitmapped sets.
+ (find_or_generate_expression): Ditto.
+ (create_expression_by_pieces): Ditto.
+ (insert_into_preds_of_block): Ditto.
+ (changed_blocks): New variable.
+ (do_regular_insertion): Broken out from insert_aux.
+ (insert_aux): Modified for bitmapped sets.
+ (find_existing_value_expr): New function.
+ (create_value_expr_from): Use it.
+ (insert_extra_phis): Removed.
+ (print_bitmap_set): Renamed from bitmap_print_value_set.
+ (compute_avail): Handle RETURN_EXPR.
+ (init_pre): Modify for bitmapped sets.
+ * tree-flow.h (has_stmt_ann): New function.
+
+2006-10-29 Roger Sayle <roger@eyesopen.com>
+
+ * builtins.c (fold_builtin_floor): Check for the availability of
+ the C99 trunc function before transforming floor into trunc.
+
+2006-10-29 Kaveh R. Ghazi <ghazi@caip.rutgers.edu>
+
+ * builtins.c (fold_builtin_hypot): Rearrange recursive
+ transformation before others, and also do ABS_EXPR. When
+ necessary, check flag_unsafe_math_optimizations. When necessary,
+ add fabs.
+
+2006-10-29 Roger Sayle <roger@eyesopen.com>
+
+ * fold-const.c (fold_comparison): Fold ~X op ~Y as Y op X.
+ Fold ~X op C as X op' ~C, where op' is the swapped comparison.
+ (fold_binary): ~X eq/ne C is now handled in fold_comparison.
+ Fold -X eq/ne -Y as X eq/ne Y.
+
+2006-10-29 Richard Sandiford <richard@codesourcery.com>
+
+ * config/mips/mips.md (mul<mode>3): Check ISA_HAS_MUL3 rather than
+ GENERATE_MULT3_<MODE>. Restrict the test to SImode. Use ISA_HAS_MUL3
+ rather than GENERATE_MULT3_SI in the various define_peephole2s.
+ (mulsi3_mult3): Depend on ISA_HAS_MUL3 rather than GENERATE_MULT3_SI.
+ Use an inclusive test for "mult" rather than "mul".
+ (rotr<mode>3): Depend on ISA_HAS_ROR.
+ * config/mips/mips.h (GENERATE_MULT3_SI): Delete in favor of
+ ISA_HAS_MUL3.
+ (GENERATE_MULT3_DI): Delete.
+ (ISA_HAS_64BIT_REGS): Use consistent formatting.
+ (ISA_HAS_MUL3): New macro.
+ (ISA_HAS_CONDMOVE, ISA_HAS_8CC): Use consistent formatting.
+ (ISA_HAS_FP4, ISA_HAS_MADD_MSUB, ISA_HAS_NMADD_NMSUB): Likewise.
+ (ISA_HAS_CLZ_CLO): Likewise.
+ (ISA_HAS_DCLZ_DCLO): Delete.
+ (ISA_HAS_MULHI, ISA_HAS_MULS, ISA_HAS_MSAC): Require !TARGET_MIPS16.
+ (ISA_HAS_MACC): Require !TARGET_MIPS16 for all ISAs, not just
+ the VR4120 and VR4130.
+ (ISA_HAS_MACCHI): Use consistent formatting.
+ (ISA_HAS_ROTR_SI, ISA_HAS_ROTR_DI): Delete in favor of...
+ (ISA_HAS_ROR): ...this new macro.
+ (ISA_HAS_PREFETCH, ISA_HAS_PREFETCHX): Use consistent formatting.
+ (ISA_HAS_SEB_SEH, ISA_HAS_EXT_INS): Likewise.
+ (ISA_HAS_LOAD_DELAY): Use ISA_MIPS1.
+
+2006-10-29 Roger Sayle <roger@eyesopen.com>
+
+ PR tree-optimization/15458
+ * fold-const.c (fold_binary): Optimize ~X ^ C as X ^ ~C, where C
+ is a constant.
+
+2006-10-29 Richard Guenther <rguenther@suse.de>
+
+ * config/i386/i386-protos.h (ix86_expand_trunc): Declare.
+ (ix86_expand_truncdf_32): Likewise.
+ * config/i386/i386.c (ix86_expand_trunc): New function expanding
+ trunc inline for SSE math and -fno-trapping-math and if not
+ optimizing for size.
+ (ix86_expand_truncdf_32): Same for DFmode on 32bit archs.
+ * config/i386/i386.md (btruncsf2, btruncdf2): Adjust expanders
+ for expanding btrunc inline for SSE math.
+
+2006-10-29 Joseph Myers <joseph@codesourcery.com>
+
+ * config.gcc (i[34567]86-*-linux*): Handle --enable-targets=all.
+ Handle tuning for bi-arch i[34567]86-*-linux* like that for
+ i[34567]86-*-solaris2.1[0-9]*.
+ * config/i386/linux64.h (TARGET_VERSION, MULTILIB_DEFAULTS):
+ Define conditionally depending on TARGET_64BIT_DEFAULT.
+ (SPEC_32, SPEC_64): Define.
+ (LINK_SPEC): Use them.
+ * doc/install.texi (--enable-targets=all): Document for x86-linux.
+
+2006-10-29 Richard Guenther <rguenther@suse.de>
+
+ * config/i386/i386-protos.h (ix86_expand_round): Declare.
+ (ix86_expand_rounddf_32): Likewise.
+ * config/i386/i386.c (ix86_expand_round): New function expanding
+ round inline for SSE math and -fno-trapping-math and if not
+ optimizing for size.
+ (ix86_expand_rounddf_32): Same for DFmode on 32bit archs.
+ * config/i386/i386.md (rounddf2, roundsf2): New pattern expanding
+ round via ix86_expand_round.
+
+2006-10-29 Richard Guenther <rguenther@suse.de>
+
+ * config/i386/i386-protos.h (ix86_expand_floorceil): Declare.
+ (ix86_expand_floorceildf_32): Likewise.
+ * config/i386/i386.c (ix86_expand_sse_compare_mask): New
+ static helper function.
+ (ix86_expand_floorceil): Expander for floor and ceil to SSE
+ math.
+ (ix86_expand_floorceildf_32): Same for DFmode on 32bit archs.
+ * config/i386/i386.md (floordf2): Adjust to enable floor
+ expansion via ix86_expand_floorceil if TARGET_SSE_MATH and
+ -fno-trapping-math is enabled and if not optimizing for size.
+ (floorsf2, ceildf2, ceilsf2): Likewise.
+ * config/i386/sse.md (sse_maskcmpsf3): New insn.
+ (sse2_maskcmpdf3): Likewise.
+
+2006-10-29 Richard Guenther <rguenther@suse.de>
+
+ * builtins.c (expand_builtin_mathfn): Expand nearbyint as
+ rint in case -fno-trapping-math is enabled.
+ * config/i386/i386-protos.h (ix86_expand_rint): Declare.
+ * config/i386/i386.c (ix86_gen_TWO52): New static helper function.
+ (ix86_expand_sse_fabs): Likewise.
+ (ix86_expand_rint): New function expanding rint to x87 or SSE math.
+ * config/i386/i386.md (rintdf2): Enable for SSE math if
+ -fno-trapping-math is enabled, use ix86_expand_rint for expansion.
+ (rintsf2): Likewise.
+
2006-10-29 Richard Guenther <rguenther@suse.de>
* genopinit.c (optabs): Change lfloor_optab and lceil_optab
--- /usr/src/SpecTests/sandbox-britten-memory/x86_64/mem-result/ChangeLog.cp 2006-10-29 15:20:12.000000000 +0000
+++ /usr/src/SpecTests/sandbox-britten-memory/gcc/gcc/cp/ChangeLog 2006-10-30 02:52:29.000000000 +0000
@@ -1,3 +1,15 @@
+2006-10-29 Dirk Mueller <dmueller@suse.de>
+
+ PR c++/29089
+ * typeck.c (build_unary_op): Duplicate warning message
+ for easier translation.
+
+2006-10-29 Dirk Mueller <dmueller@suse.de>
+
+ PR c++/16307
+ * typeck.c (build_array_ref): Warn for char subscriptions
+ on pointers.
+
2006-10-29 Kazu Hirata <kazu@codesourcery.com>
* decl.c: Fix a comment typo.
The results can be reproduced by building a compiler with
--enable-gather-detailed-mem-stats targetting x86-64
and compiling preprocessed combine.c or testcase from PR8632 with:
-fmem-report --param=ggc-min-heapsize=1024 --param=ggc-min-expand=1 -Ox -Q
The memory consumption summary appears in the dump after detailed listing
of the places they are allocated in. Peak memory consumption is actually
computed by looking for maximal value in {GC XXXX -> YYYY} report.
Your testing script.
More information about the Gcc-regression
mailing list