A recent patch increased GCC's memory consumption!

gcctest@suse.de gcctest@suse.de
Mon Oct 30 04:22:00 GMT 2006


Hi,

I am a friendly script caring about memory consumption in GCC.  Please
contact jh@suse.cz if something is going wrong.

Comparing memory consumption on compilation of combine.i, insn-attrtab.i,
and generate-3.4.ii I got:


comparing combine.c compilation at -O0 level:
    Overall memory needed: 28319k -> 28331k
    Peak memory use before GGC: 9282k
    Peak memory use after GGC: 8821k
    Maximum of released memory in single GGC run: 2666k
    Garbage: 36829k
    Leak: 6441k
    Overhead: 4856k
    GGC runs: 280

comparing combine.c compilation at -O1 level:
  Amount of produced GGC garbage increased from 57302k to 57448k, overall 0.25%
    Overall memory needed: 39731k -> 40179k
    Peak memory use before GGC: 17269k -> 17270k
    Peak memory use after GGC: 17094k
    Maximum of released memory in single GGC run: 2263k -> 2383k
    Garbage: 57302k -> 57448k
    Leak: 6482k -> 6482k
    Overhead: 6165k -> 6196k
    GGC runs: 354 -> 355

comparing combine.c compilation at -O2 level:
    Overall memory needed: 29778k
    Peak memory use before GGC: 17266k
    Peak memory use after GGC: 17094k
    Maximum of released memory in single GGC run: 2855k -> 2916k
    Garbage: 77544k -> 76563k
    Leak: 6582k -> 6578k
    Overhead: 8943k -> 8786k
    GGC runs: 421 -> 420

comparing combine.c compilation at -O3 level:
    Overall memory needed: 28882k
    Peak memory use before GGC: 18270k -> 18206k
    Peak memory use after GGC: 17826k -> 17822k
    Maximum of released memory in single GGC run: 4036k -> 4105k
    Garbage: 108102k -> 106666k
    Leak: 6660k -> 6653k
    Overhead: 12567k -> 12394k
    GGC runs: 470 -> 471

comparing insn-attrtab.c compilation at -O0 level:
    Overall memory needed: 88222k
    Peak memory use before GGC: 69766k
    Peak memory use after GGC: 44176k
    Maximum of released memory in single GGC run: 36964k
    Garbage: 129071k
    Leak: 9486k
    Overhead: 16989k
    GGC runs: 217

comparing insn-attrtab.c compilation at -O1 level:
  Peak amount of GGC memory allocated before garbage collecting increased from 89834k to 90352k, overall 0.58%
  Peak amount of GGC memory still allocated after garbage collectin increased from 83210k to 83714k, overall 0.61%
  Amount of produced GGC garbage increased from 276300k to 277742k, overall 0.52%
    Overall memory needed: 114138k -> 114146k
    Peak memory use before GGC: 89834k -> 90352k
    Peak memory use after GGC: 83210k -> 83714k
    Maximum of released memory in single GGC run: 31806k
    Garbage: 276300k -> 277742k
    Leak: 9328k -> 9328k
    Overhead: 29482k -> 29771k
    GGC runs: 223

comparing insn-attrtab.c compilation at -O2 level:
  Peak amount of GGC memory allocated before garbage collecting increased from 92111k to 92581k, overall 0.51%
  Peak amount of GGC memory still allocated after garbage collectin increased from 84216k to 84694k, overall 0.57%
  Amount of produced GGC garbage increased from 319269k to 320900k, overall 0.51%
    Overall memory needed: 111658k -> 111178k
    Peak memory use before GGC: 92111k -> 92581k
    Peak memory use after GGC: 84216k -> 84694k
    Maximum of released memory in single GGC run: 30368k -> 30383k
    Garbage: 319269k -> 320900k
    Leak: 9329k -> 9330k
    Overhead: 36808k -> 37085k
    GGC runs: 249 -> 250

comparing insn-attrtab.c compilation at -O3 level:
  Peak amount of GGC memory allocated before garbage collecting increased from 92137k to 92607k, overall 0.51%
  Peak amount of GGC memory still allocated after garbage collectin increased from 84242k to 84719k, overall 0.57%
  Amount of produced GGC garbage increased from 319883k to 321528k, overall 0.51%
    Overall memory needed: 111686k -> 111214k
    Peak memory use before GGC: 92137k -> 92607k
    Peak memory use after GGC: 84242k -> 84719k
    Maximum of released memory in single GGC run: 30559k -> 30575k
    Garbage: 319883k -> 321528k
    Leak: 9332k -> 9333k
    Overhead: 36998k -> 37278k
    GGC runs: 252 -> 254

comparing Gerald's testcase PR8361 compilation at -O0 level:
    Overall memory needed: 119490k
    Peak memory use before GGC: 92635k
    Peak memory use after GGC: 91717k
    Maximum of released memory in single GGC run: 19299k
    Garbage: 205556k
    Leak: 47662k
    Overhead: 20811k
    GGC runs: 402

comparing Gerald's testcase PR8361 compilation at -O1 level:
  Amount of produced GGC garbage increased from 440853k to 444259k, overall 0.77%
    Overall memory needed: 119222k
    Peak memory use before GGC: 97821k
    Peak memory use after GGC: 95611k
    Maximum of released memory in single GGC run: 18569k
    Garbage: 440853k -> 444259k
    Leak: 49994k -> 49995k
    Overhead: 32124k -> 32811k
    GGC runs: 550 -> 552

comparing Gerald's testcase PR8361 compilation at -O2 level:
    Overall memory needed: 119178k -> 119218k
    Peak memory use before GGC: 97820k
    Peak memory use after GGC: 95611k
    Maximum of released memory in single GGC run: 18569k
    Garbage: 507740k -> 508058k
    Leak: 50711k -> 50699k
    Overhead: 40630k -> 40860k
    GGC runs: 613 -> 612

comparing Gerald's testcase PR8361 compilation at -O3 level:
  Amount of produced GGC garbage increased from 526587k to 527424k, overall 0.16%
    Overall memory needed: 118882k
    Peak memory use before GGC: 97868k
    Peak memory use after GGC: 96898k
    Maximum of released memory in single GGC run: 18831k
    Garbage: 526587k -> 527424k
    Leak: 50287k -> 50275k
    Overhead: 41062k -> 41375k
    GGC runs: 622 -> 626

comparing PR rtl-optimization/28071 testcase compilation at -O0 level:
    Overall memory needed: 137934k
    Peak memory use before GGC: 81886k
    Peak memory use after GGC: 58766k
    Maximum of released memory in single GGC run: 45494k
    Garbage: 147250k
    Leak: 7507k
    Overhead: 25296k
    GGC runs: 83

comparing PR rtl-optimization/28071 testcase compilation at -O1 level:
  Peak amount of GGC memory allocated before garbage collecting increased from 203459k to 205249k, overall 0.88%
  Peak amount of GGC memory still allocated after garbage collectin increased from 199235k to 201025k, overall 0.90%
  Amount of produced GGC garbage increased from 268729k to 271709k, overall 1.11%
    Overall memory needed: 426038k -> 424546k
    Peak memory use before GGC: 203459k -> 205249k
    Peak memory use after GGC: 199235k -> 201025k
    Maximum of released memory in single GGC run: 100817k -> 101716k
    Garbage: 268729k -> 271709k
    Leak: 47572k -> 47573k
    Overhead: 30229k -> 30825k
    GGC runs: 101

comparing PR rtl-optimization/28071 testcase compilation at -O2 level:
  Peak amount of GGC memory allocated before garbage collecting increased from 204210k to 206000k, overall 0.88%
  Peak amount of GGC memory still allocated after garbage collectin increased from 199987k to 201776k, overall 0.89%
    Overall memory needed: 349494k -> 352042k
    Peak memory use before GGC: 204210k -> 206000k
    Peak memory use after GGC: 199987k -> 201776k
    Maximum of released memory in single GGC run: 107089k -> 108041k
    Garbage: 358246k -> 350431k
    Leak: 48156k -> 48156k
    Overhead: 47830k -> 46270k
    GGC runs: 108

comparing PR rtl-optimization/28071 testcase compilation at -O3 -fno-tree-pre -fno-tree-fre level:
    Overall memory needed: 535282k -> 535306k
    Peak memory use before GGC: 314907k
    Peak memory use after GGC: 293250k
    Maximum of released memory in single GGC run: 163448k
    Garbage: 491201k
    Leak: 65488k
    Overhead: 59087k
    GGC runs: 95

Head of the ChangeLog is:

--- /usr/src/SpecTests/sandbox-britten-memory/x86_64/mem-result/ChangeLog	2006-10-29 15:20:12.000000000 +0000
+++ /usr/src/SpecTests/sandbox-britten-memory/gcc/gcc/ChangeLog	2006-10-30 02:52:31.000000000 +0000
@@ -1,3 +1,188 @@
+2006-10-29  Daniel Berlin  <dberlin@dberlin.org>
+
+	* tree.h (tree_value_handle): Remove struct value_set declaration.	
+	Change value_set to bitmap_set.
+	* tree-pretty-print.c (dump_generic_node): Use has_stmt_ann.
+	* tree-vn.c (get_value_handle): Made inline and moved to
+	tree-flow-inline.h.
+	* tree-flow-inline.h: (has_stmt_ann): New function.
+	* tree-ssa-pre.c (expressions): New variable.
+	(next_expression_id): Ditto.
+	(alloc_expression_id): New function.
+	(struct value_set): Remove.
+	(get_expression_id): New function.
+	(get_or_alloc_expression_id): Ditto.
+	(expression_for_id): Ditto.
+	(clear_expression_ids): Ditto.
+	(FOR_EACH_EXPR_ID_IN_SET): New macro.
+	(bb_value_sets): Renamed to bb_bitmap_sets.
+	All value sets replaced with bitmap_sets.
+	Add visited member.
+	(BB_VISITED): New macro.
+	(postorder): New variable.
+	(add_to_value): Removed.
+	(value_exists_in_set_bitmap): Ditto.
+	(value_insert_into_set_bitmap): Ditto.
+	(set_new): Ditto.
+	(set_copy): Ditto.
+	(set_remove): Ditto.
+	(set_contains_value): Ditto.
+	(insert_into_set): Ditto.
+	(set_equal): Ditto.
+	(find_leader): Ditto.
+	(bitmap_set_subtract_from_value_set): Ditto.
+	(value_insert_into_set): Ditto.
+	(print_value_set): Ditto.
+	(debug_value_set): Ditto.
+	(constant_expr_p): New function.
+	(bitmap_remove_from_set): Ditto.
+	(bitmap_insert_into_set): Ditto.
+	(bitmap_set_free): Ditto.
+	(vh_compare): Ditto.
+	(sorted_array_from_bitmap_set): Ditto.
+	(bitmap_set_subtract): Ditto.
+	(bitmap_set_equal): Ditto.
+	(debug_bitmap_set): Ditto.
+	(find_leader_in_sets): Ditto.
+	(bitmap_set_replace_value): Modify for bitmapped sets.
+	(phi_translate): Ditto.
+	(phi_translate_set): Ditto.
+	(bitmap_find_leader): Ditto.
+	(valid_in_sets): Ditto.
+	(union_contains_value): Ditto.
+	(clean): Ditto.
+	(compute_antic_aux): Ditto.  Mark changed blocks.
+	(compute_antic): Ditto. Iterate in postorder and only over
+	changing blocks.
+	(compute_rvuse_and_antic_safe): Reuse postorder.
+	(create_component_ref_by_pieces): Modify for bitmapped sets.
+	(find_or_generate_expression): Ditto.
+	(create_expression_by_pieces): Ditto.
+	(insert_into_preds_of_block): Ditto.
+	(changed_blocks): New variable.
+	(do_regular_insertion): Broken out from insert_aux.
+	(insert_aux): Modified for bitmapped sets.
+	(find_existing_value_expr): New function.
+	(create_value_expr_from): Use it.
+	(insert_extra_phis): Removed.
+	(print_bitmap_set): Renamed from bitmap_print_value_set.
+	(compute_avail): Handle RETURN_EXPR.
+	(init_pre): Modify for bitmapped sets.
+	* tree-flow.h (has_stmt_ann): New function.
+	
+2006-10-29  Roger Sayle  <roger@eyesopen.com>
+
+	* builtins.c (fold_builtin_floor): Check for the availability of
+	the C99 trunc function before transforming floor into trunc.
+
+2006-10-29  Kaveh R. Ghazi  <ghazi@caip.rutgers.edu>
+
+	* builtins.c (fold_builtin_hypot): Rearrange recursive
+	transformation before others, and also do ABS_EXPR.  When
+	necessary, check flag_unsafe_math_optimizations.  When necessary,
+	add fabs.
+
+2006-10-29  Roger Sayle  <roger@eyesopen.com>
+
+	* fold-const.c (fold_comparison): Fold ~X op ~Y as Y op X.
+	Fold ~X op C as X op' ~C, where op' is the swapped comparison.
+	(fold_binary): ~X eq/ne C is now handled in fold_comparison.
+	Fold -X eq/ne -Y as X eq/ne Y.
+
+2006-10-29  Richard Sandiford  <richard@codesourcery.com>
+
+	* config/mips/mips.md (mul<mode>3): Check ISA_HAS_MUL3 rather than
+	GENERATE_MULT3_<MODE>.  Restrict the test to SImode.  Use ISA_HAS_MUL3
+	rather than GENERATE_MULT3_SI in the various define_peephole2s.
+	(mulsi3_mult3): Depend on ISA_HAS_MUL3 rather than GENERATE_MULT3_SI.
+	Use an inclusive test for "mult" rather than "mul".
+	(rotr<mode>3): Depend on ISA_HAS_ROR.
+	* config/mips/mips.h (GENERATE_MULT3_SI): Delete in favor of
+	ISA_HAS_MUL3.
+	(GENERATE_MULT3_DI): Delete.
+	(ISA_HAS_64BIT_REGS): Use consistent formatting.
+	(ISA_HAS_MUL3): New macro.
+	(ISA_HAS_CONDMOVE, ISA_HAS_8CC): Use consistent formatting.
+	(ISA_HAS_FP4, ISA_HAS_MADD_MSUB, ISA_HAS_NMADD_NMSUB): Likewise.
+	(ISA_HAS_CLZ_CLO): Likewise.
+	(ISA_HAS_DCLZ_DCLO): Delete.
+	(ISA_HAS_MULHI, ISA_HAS_MULS, ISA_HAS_MSAC): Require !TARGET_MIPS16.
+	(ISA_HAS_MACC): Require !TARGET_MIPS16 for all ISAs, not just
+	the VR4120 and VR4130.
+	(ISA_HAS_MACCHI): Use consistent formatting.
+	(ISA_HAS_ROTR_SI, ISA_HAS_ROTR_DI): Delete in favor of...
+	(ISA_HAS_ROR): ...this new macro.
+	(ISA_HAS_PREFETCH, ISA_HAS_PREFETCHX): Use consistent formatting.
+	(ISA_HAS_SEB_SEH, ISA_HAS_EXT_INS): Likewise.
+	(ISA_HAS_LOAD_DELAY): Use ISA_MIPS1.
+
+2006-10-29  Roger Sayle  <roger@eyesopen.com>
+
+	PR tree-optimization/15458
+	* fold-const.c (fold_binary): Optimize ~X ^ C as X ^ ~C, where C
+	is a constant.
+
+2006-10-29  Richard Guenther  <rguenther@suse.de>
+
+	* config/i386/i386-protos.h (ix86_expand_trunc): Declare.
+	(ix86_expand_truncdf_32): Likewise.
+	* config/i386/i386.c (ix86_expand_trunc): New function expanding
+	trunc inline for SSE math and -fno-trapping-math and if not
+	optimizing for size.
+	(ix86_expand_truncdf_32): Same for DFmode on 32bit archs.
+	* config/i386/i386.md (btruncsf2, btruncdf2): Adjust expanders
+	for expanding btrunc inline for SSE math.
+
+2006-10-29  Joseph Myers  <joseph@codesourcery.com>
+
+	* config.gcc (i[34567]86-*-linux*): Handle --enable-targets=all.
+	Handle tuning for bi-arch i[34567]86-*-linux* like that for
+	i[34567]86-*-solaris2.1[0-9]*.
+	* config/i386/linux64.h (TARGET_VERSION, MULTILIB_DEFAULTS):
+	Define conditionally depending on TARGET_64BIT_DEFAULT.
+	(SPEC_32, SPEC_64): Define.
+	(LINK_SPEC): Use them.
+	* doc/install.texi (--enable-targets=all): Document for x86-linux.
+
+2006-10-29  Richard Guenther  <rguenther@suse.de>
+
+	* config/i386/i386-protos.h (ix86_expand_round): Declare.
+	(ix86_expand_rounddf_32): Likewise.
+	* config/i386/i386.c (ix86_expand_round): New function expanding
+	round inline for SSE math and -fno-trapping-math and if not
+	optimizing for size.
+	(ix86_expand_rounddf_32): Same for DFmode on 32bit archs.
+	* config/i386/i386.md (rounddf2, roundsf2): New pattern expanding
+	round via ix86_expand_round.
+
+2006-10-29  Richard Guenther  <rguenther@suse.de>
+
+	* config/i386/i386-protos.h (ix86_expand_floorceil): Declare.
+	(ix86_expand_floorceildf_32): Likewise.
+	* config/i386/i386.c (ix86_expand_sse_compare_mask): New
+	static helper function.
+	(ix86_expand_floorceil): Expander for floor and ceil to SSE
+	math.
+	(ix86_expand_floorceildf_32): Same for DFmode on 32bit archs.
+	* config/i386/i386.md (floordf2): Adjust to enable floor
+	expansion via ix86_expand_floorceil if TARGET_SSE_MATH and
+	-fno-trapping-math is enabled and if not optimizing for size.
+	(floorsf2, ceildf2, ceilsf2): Likewise.
+	* config/i386/sse.md (sse_maskcmpsf3): New insn.
+	(sse2_maskcmpdf3): Likewise.
+
+2006-10-29  Richard Guenther  <rguenther@suse.de>
+
+	* builtins.c (expand_builtin_mathfn): Expand nearbyint as
+	rint in case -fno-trapping-math is enabled.
+	* config/i386/i386-protos.h (ix86_expand_rint): Declare.
+	* config/i386/i386.c (ix86_gen_TWO52): New static helper function.
+	(ix86_expand_sse_fabs): Likewise.
+	(ix86_expand_rint): New function expanding rint to x87 or SSE math.
+	* config/i386/i386.md (rintdf2): Enable for SSE math if
+	-fno-trapping-math is enabled, use ix86_expand_rint for expansion.
+	(rintsf2): Likewise.
+
 2006-10-29  Richard Guenther  <rguenther@suse.de>
 
 	* genopinit.c (optabs): Change lfloor_optab and lceil_optab
--- /usr/src/SpecTests/sandbox-britten-memory/x86_64/mem-result/ChangeLog.cp	2006-10-29 15:20:12.000000000 +0000
+++ /usr/src/SpecTests/sandbox-britten-memory/gcc/gcc/cp/ChangeLog	2006-10-30 02:52:29.000000000 +0000
@@ -1,3 +1,15 @@
+2006-10-29  Dirk Mueller  <dmueller@suse.de>
+
+	PR c++/29089
+	* typeck.c (build_unary_op): Duplicate warning message
+	for easier translation.
+
+2006-10-29  Dirk Mueller  <dmueller@suse.de>
+
+	PR c++/16307
+	* typeck.c (build_array_ref): Warn for char subscriptions
+	on pointers.
+
 2006-10-29  Kazu Hirata  <kazu@codesourcery.com>
 
 	* decl.c: Fix a comment typo.


The results can be reproduced by building a compiler with

--enable-gather-detailed-mem-stats targetting x86-64

and compiling preprocessed combine.c or testcase from PR8632 with:

-fmem-report --param=ggc-min-heapsize=1024 --param=ggc-min-expand=1 -Ox -Q

The memory consumption summary appears in the dump after detailed listing
of the places they are allocated in.  Peak memory consumption is actually
computed by looking for maximal value in {GC XXXX -> YYYY} report.

Your testing script.



More information about the Gcc-regression mailing list