This is the mail archive of the gcc-regression@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

A recent patch increased GCC's memory consumption!


Hi,

I am a friendly script caring about memory consumption in GCC.  Please
contact jh@suse.cz if something is going wrong.

Comparing memory consumption on compilation of combine.i, insn-attrtab.i,
and generate-3.4.ii I got:


comparing combine.c compilation at -O0 level:
    Overall memory needed: 25336k -> 25340k
    Peak memory use before GGC: 9567k
    Peak memory use after GGC: 8914k
    Maximum of released memory in single GGC run: 2649k -> 2648k
    Garbage: 40069k -> 40060k
    Leak: 6741k -> 6741k
    Overhead: 5738k -> 5738k
    GGC runs: 313

comparing combine.c compilation at -O1 level:
    Overall memory needed: 26900k
    Peak memory use before GGC: 17435k
    Peak memory use after GGC: 17256k
    Maximum of released memory in single GGC run: 2309k
    Garbage: 62592k -> 62593k
    Leak: 6880k -> 6880k
    Overhead: 7485k -> 7485k
    GGC runs: 393

comparing combine.c compilation at -O2 level:
  Amount of memory still referenced at the end of compilation increased from 6966k to 6974k, overall 0.12%
    Overall memory needed: 26900k
    Peak memory use before GGC: 17438k
    Peak memory use after GGC: 17256k
    Maximum of released memory in single GGC run: 3628k -> 3620k
    Garbage: 87439k -> 87438k
    Leak: 6966k -> 6974k
    Overhead: 10540k -> 10540k
    GGC runs: 466 -> 465

comparing combine.c compilation at -O3 level:
    Overall memory needed: 26900k
    Peak memory use before GGC: 19231k
    Peak memory use after GGC: 18481k -> 18482k
    Maximum of released memory in single GGC run: 4967k
    Garbage: 120004k -> 120003k
    Leak: 7055k -> 7056k
    Overhead: 14423k -> 14423k
    GGC runs: 520

comparing insn-attrtab.c compilation at -O0 level:
    Overall memory needed: 80944k -> 80952k
    Peak memory use before GGC: 69508k -> 69509k
    Peak memory use after GGC: 45045k
    Maximum of released memory in single GGC run: 36220k
    Garbage: 146683k -> 146686k
    Leak: 9892k -> 9892k
    Overhead: 19751k -> 19751k
    GGC runs: 247

comparing insn-attrtab.c compilation at -O1 level:
    Overall memory needed: 114184k
    Peak memory use before GGC: 96843k -> 96842k
    Peak memory use after GGC: 84876k -> 84874k
    Maximum of released memory in single GGC run: 32383k
    Garbage: 304565k -> 304573k
    Leak: 10073k -> 10073k
    Overhead: 37280k -> 37280k
    GGC runs: 246

comparing insn-attrtab.c compilation at -O2 level:
    Overall memory needed: 111584k
    Peak memory use before GGC: 98561k
    Peak memory use after GGC: 84974k -> 84975k
    Maximum of released memory in single GGC run: 31928k -> 31927k
    Garbage: 351851k -> 351852k
    Leak: 10056k -> 10057k
    Overhead: 45865k -> 45866k
    GGC runs: 276

comparing insn-attrtab.c compilation at -O3 level:
    Overall memory needed: 111632k
    Peak memory use before GGC: 98589k
    Peak memory use after GGC: 85002k
    Maximum of released memory in single GGC run: 32269k -> 32270k
    Garbage: 352518k -> 352518k
    Leak: 10061k -> 10062k
    Overhead: 46068k
    GGC runs: 281

comparing Gerald's testcase PR8361 compilation at -O0 level:
    Overall memory needed: 118252k
    Peak memory use before GGC: 95027k -> 95028k
    Peak memory use after GGC: 94080k
    Maximum of released memory in single GGC run: 20299k
    Garbage: 223449k -> 223449k
    Leak: 49470k -> 49470k
    Overhead: 37085k -> 37085k
    GGC runs: 369

comparing Gerald's testcase PR8361 compilation at -O1 level:
    Overall memory needed: 108460k
    Peak memory use before GGC: 95143k
    Peak memory use after GGC: 93151k -> 93152k
    Maximum of released memory in single GGC run: 20158k
    Garbage: 561300k -> 561223k
    Leak: 52213k -> 52213k
    Overhead: 63026k -> 63020k
    GGC runs: 532

comparing Gerald's testcase PR8361 compilation at -O2 level:
    Overall memory needed: 113132k -> 113144k
    Peak memory use before GGC: 95143k -> 95144k
    Peak memory use after GGC: 93152k
    Maximum of released memory in single GGC run: 20158k
    Garbage: 777516k -> 777488k
    Leak: 53308k -> 53309k
    Overhead: 83772k -> 83772k
    GGC runs: 612

comparing Gerald's testcase PR8361 compilation at -O3 level:
    Overall memory needed: 112676k -> 112648k
    Peak memory use before GGC: 96537k
    Peak memory use after GGC: 94579k -> 94580k
    Maximum of released memory in single GGC run: 20582k
    Garbage: 857240k -> 857251k
    Leak: 54247k -> 54248k
    Overhead: 89341k -> 89341k
    GGC runs: 625

Head of the ChangeLog is:

--- /usr/src/SpecTests/sandbox-britten-memory/x86_64/mem-result/ChangeLog	2006-02-18 19:45:41.000000000 +0000
+++ /usr/src/SpecTests/sandbox-britten-memory/gcc/gcc/ChangeLog	2006-02-19 06:30:39.000000000 +0000
@@ -1,3 +1,152 @@
+2006-02-18  Mark Wielaard  <mark@klomp.org>
+
+	* doc/contrib.texi (Contributors): Add classpath/libgcj hackers
+	who added new 4.1 features, bug fixes and integration support.
+
+2005-02-18  David Edelsohn  <edelsohn@gnu.org>
+
+	PR target/26350
+	* config/rs6000/rs6000.md (extenddftf2): Force 0.0 to validized
+	MEM for ABI_V4 pic.
+
+2005-02-18  Richard Sandiford  <richard@codesourcery.com>
+
+	* cselib.c (cselib_init): Change RTX_SIZE to RTX_CODE_SIZE.
+	* emit-rtl.c (copy_rtx_if_shared_1): Use shallow_copy_rtx.
+	(copy_insn_1): Likewise.  Don't copy each field individually.
+	Reindent.
+	* read-rtl.c (apply_macro_to_rtx): Use RTX_CODE_SIZE instead
+	of RTX_SIZE.
+	* reload1.c (eliminate_regs): Use shallow_copy_rtx.
+	* rtl.c (rtx_size): Rename variable to...
+	(rtx_code_size): ...this.
+	(rtx_size): New function.
+	(rtx_alloc_stat): Use RTX_CODE_SIZE instead of RTX_SIZE.
+	(copy_rtx): Use shallow_copy_rtx.  Don't copy each field individually.
+	Reindent.
+	(shallow_copy_rtx_stat): Use rtx_size instead of RTX_SIZE.
+	* rtl.h (rtx_code_size): New variable.
+	(rtx_size): Change from a variable to a function.
+	(RTX_SIZE): Rename to...
+	(RTX_CODE_SIZE): ...this.
+
+	PR target/9703
+	PR tree-optimization/17106
+	* doc/tm.texi (TARGET_USE_BLOCKS_FOR_CONSTANT_P): Document.
+	(Anchored Addresses): New section.
+	* doc/invoke.texi (-fsection-anchors): Document.
+	* doc/rtl.texi (SYMBOL_REF_IN_BLOCK_P, SYMBOL_FLAG_IN_BLOCK): Likewise.
+	(SYMBOL_REF_ANCHOR_P, SYMBOL_FLAG_ANCHOR): Likewise.
+	(SYMBOL_REF_BLOCK, SYMBOL_REF_BLOCK_OFFSET): Likewise.
+	* hooks.c (hook_bool_mode_rtx_false): New function.
+	* hooks.h (hook_bool_mode_rtx_false): Declare.
+	* gengtype.c (create_optional_field): New function.
+	(adjust_field_rtx_def): Add the "block_sym" field for SYMBOL_REFs when
+	SYMBOL_REF_IN_BLOCK_P is true.
+	* target.h (output_anchor, use_blocks_for_constant_p): New hooks.
+	(min_anchor_offset, max_anchor_offset): Likewise.
+	(use_anchors_for_symbol_p): New hook.
+	* toplev.c (compile_file): Call output_object_blocks.
+	(target_supports_section_anchors_p): New function.
+	(process_options): Check that -fsection-anchors is only used on
+	targets that support it and when -funit-at-a-time is in effect.
+	* tree-ssa-loop-ivopts.c (prepare_decl_rtl): Only create DECL_RTL
+	if the decl doesn't have one.
+	* dwarf2out.c: Remove instantiations of VEC(rtx,gc).
+	* expr.c (emit_move_multi_word, emit_move_insn): Pass the result
+	of force_const_mem through use_anchored_address.
+	(expand_expr_constant): New function.
+	(expand_expr_addr_expr_1): Call it.  Use the same modifier when
+	calling expand_expr for INDIRECT_REF.
+	(expand_expr_real_1): Pass DECL_RTL through use_anchored_address
+	for all modifiers except EXPAND_INITIALIZER.  Use expand_expr_constant.
+	* expr.h (use_anchored_address): Declare.
+	* loop-unroll.c: Don't declare rtx vectors here.
+	* explow.c: Include output.h.
+	(validize_mem): Call use_anchored_address.
+	(use_anchored_address): New function.
+	* common.opt (-fsection-anchors): New switch.
+	* varasm.c (object_block_htab, anchor_labelno): New variables.
+	(hash_section, object_block_entry_eq, object_block_entry_hash)
+	(use_object_blocks_p, get_block_for_section, create_block_symbol)
+	(use_blocks_for_decl_p, change_symbol_section): New functions.
+	(get_variable_section): New function, split out from assemble_variable.
+	(make_decl_rtl): Create a block symbol if use_object_blocks_p and
+	use_blocks_for_decl_p say so.  Use change_symbol_section if the
+	symbol has already been created.
+	(assemble_variable_contents): New function, split out from...
+	(assemble_variable): ...here.  Don't output any code for
+	block symbols; just pass them to place_block_symbol.
+	Use get_variable_section and assemble_variable_contents.
+	(get_constant_alignment, get_constant_section, get_constant_size): New
+	functions, split from output_constant_def_contents.
+	(build_constant_desc): Create a block symbol if use_object_blocks_p
+	says so.  Or into SYMBOL_REF_FLAGS.
+	(assemble_constant_contents): New function, split from...
+	(output_constant_def_contents): ...here.  Don't output any code
+	for block symbols; just pass them to place_section_symbol.
+	Use get_constant_section and get_constant_alignment.
+	(force_const_mem): Create a block symbol if use_object_blocks_p and
+	use_blocks_for_constant_p say so.  Or into SYMBOL_REF_FLAGS.
+	(output_constant_pool_1): Add an explicit alignment argument.
+	Don't switch sections here.
+	(output_constant_pool): Adjust call to output_constant_pool_1.
+	Switch sections here instead.  Don't output anything for block symbols;
+	just pass them to place_block_symbol.
+	(init_varasm_once): Initialize object_block_htab.
+	(default_encode_section_info): Keep the old SYMBOL_FLAG_IN_BLOCK.
+	(default_asm_output_anchor, default_use_aenchors_for_symbol_p)
+	(place_block_symbol, get_section_anchor, output_object_block)
+	(output_object_block_htab, output_object_blocks): New functions.
+	* target-def.h (TARGET_ASM_OUTPUT_ANCHOR): New macro.
+	(TARGET_ASM_OUT): Include it.
+	(TARGET_USE_BLOCKS_FOR_CONSTANT_P): New macro.
+	(TARGET_MIN_ANCHOR_OFFSET, TARGET_MAX_ANCHOR_OFFSET): New macros.
+	(TARGET_USE_ANCHORS_FOR_SYMBOL_P): New macro.
+	(TARGET_INITIALIZER): Include them.
+	* rtl.c (rtl_check_failed_block_symbol): New function.
+	* rtl.h: Include vec.h.  Declare heap and gc rtx vectors.
+	(block_symbol, object_block): New structures.
+	(rtx_def): Add a block_symbol field to the union.
+	(BLOCK_SYMBOL_CHECK): New macro.
+	(rtl_check_failed_block_symbol): Declare.
+	(SYMBOL_FLAG_IN_BLOCK, SYMBOL_FLAG_ANCHOR): New SYMBOL_REF flags.
+	(SYMBOL_REF_IN_BLOCK_P, SYMBOL_REF_ANCHOR_P): New predicates.
+	(SYMBOL_FLAG_MACH_DEP_SHIFT): Bump by 2.
+	(SYMBOL_REF_BLOCK, SYMBOL_REF_BLOCK_OFFSET): New accessors.
+	* output.h (output_section_symbols): Declare.
+	(object_block): Name structure.
+	(place_section_symbol, get_section_anchor, default_asm_output_anchor)
+	(default_use_anchors_for_symbol_p): Declare.
+	* Makefile.in (RTL_BASE_H): Add vec.h.
+	(explow.o): Depend on output.h.
+	* config/rs6000/rs6000.c (TARGET_MIN_ANCHOR_OFFSET): Override default.
+	(TARGET_MAX_ANCHOR_OFFSET): Likewise.
+	(TARGET_USE_BLOCKS_FOR_CONSTANT_P): Likewise.
+	(rs6000_use_blocks_for_constant_p): New function.
+
+2006-02-18  John David Anglin  <dave.anglin@nrc-cnrc.gc.ca>
+
+	* doc/install.texi (hppa*-hp-hpux*): Update for 4.1.0.
+
+2006-02-18  Andrew Pinski  <pinskia@physics.uc.edu>
+
+	PR tree-opt/25680
+	* tree-ssa-ccp.c (ccp_fold): Handle store CCP of REALPART_EXPR and
+	IMAGPART_EXPR.
+
+2006-02-18  Diego Novillo  <dnovillo@redhat.com>
+
+	* tree-flow.h (struct var_ann_d): Rename field is_alias_tag to
+	is_aliased.
+	Update all users.
+
+2006-02-18  Jakub Jelinek  <jakub@redhat.com>
+
+	PR middle-end/26334
+	* stmt.c (decl_overlaps_hard_reg_set_p): Use DECL_HARD_REGISTER
+	instead of DECL_REGISTER.
+
 2006-02-18  Olivier Hainque  <hainque@adacore.com>
 
 	PR ada/13408


The results can be reproduced by building a compiler with

--enable-gather-detailed-mem-stats targetting x86-64

and compiling preprocessed combine.c or testcase from PR8632 with:

-fmem-report --param=ggc-min-heapsize=1024 --param=ggc-min-expand=1 -Ox -Q

The memory consumption summary appears in the dump after detailed listing
of the places they are allocated in.  Peak memory consumption is actually
computed by looking for maximal value in {GC XXXX -> YYYY} report.

Your testing script.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]