This is the mail archive of the gcc-regression@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

A recent patch increased GCC's memory consumption!


Hi,

I am a friendly script caring about memory consumption in GCC.  Please
contact jh@suse.cz if something is going wrong.

Comparing memory consumption on compilation of combine.i, insn-attrtab.i,
and generate-3.4.ii I got:


comparing combine.c compilation at -O0 level:
    Overall memory needed: 25276k
    Peak memory use before GGC: 9567k
    Peak memory use after GGC: 8914k
    Maximum of released memory in single GGC run: 2648k
    Garbage: 40060k -> 40068k
    Leak: 6741k -> 6741k
    Overhead: 5738k -> 5738k
    GGC runs: 313

comparing combine.c compilation at -O1 level:
    Overall memory needed: 26900k
    Peak memory use before GGC: 17438k
    Peak memory use after GGC: 17258k -> 17259k
    Maximum of released memory in single GGC run: 2317k -> 2318k
    Garbage: 61612k -> 61617k
    Leak: 6881k -> 6881k
    Overhead: 7267k -> 7268k
    GGC runs: 387 -> 388

comparing combine.c compilation at -O2 level:
    Overall memory needed: 26900k
    Peak memory use before GGC: 17440k -> 17441k
    Peak memory use after GGC: 17259k
    Maximum of released memory in single GGC run: 2409k
    Garbage: 81241k -> 81244k
    Leak: 6965k -> 6966k
    Overhead: 9889k -> 9889k
    GGC runs: 456

comparing combine.c compilation at -O3 level:
    Overall memory needed: 26900k
    Peak memory use before GGC: 18432k
    Peak memory use after GGC: 17975k
    Maximum of released memory in single GGC run: 3504k
    Garbage: 111814k -> 111814k
    Leak: 7055k -> 7055k
    Overhead: 13491k -> 13492k
    GGC runs: 514

comparing insn-attrtab.c compilation at -O0 level:
  Amount of memory still referenced at the end of compilation increased from 9892k to 10132k, overall 2.43%
    Overall memory needed: 80948k
    Peak memory use before GGC: 69509k
    Peak memory use after GGC: 45045k
    Maximum of released memory in single GGC run: 36220k
    Garbage: 146686k -> 146428k
    Leak: 9892k -> 10132k
    Overhead: 19751k -> 19750k
    GGC runs: 247

comparing insn-attrtab.c compilation at -O1 level:
    Overall memory needed: 108248k -> 108252k
    Peak memory use before GGC: 91300k
    Peak memory use after GGC: 80536k
    Maximum of released memory in single GGC run: 32511k
    Garbage: 290789k -> 290789k
    Leak: 10073k -> 10074k
    Overhead: 34652k
    GGC runs: 244

comparing insn-attrtab.c compilation at -O2 level:
    Overall memory needed: 110896k
    Peak memory use before GGC: 97629k -> 97630k
    Peak memory use after GGC: 84116k
    Maximum of released memory in single GGC run: 32085k -> 32084k
    Garbage: 346101k
    Leak: 10057k -> 10057k
    Overhead: 44310k
    GGC runs: 273

comparing insn-attrtab.c compilation at -O3 level:
    Overall memory needed: 110952k
    Peak memory use before GGC: 97657k
    Peak memory use after GGC: 84143k -> 84144k
    Maximum of released memory in single GGC run: 32420k
    Garbage: 346740k
    Leak: 10061k -> 10061k
    Overhead: 44506k
    GGC runs: 279

comparing Gerald's testcase PR8361 compilation at -O0 level:
    Overall memory needed: 118252k
    Peak memory use before GGC: 95028k
    Peak memory use after GGC: 94080k
    Maximum of released memory in single GGC run: 20299k
    Garbage: 223450k -> 223451k
    Leak: 49470k -> 49470k
    Overhead: 37086k -> 37086k
    GGC runs: 369

comparing Gerald's testcase PR8361 compilation at -O1 level:
  Amount of produced GGC garbage increased from 557326k to 559381k, overall 0.37%
  Amount of memory still referenced at the end of compilation increased from 52213k to 53307k, overall 2.10%
    Overall memory needed: 108460k -> 108456k
    Peak memory use before GGC: 95143k -> 95144k
    Peak memory use after GGC: 93152k
    Maximum of released memory in single GGC run: 20158k
    Garbage: 557326k -> 559381k
    Leak: 52213k -> 53307k
    Overhead: 61888k -> 62175k
    GGC runs: 526 -> 525

comparing Gerald's testcase PR8361 compilation at -O2 level:
  Overall memory allocated via mmap and sbrk increased from 108888k to 111076k, overall 2.01%
  Amount of produced GGC garbage increased from 678371k to 685885k, overall 1.11%
  Amount of memory still referenced at the end of compilation increased from 53299k to 54384k, overall 2.04%
    Overall memory needed: 108888k -> 111076k
    Peak memory use before GGC: 95144k
    Peak memory use after GGC: 93152k
    Maximum of released memory in single GGC run: 20158k
    Garbage: 678371k -> 685885k
    Leak: 53299k -> 54384k
    Overhead: 74296k -> 75154k
    GGC runs: 616 -> 606

comparing Gerald's testcase PR8361 compilation at -O3 level:
  Amount of produced GGC garbage increased from 743789k to 746936k, overall 0.42%
  Amount of memory still referenced at the end of compilation increased from 54234k to 55481k, overall 2.30%
    Overall memory needed: 110772k -> 110776k
    Peak memory use before GGC: 96537k -> 96538k
    Peak memory use after GGC: 94580k
    Maximum of released memory in single GGC run: 20582k -> 20583k
    Garbage: 743789k -> 746936k
    Leak: 54234k -> 55481k
    Overhead: 78685k -> 79215k
    GGC runs: 631 -> 630

Head of the ChangeLog is:

--- /usr/src/SpecTests/sandbox-britten-memory/x86_64/mem-result/ChangeLog	2006-03-02 12:08:18.000000000 +0000
+++ /usr/src/SpecTests/sandbox-britten-memory/gcc/gcc/ChangeLog	2006-03-02 22:39:36.000000000 +0000
@@ -1,3 +1,236 @@
+2006-03-02  Richard Sandiford  <richard@codesourcery.com>
+
+	* doc/tm.texi (TARGET_HAVE_SWITCHABLE_BSS_SECTIONS): Document.
+	(ASM_OUTPUT_BSS): Describe the two ways of handling global BSS,
+	and say that only one is needed.
+	* doc/rtl.texi (SYMBOL_REF_BLOCK): Say that the block can be null.
+	* target.h (have_switchable_bss_sections): New hook.
+	* explow.c (use_anchored_address): Check that the symbol is in a block.
+	* varasm.c (tls_comm_section, comm_section, lcomm_section)
+	(bss_noswitch_section): New variables.
+	(get_unnamed_section): Add SECTION_UNNAMED to the flags.
+	(get_noswitch_section): New function.
+	(get_block_for_section): Allow SECT to be null.
+	(unlikely_text_section_p): Use SECTION_STYLE.
+	(bss_initializer_p): New function.
+	(get_variable_section): Move earlier in file.  Take a new argument,
+	prefer_noswitch_p.  Move bss checks from assemble_variable to here.
+	Return one of the new *_sections in such cases.
+	(get_block_for_decl): New function, extracting some logic from
+	use_blocks_for_decl_p.
+	(change_symbol_section): Remove in favor of...
+	(change_symbol_block): ...this new function.
+	(use_blocks_for_decl_p): Remove checks now performed by
+	get_block_for_decl.
+	(make_decl_rtl): Use change_symbol_block and get_block_for_decl.
+	(ASM_EMIT_LOCAL, ASM_EMIT_BSS, ASM_EMIT_COMMON): Delete in favor of...
+	(emit_local, emit_bss, emit_common): ...these new functions.
+	Return true if the alignment was honored.
+	(emit_tls_common): New function.
+	(asm_emit_uninitialised): Delete.
+	(assemble_variable_noswitch): New function, split out from...
+	(assemble_variable): ...here.  Don't make decisions about common
+	variables here.  Globalize all public decls that go into non-common
+	sections.  Check whether SYMBOL_REF_BLOCK is null.
+	(output_constant_def_contents): Check whether SYMBOL_REF_BLOCK is null.
+	(output_constant_pool): Likewise.
+	(init_varasm_once): Initialize the new section variables.
+	(have_global_bss_p): New function.
+	(categorize_decl_for_section): Use bss_initializer_p.
+	(switch_to_section): Use SECTION_STYLE.  Abort for SECTION_NOSWITCH.
+	(place_block_symbol): Assert that the symbol must be in a block.
+	* target-def.h (TARGET_HAVE_SWITCHABLE_BSS_SECTIONS): New macro.
+	(TARGET_INITIALIZER): Include it.
+	* rtl.h (SYMBOL_REF_BLOCK): Document the null alternative.
+	* output.h (SECTION_STYLE_MASK, SECTION_COMMON): New macros.
+	(SECTION_MACH_DEP): Bump by two.
+	(SECTION_UNNAMED, SECTION_NOSWITCH): New macros.
+	(unnamed_section): Mention SECTION_UNNAMED in comment.
+	(named_section): Likewise SECTION_NAMED.
+	(noswitch_section_callback): New type.
+	(noswitch_section): New structure.
+	(section): Add a noswitch_section alternative.
+	(SECTION_STYLE): New macro.
+	(tls_comm_section, comm_section, lcomm_section): Declare.
+	(bss_noswitch_section, have_global_bss_p): Declare.
+	* config/elfos.h (TARGET_HAVE_SWITCHABLE_BSS_SECTIONS): Override.
+	* config/iq2000/iq2000.c (TARGET_HAVE_SWITCHABLE_BSS_SECTIONS):
+	Override.
+	* config/v850/v850.c (TARGET_HAVE_SWITCHABLE_BSS_SECTIONS): Override.
+	* config/stormy16/stormy16.c (TARGET_HAVE_SWITCHABLE_BSS_SECTIONS):
+	Override.
+
+2006-03-02  Daniel Berlin <dberlin@dberlin.org>
+
+	* gcc/tree-vrp.c (execute_vrp): Return value.
+	* gcc/regrename.c (rest_of_handle_regrename): Ditto.
+	* gcc/tree-into-ssa.c (rewrite_into_ssa): Ditto.
+	* gcc/tree-complex.c (tree_lower_complex): Ditto.
+	(tree_lower_complex_O0): Ditto.
+	* gcc/tracer.c (rest_of_handle_tracer): Ditto.
+	* gcc/postreload-gcse.c (rest_of_handle_gcse2): Ditto.
+	* gcc/postreload.c (rest_of_handle_postreload): Ditto.
+	* gcc/tree-tailcall.c (execute_tail_recursion): Ditto.
+	(execute_tail_calls): Ditto.
+	* gcc/tree-ssa-loop-ch.c (copy_loop_headers): Ditto.
+	* gcc/tree.h (init_function_for_compilation): Ditto.
+	* gcc/ipa-cp.c (ipcp_driver): Ditto.
+	* gcc/tree-scalar-evolution.c (scev_const_prop): Ditto.
+	* gcc/tree-scalar-evolution.h (scev_const_prop): Ditto.
+	* gcc/final.c (compute_alignments): Ditto.
+	(rest_of_handle_final): Ditto.
+	(rest_of_handle_shorten_branches): Ditto.
+	(rest_of_clean_state): Ditto.
+	* gcc/omp-low.c (execute_expand_omp): Ditto.
+	(execute_lower_omp): Ditto.
+	* gcc/tree-ssa-dse.c (tree_ssa_dse): Ditto.
+	* gcc/ipa-reference.c (static_execute): Ditto.
+	* gcc/tree-ssa-uncprop.c (tree_ssa_uncprop): Ditto.
+	* gcc/reorg.c (rest_of_handle_delay_slots): Ditto.
+	(rest_of_handle_machine_reorg): Ditto.
+	* gcc/cgraphunit.c (rebuild_cgraph_edges): Ditto.
+	* gcc/flow.c (recompute_reg_usage): Ditto.
+	(rest_of_handle_remove_death_notes): Ditto.
+	(rest_of_handle_life): Ditto.
+	(rest_of_handle_flow2): Ditto.
+	* gcc/tree-ssa-copyrename.c (rename_ssa_copies): Ditto.
+	* gcc/tree-ssa-ccp.c (do_ssa_ccp): Ditto.
+	(do_ssa_store_ccp): Ditto.
+	(execute_fold_all_builtins): Ditto.
+	* gcc/mode-switching.c (rest_of_handle_mode_switching): Ditto.
+	* gcc/modulo-sched.c (rest_of_handle_sms): Ditto.
+	* gcc/ipa-pure-const.c (static_execute): Ditto.
+	* gcc/cse.c (rest_of_handle_cse): Ditto.
+	(rest_of_handle_cse2): Ditto.
+	* gcc/web.c (rest_of_handle_web): Ditto.
+	* gcc/tree-stdarg.c (execute_optimize_stdarg): Ditto.
+	* gcc/tree-ssa-math-opts.c (execute_cse_reciprocals): Ditto.
+	* gcc/tree-ssa-dom.c (tree_ssa_dominator_optimize): Ditto.
+	* gcc/tree-nrv.c (tree_nrv): Ditto.
+	(execute_return_slot_opt): Ditto.
+	* gcc/tree-ssa-alias.c (compute_may_aliases): Ditto.
+	(create_structure_vars): Ditto.
+	* gcc/loop-init.c (rtl_loop_init): Ditto.
+	(rtl_loop_done): Ditto.
+	(rtl_move_loop_invariants): Ditto.
+	(rtl_unswitch): Ditto.
+	(rtl_unroll_and_peel_loops): Ditto.
+	(rtl_doloop): Ditto.
+	* gcc/gimple-low.c (lower_function_body): Ditto.
+	(mark_used_blocks): Ditto.
+	* gcc/tree-ssa-sink.c (execute_sink_code): Ditto.
+	* gcc/ipa-inline.c (cgraph_decide_inlining): Ditto.
+	(cgraph_early_inlining): Ditto.
+	* gcc/global.c (rest_of_handle_global_alloc): Ditto.
+	* gcc/jump.c (cleanup_barriers): Ditto.
+	(purge_line_number_notes): Ditto.
+	* gcc/ifcvt.c (rest_of_handle_if_conversion): Ditto.
+	(rest_of_handle_if_after_reload): Ditto.
+	* gcc/tree-ssa-loop.c (tree_ssa_loop_init): Ditto.
+	(tree_ssa_loop_im): Ditto.
+	(tree_ssa_loop_unswitch): Ditto.
+	(tree_vectorize): Ditto.
+	(tree_linear_transform): Ditto.
+	(tree_ssa_loop_ivcanon): Ditto.
+	(tree_ssa_empty_loop): Ditto.
+	(tree_ssa_loop_bounds): Ditto.
+	(tree_complete_unroll): Ditto.
+	(tree_ssa_loop_prefetch): Ditto.
+	(tree_ssa_loop_ivopts): Ditto.
+	(tree_ssa_loop_done): Ditto.
+	* gcc/predict.c (tree_estimate_probability): Ditto.
+	* gcc/recog.c (split_all_insns_noflow): Ditto.
+	(rest_of_handle_peephole2): Ditto.
+	(rest_of_handle_split_all_insns): Ditto.
+	* gcc/tree-eh.c (lower_eh_constructs): Ditto.
+	* gcc/regmove.c (rest_of_handle_regmove): Ditto.
+	(rest_of_handle_stack_adjustments): Ditto.
+	* gcc/local-alloc.c (rest_of_handle_local_alloc): Ditto.
+	* gcc/function.c (instantiate_virtual_regs): Ditto.
+	(init_function_for_compilation): Ditto.
+	(rest_of_handle_check_leaf_regs): Ditto.
+	* gcc/gcse.c (rest_of_handle_jump_bypass): Ditto.
+	(rest_of_handle_gcse): Ditto.
+	* gcc/ipa-type-escape.c (type_escape_execute): Ditto.
+	* gcc/alias.c (rest_of_handle_cfg): Ditto.
+	* gcc/tree-if-conv.c (main_tree_if_conversion): Ditto.
+	* gcc/profile.c (rest_of_handle_branch_prob): Ditto.
+	* gcc/tree-ssa-phiopt.c (tree_ssa_phiopt): Ditto.
+	* gcc/rtl-factoring.c (rest_of_rtl_seqabstr): Ditto.
+	* gcc/bt-load.c (rest_of_handle_branch_target_load_optimize): Ditto
+	* gcc/tree-dfa.c (find_referenced_vars): Ditto.
+	* gcc/except.c (set_nothrow_function_flags): Ditto.
+	(convert_to_eh_region_ranges): Ditto.
+	(rest_of_handle_eh): Ditto.
+	* gcc/emit-rtl.c (unshare_all_rtl): Ditto.
+	(remove_unnecessary_notes): Ditto.
+	* gcc/except.h (set_nothrow_function_flags): Ditto.
+	(convert_to_eh_region_ranges): Ditto.
+	* gcc/cfgexpand.c (tree_expand_cfg): Ditto.
+	* gcc/tree-cfgcleanup.c (merge_phi_nodes): Ditto.
+	* gcc/tree-ssa-pre.c (do_pre): Ditto.
+	(execute_fre): Ditto.
+	* gcc/cfgcleanup.c (rest_of_handle_jump): Ditto.
+	(rest_of_handle_jump2): Ditto.
+	* gcc/tree-sra.c (tree_sra): Ditto.
+	* gcc/tree-mudflap.c (execute_mudflap_function_ops): Ditto.
+	(execute_mudflap_function_decls): Ditto.
+	* gcc/tree-ssa-copy.c (do_copy_prop): Ditto.
+	(do_store_copy_prop): Ditto.
+	* gcc/ipa-prop.h (ipcp_driver): Ditto.
+	* gcc/cfglayout.c (insn_locators_initialize): Ditto.
+	* gcc/tree-ssa-forwprop.c
+	(tree_ssa_forward_propagate_single_use_vars): Ditto.
+	* gcc/cfglayout.h (insn_locators_initialize): Ditto.
+	* gcc/tree-ssa-dce.c (tree_ssa_dce): Ditto.
+	* gcc/tree-ssa.c (execute_early_warn_uninitialized): Ditto.
+	(execute_late_warn_uninitialized): Ditto.
+	* gcc/rtl.h (cleanup_barriers): Ditto.
+	(split_all_insns_noflow): Ditto.
+	(purge_line_number_notes): Ditto.
+	(unshare_all_rtl): Ditto.
+	(remove_unnecessary_notes): Ditto.
+	(recompute_reg_usage): Ditto.
+	(variable_tracking_main): Ditto.
+	* gcc/integrate.c (emit_initial_value_sets): Ditto.
+	* gcc/integrate.h (emit_initial_value_sets): Ditto.
+	* gcc/tree-optimize.c (execute_free_datastructures): Ditto
+	(execute_free_cfg_annotations): Ditto.
+	(execute_fixup_cfg): Ditto.
+	(execute_cleanup_cfg_pre_ipa): Ditto.
+	(execute_cleanup_cfg_post_optimizing): Ditto.
+	(execute_init_datastructures): Ditto.
+	* gcc/tree-object-size.c (compute_object_sizes): Ditto.
+	* gcc/combine.c (rest_of_handle_combine): Ditto.
+	* gcc/tree-outof-ssa.c (rewrite_out_of_ssa): Ditto.
+	* gcc/bb-reorder.c (duplicate_computed_gotos): Ditto.
+	(rest_of_handle_reorder_blocks): Ditto.
+	(rest_of_handle_partition_blocks): Ditto.
+	* gcc/var-tracking.c (variable_tracking_main): Ditto.
+	* gcc/tree-profile.c (tree_profiling): Ditto.
+	* gcc/tree-vect-generic.c (expand_vector_operations): Ditto.
+	* gcc/reg-stack.c (rest_of_handle_stack_regs): Ditto.
+	* gcc/sched-rgn.c (rest_of_handle_sched): Ditto.
+	(rest_of_handle_sched2): Ditto.
+	* gcc/basic-block.h (free_bb_insn): Ditto.
+	* gcc/tree-ssa-structalias.c (ipa_pta_execute): Ditto.
+	* gcc/tree-cfg.c (execute_build_cfg): Ditto.
+	(remove_useless_stmts): Ditto.
+	(split_critical_edges): Ditto.
+	(execute_warn_function_return): Ditto.
+	(execute_warn_function_noreturn): Ditto.
+	* gcc/tree-ssa-reassoc.c (execute_reassoc): Ditto.
+	* gcc/cfgrtl.c (free_bb_for_insn): Ditto.
+	* gcc/passes.c (execute_one_pass): Run additional
+	todos returned by execute function.
+	* gcc/tree-pass.h (struct tree_opt_pass): Make execute
+	return a value.
+
+2006-03-02  Richard Guenther  <rguenther@suse.de>
+
+	* tree-ssa-alias.c (find_used_portions): Consider taking
+	the address as making the variable not write-only.
+
 2006-03-02  Nick Clifton  <nickc@redhat.com>
 
 	* config.gcc (default_use_cxa_atexit): Extend the description of
@@ -1100,7 +1333,7 @@
 	PR middle-end/25600
 	* fold-const.c (fold_binary): Fold (X >> C) != 0 into X < 0 when
 	C is one less than the width of X (and related transformations).
-	* simplify_rtx.c (simplify_unary_operation_1): Transform
+	* simplify-rtx.c (simplify_unary_operation_1): Transform
 	(neg (lt x 0)) into either (ashiftrt X C) or (lshiftrt X C)
 	depending on STORE_FLAG_VALUE, were C is one less then the
 	width of X.
--- /usr/src/SpecTests/sandbox-britten-memory/x86_64/mem-result/ChangeLog.cp	2006-03-02 12:08:18.000000000 +0000
+++ /usr/src/SpecTests/sandbox-britten-memory/gcc/gcc/cp/ChangeLog	2006-03-02 22:39:33.000000000 +0000
@@ -1,3 +1,8 @@
+2006-03-02  Richard Sandiford  <richard@codesourcery.com>
+
+	* decl.c (start_decl): Use have_global_bss_p when deciding
+	whether to make the decl common.
+
 2006-03-01  Mike Stump  <mrs@apple.com>
 
 	PR darwin/25908


The results can be reproduced by building a compiler with

--enable-gather-detailed-mem-stats targetting x86-64

and compiling preprocessed combine.c or testcase from PR8632 with:

-fmem-report --param=ggc-min-heapsize=1024 --param=ggc-min-expand=1 -Ox -Q

The memory consumption summary appears in the dump after detailed listing
of the places they are allocated in.  Peak memory consumption is actually
computed by looking for maximal value in {GC XXXX -> YYYY} report.

Your testing script.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]