This is the mail archive of the gcc-regression@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Some aspect of GCC memory consumption increased by recent patch


Hi,
Comparing memory consumption on compilation of combine.i and generate-3.4.ii I got:


comparing combine.c compilation at -O0 level:
    Overall memory needed: 17776k
    Peak memory use before GGC: 9294k
    Peak memory use after GGC: 8606k
    Maximum of released memory in single GGC run: 2870k
    Garbage: 42611k
    Leak: 6103k
    Overhead: 5586k
    GGC runs: 365

comparing combine.c compilation at -O1 level:
  Amount of produced GGC garbage decreased from 80496k to 72823k, overall -10.54%
    Overall memory needed: 18660k -> 18448k
    Peak memory use before GGC: 9786k -> 9645k
    Peak memory use after GGC: 8800k -> 8804k
    Maximum of released memory in single GGC run: 2068k -> 2034k
    Garbage: 80496k -> 72823k
    Leak: 6669k -> 6669k
    Overhead: 14316k -> 12311k
    GGC runs: 598 -> 590

comparing combine.c compilation at -O2 level:
  Amount of produced GGC garbage decreased from 96605k to 88714k, overall -8.90%
    Overall memory needed: 22272k -> 22096k
    Peak memory use before GGC: 12769k
    Peak memory use after GGC: 12610k
    Maximum of released memory in single GGC run: 2578k
    Garbage: 96605k -> 88714k
    Leak: 6424k -> 6424k
    Overhead: 19132k -> 17053k
    GGC runs: 587 -> 582

comparing combine.c compilation at -O3 level:
  Overall memory allocated via mmap and sbrk increased from 23752k to 24072k, overall 1.35%
  Amount of produced GGC garbage decreased from 129502k to 117766k, overall -9.97%
    Overall memory needed: 23752k -> 24072k
    Peak memory use before GGC: 13443k -> 13103k
    Peak memory use after GGC: 12733k -> 12739k
    Maximum of released memory in single GGC run: 3484k -> 3447k
    Garbage: 129502k -> 117766k
    Leak: 6990k -> 6934k
    Overhead: 25298k -> 22130k
    GGC runs: 650 -> 644

comparing insn-attrtab.c compilation at -O0 level:
    Overall memory needed: 132860k
    Peak memory use before GGC: 77100k
    Peak memory use after GGC: 45185k
    Maximum of released memory in single GGC run: 42129k
    Garbage: 159511k
    Leak: 10618k
    Overhead: 19798k
    GGC runs: 310

comparing insn-attrtab.c compilation at -O1 level:
    Overall memory needed: 150312k -> 149896k
    Peak memory use before GGC: 97475k -> 97456k
    Peak memory use after GGC: 71695k -> 71693k
    Maximum of released memory in single GGC run: 41239k
    Garbage: 491599k -> 490782k
    Leak: 11052k -> 11051k
    Overhead: 85803k -> 85748k
    GGC runs: 469 -> 467

comparing insn-attrtab.c compilation at -O2 level:
  Overall memory allocated via mmap and sbrk increased from 224512k to 240896k, overall 7.30%
    Overall memory needed: 224512k -> 240896k
    Peak memory use before GGC: 113028k -> 113009k
    Peak memory use after GGC: 87248k -> 87246k
    Maximum of released memory in single GGC run: 35988k
    Garbage: 544311k -> 543380k
    Leak: 11207k -> 11207k
    Overhead: 95955k -> 95865k
    GGC runs: 389

comparing insn-attrtab.c compilation at -O3 level:
    Overall memory needed: 224504k -> 224644k
    Peak memory use before GGC: 113029k -> 113009k
    Peak memory use after GGC: 87248k -> 87246k
    Maximum of released memory in single GGC run: 35988k
    Garbage: 546684k -> 544517k
    Leak: 11271k -> 11247k
    Overhead: 96739k -> 96265k
    GGC runs: 398 -> 396

comparing Gerald's testcase PR8361 compilation at -O0 level:
    Overall memory needed: 114840k
    Peak memory use before GGC: 92007k
    Peak memory use after GGC: 90474k
    Maximum of released memory in single GGC run: 20897k
    Garbage: 274940k
    Leak: 59854k
    Overhead: 35160k
    GGC runs: 556

comparing Gerald's testcase PR8361 compilation at -O1 level:
  Amount of produced GGC garbage decreased from 695313k to 656225k, overall -5.96%
    Overall memory needed: 126208k -> 126320k
    Peak memory use before GGC: 96400k
    Peak memory use after GGC: 89739k
    Maximum of released memory in single GGC run: 20070k
    Garbage: 695313k -> 656225k
    Leak: 62394k -> 62367k
    Overhead: 151706k -> 136567k
    GGC runs: 819 -> 807

comparing Gerald's testcase PR8361 compilation at -O2 level:
  Amount of produced GGC garbage decreased from 763238k to 722293k, overall -5.67%
    Overall memory needed: 127456k -> 127420k
    Peak memory use before GGC: 96400k
    Peak memory use after GGC: 89740k
    Maximum of released memory in single GGC run: 20070k
    Garbage: 763238k -> 722293k
    Leak: 62959k -> 62927k
    Overhead: 182222k -> 166220k
    GGC runs: 854 -> 843

comparing Gerald's testcase PR8361 compilation at -O3 level:
  Amount of produced GGC garbage decreased from 805932k to 762557k, overall -5.69%
    Overall memory needed: 126228k -> 125860k
    Peak memory use before GGC: 92885k -> 92734k
    Peak memory use after GGC: 90751k -> 90722k
    Maximum of released memory in single GGC run: 20814k
    Garbage: 805932k -> 762557k
    Leak: 63372k -> 63334k
    Overhead: 195463k -> 176964k
    GGC runs: 841 -> 833

Head of changelog is:

--- /usr/src/SpecTests/sandbox-britten-memory/x86_64/mem-result/ChangeLog	2004-09-16 12:34:43.000000000 +0000
+++ /usr/src/SpecTests/sandbox-britten-memory/gcc/gcc/ChangeLog	2004-09-16 23:19:56.000000000 +0000
@@ -1,3 +1,168 @@
+2004-09-16  Diego Novillo  <dnovillo@redhat.com>
+
+	* tree-ssa-operands.c (add_call_clobber_ops): Make read-only
+	test apply only to TREE_STATIC and DECL_EXTERNAL.
+
+2004-09-16  Zdenek Dvorak  <rakdver@atrey.karlin.mff.cuni.cz>
+
+	* Makefile.in (tree-cfg.o): Add CFGLAYOUT_H dependency.
+	* basic-block.h (get_dominated_by_region): Declare.
+	* dominance.c (get_dominated_by_region): New function.
+	* tree-cfg.c: Include cfglayout.h.
+	(tree_duplicate_bb): Duplicate also phi nodes.
+	(struct ssa_name_map_entry): New type.
+	(add_phi_args_after_copy_bb, add_phi_args_after_copy,
+	ssa_name_map_entry_hash, ssa_name_map_entry_eq,
+	allocate_ssa_names, rewrite_to_new_ssa_names_def,
+	rewrite_to_new_ssa_names_use, rewrite_to_new_ssa_names_bb,
+	rewrite_to_new_ssa_names, tree_duplicate_sese_region): New functions.
+	* tree-flow.h (tree_duplicate_sese_region, add_phi_args_after_copy_bb,
+	add_phi_args_after_copy, rewrite_to_new_ssa_names_bb,
+	rewrite_to_new_ssa_names, allocate_ssa_names,
+	rewrite_into_loop_closed_ssa, verify_loop_closed_ssa): Declare.
+	* tree-ssa-loop-ch.c (duplicate_blocks): Removed.
+	(copy_loop_headers): Use tree_duplicate_sese_region.
+
+2004-09-16  Frank Ch. Eigler  <fche@redhat.com>
+
+	* profile.c (branch_prob): Restore support for USE_MAPPED_LOCATION.
+
+2004-09-16 Jeff Law  <law@redhat.com>
+
+	* tree-into-ssa.c (block_defs_stack): New toplevel varray.
+	(rewrite_block_data): Remove, no longer used.
+	(rewrite_initialize_block_local_data): Remove, no longer used.
+	(rewrite_initialize_block): Mark parameters as unused as needed.
+	Change references to the block local block_defs to be block_defs_stack.
+	Push a marker onto the block_defs_stack.
+	(ssa_rewrite_initialize_block): Similarly.
+	(rewrite_stmt, ssa_rewrite_stmt): Similarly.
+	(ssa_register_new_def): No longer needs varray argument.  Use
+	block_defs_stack instead.  No longer handle possibly null block_defs
+	varray.  Reverse order of items we push on the stack to make it
+	easier to identify our marker.
+	(register_new_def): No longer handle possibly null block_defs
+	varray.
+	(rewrite_finalize_block): Revamp to look for markers in the global
+	block_defs_stack varray rather than wiping a block local varray.
+	Mark arguments as unused as needed.
+	(ssa_rewrite_finalize_block): Similarly.
+	(rewrite_into_ssa): Update initialization of dom walker structure
+	to reflect that we don't need block local data anymore.  Initialize
+	the block_defs_stack varray.
+	(rewrite_ssa_into_ssa): Similarly.
+	* tree-ssa-dom.c (block_defs_stack): New toplevel varray.
+	(struct dom_walk_data): Kill block_defs field.
+	(tree_ssa_dominator_optimize): Initialize block_defs_stack.
+	(thread_across_edge): Use the global block_defs_stack instead of
+	the old block_defs varray.
+	(dom_opt_initialize_block_local_data): Update now that we don't have
+	block_defs field to check anymore.
+	(dom_opt_initialize_block): Push a marker onto block_defs_stack.
+	(restore_currdefs_to_original_value): Use the new block_defs_stack
+	instead of a block local varray.
+	(dom_opt_finalize_block): Similarly.
+	(record_equivalencs_from_phis): Similarly.
+	(optimize_stmt, register_definitions_for_stmt): Similarly.
+
+2004-09-16  Andrew MacLeod  <amacleod@redhat.com>
+
+	PR tree-optimization/17517
+	* tree-ssa-copyrename.c (copy_rename_partition_coalesce): Don't 
+	coalesce same-root variables without checking for abnormal PHI usage.
+
+2004-09-16  Daniel Berlin  <dberlin@dberlin.org>
+	
+	* cfgloop.h (duplicate_loop):  Add prototype.
+	* cfgloopmanip.c (duplicate_loop): Make non-static.
+	* lambda-code.c (perfect_nestify): Factor out test whether
+	we can handle this loop into separate function.
+	Call it.
+	(can_convert_to_perfect_nest): New function.
+	(replace_uses_of_x_with_y): Add modify_stmt call.
+	* tree-loop-linear.c (linear_transform_loops): Call
+	rewrite_into_loop_closed_ssa and free_df.
+
+2004-09-16  Daniel Berlin  <dberlin@dberlin.org>
+
+	* lambda-code.c (invariant_in_loop): is_gimple_min_invariant is
+	loop invariant as well.
+	(perfect_nestify): new function.
+	(gcc_loop_to_lambda_loop): New parameters to track lower bounds,
+	upper bounds, and steps. 
+	Set outerinductionvar properly.
+	(gcc_loopnest_to_lambda_loopnest): Add loops and need_perfect
+	parameters.
+	Return NULL if we need a perfect loop and can't make one.
+	(lambda_loopnest_to_gcc_loopnest): Correct algorithm.
+	(not_interesting_stmt): New function.
+	(phi_loop_edge_uses_def): Ditto.
+	(stmt_uses_phi_result): Ditto.
+	(stmt_is_bumper_for_loop): Ditto.
+	(perfect_nest_p): Ditto.
+	(nestify_update_pending_stmts): Ditto.
+	(replace_uses_of_x_with_y): Ditto.
+	(stmt_uses_op): Ditto.
+	(perfect_nestify): Ditto.
+	* lambda-mat.c (lambda_matrix_id_p): New function.
+	* lambda-trans.c (lambda_trans_matrix_id_p): Ditto.
+	* lambda.h: Update prototypes.
+	* tree-loop-linear (linear_transform_loop): Use new
+	perfect_nest_p. Detect and ignore identity transform.
+	* tree-ssa-loop.c (pass_linear_transform): Use TODO_write_loop_closed.
+
+2004-09-16  Sebastian Pop  <pop@cri.ensmp.fr>
+
+	* tree-loop-linear.c (gather_interchange_stats): Add more comments.
+	Gather also strides of accessed data.  Pass in the data references 
+	array.
+	(try_interchange_loops): Add a new heuristic for handling the temporal 
+	locality.  Pass in the data references array.
+	(linear_transform_loops): Pass the data references array to
+	try_interchange_loops.
+
+2004-09-16  Kazu Hirata  <kazu@cs.umass.edu>
+
+	* doc/invoke.texi: Fix typos.  Follow spelling conventions.
+
+2004-09-16  Nathan Sidwell  <nathan@codesourcery.com>
+
+	* doc/c-tree.texi (Classes): Remove index entries for
+	TREE_VIA_{PUBLIC,PROTECTED,PRIVATE}.
+
+2004-09-16  Zdenek Dvorak  <rakdver@atrey.karlin.mff.cuni.cz>
+
+	* fold-const.c (fold): Fold difference of addresses.
+	(ptr_difference_const): Moved from tree-ssa-loop-ivopts, based on
+	get_inner_reference.
+	* tree-ssa-loop-ivopts.c (peel_address): Removed.
+	(ptr_difference_const): Moved to fold-const.c.
+	(split_address_cost): Use get_inner_reference instead of peel_address.
+	(ptr_difference_cost): Change type of diff to HOST_WIDE_INT.
+	* tree.h (ptr_difference_const): Export.
+
+	* tree-ssa-loop-ivopts.c (dump_iv, dump_use, dump_cand): Add induction
+	variable type to the dump.  Fix indentation.
+	(idx_find_step): Handle nonconstant array_ref_element_size and
+	array_ref_low_bound.
+	(idx_record_use): Handle array_ref_element_size and
+	array_ref_low_bound.
+	(find_interesting_uses_stmt): Handle memory = nontrivial_expression
+	statements correctly.
+	(get_computation_at, iv_value): Do not unshare expressions here.
+	(rewrite_use_outer): Unshare the expression before it is emitted
+	to code.
+	* tree-ssa-loop-niter.c (unsigned_type_for, signed_type_for):
+	Moved to tree.c.
+	* tree.c (unsigned_type_for, signed_type_for): Moved from
+	tree-ssa-loop-niter.c.  Use langhooks.
+	* tree.h (signed_type_for): Export.
+
+2004-09-16  David Edelsohn  <edelsohn@gnu.org>
+
+	* config/rs6000/rs6000.c (rs6000_xcoff_asm_named_section): Update
+	prototype.
+
 2004-09-15  Andrew Pinski  <pinskia@physics.uc.edu>
 
 	PR target/11572
--- /usr/src/SpecTests/sandbox-britten-memory/x86_64/mem-result/ChangeLog.cp	2004-09-15 20:40:34.000000000 +0000
+++ /usr/src/SpecTests/sandbox-britten-memory/gcc/gcc/cp/ChangeLog	2004-09-16 23:19:59.000000000 +0000
@@ -1,3 +1,42 @@
+2004-09-16  Mark Mitchell  <mark@codesourcery.com>
+
+	PR c++/17501
+	* parser.c (cp_parser_nested_name_specifier): Do not resolve
+	typename types if the user explicitly said "typename".
+
+2004-09-16  Andrew MacLeod  <amacleod@redhat.com>
+
+	* error.c (dump_decl): Make sure there is lang_specific info before 
+	checking for DTOR and CTOR decls.
+
+2004-09-16  Nathan Sidwell  <nathan@codesourcery.com>
+
+	* class.c (copy_virtuals): Remove.
+	(build_primary_vtable): Use copy_list directly.
+	(build_secondary_vtable): Likewise.
+	(update_vtable_entry_for_fn): Clear BV_CALL_INDEX here.
+	(create_vtable_ptr): Likewise.
+
+2004-09-16  Kazu Hirata  <kazu@cs.umass.edu>
+
+	* search.c: Follow spelling conventions.
+
+2004-09-16  Nathan Sidwell  <nathan@codesourcery.com>
+
+	* cp-tree.h (struct lang_type_class): Make pure_virtuals a
+	VEC(tree).
+	(CLASSTYPE_INLINE_FRIENDS, CLASSTYPE_PURE_VIRTUALS): Update
+	comments.
+	* call.c (build_new_method_call): Don't confirm a pure virtual is
+	in CLASSTYPE_PURE_VIRTUALS.  Reorder checks. Make it a warning.
+	* class.c (check_methods): CLASSTYPE_INLINE_FRIENDS is a VEC(tree).
+	(fixup_inline_methods, finish_struct): Likewise.
+	* decl.c (finish_method): Likewise.
+	* search.c (dfs_get_pure_virtuals, get_pure_virtuals):
+	CLASSTYPE_PURE_VIRTUALS is a VEC(tree).
+	* typeck2.c (abstract_virtuals_error): Likewise. Truncate the
+	vector to avoid repeating the list in error messages.
+
 2004-09-15  Mark Mitchell  <mark@codesourcery.com>
 
 	* cp-objcp-common.h (LANG_HOOKS_COMDAT_GROUP): Define.

I am friendly script caring about memory consumption in GCC.  Please contact
jh@suse.cz if something is going wrong.

The results can be reproduced by building compiler with
--enable-gather-detailed-mem-stats targetting x86-64 and compiling preprocessed
combine.c or testcase from PR8632 with:

-fmem-report --param=ggc-min-heapsize=1024 --param=ggc-min-expand=1 -Ox -Q

The memory consumption summary appears in the dump after detailed listing of
the places they are allocated in.  Peak memory consumption is actually computed
by looking for maximal value in {GC XXXX -> YYYY} report.

Yours testing script.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]