This is the mail archive of the
gcc-regression@gcc.gnu.org
mailing list for the GCC project.
Some aspect of GCC memory consumption increased by recent patch
- From: gcctest at suse dot de
- To: jh at suse dot cz, gcc-regression at gcc dot gnu dot org
- Date: Fri, 17 Sep 2004 00:11:53 +0000
- Subject: Some aspect of GCC memory consumption increased by recent patch
Hi,
Comparing memory consumption on compilation of combine.i and generate-3.4.ii I got:
comparing combine.c compilation at -O0 level:
Overall memory needed: 17776k
Peak memory use before GGC: 9294k
Peak memory use after GGC: 8606k
Maximum of released memory in single GGC run: 2870k
Garbage: 42611k
Leak: 6103k
Overhead: 5586k
GGC runs: 365
comparing combine.c compilation at -O1 level:
Amount of produced GGC garbage decreased from 80496k to 72823k, overall -10.54%
Overall memory needed: 18660k -> 18448k
Peak memory use before GGC: 9786k -> 9645k
Peak memory use after GGC: 8800k -> 8804k
Maximum of released memory in single GGC run: 2068k -> 2034k
Garbage: 80496k -> 72823k
Leak: 6669k -> 6669k
Overhead: 14316k -> 12311k
GGC runs: 598 -> 590
comparing combine.c compilation at -O2 level:
Amount of produced GGC garbage decreased from 96605k to 88714k, overall -8.90%
Overall memory needed: 22272k -> 22096k
Peak memory use before GGC: 12769k
Peak memory use after GGC: 12610k
Maximum of released memory in single GGC run: 2578k
Garbage: 96605k -> 88714k
Leak: 6424k -> 6424k
Overhead: 19132k -> 17053k
GGC runs: 587 -> 582
comparing combine.c compilation at -O3 level:
Overall memory allocated via mmap and sbrk increased from 23752k to 24072k, overall 1.35%
Amount of produced GGC garbage decreased from 129502k to 117766k, overall -9.97%
Overall memory needed: 23752k -> 24072k
Peak memory use before GGC: 13443k -> 13103k
Peak memory use after GGC: 12733k -> 12739k
Maximum of released memory in single GGC run: 3484k -> 3447k
Garbage: 129502k -> 117766k
Leak: 6990k -> 6934k
Overhead: 25298k -> 22130k
GGC runs: 650 -> 644
comparing insn-attrtab.c compilation at -O0 level:
Overall memory needed: 132860k
Peak memory use before GGC: 77100k
Peak memory use after GGC: 45185k
Maximum of released memory in single GGC run: 42129k
Garbage: 159511k
Leak: 10618k
Overhead: 19798k
GGC runs: 310
comparing insn-attrtab.c compilation at -O1 level:
Overall memory needed: 150312k -> 149896k
Peak memory use before GGC: 97475k -> 97456k
Peak memory use after GGC: 71695k -> 71693k
Maximum of released memory in single GGC run: 41239k
Garbage: 491599k -> 490782k
Leak: 11052k -> 11051k
Overhead: 85803k -> 85748k
GGC runs: 469 -> 467
comparing insn-attrtab.c compilation at -O2 level:
Overall memory allocated via mmap and sbrk increased from 224512k to 240896k, overall 7.30%
Overall memory needed: 224512k -> 240896k
Peak memory use before GGC: 113028k -> 113009k
Peak memory use after GGC: 87248k -> 87246k
Maximum of released memory in single GGC run: 35988k
Garbage: 544311k -> 543380k
Leak: 11207k -> 11207k
Overhead: 95955k -> 95865k
GGC runs: 389
comparing insn-attrtab.c compilation at -O3 level:
Overall memory needed: 224504k -> 224644k
Peak memory use before GGC: 113029k -> 113009k
Peak memory use after GGC: 87248k -> 87246k
Maximum of released memory in single GGC run: 35988k
Garbage: 546684k -> 544517k
Leak: 11271k -> 11247k
Overhead: 96739k -> 96265k
GGC runs: 398 -> 396
comparing Gerald's testcase PR8361 compilation at -O0 level:
Overall memory needed: 114840k
Peak memory use before GGC: 92007k
Peak memory use after GGC: 90474k
Maximum of released memory in single GGC run: 20897k
Garbage: 274940k
Leak: 59854k
Overhead: 35160k
GGC runs: 556
comparing Gerald's testcase PR8361 compilation at -O1 level:
Amount of produced GGC garbage decreased from 695313k to 656225k, overall -5.96%
Overall memory needed: 126208k -> 126320k
Peak memory use before GGC: 96400k
Peak memory use after GGC: 89739k
Maximum of released memory in single GGC run: 20070k
Garbage: 695313k -> 656225k
Leak: 62394k -> 62367k
Overhead: 151706k -> 136567k
GGC runs: 819 -> 807
comparing Gerald's testcase PR8361 compilation at -O2 level:
Amount of produced GGC garbage decreased from 763238k to 722293k, overall -5.67%
Overall memory needed: 127456k -> 127420k
Peak memory use before GGC: 96400k
Peak memory use after GGC: 89740k
Maximum of released memory in single GGC run: 20070k
Garbage: 763238k -> 722293k
Leak: 62959k -> 62927k
Overhead: 182222k -> 166220k
GGC runs: 854 -> 843
comparing Gerald's testcase PR8361 compilation at -O3 level:
Amount of produced GGC garbage decreased from 805932k to 762557k, overall -5.69%
Overall memory needed: 126228k -> 125860k
Peak memory use before GGC: 92885k -> 92734k
Peak memory use after GGC: 90751k -> 90722k
Maximum of released memory in single GGC run: 20814k
Garbage: 805932k -> 762557k
Leak: 63372k -> 63334k
Overhead: 195463k -> 176964k
GGC runs: 841 -> 833
Head of changelog is:
--- /usr/src/SpecTests/sandbox-britten-memory/x86_64/mem-result/ChangeLog 2004-09-16 12:34:43.000000000 +0000
+++ /usr/src/SpecTests/sandbox-britten-memory/gcc/gcc/ChangeLog 2004-09-16 23:19:56.000000000 +0000
@@ -1,3 +1,168 @@
+2004-09-16 Diego Novillo <dnovillo@redhat.com>
+
+ * tree-ssa-operands.c (add_call_clobber_ops): Make read-only
+ test apply only to TREE_STATIC and DECL_EXTERNAL.
+
+2004-09-16 Zdenek Dvorak <rakdver@atrey.karlin.mff.cuni.cz>
+
+ * Makefile.in (tree-cfg.o): Add CFGLAYOUT_H dependency.
+ * basic-block.h (get_dominated_by_region): Declare.
+ * dominance.c (get_dominated_by_region): New function.
+ * tree-cfg.c: Include cfglayout.h.
+ (tree_duplicate_bb): Duplicate also phi nodes.
+ (struct ssa_name_map_entry): New type.
+ (add_phi_args_after_copy_bb, add_phi_args_after_copy,
+ ssa_name_map_entry_hash, ssa_name_map_entry_eq,
+ allocate_ssa_names, rewrite_to_new_ssa_names_def,
+ rewrite_to_new_ssa_names_use, rewrite_to_new_ssa_names_bb,
+ rewrite_to_new_ssa_names, tree_duplicate_sese_region): New functions.
+ * tree-flow.h (tree_duplicate_sese_region, add_phi_args_after_copy_bb,
+ add_phi_args_after_copy, rewrite_to_new_ssa_names_bb,
+ rewrite_to_new_ssa_names, allocate_ssa_names,
+ rewrite_into_loop_closed_ssa, verify_loop_closed_ssa): Declare.
+ * tree-ssa-loop-ch.c (duplicate_blocks): Removed.
+ (copy_loop_headers): Use tree_duplicate_sese_region.
+
+2004-09-16 Frank Ch. Eigler <fche@redhat.com>
+
+ * profile.c (branch_prob): Restore support for USE_MAPPED_LOCATION.
+
+2004-09-16 Jeff Law <law@redhat.com>
+
+ * tree-into-ssa.c (block_defs_stack): New toplevel varray.
+ (rewrite_block_data): Remove, no longer used.
+ (rewrite_initialize_block_local_data): Remove, no longer used.
+ (rewrite_initialize_block): Mark parameters as unused as needed.
+ Change references to the block local block_defs to be block_defs_stack.
+ Push a marker onto the block_defs_stack.
+ (ssa_rewrite_initialize_block): Similarly.
+ (rewrite_stmt, ssa_rewrite_stmt): Similarly.
+ (ssa_register_new_def): No longer needs varray argument. Use
+ block_defs_stack instead. No longer handle possibly null block_defs
+ varray. Reverse order of items we push on the stack to make it
+ easier to identify our marker.
+ (register_new_def): No longer handle possibly null block_defs
+ varray.
+ (rewrite_finalize_block): Revamp to look for markers in the global
+ block_defs_stack varray rather than wiping a block local varray.
+ Mark arguments as unused as needed.
+ (ssa_rewrite_finalize_block): Similarly.
+ (rewrite_into_ssa): Update initialization of dom walker structure
+ to reflect that we don't need block local data anymore. Initialize
+ the block_defs_stack varray.
+ (rewrite_ssa_into_ssa): Similarly.
+ * tree-ssa-dom.c (block_defs_stack): New toplevel varray.
+ (struct dom_walk_data): Kill block_defs field.
+ (tree_ssa_dominator_optimize): Initialize block_defs_stack.
+ (thread_across_edge): Use the global block_defs_stack instead of
+ the old block_defs varray.
+ (dom_opt_initialize_block_local_data): Update now that we don't have
+ block_defs field to check anymore.
+ (dom_opt_initialize_block): Push a marker onto block_defs_stack.
+ (restore_currdefs_to_original_value): Use the new block_defs_stack
+ instead of a block local varray.
+ (dom_opt_finalize_block): Similarly.
+ (record_equivalencs_from_phis): Similarly.
+ (optimize_stmt, register_definitions_for_stmt): Similarly.
+
+2004-09-16 Andrew MacLeod <amacleod@redhat.com>
+
+ PR tree-optimization/17517
+ * tree-ssa-copyrename.c (copy_rename_partition_coalesce): Don't
+ coalesce same-root variables without checking for abnormal PHI usage.
+
+2004-09-16 Daniel Berlin <dberlin@dberlin.org>
+
+ * cfgloop.h (duplicate_loop): Add prototype.
+ * cfgloopmanip.c (duplicate_loop): Make non-static.
+ * lambda-code.c (perfect_nestify): Factor out test whether
+ we can handle this loop into separate function.
+ Call it.
+ (can_convert_to_perfect_nest): New function.
+ (replace_uses_of_x_with_y): Add modify_stmt call.
+ * tree-loop-linear.c (linear_transform_loops): Call
+ rewrite_into_loop_closed_ssa and free_df.
+
+2004-09-16 Daniel Berlin <dberlin@dberlin.org>
+
+ * lambda-code.c (invariant_in_loop): is_gimple_min_invariant is
+ loop invariant as well.
+ (perfect_nestify): new function.
+ (gcc_loop_to_lambda_loop): New parameters to track lower bounds,
+ upper bounds, and steps.
+ Set outerinductionvar properly.
+ (gcc_loopnest_to_lambda_loopnest): Add loops and need_perfect
+ parameters.
+ Return NULL if we need a perfect loop and can't make one.
+ (lambda_loopnest_to_gcc_loopnest): Correct algorithm.
+ (not_interesting_stmt): New function.
+ (phi_loop_edge_uses_def): Ditto.
+ (stmt_uses_phi_result): Ditto.
+ (stmt_is_bumper_for_loop): Ditto.
+ (perfect_nest_p): Ditto.
+ (nestify_update_pending_stmts): Ditto.
+ (replace_uses_of_x_with_y): Ditto.
+ (stmt_uses_op): Ditto.
+ (perfect_nestify): Ditto.
+ * lambda-mat.c (lambda_matrix_id_p): New function.
+ * lambda-trans.c (lambda_trans_matrix_id_p): Ditto.
+ * lambda.h: Update prototypes.
+ * tree-loop-linear (linear_transform_loop): Use new
+ perfect_nest_p. Detect and ignore identity transform.
+ * tree-ssa-loop.c (pass_linear_transform): Use TODO_write_loop_closed.
+
+2004-09-16 Sebastian Pop <pop@cri.ensmp.fr>
+
+ * tree-loop-linear.c (gather_interchange_stats): Add more comments.
+ Gather also strides of accessed data. Pass in the data references
+ array.
+ (try_interchange_loops): Add a new heuristic for handling the temporal
+ locality. Pass in the data references array.
+ (linear_transform_loops): Pass the data references array to
+ try_interchange_loops.
+
+2004-09-16 Kazu Hirata <kazu@cs.umass.edu>
+
+ * doc/invoke.texi: Fix typos. Follow spelling conventions.
+
+2004-09-16 Nathan Sidwell <nathan@codesourcery.com>
+
+ * doc/c-tree.texi (Classes): Remove index entries for
+ TREE_VIA_{PUBLIC,PROTECTED,PRIVATE}.
+
+2004-09-16 Zdenek Dvorak <rakdver@atrey.karlin.mff.cuni.cz>
+
+ * fold-const.c (fold): Fold difference of addresses.
+ (ptr_difference_const): Moved from tree-ssa-loop-ivopts, based on
+ get_inner_reference.
+ * tree-ssa-loop-ivopts.c (peel_address): Removed.
+ (ptr_difference_const): Moved to fold-const.c.
+ (split_address_cost): Use get_inner_reference instead of peel_address.
+ (ptr_difference_cost): Change type of diff to HOST_WIDE_INT.
+ * tree.h (ptr_difference_const): Export.
+
+ * tree-ssa-loop-ivopts.c (dump_iv, dump_use, dump_cand): Add induction
+ variable type to the dump. Fix indentation.
+ (idx_find_step): Handle nonconstant array_ref_element_size and
+ array_ref_low_bound.
+ (idx_record_use): Handle array_ref_element_size and
+ array_ref_low_bound.
+ (find_interesting_uses_stmt): Handle memory = nontrivial_expression
+ statements correctly.
+ (get_computation_at, iv_value): Do not unshare expressions here.
+ (rewrite_use_outer): Unshare the expression before it is emitted
+ to code.
+ * tree-ssa-loop-niter.c (unsigned_type_for, signed_type_for):
+ Moved to tree.c.
+ * tree.c (unsigned_type_for, signed_type_for): Moved from
+ tree-ssa-loop-niter.c. Use langhooks.
+ * tree.h (signed_type_for): Export.
+
+2004-09-16 David Edelsohn <edelsohn@gnu.org>
+
+ * config/rs6000/rs6000.c (rs6000_xcoff_asm_named_section): Update
+ prototype.
+
2004-09-15 Andrew Pinski <pinskia@physics.uc.edu>
PR target/11572
--- /usr/src/SpecTests/sandbox-britten-memory/x86_64/mem-result/ChangeLog.cp 2004-09-15 20:40:34.000000000 +0000
+++ /usr/src/SpecTests/sandbox-britten-memory/gcc/gcc/cp/ChangeLog 2004-09-16 23:19:59.000000000 +0000
@@ -1,3 +1,42 @@
+2004-09-16 Mark Mitchell <mark@codesourcery.com>
+
+ PR c++/17501
+ * parser.c (cp_parser_nested_name_specifier): Do not resolve
+ typename types if the user explicitly said "typename".
+
+2004-09-16 Andrew MacLeod <amacleod@redhat.com>
+
+ * error.c (dump_decl): Make sure there is lang_specific info before
+ checking for DTOR and CTOR decls.
+
+2004-09-16 Nathan Sidwell <nathan@codesourcery.com>
+
+ * class.c (copy_virtuals): Remove.
+ (build_primary_vtable): Use copy_list directly.
+ (build_secondary_vtable): Likewise.
+ (update_vtable_entry_for_fn): Clear BV_CALL_INDEX here.
+ (create_vtable_ptr): Likewise.
+
+2004-09-16 Kazu Hirata <kazu@cs.umass.edu>
+
+ * search.c: Follow spelling conventions.
+
+2004-09-16 Nathan Sidwell <nathan@codesourcery.com>
+
+ * cp-tree.h (struct lang_type_class): Make pure_virtuals a
+ VEC(tree).
+ (CLASSTYPE_INLINE_FRIENDS, CLASSTYPE_PURE_VIRTUALS): Update
+ comments.
+ * call.c (build_new_method_call): Don't confirm a pure virtual is
+ in CLASSTYPE_PURE_VIRTUALS. Reorder checks. Make it a warning.
+ * class.c (check_methods): CLASSTYPE_INLINE_FRIENDS is a VEC(tree).
+ (fixup_inline_methods, finish_struct): Likewise.
+ * decl.c (finish_method): Likewise.
+ * search.c (dfs_get_pure_virtuals, get_pure_virtuals):
+ CLASSTYPE_PURE_VIRTUALS is a VEC(tree).
+ * typeck2.c (abstract_virtuals_error): Likewise. Truncate the
+ vector to avoid repeating the list in error messages.
+
2004-09-15 Mark Mitchell <mark@codesourcery.com>
* cp-objcp-common.h (LANG_HOOKS_COMDAT_GROUP): Define.
I am friendly script caring about memory consumption in GCC. Please contact
jh@suse.cz if something is going wrong.
The results can be reproduced by building compiler with
--enable-gather-detailed-mem-stats targetting x86-64 and compiling preprocessed
combine.c or testcase from PR8632 with:
-fmem-report --param=ggc-min-heapsize=1024 --param=ggc-min-expand=1 -Ox -Q
The memory consumption summary appears in the dump after detailed listing of
the places they are allocated in. Peak memory consumption is actually computed
by looking for maximal value in {GC XXXX -> YYYY} report.
Yours testing script.