This is the mail archive of the gcc-regression@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

A recent patch increased GCC's memory consumption in some cases!


Hi,

I am a friendly script caring about memory consumption in GCC.  Please
contact jh@suse.cz if something is going wrong.

Comparing memory consumption on compilation of combine.i, insn-attrtab.i,
and generate-3.4.ii I got:


comparing empty function compilation at -O0 level:
    Overall memory needed: 7387k
    Peak memory use before GGC: 2263k
    Peak memory use after GGC: 1952k
    Maximum of released memory in single GGC run: 311k
    Garbage: 446k
    Leak: 2285k
    Overhead: 456k
    GGC runs: 3

comparing empty function compilation at -O0 -g level:
    Overall memory needed: 7403k
    Peak memory use before GGC: 2290k
    Peak memory use after GGC: 1979k
    Maximum of released memory in single GGC run: 311k
    Garbage: 449k
    Leak: 2318k
    Overhead: 461k
    GGC runs: 3

comparing empty function compilation at -O1 level:
    Overall memory needed: 7519k
    Peak memory use before GGC: 2263k
    Peak memory use after GGC: 1952k
    Maximum of released memory in single GGC run: 311k
    Garbage: 452k -> 452k
    Leak: 2288k
    Overhead: 457k -> 457k
    GGC runs: 4

comparing empty function compilation at -O2 level:
    Overall memory needed: 7527k
    Peak memory use before GGC: 2263k
    Peak memory use after GGC: 1952k
    Maximum of released memory in single GGC run: 311k
    Garbage: 455k -> 455k
    Leak: 2288k
    Overhead: 457k -> 457k
    GGC runs: 4

comparing empty function compilation at -O3 level:
    Overall memory needed: 7527k
    Peak memory use before GGC: 2263k
    Peak memory use after GGC: 1952k
    Maximum of released memory in single GGC run: 311k
    Garbage: 455k -> 455k
    Leak: 2288k
    Overhead: 457k -> 457k
    GGC runs: 4

comparing combine.c compilation at -O0 level:
    Overall memory needed: 17731k
    Peak memory use before GGC: 9264k
    Peak memory use after GGC: 8851k
    Maximum of released memory in single GGC run: 2578k
    Garbage: 37095k
    Leak: 6580k
    Overhead: 5053k
    GGC runs: 282

comparing combine.c compilation at -O0 -g level:
    Overall memory needed: 19875k
    Peak memory use before GGC: 10880k
    Peak memory use after GGC: 10511k
    Maximum of released memory in single GGC run: 2352k
    Garbage: 37692k
    Leak: 9473k
    Overhead: 5759k
    GGC runs: 270

comparing combine.c compilation at -O1 level:
  Amount of produced GGC garbage decreased from 56032k to 51580k, overall -8.63%
    Overall memory needed: 35279k -> 35555k
    Peak memory use before GGC: 19348k
    Peak memory use after GGC: 19155k
    Maximum of released memory in single GGC run: 2181k -> 2273k
    Garbage: 56032k -> 51580k
    Leak: 6609k -> 6614k
    Overhead: 6257k -> 6003k
    GGC runs: 353 -> 348

comparing combine.c compilation at -O2 level:
  Overall memory allocated via mmap and sbrk increased from 38347k to 39975k, overall 4.25%
    Overall memory needed: 38347k -> 39975k
    Peak memory use before GGC: 19415k
    Peak memory use after GGC: 19213k
    Maximum of released memory in single GGC run: 2151k -> 2200k
    Garbage: 68117k -> 67677k
    Leak: 6733k -> 6729k
    Overhead: 8090k -> 8046k
    GGC runs: 399 -> 401

comparing combine.c compilation at -O3 level:
  Overall memory allocated via mmap and sbrk increased from 42643k to 44215k, overall 3.69%
  Peak amount of GGC memory allocated before garbage collecting increased from 19612k to 19713k, overall 0.51%
  Peak amount of GGC memory still allocated after garbage collecting increased from 19300k to 19379k, overall 0.41%
  Amount of produced GGC garbage increased from 89344k to 92702k, overall 3.76%
  Amount of memory still referenced at the end of compilation increased from 6836k to 6843k, overall 0.11%
    Overall memory needed: 42643k -> 44215k
    Peak memory use before GGC: 19612k -> 19713k
    Peak memory use after GGC: 19300k -> 19379k
    Maximum of released memory in single GGC run: 3636k -> 3715k
    Garbage: 89344k -> 92702k
    Leak: 6836k -> 6843k
    Overhead: 11073k -> 11381k
    GGC runs: 423 -> 426

comparing insn-attrtab.c compilation at -O0 level:
  Overall memory allocated via mmap and sbrk increased from 100527k to 102855k, overall 2.32%
    Overall memory needed: 100527k -> 102855k
    Peak memory use before GGC: 68627k
    Peak memory use after GGC: 44730k
    Maximum of released memory in single GGC run: 36429k
    Garbage: 130923k
    Leak: 9583k
    Overhead: 16927k
    GGC runs: 212

comparing insn-attrtab.c compilation at -O0 -g level:
    Overall memory needed: 104223k -> 104355k
    Peak memory use before GGC: 69788k
    Peak memory use after GGC: 45998k
    Maximum of released memory in single GGC run: 36429k
    Garbage: 132399k
    Leak: 11067k
    Overhead: 17324k
    GGC runs: 210

comparing insn-attrtab.c compilation at -O1 level:
  Ovarall memory allocated via mmap and sbrk decreased from 145595k to 124791k, overall -16.67%
  Peak amount of GGC memory allocated before garbage collecting run decreased from 85948k to 73426k, overall -17.05%
  Peak amount of GGC memory still allocated after garbage collecting decreased from 80070k to 69350k, overall -15.46%
  Amount of produced GGC garbage decreased from 264605k to 227633k, overall -16.24%
    Overall memory needed: 145595k -> 124791k
    Peak memory use before GGC: 85948k -> 73426k
    Peak memory use after GGC: 80070k -> 69350k
    Maximum of released memory in single GGC run: 32839k -> 32798k
    Garbage: 264605k -> 227633k
    Leak: 9403k -> 9405k
    Overhead: 27652k -> 27090k
    GGC runs: 226

comparing insn-attrtab.c compilation at -O2 level:
  Peak amount of GGC memory allocated before garbage collecting run decreased from 87235k to 74717k, overall -16.75%
  Peak amount of GGC memory still allocated after garbage collecting decreased from 80140k to 69433k, overall -15.42%
  Amount of produced GGC garbage decreased from 300011k to 262915k, overall -14.11%
    Overall memory needed: 200091k -> 195507k
    Peak memory use before GGC: 87235k -> 74717k
    Peak memory use after GGC: 80140k -> 69433k
    Maximum of released memory in single GGC run: 30044k -> 29994k
    Garbage: 300011k -> 262915k
    Leak: 9401k -> 9402k
    Overhead: 33248k -> 32709k
    GGC runs: 243 -> 245

comparing insn-attrtab.c compilation at -O3 level:
  Overall memory allocated via mmap and sbrk increased from 197835k to 205003k, overall 3.62%
    Overall memory needed: 197835k -> 205003k
    Peak memory use before GGC: 87249k -> 84750k
    Peak memory use after GGC: 80154k -> 78009k
    Maximum of released memory in single GGC run: 30105k -> 30928k
    Garbage: 300678k -> 291982k
    Leak: 9405k -> 9406k
    Overhead: 33451k -> 33442k
    GGC runs: 244 -> 246

comparing Gerald's testcase PR8361 compilation at -O0 level:
    Overall memory needed: 147077k -> 147067k
    Peak memory use before GGC: 90246k
    Peak memory use after GGC: 89353k
    Maximum of released memory in single GGC run: 17774k
    Garbage: 207955k
    Leak: 49212k
    Overhead: 23912k
    GGC runs: 409

comparing Gerald's testcase PR8361 compilation at -O0 -g level:
    Overall memory needed: 164865k -> 164875k
    Peak memory use before GGC: 102853k
    Peak memory use after GGC: 101834k
    Maximum of released memory in single GGC run: 18129k
    Garbage: 214506k
    Leak: 72527k
    Overhead: 29809k
    GGC runs: 384

comparing Gerald's testcase PR8361 compilation at -O1 level:
  Overall memory allocated via mmap and sbrk increased from 141920k to 151363k, overall 6.65%
    Overall memory needed: 141920k -> 151363k
    Peak memory use before GGC: 101928k
    Peak memory use after GGC: 100917k
    Maximum of released memory in single GGC run: 17236k
    Garbage: 344337k -> 338439k
    Leak: 50302k -> 50309k
    Overhead: 30468k -> 29926k
    GGC runs: 527 -> 524

comparing Gerald's testcase PR8361 compilation at -O2 level:
  Overall memory allocated via mmap and sbrk increased from 143988k to 156835k, overall 8.92%
  Amount of produced GGC garbage increased from 375659k to 382735k, overall 1.88%
    Overall memory needed: 143988k -> 156835k
    Peak memory use before GGC: 102595k -> 102534k
    Peak memory use after GGC: 101527k -> 101517k
    Maximum of released memory in single GGC run: 17233k
    Garbage: 375659k -> 382735k
    Leak: 51032k -> 51026k
    Overhead: 34760k -> 35198k
    GGC runs: 564 -> 570

comparing Gerald's testcase PR8361 compilation at -O3 level:
  Overall memory allocated via mmap and sbrk increased from 146240k to 159803k, overall 9.27%
  Amount of produced GGC garbage increased from 394625k to 412704k, overall 4.58%
    Overall memory needed: 146240k -> 159803k
    Peak memory use before GGC: 104355k -> 104352k
    Peak memory use after GGC: 103310k
    Maximum of released memory in single GGC run: 17610k
    Garbage: 394625k -> 412704k
    Leak: 51318k -> 51309k
    Overhead: 36321k -> 37524k
    GGC runs: 572 -> 585

comparing PR rtl-optimization/28071 testcase compilation at -O0 level:
    Overall memory needed: 244771k -> 244770k
    Peak memory use before GGC: 80963k
    Peak memory use after GGC: 58702k
    Maximum of released memory in single GGC run: 44134k
    Garbage: 142205k
    Leak: 7613k
    Overhead: 24559k
    GGC runs: 79

comparing PR rtl-optimization/28071 testcase compilation at -O0 -g level:
    Overall memory needed: 245591k -> 245594k
    Peak memory use before GGC: 81609k
    Peak memory use after GGC: 59348k
    Maximum of released memory in single GGC run: 44123k
    Garbage: 142415k
    Leak: 9381k
    Overhead: 25054k
    GGC runs: 89

comparing PR rtl-optimization/28071 testcase compilation at -O1 level:
  Overall memory allocated via mmap and sbrk increased from 244163k to 253839k, overall 3.96%
    Overall memory needed: 244163k -> 253839k
    Peak memory use before GGC: 84282k -> 84262k
    Peak memory use after GGC: 74847k
    Maximum of released memory in single GGC run: 36149k
    Garbage: 221623k -> 221602k
    Leak: 20856k -> 20856k
    Overhead: 30444k -> 30442k
    GGC runs: 81

comparing PR rtl-optimization/28071 testcase compilation at -O2 level:
    Overall memory needed: 497627k -> 507535k
    Peak memory use before GGC: 79833k
    Peak memory use after GGC: 74847k
    Maximum of released memory in single GGC run: 33439k
    Garbage: 228654k -> 228633k
    Leak: 20946k -> 20946k
    Overhead: 32525k -> 32524k
    GGC runs: 91

comparing PR rtl-optimization/28071 testcase compilation at -O3 -fno-tree-pre -fno-tree-fre level:
    Overall memory needed: 1268775k -> 1269191k
    Peak memory use before GGC: 201851k
    Peak memory use after GGC: 190309k
    Maximum of released memory in single GGC run: 80677k -> 80679k
    Garbage: 370228k -> 370229k
    Leak: 46312k
    Overhead: 48627k -> 48626k
    GGC runs: 70

Head of the ChangeLog is:

--- /usr/src/SpecTests/sandbox-britten-memory/x86_64/mem-result/ChangeLog	2007-04-11 07:52:20.000000000 +0000
+++ /usr/src/SpecTests/sandbox-britten-memory/gcc/gcc/ChangeLog	2007-04-12 00:25:32.000000000 +0000
@@ -1,3 +1,143 @@
+2007-04-11  Diego Novillo  <dnovillo@redhat.com>
+
+	* tree-ssa-alias.c (dump_mem_ref_stats): Do not call
+	need_to_partition_p if there are no memory statements in the
+	function.
+
+2007-04-11  Zdenek Dvorak  <dvorakz@suse.cz>
+
+	* tree-data-ref.c (chrec_steps_divide_constant_p): Removed.
+	(gcd_of_steps_may_divide_p): New function.
+	(analyze_miv_subscript): Use gcd_of_steps_may_divide_p.
+
+2007-04-11  Bernd Schmidt  <bernd.schmidt@analog.com>
+
+	* reload.c (find_reloads_toplev, find_reloads_address,
+	find_reloads_address_1, find_reloads_subreg_address): Use rtx_equal_p,
+	not a pointer equality test, to decide if we need to call
+	push_reg_equiv_alt_mem.
+
+2007-04-11  Sebastian Pop  <sebastian.pop@inria.fr>
+
+	* tree-data-ref.c (affine_function_zero_p, constant_access_functions,
+	insert_innermost_unit_dist_vector, add_distance_for_zero_overlaps): New.
+	(build_classic_dist_vector): Call add_distance_for_zero_overlaps.
+
+2007-04-11  Zdenek Dvorak  <dvorakz@suse.cz>
+
+	* tree-data-ref.c (add_multivariate_self_dist): Force the distance
+	vector to be positive.
+
+2007-04-11  Diego Novillo  <dnovillo@redhat.com>
+
+	PR 30735
+	PR 31090
+	* doc/invoke.texi: Document --params max-aliased-vops and
+	avg-aliased-vops.
+	* tree-ssa-operands.h (get_mpt_for, dump_memory_partitions,
+	debug_memory_partitions): Move to tree-flow.h
+	* params.h (AVG_ALIASED_VOPS): Define.
+	* tree-ssa-alias.c (struct mp_info_def): Remove.  Update all
+	users.
+	(mp_info_t): Likewise.
+	(get_mem_sym_stats_for): New.
+	(set_memory_partition): Move from tree-flow-inline.h.
+	(mark_non_addressable): Only clear the set of symbols for the
+	partition if it exists.
+	(dump_memory_partitions): Move from tree-ssa-operands.c
+	(debug_memory_partitions): Likewise.
+	(need_to_partition_p): New.
+	(dump_mem_ref_stats): New.
+	(debug_mem_ref_stats): New.
+	(dump_mem_sym_stats): New.
+	(debug_mem_sym_stats): New.
+	(update_mem_sym_stats_from_stmt): New.
+	(compare_mp_info_entries): New.
+	(mp_info_cmp): Call it.
+	(sort_mp_info): Change argument to a list of mem_sym_stats_t
+	objects.
+	(get_mpt_for): Move from tree-ssa-operands.c.
+	(find_partition_for): New.
+	(create_partition_for): Remove.
+	(estimate_vop_reduction): New.
+	(update_reference_counts): New.
+	(build_mp_info): New.
+	(compute_memory_partitions): Refactor.
+	Document new heuristic.
+	Call build_mp_info, update_reference_counts,
+	find_partition_for and estimate_vop_reduction.
+	(compute_may_aliases): Populate virtual operands before
+	calling debugging dumps.
+	(delete_mem_sym_stats): New.
+	(delete_mem_ref_stats): New.
+	(init_mem_ref_stats): New.
+	(init_alias_info): Call it.
+	(maybe_create_global_var): Remove alias_info argument.
+	Get number of call sites and number of pure/const call sites
+	from gimple_mem_ref_stats().
+	(dump_alias_info): Call dump_memory_partitions first.
+	(dump_points_to_info_for): Show how many times a pointer has
+	been dereferenced.
+	* opts.c (decode_options): For -O2 set --param
+	max-aliased-vops to 500.
+	For -O3 set --param max-aliased-vops to 1000 and --param
+	avg-aliased-vops to 3.
+	* fortran/options.c (gfc_init_options): Remove assignment to
+	MAX_ALIASED_VOPS.
+	* tree-flow-inline.h (gimple_mem_ref_stats): New.
+	* tree-dfa.c (dump_variable): Dump memory reference
+	statistics.
+	Dump NO_ALIAS* settings.
+	(referenced_var_lookup): Tidy.
+	(mem_sym_stats): New.
+	* tree-ssa-copy.c (may_propagate_copy): Return true if DEST
+	and ORIG are different SSA names for a memory partition.
+	* tree-ssa.c (delete_tree_ssa): Call delete_mem_ref_stats.
+	* tree-flow.h (struct mem_sym_stats_d): Define.
+	(mem_sym_stats_t): Define.
+	(struct mem_ref_stats_d): Define.
+	(struct gimple_df): Add field mem_ref_stats.
+	(enum noalias_state): Define.
+	(struct var_ann_d): Add bitfield noalias_state.
+	(mem_sym_stats, delete_mem_ref_stats, dump_mem_ref_stats,
+	debug_mem_ref_stats, debug_memory_partitions,
+	debug_mem_sym_stats): Declare.
+	* tree-ssa-structalias.c (update_alias_info): Update call
+	sites, pure/const call sites and asm sites in structure
+	returned by gimple_mem_ref_stats.
+	Remove local variable IS_POTENTIAL_DEREF.
+	Increase NUM_DEREFS if the memory expression is a potential
+	dereference.
+	Call update_mem_sym_stats_from_stmt.
+	If the memory references memory, call
+	update_mem_sym_stats_from_stmt for all the direct memory
+	symbol references found.
+	(intra_create_variable_infos): Set noalias_state field for
+	pointer arguments according to the value of
+	flag_argument_noalias.
+	* tree-ssa-structalias.h (struct alias_info): Remove fields
+	num_calls_found and num_pure_const_calls_found.
+	(update_mem_sym_stats_from_stmt): Declare.
+	* params.def (PARAM_MAX_ALIASED_VOPS): Change description.
+	Set default value to 100.
+	(PARAM_AVG_ALIASED_VOPS): Define.
+
+2007-04-11  Richard Guenther  <rguenther@suse.de>
+
+	PR middle-end/31530
+	* simplify-rtx.c (simplify_binary_operation_1): Do not simplify
+	a * -b + c as c - a * b if we honor sign dependent rounding.
+
+2007-04-11  Bernd Schmidt  <bernd.schmidt@analog.com>
+
+	* config/bfin/bfin-protos.h (bfin_expand_movmem): Renamed from
+	bfin_expand_strmov.
+	* config/bfin/bfin.c (bfin_expand_prologue, bfin_delegitimize_address,
+	bfin_function_ok_for_sibcall, split_load_immediate): Remove unused
+	variables.
+	(initialize_trampoline): Don't use old-style function definition.
+	(bfin_secondary_reload): Mark IN_P argument as unused.
+
 2007-04-10  Sebastian Pop  <sebastian.pop@inria.fr>
 
 	PR tree-optimization/31343


The results can be reproduced by building a compiler with

--enable-gather-detailed-mem-stats targetting x86-64

and compiling preprocessed combine.c or testcase from PR8632 with:

-fmem-report --param=ggc-min-heapsize=1024 --param=ggc-min-expand=1 -Ox -Q

The memory consumption summary appears in the dump after detailed listing
of the places they are allocated in.  Peak memory consumption is actually
computed by looking for maximal value in {GC XXXX -> YYYY} report.

Your testing script.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]