This is the mail archive of the gcc-regression@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

A recent patch increased GCC's memory consumption!


Hi,

I am a friendly script caring about memory consumption in GCC.  Please
contact jh@suse.cz if something is going wrong.

Comparing memory consumption on compilation of combine.i, insn-attrtab.i,
and generate-3.4.ii I got:


comparing empty function compilation at -O0 level:
  Peak amount of GGC memory allocated before garbage collecting increased from 2229k to 2233k, overall 0.18%
  Peak amount of GGC memory still allocated after garbage collectin increased from 1936k to 1940k, overall 0.21%
  Amount of memory still referenced at the end of compilation increased from 2266k to 2271k, overall 0.22%
    Overall memory needed: 18243k -> 18255k
    Peak memory use before GGC: 2229k -> 2233k
    Peak memory use after GGC: 1936k -> 1940k
    Maximum of released memory in single GGC run: 293k
    Garbage: 422k
    Leak: 2266k -> 2271k
    Overhead: 445k -> 446k
    GGC runs: 3

comparing empty function compilation at -O0 -g level:
  Peak amount of GGC memory allocated before garbage collecting increased from 2256k to 2260k, overall 0.18%
  Peak amount of GGC memory still allocated after garbage collectin increased from 1963k to 1967k, overall 0.20%
  Amount of memory still referenced at the end of compilation increased from 2298k to 2303k, overall 0.21%
    Overall memory needed: 18263k -> 18271k
    Peak memory use before GGC: 2256k -> 2260k
    Peak memory use after GGC: 1963k -> 1967k
    Maximum of released memory in single GGC run: 293k
    Garbage: 424k
    Leak: 2298k -> 2303k
    Overhead: 449k -> 450k
    GGC runs: 3

comparing empty function compilation at -O1 level:
  Peak amount of GGC memory allocated before garbage collecting increased from 2229k to 2233k, overall 0.18%
  Peak amount of GGC memory still allocated after garbage collectin increased from 1936k to 1940k, overall 0.21%
  Amount of memory still referenced at the end of compilation increased from 2269k to 2274k, overall 0.22%
    Overall memory needed: 18347k -> 18359k
    Peak memory use before GGC: 2229k -> 2233k
    Peak memory use after GGC: 1936k -> 1940k
    Maximum of released memory in single GGC run: 293k
    Garbage: 427k
    Leak: 2269k -> 2274k
    Overhead: 445k -> 446k
    GGC runs: 4

comparing empty function compilation at -O2 level:
  Peak amount of GGC memory allocated before garbage collecting increased from 2229k to 2233k, overall 0.18%
  Peak amount of GGC memory still allocated after garbage collectin increased from 1936k to 1940k, overall 0.21%
  Amount of memory still referenced at the end of compilation increased from 2269k to 2274k, overall 0.22%
    Overall memory needed: 18359k -> 18367k
    Peak memory use before GGC: 2229k -> 2233k
    Peak memory use after GGC: 1936k -> 1940k
    Maximum of released memory in single GGC run: 293k
    Garbage: 430k
    Leak: 2269k -> 2274k
    Overhead: 446k -> 447k
    GGC runs: 4

comparing empty function compilation at -O3 level:
  Peak amount of GGC memory allocated before garbage collecting increased from 2229k to 2233k, overall 0.18%
  Peak amount of GGC memory still allocated after garbage collectin increased from 1936k to 1940k, overall 0.21%
  Amount of memory still referenced at the end of compilation increased from 2269k to 2274k, overall 0.22%
    Overall memory needed: 18359k -> 18367k
    Peak memory use before GGC: 2229k -> 2233k
    Peak memory use after GGC: 1936k -> 1940k
    Maximum of released memory in single GGC run: 293k
    Garbage: 430k
    Leak: 2269k -> 2274k
    Overhead: 446k -> 447k
    GGC runs: 4

comparing combine.c compilation at -O0 level:
    Overall memory needed: 28431k -> 28443k
    Peak memory use before GGC: 9305k -> 9309k
    Peak memory use after GGC: 8844k -> 8848k
    Maximum of released memory in single GGC run: 2665k
    Garbage: 36852k -> 36852k
    Leak: 6456k -> 6461k
    Overhead: 4868k -> 4869k
    GGC runs: 280 -> 279

comparing combine.c compilation at -O0 -g level:
    Overall memory needed: 30523k -> 30535k
    Peak memory use before GGC: 10855k -> 10859k
    Peak memory use after GGC: 10485k -> 10489k
    Maximum of released memory in single GGC run: 2415k
    Garbage: 37429k -> 37429k
    Leak: 9266k -> 9271k
    Overhead: 5536k -> 5537k
    GGC runs: 271

comparing combine.c compilation at -O1 level:
    Overall memory needed: 40271k -> 40287k
    Peak memory use before GGC: 17295k -> 17299k
    Peak memory use after GGC: 17120k -> 17124k
    Maximum of released memory in single GGC run: 2275k
    Garbage: 57480k -> 57485k
    Leak: 6510k -> 6515k
    Overhead: 6226k -> 6227k
    GGC runs: 357 -> 356

comparing combine.c compilation at -O2 level:
    Overall memory needed: 29802k -> 29806k
    Peak memory use before GGC: 17291k -> 17295k
    Peak memory use after GGC: 17120k -> 17124k
    Maximum of released memory in single GGC run: 2868k -> 2876k
    Garbage: 74890k -> 74892k
    Leak: 6616k -> 6613k
    Overhead: 8472k -> 8473k
    GGC runs: 412 -> 411

comparing combine.c compilation at -O3 level:
    Overall memory needed: 28902k -> 28906k
    Peak memory use before GGC: 18419k -> 18423k
    Peak memory use after GGC: 17847k -> 17851k
    Maximum of released memory in single GGC run: 4106k
    Garbage: 112668k -> 112668k
    Leak: 6684k -> 6689k
    Overhead: 13029k -> 13030k
    GGC runs: 463 -> 462

    Overall memory needed: 28431k -> 28443k
    Peak memory use before GGC: 9305k -> 9309k
    Peak memory use after GGC: 8844k -> 8848k
    Maximum of released memory in single GGC run: 2665k
    Garbage: 36852k -> 36852k
    Leak: 6456k -> 6461k
    Overhead: 4868k -> 4869k
    GGC runs: 280 -> 279

comparing combine.c compilation at -O1 level:
    Overall memory needed: 40271k -> 40287k
    Peak memory use before GGC: 17295k -> 17299k
    Peak memory use after GGC: 17120k -> 17124k
    Maximum of released memory in single GGC run: 2275k
    Garbage: 57480k -> 57485k
    Leak: 6510k -> 6515k
    Overhead: 6226k -> 6227k
    GGC runs: 357 -> 356

comparing combine.c compilation at -O2 level:
    Overall memory needed: 29802k -> 29806k
    Peak memory use before GGC: 17291k -> 17295k
    Peak memory use after GGC: 17120k -> 17124k
    Maximum of released memory in single GGC run: 2868k -> 2876k
    Garbage: 74890k -> 74892k
    Leak: 6616k -> 6613k
    Overhead: 8472k -> 8473k
    GGC runs: 412 -> 411

comparing combine.c compilation at -O3 level:
    Overall memory needed: 28902k -> 28906k
    Peak memory use before GGC: 18419k -> 18423k
    Peak memory use after GGC: 17847k -> 17851k
    Maximum of released memory in single GGC run: 4106k
    Garbage: 112668k -> 112668k
    Leak: 6684k -> 6689k
    Overhead: 13029k -> 13030k
    GGC runs: 463 -> 462

comparing insn-attrtab.c compilation at -O0 level:
    Overall memory needed: 88242k -> 88246k
    Peak memory use before GGC: 69789k -> 69793k
    Peak memory use after GGC: 44199k -> 44203k
    Maximum of released memory in single GGC run: 36964k
    Garbage: 129066k -> 129066k
    Leak: 9515k -> 9520k
    Overhead: 17000k -> 17001k
    GGC runs: 216

comparing insn-attrtab.c compilation at -O0 -g level:
    Overall memory needed: 89422k -> 89426k
    Peak memory use before GGC: 70938k -> 70942k
    Peak memory use after GGC: 45455k -> 45459k
    Maximum of released memory in single GGC run: 36964k
    Garbage: 130494k -> 130494k
    Leak: 10946k -> 10951k
    Overhead: 17379k -> 17380k
    GGC runs: 212

comparing insn-attrtab.c compilation at -O1 level:
    Overall memory needed: 112878k -> 114178k
    Peak memory use before GGC: 90375k -> 90379k
    Peak memory use after GGC: 83737k -> 83741k
    Maximum of released memory in single GGC run: 31852k
    Garbage: 277775k -> 277776k
    Leak: 9357k -> 9362k
    Overhead: 29791k -> 29792k
    GGC runs: 222 -> 221

comparing insn-attrtab.c compilation at -O2 level:
    Overall memory needed: 129382k -> 129390k
    Peak memory use before GGC: 92604k -> 92608k
    Peak memory use after GGC: 84716k -> 84720k
    Maximum of released memory in single GGC run: 30398k
    Garbage: 317196k -> 317198k
    Leak: 9359k -> 9364k
    Overhead: 36370k -> 36371k
    GGC runs: 245 -> 244

comparing insn-attrtab.c compilation at -O3 level:
  Overall memory allocated via mmap and sbrk increased from 129418k to 134238k, overall 3.72%
    Overall memory needed: 129418k -> 134238k
    Peak memory use before GGC: 92630k -> 92634k
    Peak memory use after GGC: 84742k -> 84746k
    Maximum of released memory in single GGC run: 30585k
    Garbage: 318053k -> 318053k
    Leak: 9362k -> 9367k
    Overhead: 36605k -> 36606k
    GGC runs: 249 -> 248

comparing Gerald's testcase PR8361 compilation at -O0 level:
    Overall memory needed: 119998k -> 120002k
    Peak memory use before GGC: 93308k -> 93312k
    Peak memory use after GGC: 92381k -> 92385k
    Maximum of released memory in single GGC run: 20013k
    Garbage: 207743k -> 207743k
    Leak: 47725k -> 47730k
    Overhead: 20983k -> 20983k
    GGC runs: 409

comparing Gerald's testcase PR8361 compilation at -O0 -g level:
    Overall memory needed: 132498k -> 132502k
    Peak memory use before GGC: 105437k -> 105441k
    Peak memory use after GGC: 104386k -> 104390k
    Maximum of released memory in single GGC run: 19646k
    Garbage: 214333k -> 214331k
    Leak: 70684k -> 70689k
    Overhead: 26599k -> 26600k
    GGC runs: 380

comparing Gerald's testcase PR8361 compilation at -O1 level:
    Overall memory needed: 119134k -> 119138k
    Peak memory use before GGC: 97919k -> 97923k
    Peak memory use after GGC: 95707k -> 95711k
    Maximum of released memory in single GGC run: 18552k
    Garbage: 446279k -> 446309k
    Leak: 50111k -> 50116k
    Overhead: 32835k -> 32836k
    GGC runs: 559

comparing Gerald's testcase PR8361 compilation at -O2 level:
    Overall memory needed: 119158k -> 119162k
    Peak memory use before GGC: 97919k -> 97924k
    Peak memory use after GGC: 95706k -> 95711k
    Maximum of released memory in single GGC run: 18552k
    Garbage: 505250k -> 505250k
    Leak: 50795k -> 50800k
    Overhead: 40015k -> 40016k
    GGC runs: 613

comparing Gerald's testcase PR8361 compilation at -O3 level:
    Overall memory needed: 118990k -> 118994k
    Peak memory use before GGC: 97964k -> 97968k
    Peak memory use after GGC: 96993k -> 96997k
    Maximum of released memory in single GGC run: 18932k
    Garbage: 526318k -> 526304k
    Leak: 50340k -> 50345k
    Overhead: 40920k -> 40921k
    GGC runs: 628

comparing PR rtl-optimization/28071 testcase compilation at -O0 level:
    Overall memory needed: 137958k -> 137962k
    Peak memory use before GGC: 81909k -> 81913k
    Peak memory use after GGC: 58788k -> 58792k
    Maximum of released memory in single GGC run: 45493k
    Garbage: 147244k -> 147245k
    Leak: 7536k -> 7541k
    Overhead: 25303k -> 25304k
    GGC runs: 82

comparing PR rtl-optimization/28071 testcase compilation at -O0 -g level:
    Overall memory needed: 138134k -> 138138k
    Peak memory use before GGC: 82542k -> 82546k
    Peak memory use after GGC: 59422k -> 59426k
    Maximum of released memory in single GGC run: 45558k
    Garbage: 147415k
    Leak: 9244k -> 9249k
    Overhead: 25769k -> 25769k
    GGC runs: 88

comparing PR rtl-optimization/28071 testcase compilation at -O1 level:
    Overall memory needed: 424330k -> 424266k
    Peak memory use before GGC: 205229k -> 205233k
    Peak memory use after GGC: 201005k -> 201009k
    Maximum of released memory in single GGC run: 101903k
    Garbage: 272136k -> 272136k
    Leak: 47601k -> 47606k
    Overhead: 31281k -> 31282k
    GGC runs: 101

comparing PR rtl-optimization/28071 testcase compilation at -O2 level:
    Overall memory needed: 352126k -> 352386k
    Peak memory use before GGC: 206002k -> 206006k
    Peak memory use after GGC: 201778k -> 201782k
    Maximum of released memory in single GGC run: 108808k
    Garbage: 352361k -> 352361k
    Leak: 48184k -> 48189k
    Overhead: 47026k -> 47027k
    GGC runs: 110

comparing PR rtl-optimization/28071 testcase compilation at -O3 -fno-tree-pre -fno-tree-fre level:
    Overall memory needed: 781466k -> 781310k
    Peak memory use before GGC: 314925k -> 314929k
    Peak memory use after GGC: 293268k -> 293272k
    Maximum of released memory in single GGC run: 165201k -> 165197k
    Garbage: 494383k -> 494379k
    Leak: 65517k -> 65522k
    Overhead: 59884k -> 59885k
    GGC runs: 98

Head of the ChangeLog is:

--- /usr/src/SpecTests/sandbox-britten-memory/x86_64/mem-result/ChangeLog	2006-11-22 06:29:51.000000000 +0000
+++ /usr/src/SpecTests/sandbox-britten-memory/gcc/gcc/ChangeLog	2006-11-22 23:02:06.000000000 +0000
@@ -1,3 +1,108 @@
+2006-11-22  Peter Bergner  <bergner@vnet.ibm.com>
+
+	* config/rs6000/rs6000.c (get_store_dest): New.
+	(adjacent_mem_locations): Use get_store_dest() to get
+	the rtl of the store destination.
+
+2006-11-22  Joseph Myers  <joseph@codesourcery.com>
+
+	* config/rs6000/spe.md (SPE64): New mode macro.
+	(mov_sidf_e500_subreg0): Change to mov_si<mode>_e500_subreg0.  Add
+	memory load.
+	(mov_si<mode>_e500_subreg0_2): New.
+	(mov_sidf_e500_subreg4): Change to mov_si<mode>_e500_subreg4.  Add
+	memory load.
+	(mov_si<mode>_e500_subreg4_2): New.
+	* config/rs6000/predicates.md (input_operand): Do not allow
+	invalid E500 subregs.
+	(rs6000_nonimmediate_operand): Check for invalid E500 subregs also
+	if TARGET_SPE.
+	* config/rs6000/rs6000.c (invalid_e500_subreg): Check for subregs
+	involving DFmode if TARGET_E500_DOUBLE.  Check for subregs
+	involving vector modes if TARGET_SPE.
+
+2006-11-22  Kaz Kojima  <kkojima@gcc.gnu.org>
+
+	Revert
+	2006-11-12  Kaz Kojima  <kkojima@gcc.gnu.org>
+	* reorg.c (emit_delay_sequence): Copy the delay slot insn.
+
+2006-11-22  Bernd Schmidt  <bernd.schmidt@analog.com>
+
+	* config/bfin/predicates.md (d_register_operand, mem_p_address_operand,
+	mem_i_address_operand): New predicates.
+	* config/bfin/bfin.c (bfin_issue_rate): New function.
+	(TARGET_SCHED_ISSUE_RATE): New macro.
+	* config/bfin/bfin.md (addrtype): New attribute.
+	(slot0, slot1, slot2, store, pregs): New cpu_units.
+	(core): Now a define_reservation.
+	(alu): Remove some insn types from this reservation.
+	(dsp32, load32, loadp, loadi, store32, storep, storei, multi): New
+	insn reservations.
+	(dummy reservation): Don't trigger for mcld insns.
+	(absence_sets): Two new absence sets to enforce slot ordering.
+	(popsi_insn): Set addrtype.
+
+2006-11-22  Ira Rosen  <irar@il.ibm.com>
+
+	* doc/c-tree.texi: Document new tree codes.
+	* doc/md.texi: Document new optabs.
+	* tree-pretty-print.c (dump_generic_node): Handle print of new tree
+	codes.
+	* optabs.c (optab_for_tree_code, init_optabs): Handle new optabs.
+	* optabs.h (optab_index): Add new.
+	(vec_extract_even_optab, vec_extract_odd_optab,
+	vec_interleave_high_optab, vec_interleave_low_optab): New optabs.
+	* genopinit.c (vec_extract_even_optab, vec_extract_odd_optab,
+	vec_interleave_high_optab, vec_interleave_low_optab): Initialize
+	new optabs.
+	* expr.c (expand_expr_real_1): Add implementation for new tree codes.
+	* tree-vectorizer.c (new_stmt_vec_info): Initialize new fields.
+	* tree-vectorizer.h (stmt_vec_info): Add new fields for interleaving
+	along with macros for their access.
+	* tree-data-ref.h (first_location_in_loop, data_reference): Update
+	comment.
+	* tree-vect-analyze.c (toplev.h): Include.
+	(vect_determine_vectorization_factor): Fix indentation.
+	(vect_insert_into_interleaving_chain,
+	vect_update_interleaving_chain, vect_equal_offsets): New functions.
+	(vect_analyze_data_ref_dependence): Add argument for interleaving
+	check. Check for interleaving if it's true.
+	(vect_check_dependences): New function.
+	(vect_analyze_data_ref_dependences): Call vect_check_dependences for
+	every ddr. Call vect_analyze_data_ref_dependence with new argument.
+	(vect_update_misalignment_for_peel): Update for interleaving.
+	(vect_verify_datarefs_alignment): Check only first data-ref for
+	interleaving.
+	(vect_enhance_data_refs_alignment): Update for interleaving. Check
+	only first data-ref for interleaving.
+	(vect_analyze_data_ref_access): Check interleaving, update
+	interleaving data.
+	(vect_analyze_data_refs): Call compute_data_dependences_for_loop
+	with different parameters.
+	* tree.def (VEC_EXTRACT_EVEN_EXPR, VEC_EXTRACT_ODD_EXPR,
+	VEC_INTERLEAVE_HIGH_EXPR, VEC_INTERLEAVE_LOW_EXPR): New tree codes.
+	* tree-inline.c (estimate_num_insns_1): Add cases for new codes.
+	* tree-vect-transform.c (vect_create_addr_base_for_vector_ref):
+	Update step in case of interleaving.
+	(vect_strided_store_supported, vect_permute_store_chain): New
+	functions.
+	(vectorizable_store): Handle strided stores.
+	(vect_strided_load_supported, vect_permute_load_chain,
+	vect_transform_strided_load): New functions.
+	(vectorizable_load): Handle strided loads.
+	(vect_transform_stmt): Add argument. Handle strided stores. Check
+	that vectorized stmt exists for patterns.
+	(vect_gen_niters_for_prolog_loop): Update calculation for
+	interleaving.
+	(vect_transform_loop): Remove stmt_vec_info for strided stores after
+	whole chain vectorization.
+	* config/rs6000/altivec.md (UNSPEC_EXTEVEN, UNSPEC_EXTODD,
+	UNSPEC_INTERHI, UNSPEC_INTERLO): New constants.
+	(vpkuhum_nomode, vpkuwum_nomode, vec_extract_even<mode>,
+	vec_extract_odd<mode>, altivec_vmrghsf, altivec_vmrglsf,
+	vec_interleave_high<mode>, vec_interleave_low<mode>): Implement.
+
 2006-11-22  Steven Bosscher  <steven@gcc.gnu.org>
 
 	* cse.c (enum taken): Remove PATH_AROUND.


The results can be reproduced by building a compiler with

--enable-gather-detailed-mem-stats targetting x86-64

and compiling preprocessed combine.c or testcase from PR8632 with:

-fmem-report --param=ggc-min-heapsize=1024 --param=ggc-min-expand=1 -Ox -Q

The memory consumption summary appears in the dump after detailed listing
of the places they are allocated in.  Peak memory consumption is actually
computed by looking for maximal value in {GC XXXX -> YYYY} report.

Your testing script.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]