This is the mail archive of the
gcc-regression@gcc.gnu.org
mailing list for the GCC project.
A recent patch increased GCC's memory consumption!
- From: gcctest at suse dot de
- To: jh at suse dot cz, gcc-regression at gcc dot gnu dot org
- Date: Thu, 23 Nov 2006 00:45:27 +0000
- Subject: A recent patch increased GCC's memory consumption!
Hi,
I am a friendly script caring about memory consumption in GCC. Please
contact jh@suse.cz if something is going wrong.
Comparing memory consumption on compilation of combine.i, insn-attrtab.i,
and generate-3.4.ii I got:
comparing empty function compilation at -O0 level:
Peak amount of GGC memory allocated before garbage collecting increased from 2229k to 2233k, overall 0.18%
Peak amount of GGC memory still allocated after garbage collectin increased from 1936k to 1940k, overall 0.21%
Amount of memory still referenced at the end of compilation increased from 2266k to 2271k, overall 0.22%
Overall memory needed: 18243k -> 18255k
Peak memory use before GGC: 2229k -> 2233k
Peak memory use after GGC: 1936k -> 1940k
Maximum of released memory in single GGC run: 293k
Garbage: 422k
Leak: 2266k -> 2271k
Overhead: 445k -> 446k
GGC runs: 3
comparing empty function compilation at -O0 -g level:
Peak amount of GGC memory allocated before garbage collecting increased from 2256k to 2260k, overall 0.18%
Peak amount of GGC memory still allocated after garbage collectin increased from 1963k to 1967k, overall 0.20%
Amount of memory still referenced at the end of compilation increased from 2298k to 2303k, overall 0.21%
Overall memory needed: 18263k -> 18271k
Peak memory use before GGC: 2256k -> 2260k
Peak memory use after GGC: 1963k -> 1967k
Maximum of released memory in single GGC run: 293k
Garbage: 424k
Leak: 2298k -> 2303k
Overhead: 449k -> 450k
GGC runs: 3
comparing empty function compilation at -O1 level:
Peak amount of GGC memory allocated before garbage collecting increased from 2229k to 2233k, overall 0.18%
Peak amount of GGC memory still allocated after garbage collectin increased from 1936k to 1940k, overall 0.21%
Amount of memory still referenced at the end of compilation increased from 2269k to 2274k, overall 0.22%
Overall memory needed: 18347k -> 18359k
Peak memory use before GGC: 2229k -> 2233k
Peak memory use after GGC: 1936k -> 1940k
Maximum of released memory in single GGC run: 293k
Garbage: 427k
Leak: 2269k -> 2274k
Overhead: 445k -> 446k
GGC runs: 4
comparing empty function compilation at -O2 level:
Peak amount of GGC memory allocated before garbage collecting increased from 2229k to 2233k, overall 0.18%
Peak amount of GGC memory still allocated after garbage collectin increased from 1936k to 1940k, overall 0.21%
Amount of memory still referenced at the end of compilation increased from 2269k to 2274k, overall 0.22%
Overall memory needed: 18359k -> 18367k
Peak memory use before GGC: 2229k -> 2233k
Peak memory use after GGC: 1936k -> 1940k
Maximum of released memory in single GGC run: 293k
Garbage: 430k
Leak: 2269k -> 2274k
Overhead: 446k -> 447k
GGC runs: 4
comparing empty function compilation at -O3 level:
Peak amount of GGC memory allocated before garbage collecting increased from 2229k to 2233k, overall 0.18%
Peak amount of GGC memory still allocated after garbage collectin increased from 1936k to 1940k, overall 0.21%
Amount of memory still referenced at the end of compilation increased from 2269k to 2274k, overall 0.22%
Overall memory needed: 18359k -> 18367k
Peak memory use before GGC: 2229k -> 2233k
Peak memory use after GGC: 1936k -> 1940k
Maximum of released memory in single GGC run: 293k
Garbage: 430k
Leak: 2269k -> 2274k
Overhead: 446k -> 447k
GGC runs: 4
comparing combine.c compilation at -O0 level:
Overall memory needed: 28431k -> 28443k
Peak memory use before GGC: 9305k -> 9309k
Peak memory use after GGC: 8844k -> 8848k
Maximum of released memory in single GGC run: 2665k
Garbage: 36852k -> 36852k
Leak: 6456k -> 6461k
Overhead: 4868k -> 4869k
GGC runs: 280 -> 279
comparing combine.c compilation at -O0 -g level:
Overall memory needed: 30523k -> 30535k
Peak memory use before GGC: 10855k -> 10859k
Peak memory use after GGC: 10485k -> 10489k
Maximum of released memory in single GGC run: 2415k
Garbage: 37429k -> 37429k
Leak: 9266k -> 9271k
Overhead: 5536k -> 5537k
GGC runs: 271
comparing combine.c compilation at -O1 level:
Overall memory needed: 40271k -> 40287k
Peak memory use before GGC: 17295k -> 17299k
Peak memory use after GGC: 17120k -> 17124k
Maximum of released memory in single GGC run: 2275k
Garbage: 57480k -> 57485k
Leak: 6510k -> 6515k
Overhead: 6226k -> 6227k
GGC runs: 357 -> 356
comparing combine.c compilation at -O2 level:
Overall memory needed: 29802k -> 29806k
Peak memory use before GGC: 17291k -> 17295k
Peak memory use after GGC: 17120k -> 17124k
Maximum of released memory in single GGC run: 2868k -> 2876k
Garbage: 74890k -> 74892k
Leak: 6616k -> 6613k
Overhead: 8472k -> 8473k
GGC runs: 412 -> 411
comparing combine.c compilation at -O3 level:
Overall memory needed: 28902k -> 28906k
Peak memory use before GGC: 18419k -> 18423k
Peak memory use after GGC: 17847k -> 17851k
Maximum of released memory in single GGC run: 4106k
Garbage: 112668k -> 112668k
Leak: 6684k -> 6689k
Overhead: 13029k -> 13030k
GGC runs: 463 -> 462
Overall memory needed: 28431k -> 28443k
Peak memory use before GGC: 9305k -> 9309k
Peak memory use after GGC: 8844k -> 8848k
Maximum of released memory in single GGC run: 2665k
Garbage: 36852k -> 36852k
Leak: 6456k -> 6461k
Overhead: 4868k -> 4869k
GGC runs: 280 -> 279
comparing combine.c compilation at -O1 level:
Overall memory needed: 40271k -> 40287k
Peak memory use before GGC: 17295k -> 17299k
Peak memory use after GGC: 17120k -> 17124k
Maximum of released memory in single GGC run: 2275k
Garbage: 57480k -> 57485k
Leak: 6510k -> 6515k
Overhead: 6226k -> 6227k
GGC runs: 357 -> 356
comparing combine.c compilation at -O2 level:
Overall memory needed: 29802k -> 29806k
Peak memory use before GGC: 17291k -> 17295k
Peak memory use after GGC: 17120k -> 17124k
Maximum of released memory in single GGC run: 2868k -> 2876k
Garbage: 74890k -> 74892k
Leak: 6616k -> 6613k
Overhead: 8472k -> 8473k
GGC runs: 412 -> 411
comparing combine.c compilation at -O3 level:
Overall memory needed: 28902k -> 28906k
Peak memory use before GGC: 18419k -> 18423k
Peak memory use after GGC: 17847k -> 17851k
Maximum of released memory in single GGC run: 4106k
Garbage: 112668k -> 112668k
Leak: 6684k -> 6689k
Overhead: 13029k -> 13030k
GGC runs: 463 -> 462
comparing insn-attrtab.c compilation at -O0 level:
Overall memory needed: 88242k -> 88246k
Peak memory use before GGC: 69789k -> 69793k
Peak memory use after GGC: 44199k -> 44203k
Maximum of released memory in single GGC run: 36964k
Garbage: 129066k -> 129066k
Leak: 9515k -> 9520k
Overhead: 17000k -> 17001k
GGC runs: 216
comparing insn-attrtab.c compilation at -O0 -g level:
Overall memory needed: 89422k -> 89426k
Peak memory use before GGC: 70938k -> 70942k
Peak memory use after GGC: 45455k -> 45459k
Maximum of released memory in single GGC run: 36964k
Garbage: 130494k -> 130494k
Leak: 10946k -> 10951k
Overhead: 17379k -> 17380k
GGC runs: 212
comparing insn-attrtab.c compilation at -O1 level:
Overall memory needed: 112878k -> 114178k
Peak memory use before GGC: 90375k -> 90379k
Peak memory use after GGC: 83737k -> 83741k
Maximum of released memory in single GGC run: 31852k
Garbage: 277775k -> 277776k
Leak: 9357k -> 9362k
Overhead: 29791k -> 29792k
GGC runs: 222 -> 221
comparing insn-attrtab.c compilation at -O2 level:
Overall memory needed: 129382k -> 129390k
Peak memory use before GGC: 92604k -> 92608k
Peak memory use after GGC: 84716k -> 84720k
Maximum of released memory in single GGC run: 30398k
Garbage: 317196k -> 317198k
Leak: 9359k -> 9364k
Overhead: 36370k -> 36371k
GGC runs: 245 -> 244
comparing insn-attrtab.c compilation at -O3 level:
Overall memory allocated via mmap and sbrk increased from 129418k to 134238k, overall 3.72%
Overall memory needed: 129418k -> 134238k
Peak memory use before GGC: 92630k -> 92634k
Peak memory use after GGC: 84742k -> 84746k
Maximum of released memory in single GGC run: 30585k
Garbage: 318053k -> 318053k
Leak: 9362k -> 9367k
Overhead: 36605k -> 36606k
GGC runs: 249 -> 248
comparing Gerald's testcase PR8361 compilation at -O0 level:
Overall memory needed: 119998k -> 120002k
Peak memory use before GGC: 93308k -> 93312k
Peak memory use after GGC: 92381k -> 92385k
Maximum of released memory in single GGC run: 20013k
Garbage: 207743k -> 207743k
Leak: 47725k -> 47730k
Overhead: 20983k -> 20983k
GGC runs: 409
comparing Gerald's testcase PR8361 compilation at -O0 -g level:
Overall memory needed: 132498k -> 132502k
Peak memory use before GGC: 105437k -> 105441k
Peak memory use after GGC: 104386k -> 104390k
Maximum of released memory in single GGC run: 19646k
Garbage: 214333k -> 214331k
Leak: 70684k -> 70689k
Overhead: 26599k -> 26600k
GGC runs: 380
comparing Gerald's testcase PR8361 compilation at -O1 level:
Overall memory needed: 119134k -> 119138k
Peak memory use before GGC: 97919k -> 97923k
Peak memory use after GGC: 95707k -> 95711k
Maximum of released memory in single GGC run: 18552k
Garbage: 446279k -> 446309k
Leak: 50111k -> 50116k
Overhead: 32835k -> 32836k
GGC runs: 559
comparing Gerald's testcase PR8361 compilation at -O2 level:
Overall memory needed: 119158k -> 119162k
Peak memory use before GGC: 97919k -> 97924k
Peak memory use after GGC: 95706k -> 95711k
Maximum of released memory in single GGC run: 18552k
Garbage: 505250k -> 505250k
Leak: 50795k -> 50800k
Overhead: 40015k -> 40016k
GGC runs: 613
comparing Gerald's testcase PR8361 compilation at -O3 level:
Overall memory needed: 118990k -> 118994k
Peak memory use before GGC: 97964k -> 97968k
Peak memory use after GGC: 96993k -> 96997k
Maximum of released memory in single GGC run: 18932k
Garbage: 526318k -> 526304k
Leak: 50340k -> 50345k
Overhead: 40920k -> 40921k
GGC runs: 628
comparing PR rtl-optimization/28071 testcase compilation at -O0 level:
Overall memory needed: 137958k -> 137962k
Peak memory use before GGC: 81909k -> 81913k
Peak memory use after GGC: 58788k -> 58792k
Maximum of released memory in single GGC run: 45493k
Garbage: 147244k -> 147245k
Leak: 7536k -> 7541k
Overhead: 25303k -> 25304k
GGC runs: 82
comparing PR rtl-optimization/28071 testcase compilation at -O0 -g level:
Overall memory needed: 138134k -> 138138k
Peak memory use before GGC: 82542k -> 82546k
Peak memory use after GGC: 59422k -> 59426k
Maximum of released memory in single GGC run: 45558k
Garbage: 147415k
Leak: 9244k -> 9249k
Overhead: 25769k -> 25769k
GGC runs: 88
comparing PR rtl-optimization/28071 testcase compilation at -O1 level:
Overall memory needed: 424330k -> 424266k
Peak memory use before GGC: 205229k -> 205233k
Peak memory use after GGC: 201005k -> 201009k
Maximum of released memory in single GGC run: 101903k
Garbage: 272136k -> 272136k
Leak: 47601k -> 47606k
Overhead: 31281k -> 31282k
GGC runs: 101
comparing PR rtl-optimization/28071 testcase compilation at -O2 level:
Overall memory needed: 352126k -> 352386k
Peak memory use before GGC: 206002k -> 206006k
Peak memory use after GGC: 201778k -> 201782k
Maximum of released memory in single GGC run: 108808k
Garbage: 352361k -> 352361k
Leak: 48184k -> 48189k
Overhead: 47026k -> 47027k
GGC runs: 110
comparing PR rtl-optimization/28071 testcase compilation at -O3 -fno-tree-pre -fno-tree-fre level:
Overall memory needed: 781466k -> 781310k
Peak memory use before GGC: 314925k -> 314929k
Peak memory use after GGC: 293268k -> 293272k
Maximum of released memory in single GGC run: 165201k -> 165197k
Garbage: 494383k -> 494379k
Leak: 65517k -> 65522k
Overhead: 59884k -> 59885k
GGC runs: 98
Head of the ChangeLog is:
--- /usr/src/SpecTests/sandbox-britten-memory/x86_64/mem-result/ChangeLog 2006-11-22 06:29:51.000000000 +0000
+++ /usr/src/SpecTests/sandbox-britten-memory/gcc/gcc/ChangeLog 2006-11-22 23:02:06.000000000 +0000
@@ -1,3 +1,108 @@
+2006-11-22 Peter Bergner <bergner@vnet.ibm.com>
+
+ * config/rs6000/rs6000.c (get_store_dest): New.
+ (adjacent_mem_locations): Use get_store_dest() to get
+ the rtl of the store destination.
+
+2006-11-22 Joseph Myers <joseph@codesourcery.com>
+
+ * config/rs6000/spe.md (SPE64): New mode macro.
+ (mov_sidf_e500_subreg0): Change to mov_si<mode>_e500_subreg0. Add
+ memory load.
+ (mov_si<mode>_e500_subreg0_2): New.
+ (mov_sidf_e500_subreg4): Change to mov_si<mode>_e500_subreg4. Add
+ memory load.
+ (mov_si<mode>_e500_subreg4_2): New.
+ * config/rs6000/predicates.md (input_operand): Do not allow
+ invalid E500 subregs.
+ (rs6000_nonimmediate_operand): Check for invalid E500 subregs also
+ if TARGET_SPE.
+ * config/rs6000/rs6000.c (invalid_e500_subreg): Check for subregs
+ involving DFmode if TARGET_E500_DOUBLE. Check for subregs
+ involving vector modes if TARGET_SPE.
+
+2006-11-22 Kaz Kojima <kkojima@gcc.gnu.org>
+
+ Revert
+ 2006-11-12 Kaz Kojima <kkojima@gcc.gnu.org>
+ * reorg.c (emit_delay_sequence): Copy the delay slot insn.
+
+2006-11-22 Bernd Schmidt <bernd.schmidt@analog.com>
+
+ * config/bfin/predicates.md (d_register_operand, mem_p_address_operand,
+ mem_i_address_operand): New predicates.
+ * config/bfin/bfin.c (bfin_issue_rate): New function.
+ (TARGET_SCHED_ISSUE_RATE): New macro.
+ * config/bfin/bfin.md (addrtype): New attribute.
+ (slot0, slot1, slot2, store, pregs): New cpu_units.
+ (core): Now a define_reservation.
+ (alu): Remove some insn types from this reservation.
+ (dsp32, load32, loadp, loadi, store32, storep, storei, multi): New
+ insn reservations.
+ (dummy reservation): Don't trigger for mcld insns.
+ (absence_sets): Two new absence sets to enforce slot ordering.
+ (popsi_insn): Set addrtype.
+
+2006-11-22 Ira Rosen <irar@il.ibm.com>
+
+ * doc/c-tree.texi: Document new tree codes.
+ * doc/md.texi: Document new optabs.
+ * tree-pretty-print.c (dump_generic_node): Handle print of new tree
+ codes.
+ * optabs.c (optab_for_tree_code, init_optabs): Handle new optabs.
+ * optabs.h (optab_index): Add new.
+ (vec_extract_even_optab, vec_extract_odd_optab,
+ vec_interleave_high_optab, vec_interleave_low_optab): New optabs.
+ * genopinit.c (vec_extract_even_optab, vec_extract_odd_optab,
+ vec_interleave_high_optab, vec_interleave_low_optab): Initialize
+ new optabs.
+ * expr.c (expand_expr_real_1): Add implementation for new tree codes.
+ * tree-vectorizer.c (new_stmt_vec_info): Initialize new fields.
+ * tree-vectorizer.h (stmt_vec_info): Add new fields for interleaving
+ along with macros for their access.
+ * tree-data-ref.h (first_location_in_loop, data_reference): Update
+ comment.
+ * tree-vect-analyze.c (toplev.h): Include.
+ (vect_determine_vectorization_factor): Fix indentation.
+ (vect_insert_into_interleaving_chain,
+ vect_update_interleaving_chain, vect_equal_offsets): New functions.
+ (vect_analyze_data_ref_dependence): Add argument for interleaving
+ check. Check for interleaving if it's true.
+ (vect_check_dependences): New function.
+ (vect_analyze_data_ref_dependences): Call vect_check_dependences for
+ every ddr. Call vect_analyze_data_ref_dependence with new argument.
+ (vect_update_misalignment_for_peel): Update for interleaving.
+ (vect_verify_datarefs_alignment): Check only first data-ref for
+ interleaving.
+ (vect_enhance_data_refs_alignment): Update for interleaving. Check
+ only first data-ref for interleaving.
+ (vect_analyze_data_ref_access): Check interleaving, update
+ interleaving data.
+ (vect_analyze_data_refs): Call compute_data_dependences_for_loop
+ with different parameters.
+ * tree.def (VEC_EXTRACT_EVEN_EXPR, VEC_EXTRACT_ODD_EXPR,
+ VEC_INTERLEAVE_HIGH_EXPR, VEC_INTERLEAVE_LOW_EXPR): New tree codes.
+ * tree-inline.c (estimate_num_insns_1): Add cases for new codes.
+ * tree-vect-transform.c (vect_create_addr_base_for_vector_ref):
+ Update step in case of interleaving.
+ (vect_strided_store_supported, vect_permute_store_chain): New
+ functions.
+ (vectorizable_store): Handle strided stores.
+ (vect_strided_load_supported, vect_permute_load_chain,
+ vect_transform_strided_load): New functions.
+ (vectorizable_load): Handle strided loads.
+ (vect_transform_stmt): Add argument. Handle strided stores. Check
+ that vectorized stmt exists for patterns.
+ (vect_gen_niters_for_prolog_loop): Update calculation for
+ interleaving.
+ (vect_transform_loop): Remove stmt_vec_info for strided stores after
+ whole chain vectorization.
+ * config/rs6000/altivec.md (UNSPEC_EXTEVEN, UNSPEC_EXTODD,
+ UNSPEC_INTERHI, UNSPEC_INTERLO): New constants.
+ (vpkuhum_nomode, vpkuwum_nomode, vec_extract_even<mode>,
+ vec_extract_odd<mode>, altivec_vmrghsf, altivec_vmrglsf,
+ vec_interleave_high<mode>, vec_interleave_low<mode>): Implement.
+
2006-11-22 Steven Bosscher <steven@gcc.gnu.org>
* cse.c (enum taken): Remove PATH_AROUND.
The results can be reproduced by building a compiler with
--enable-gather-detailed-mem-stats targetting x86-64
and compiling preprocessed combine.c or testcase from PR8632 with:
-fmem-report --param=ggc-min-heapsize=1024 --param=ggc-min-expand=1 -Ox -Q
The memory consumption summary appears in the dump after detailed listing
of the places they are allocated in. Peak memory consumption is actually
computed by looking for maximal value in {GC XXXX -> YYYY} report.
Your testing script.