This is the mail archive of the
gcc-regression@gcc.gnu.org
mailing list for the GCC project.
A recent patch increased GCC's memory consumption!
- From: gcctest at suse dot de
- To: jh at suse dot cz, gcc-regression at gcc dot gnu dot org
- Date: Wed, 08 Nov 2006 13:17:53 +0000
- Subject: A recent patch increased GCC's memory consumption!
Hi,
I am a friendly script caring about memory consumption in GCC. Please
contact jh@suse.cz if something is going wrong.
Comparing memory consumption on compilation of combine.i, insn-attrtab.i,
and generate-3.4.ii I got:
comparing combine.c compilation at -O0 level:
Peak amount of GGC memory allocated before garbage collecting increased from 9293k to 9304k, overall 0.12%
Peak amount of GGC memory still allocated after garbage collectin increased from 8832k to 8843k, overall 0.12%
Amount of memory still referenced at the end of compilation increased from 6441k to 6454k, overall 0.21%
Overall memory needed: 28374k -> 28410k
Peak memory use before GGC: 9293k -> 9304k
Peak memory use after GGC: 8832k -> 8843k
Maximum of released memory in single GGC run: 2666k
Garbage: 36856k -> 36845k
Leak: 6441k -> 6454k
Overhead: 4860k -> 4862k
GGC runs: 280
comparing combine.c compilation at -O1 level:
Amount of memory still referenced at the end of compilation increased from 6495k to 6517k, overall 0.33%
Overall memory needed: 40218k -> 40254k
Peak memory use before GGC: 17281k -> 17292k
Peak memory use after GGC: 17106k -> 17117k
Maximum of released memory in single GGC run: 2363k
Garbage: 57582k -> 57577k
Leak: 6495k -> 6517k
Overhead: 6224k -> 6226k
GGC runs: 355
comparing combine.c compilation at -O2 level:
Overall memory needed: 29790k -> 29802k
Peak memory use before GGC: 17277k -> 17288k
Peak memory use after GGC: 17106k -> 17117k
Maximum of released memory in single GGC run: 2803k
Garbage: 74892k -> 74893k
Leak: 6603k -> 6609k
Overhead: 8470k -> 8473k
GGC runs: 413
comparing combine.c compilation at -O3 level:
Amount of memory still referenced at the end of compilation increased from 6668k to 6681k, overall 0.20%
Overall memory needed: 28894k -> 28902k
Peak memory use before GGC: 18218k -> 18229k
Peak memory use after GGC: 17834k -> 17845k
Maximum of released memory in single GGC run: 4104k
Garbage: 104223k -> 104223k
Leak: 6668k -> 6681k
Overhead: 11907k -> 11909k
GGC runs: 462
comparing insn-attrtab.c compilation at -O0 level:
Amount of memory still referenced at the end of compilation increased from 9501k to 9514k, overall 0.14%
Overall memory needed: 88230k -> 88242k
Peak memory use before GGC: 69777k -> 69788k
Peak memory use after GGC: 44187k -> 44198k
Maximum of released memory in single GGC run: 36963k
Garbage: 129065k -> 129062k
Leak: 9501k -> 9514k
Overhead: 16993k -> 16996k
GGC runs: 216
comparing insn-attrtab.c compilation at -O1 level:
Amount of memory still referenced at the end of compilation increased from 9343k to 9357k, overall 0.14%
Overall memory needed: 113902k -> 114174k
Peak memory use before GGC: 90363k -> 90374k
Peak memory use after GGC: 83725k -> 83736k
Maximum of released memory in single GGC run: 31852k
Garbage: 277769k -> 277769k
Leak: 9343k -> 9357k
Overhead: 29778k -> 29780k
GGC runs: 223 -> 222
comparing insn-attrtab.c compilation at -O2 level:
Amount of memory still referenced at the end of compilation increased from 9345k to 9359k, overall 0.14%
Overall memory needed: 120390k -> 120406k
Peak memory use before GGC: 92593k -> 92604k
Peak memory use after GGC: 84705k -> 84716k
Maximum of released memory in single GGC run: 30394k
Garbage: 317192k -> 317192k
Leak: 9345k -> 9359k
Overhead: 36353k -> 36356k
GGC runs: 246 -> 245
comparing insn-attrtab.c compilation at -O3 level:
Amount of memory still referenced at the end of compilation increased from 9348k to 9362k, overall 0.14%
Overall memory needed: 134218k -> 134226k
Peak memory use before GGC: 92618k -> 92629k
Peak memory use after GGC: 84731k -> 84742k
Maximum of released memory in single GGC run: 30584k
Garbage: 317844k -> 317844k
Leak: 9348k -> 9362k
Overhead: 36551k -> 36554k
GGC runs: 250 -> 249
comparing Gerald's testcase PR8361 compilation at -O0 level:
Overall memory needed: 119538k -> 119550k
Peak memory use before GGC: 92680k -> 92691k
Peak memory use after GGC: 91760k -> 91771k
Maximum of released memory in single GGC run: 19314k
Garbage: 205600k -> 205599k
Leak: 47677k -> 47691k
Overhead: 20817k -> 20819k
GGC runs: 402 -> 401
comparing Gerald's testcase PR8361 compilation at -O1 level:
Overall memory needed: 119278k -> 119302k
Peak memory use before GGC: 97848k -> 97860k
Peak memory use after GGC: 95638k -> 95650k
Maximum of released memory in single GGC run: 18600k
Garbage: 444206k -> 444236k
Leak: 50011k -> 50025k
Overhead: 32784k -> 32786k
GGC runs: 552 -> 551
comparing Gerald's testcase PR8361 compilation at -O2 level:
Overall memory needed: 119286k -> 119294k
Peak memory use before GGC: 97848k -> 97860k
Peak memory use after GGC: 95638k -> 95650k
Maximum of released memory in single GGC run: 18600k
Garbage: 503803k -> 503803k
Leak: 50716k -> 50729k
Overhead: 40089k -> 40091k
GGC runs: 608 -> 607
comparing Gerald's testcase PR8361 compilation at -O3 level:
Overall memory needed: 118930k -> 118926k
Peak memory use before GGC: 97894k -> 97906k
Peak memory use after GGC: 96924k -> 96936k
Maximum of released memory in single GGC run: 18847k
Garbage: 523450k -> 523447k
Leak: 50291k -> 50304k
Overhead: 40598k -> 40603k
GGC runs: 621 -> 620
comparing PR rtl-optimization/28071 testcase compilation at -O0 level:
Amount of memory still referenced at the end of compilation increased from 7522k to 7536k, overall 0.18%
Overall memory needed: 137946k -> 137786k
Peak memory use before GGC: 81898k -> 81909k
Peak memory use after GGC: 58777k -> 58788k
Maximum of released memory in single GGC run: 45493k
Garbage: 147195k -> 147243k
Leak: 7522k -> 7536k
Overhead: 25300k -> 25302k
GGC runs: 83 -> 82
comparing PR rtl-optimization/28071 testcase compilation at -O1 level:
Overall memory needed: 423030k -> 423186k
Peak memory use before GGC: 205260k -> 205271k
Peak memory use after GGC: 201036k -> 201047k
Maximum of released memory in single GGC run: 101714k
Garbage: 271706k -> 271706k
Leak: 47588k -> 47601k
Overhead: 30829k -> 30831k
GGC runs: 101
comparing PR rtl-optimization/28071 testcase compilation at -O2 level:
Overall memory needed: 352334k -> 352070k
Peak memory use before GGC: 206001k -> 206012k
Peak memory use after GGC: 201777k -> 201788k
Maximum of released memory in single GGC run: 108617k
Garbage: 351905k -> 351905k
Leak: 48171k -> 48184k
Overhead: 46573k -> 46575k
GGC runs: 110
comparing PR rtl-optimization/28071 testcase compilation at -O3 -fno-tree-pre -fno-tree-fre level:
Overall memory needed: 781042k -> 781150k
Peak memory use before GGC: 314916k -> 314927k
Peak memory use after GGC: 293259k -> 293270k
Maximum of released memory in single GGC run: 165331k
Garbage: 494299k -> 494299k
Leak: 65503k -> 65517k
Overhead: 59714k -> 59716k
GGC runs: 98
Head of the ChangeLog is:
--- /usr/src/SpecTests/sandbox-britten-memory/x86_64/mem-result/ChangeLog 2006-11-08 01:38:17.000000000 +0000
+++ /usr/src/SpecTests/sandbox-britten-memory/gcc/gcc/ChangeLog 2006-11-08 11:39:57.000000000 +0000
@@ -1,3 +1,185 @@
+2006-11-08 Dorit Nuzman <dorit@il.ibm.com>
+
+ * tree-vect-analyze.c (vect_mark_relevant, vect_stmt_relevant_p): Take
+ enum argument instead of bool.
+ (vect_analyze_operations): Call vectorizable_type_promotion.
+ * tree-vectorizer.h (type_promotion_vec_info_type): New enum
+ stmt_vec_info_type value.
+ (supportable_widening_operation, vectorizable_type_promotion): New
+ function declarations.
+ * tree-vect-transform.c (vect_gen_widened_results_half): New function.
+ (vectorizable_type_promotion): New function.
+ (vect_transform_stmt): Call vectorizable_type_promotion.
+ * tree-vect-analyze.c (supportable_widening_operation): New function.
+ * tree-vect-patterns.c (vect_recog_dot_prod_pattern):
+ Add implementation.
+ * tree-vect-generic.c (expand_vector_operations_1): Consider correct
+ mode.
+
+ * tree.def (VEC_WIDEN_MULT_HI_EXPR, VEC_WIDEN_MULT_LO_EXPR):
+ (VEC_UNPACK_HI_EXPR, VEC_UNPACK_LO_EXPR): New tree-codes.
+ * tree-inline.c (estimate_num_insns_1): Add cases for above new
+ tree-codes.
+ * tree-pretty-print.c (dump_generic_node, op_prio): Likewise.
+ * expr.c (expand_expr_real_1): Likewise.
+ * optabs.c (optab_for_tree_code): Likewise.
+ (init_optabs): Initialize new optabs.
+ * genopinit.c (vec_widen_umult_hi_optab, vec_widen_smult_hi_optab,
+ vec_widen_smult_hi_optab, vec_widen_smult_lo_optab,
+ vec_unpacks_hi_optab, vec_unpacks_lo_optab, vec_unpacku_hi_optab,
+ vec_unpacku_lo_optab): Initialize new optabs.
+ * optabs.h (OTI_vec_widen_umult_hi, OTI_vec_widen_umult_lo):
+ (OTI_vec_widen_smult_h, OTI_vec_widen_smult_lo, OTI_vec_unpacks_hi,
+ OTI_vec_unpacks_lo, OTI_vec_unpacku_hi, OTI_vec_unpacku_lo): New
+ optab indices.
+ (vec_widen_umult_hi_optab, vec_widen_umult_lo_optab):
+ (vec_widen_smult_hi_optab, vec_widen_smult_lo_optab):
+ (vec_unpacks_hi_optab, vec_unpacku_hi_optab, vec_unpacks_lo_optab):
+ (vec_unpacku_lo_optab): New optabs.
+ * doc/md.texi (vec_unpacks_hi, vec_unpacks_lo, vec_unpacku_hi):
+ (vec_unpacku_lo, vec_widen_umult_hi, vec_widen_umult_lo):
+ (vec_widen_smult_hi, vec_widen_smult_lo): New.
+ * doc/c-tree.texi (VEC_LSHIFT_EXPR, VEC_RSHIFT_EXPR):
+ (VEC_WIDEN_MULT_HI_EXPR, VEC_WIDEN_MULT_LO_EXPR, VEC_UNPACK_HI_EXPR):
+ (VEC_UNPACK_LO_EXPR, VEC_PACK_MOD_EXPR, VEC_PACK_SAT_EXPR): New.
+
+ * config/rs6000/altivec.md (UNSPEC_VMULWHUB, UNSPEC_VMULWLUB):
+ (UNSPEC_VMULWHSB, UNSPEC_VMULWLSB, UNSPEC_VMULWHUH, UNSPEC_VMULWLUH):
+ (UNSPEC_VMULWHSH, UNSPEC_VMULWLSH): New.
+ (UNSPEC_VPERMSI, UNSPEC_VPERMHI): New.
+ (vec_vperm_v8hiv4si, vec_vperm_v16qiv8hi): New patterns used to
+ implement the unsigned unpacking patterns.
+ (vec_unpacks_hi_v16qi, vec_unpacks_hi_v8hi, vec_unpacks_lo_v16qi):
+ (vec_unpacks_lo_v8hi): New signed unpacking patterns.
+ (vec_unpacku_hi_v16qi, vec_unpacku_hi_v8hi, vec_unpacku_lo_v16qi):
+ (vec_unpacku_lo_v8hi): New unsigned unpacking patterns.
+ (vec_widen_umult_hi_v16qi, vec_widen_umult_lo_v16qi):
+ (vec_widen_smult_hi_v16qi, vec_widen_smult_lo_v16qi):
+ (vec_widen_umult_hi_v8hi, vec_widen_umult_lo_v8hi):
+ (vec_widen_smult_hi_v8hi, vec_widen_smult_lo_v8hi): New widening
+ multiplication patterns.
+
+ * target.h (builtin_mul_widen_even, builtin_mul_widen_odd): New.
+ * target-def.h (TARGET_VECTORIZE_BUILTIN_MUL_WIDEN_EVEN):
+ (TARGET_VECTORIZE_BUILTIN_MUL_WIDEN_ODD): New.
+ * config/rs6000/rs6000.c (rs6000_builtin_mul_widen_even): New.
+ (rs6000_builtin_mul_widen_odd): New.
+ (TARGET_VECTORIZE_BUILTIN_MUL_WIDEN_EVEN): Defined.
+ (TARGET_VECTORIZE_BUILTIN_MUL_WIDEN_ODD): Defined.
+ * tree-vectorizer.h (enum vect_relevant): New enum type.
+ (_stmt_vec_info): Field relevant chaned from bool to enum
+ vect_relevant.
+ (STMT_VINFO_RELEVANT_P): Updated.
+ (STMT_VINFO_RELEVANT): New.
+ * tree-vectorizer.c (new_stmt_vec_info): Use STMT_VINFO_RELEVANT
+ instead of STMT_VINFO_RELEVANT_P.
+ * tree-vect-analyze.c (vect_mark_relevant, vect_stmt_relevant_p):
+ Replace calls to STMT_VINFO_RELEVANT_P with STMT_VINFO_RELEVANT,
+ and boolean variable with enum vect_relevant.
+ (vect_mark_stmts_to_be_vectorized): Likewise + update documentation.
+ * doc/tm.texi (TARGET_VECTORIZE_BUILTIN_MUL_WIDEN_EVEN): New.
+ (TARGET_VECTORIZE_BUILTIN_MUL_WIDEN_ODD): New.
+
+ 2006-11-08 Richard Henderson <rth@redhat.com>
+
+ * config/i386/sse.md (vec_widen_umult_hi_v8hi,
+ vec_widen_umult_lo_v8hi): New.
+ (vec_widen_smult_hi_v4si, vec_widen_smult_lo_v4si,
+ vec_widen_umult_hi_v4si, vec_widen_umult_lo_v4si): New.
+
+ * config/i386/i386.c (ix86_expand_sse_unpack): New.
+ * config/i386/i386-protos.h (ix86_expand_sse_unpack): New.
+ * config/i386/sse.md (vec_unpacku_hi_v16qi, vec_unpacks_hi_v16qi,
+ vec_unpacku_lo_v16qi, vec_unpacks_lo_v16qi, vec_unpacku_hi_v8hi,
+ vec_unpacks_hi_v8hi, vec_unpacku_lo_v8hi, vec_unpacks_lo_v8hi,
+ vec_unpacku_hi_v4si, vec_unpacks_hi_v4si, vec_unpacku_lo_v4si,
+ vec_unpacks_lo_v4si): New.
+
+ 2006-11-08 Dorit Nuzman <dorit@il.ibm.com>
+
+ * tree-vect-transform.c (vectorizable_type_demotion): New function.
+ (vect_transform_stmt): Add case for type_demotion_vec_info_type.
+ (vect_analyze_operations): Call vectorizable_type_demotion.
+ * tree-vectorizer.h (type_demotion_vec_info_type): New enum
+ stmt_vec_info_type value.
+ (vectorizable_type_demotion): New function declaration.
+ * tree-vect-generic.c (expand_vector_operations_1): Consider correct
+ mode.
+
+ * tree.def (VEC_PACK_MOD_EXPR, VEC_PACK_SAT_EXPR): New tree-codes.
+ * expr.c (expand_expr_real_1): Add case for VEC_PACK_MOD_EXPR and
+ VEC_PACK_SAT_EXPR.
+ * tree-iniline.c (estimate_num_insns_1): Likewise.
+ * tree-pretty-print.c (dump_generic_node, op_prio): Likewise.
+ * optabs.c (optab_for_tree_code): Likewise.
+
+ * optabs.c (expand_binop): In case of vec_pack_*_optabs the mode
+ compared against the predicate of the result is not 'mode' (the input
+ to the function) but a mode with half the size of 'mode'.
+ (init_optab): Initialize new optabs.
+ * optabs.h (OTI_vec_pack_mod, OTI_vec_pack_ssat, OTI_vec_pack_usat):
+ New optab indices.
+ (vec_pack_mod_optab, vec_pack_ssat_optab, vec_pack_usat_optab): New
+ optabs.
+ * genopinit.c (vec_pack_mod_optab, vec_pack_ssat_optab):
+ (vec_pack_usat_optab): Initialize new optabs.
+ * doc/md.texi (vec_pack_mod, vec_pack_ssat, vec_pack_usat): New.
+ * config/rs6000/altivec.md (vec_pack_mod_v8hi, vec_pack_mod_v4si): New.
+
+ 2006-11-08 Richard Henderson <rth@redehat.com>
+
+ * config/i386/sse.md (vec_pack_mod_v8hi, vec_pack_mod_v4si):
+ (vec_pack_mod_v2di, vec_interleave_highv16qi, vec_interleave_lowv16qi):
+ (vec_interleave_highv8hi, vec_interleave_lowv8hi):
+ (vec_interleave_highv4si, vec_interleave_lowv4si):
+ (vec_interleave_highv2di, vec_interleave_lowv2di): New.
+
+ 2006-11-08 Dorit Nuzman <dorit@il.ibm.com>
+
+ * tree-vect-transform.c (vectorizable_reduction): Support multiple
+ datatypes.
+ (vect_transform_stmt): Removed redundant code.
+
+ 2006-11-08 Dorit Nuzman <dorit@il.ibm.com>
+
+ * tree-vect-transform.c (vectorizable_operation): Support multiple
+ datatypes.
+
+ 2006-11-08 Dorit Nuzman <dorit@il.ibm.com>
+
+ * tree-vect-transform.c (vect_align_data_ref): Removed.
+ (vect_create_data_ref_ptr): Added additional argument - ptr_incr.
+ Updated function documentation. Return the increment stmt in ptr_incr.
+ (bump_vector_ptr): New function.
+ (vect_get_vec_def_for_stmt_copy): New function.
+ (vect_finish_stmt_generation): Create a stmt_info to newly created
+ vector stmts.
+ (vect_setup_realignment): Call vect_create_data_ref_ptr with additional
+ argument.
+ (vectorizable_reduction, vectorizable_assignment): Not supported yet if
+ VF is greater than the number of elements that can fit in one vector
+ word.
+ (vectorizable_operation, vectorizable_condition): Likewise.
+ (vectorizable_store, vectorizable_load): Support the case that the VF
+ is greater than the number of elements that can fit in one vector word.
+ (vect_transform_loop): Don't fail in case of multiple data-types.
+ * tree-vect-analyze.c (vect_determine_vectorization_factor): Don't fail
+ in case of multiple data-types; the smallest type determines the VF.
+ (vect_analyze_data_ref_dependence): Don't record datarefs as same_align
+ if they are of different sizes.
+ (vect_update_misalignment_for_peel): Compare misalignments in terms of
+ number of elements rather than number of bytes.
+ (vect_enhance_data_refs_alignment): Fix/Add dump printouts.
+ (vect_can_advance_ivs_p): Fix a dump printout
+
+2006-11-07 Eric Christopher <echristo@apple.com>
+
+ * libgcc2.c (__bswapdi2): Rename from bswapDI2.
+ (__bswapsi2): Ditto.
+ * libgcc2.h: Remove transformation of bswap routines.
+ * config/i386/i386.md (bswapsi2): New.
+ (bswapdi2): Ditto.
+
2006-11-07 Jakub Jelinek <jakub@redhat.com>
* c-common.c (c_common_attributes): Add gnu_inline attribyte.
@@ -14,7 +196,7 @@
2006-11-06 Anatoly Sokolov <aesok@post.ru>
- * config/avr/avr-protos.h (mask_one_bit_p, const_int_pow2_p): Remove
+ * config/avr/avr-protos.h (mask_one_bit_p, const_int_pow2_p): Remove
prototype.
* config/avr/avr.c (mask_one_bit_p, const_int_pow2_p): Remove.
(output_movhi, ashlhi3_out, ashlsi3_out, ashrhi3_out, ashrsi3_out,
@@ -50,9 +232,9 @@
* gcc.c (process_command): Treat -b as normal switch if its argument
has no dash.
-
+
2006-11-07 David Ung <davidu@mips.com>
-
+
* config/mips/mips.h (ISA_HAS_PREFETCHX): Add ISA_MIPS32R2 to the
list.
The results can be reproduced by building a compiler with
--enable-gather-detailed-mem-stats targetting x86-64
and compiling preprocessed combine.c or testcase from PR8632 with:
-fmem-report --param=ggc-min-heapsize=1024 --param=ggc-min-expand=1 -Ox -Q
The memory consumption summary appears in the dump after detailed listing
of the places they are allocated in. Peak memory consumption is actually
computed by looking for maximal value in {GC XXXX -> YYYY} report.
Your testing script.