Some aspect of GCC memory consumption increased by recent patch
gcctest@suse.de
gcctest@suse.de
Tue Feb 1 04:22:00 GMT 2005
Hi,
Comparing memory consumption on compilation of combine.i and generate-3.4.ii I got:
comparing combine.c compilation at -O0 level:
Overall memory needed: 24641k -> 24637k
Peak memory use before GGC: 9351k
Peak memory use after GGC: 8665k
Maximum of released memory in single GGC run: 2864k
Garbage: 41665k
Leak: 6387k
Overhead: 5772k
GGC runs: 328
comparing combine.c compilation at -O1 level:
Overall memory allocated via mmap and sbrk increased from 25461k to 25497k, overall 0.14%
Overall memory needed: 25461k -> 25497k
Peak memory use before GGC: 9228k
Peak memory use after GGC: 8733k
Maximum of released memory in single GGC run: 2027k
Garbage: 61218k
Leak: 6749k
Overhead: 9980k
GGC runs: 503
comparing combine.c compilation at -O2 level:
Overall memory needed: 29525k -> 29541k
Peak memory use before GGC: 12663k
Peak memory use after GGC: 12537k
Maximum of released memory in single GGC run: 2550k
Garbage: 79173k
Leak: 6585k
Overhead: 14094k
GGC runs: 515
comparing combine.c compilation at -O3 level:
Overall memory needed: 20212k
Peak memory use before GGC: 12794k
Peak memory use after GGC: 12537k
Maximum of released memory in single GGC run: 3345k
Garbage: 106976k
Leak: 7098k
Overhead: 18885k
GGC runs: 581
comparing insn-attrtab.c compilation at -O0 level:
Overall memory needed: 114136k
Peak memory use before GGC: 74739k
Peak memory use after GGC: 45485k
Maximum of released memory in single GGC run: 39340k
Garbage: 152660k
Leak: 10976k
Overhead: 19969k
GGC runs: 273
comparing insn-attrtab.c compilation at -O1 level:
Overall memory allocated via mmap and sbrk increased from 124560k to 125300k, overall 0.59%
Overall memory needed: 124560k -> 125300k
Peak memory use before GGC: 78748k
Peak memory use after GGC: 70095k
Maximum of released memory in single GGC run: 40765k
Garbage: 367812k
Leak: 11353k
Overhead: 69563k
GGC runs: 399
comparing insn-attrtab.c compilation at -O2 level:
Overall memory needed: 148992k
Peak memory use before GGC: 98349k
Peak memory use after GGC: 83466k
Maximum of released memory in single GGC run: 39290k
Garbage: 481106k
Leak: 11239k
Overhead: 84673k
GGC runs: 341
comparing insn-attrtab.c compilation at -O3 level:
Overall memory needed: 148984k -> 147352k
Peak memory use before GGC: 98351k
Peak memory use after GGC: 83467k
Maximum of released memory in single GGC run: 39291k
Garbage: 482150k
Leak: 11277k
Overhead: 84822k
GGC runs: 347
comparing Gerald's testcase PR8361 compilation at -O0 level:
Amount of produced GGC garbage decreased from 245826k to 207460k, overall -18.49%
Overall memory needed: 111092k -> 110792k
Peak memory use before GGC: 86885k -> 86594k
Peak memory use after GGC: 85935k -> 85642k
Maximum of released memory in single GGC run: 19282k -> 19192k
Garbage: 245826k -> 207460k
Leak: 55495k -> 54069k
Overhead: 43290k -> 36026k
GGC runs: 367 -> 317
comparing Gerald's testcase PR8361 compilation at -O1 level:
Amount of produced GGC garbage decreased from 445679k to 155717k, overall -186.21%
Amount of memory still referenced at the end of compilation increased from 56775k to 70830k, overall 24.76%
Overall memory needed: 103673k -> 102265k
Peak memory use before GGC: 85967k -> 85625k
Peak memory use after GGC: 84933k -> 84691k
Maximum of released memory in single GGC run: 18946k -> 18803k
Garbage: 445679k -> 155717k
Leak: 56775k -> 70830k
Overhead: 65545k -> 31643k
GGC runs: 526 -> 256
comparing Gerald's testcase PR8361 compilation at -O2 level:
Amount of produced GGC garbage decreased from 487823k to 155764k, overall -213.18%
Amount of memory still referenced at the end of compilation increased from 57392k to 70901k, overall 23.54%
Overall memory needed: 104721k -> 102261k
Peak memory use before GGC: 85967k -> 85625k
Peak memory use after GGC: 84933k -> 84692k
Maximum of released memory in single GGC run: 18945k -> 18802k
Garbage: 487823k -> 155764k
Leak: 57392k -> 70901k
Overhead: 75166k -> 31661k
GGC runs: 584 -> 256
comparing Gerald's testcase PR8361 compilation at -O3 level:
Amount of produced GGC garbage decreased from 503831k to 154891k, overall -225.28%
Amount of memory still referenced at the end of compilation increased from 57523k to 72139k, overall 25.41%
Overall memory needed: 112413k -> 109901k
Peak memory use before GGC: 92707k -> 92289k
Peak memory use after GGC: 86227k -> 85886k
Maximum of released memory in single GGC run: 19713k -> 19647k
Garbage: 503831k -> 154891k
Leak: 57523k -> 72139k
Overhead: 76875k -> 31584k
GGC runs: 569 -> 255
Head of changelog is:
--- /usr/src/SpecTests/sandbox-britten-memory/x86_64/mem-result/ChangeLog 2005-01-31 22:50:53.000000000 +0000
+++ /usr/src/SpecTests/sandbox-britten-memory/gcc/gcc/ChangeLog 2005-02-01 03:32:10.000000000 +0000
@@ -1,3 +1,46 @@
+2005-01-31 James E. Wilson <wilson@specifixinc.com>
+
+ * config/ia64/itanium1.md (1_scall bypass): Change 2_mmalua to
+ 1_mmalua.
+
+2005-02-01 Eric Christopher <echristo@redhat.com>
+
+ * config/mips/mips.c (override_options): Warn if -mint64
+ is used.
+ * doc/invoke.texi (MIPS Options): Document that -mint64 is
+ deprecated.
+
+2005-02-01 Kazu Hirata <kazu@cs.umass.edu>
+
+ * cse.c (cse_reg_info): Remove hash_next, next, regno. Add
+ timestamp.
+ (cse_reg_info_list, cse_reg_info_list_free, REGHASH_SHIFT,
+ REGHASH_SIZE, REGHASH_MASK, reg_hash, REGHASH_FN,
+ cached_cse_reg_info, GET_CSE_REG_INFO): Remove.
+ (cached_regno): Initialize to INVALID_REGNUM.
+ (cse_reg_info_table_size,
+ cse_reg_info_table_first_uninitialized,
+ cse_reg_info_timestamp): New.
+ (REG_TICK, REG_IN_TABLE, SUBREG_TICKED, REG_QTY): Use
+ get_cse_reg_info.
+ (init_cse_reg_info, get_cse_reg_info_1): New.
+ (get_cse_reg_info): Cache the last look-up.
+ (new_basic_block): Update the code to clear mappings from
+ registers to cse_reg_info entries.
+ (cse_main): Call init_cse_reg_info.
+
+ * cse.c (get_cse_reg_info): Update a comment.
+
+2005-01-31 Steven Bosscher <stevenb@suse.de>
+
+ PR c/19333
+ * c-decl.c (start_decl): Do not warn about arrays of elements with
+ an incomplete type here.
+ (grokdeclarator): Do it here by making a pedwarn an error.
+ * c-typeck.c (push_init_level): If there were previous errors with
+ the constructor type, do not warn about braces for initializers.
+ (process_init_element): Likewise for excess initializer elements.
+
2005-01-31 Kazu Hirata <kazu@cs.umass.edu>
* cse.c (delete_trivially_dead_insn): Don't iterate.
@@ -21,6 +64,11 @@
2005-01-31 Dale Johannesen <dalej@apple.com>
+ * doc/extend.texi (nested functions): Fix linkage description.
+ Clarify that static is not allowed.
+
+2005-01-31 Dale Johannesen <dalej@apple.com>
+
* config/rs6000/darwin.md (movsf_low_di): Make work.
(movdf_low_di): Make work.
@@ -2071,7 +2119,7 @@
2005-01-11 Andreas Krebbel <krebbel1@de.ibm.com>
- * config/s390/s390.c (struct s390_frame_layout): Remove
+ * config/s390/s390.c (struct s390_frame_layout): Remove
save_backchain_p.
(s390_frame_info, s390_emit_prologue): Replace occurrences of
save_backchain_p with TARGET_BACKCHAIN.
@@ -2191,12 +2239,12 @@
2005-01-09 Ira Rosen <irar@il.ibm.com>
- * tree-vectorizer.c (vect_analyze_offset_expr): Use
+ * tree-vectorizer.c (vect_analyze_offset_expr): Use
expr_invariant_in_loop_p.
Initialize outputs first thing in the function.
(vect_update_ivs_after_vectorizer): Call initial_condition_in_loop_num.
(vect_is_simple_iv_evolution): Call initial_condition_in_loop_num.
- (vect_analyze_pointer_ref_access): Check that the initial condition of
+ (vect_analyze_pointer_ref_access): Check that the initial condition of
the access function is loop invariant.
2005-01-09 Richard Henderson <rth@redhat.com>
@@ -2218,7 +2266,7 @@
gtv2si3, umaxv8qi3, smaxv4hi3, uminv8qi3, sminv4hi3, ashrv4hi3,
ashrv2si3, lshrv4hi3, lshrv2si3, mmx_lshrdi3, ashlv4hi3, ashlv2si3,
mmx_ashldi3, mmx_packsswb, mmx_packssdw, mmx_packuswb, mmx_punpckhbw,
- mmx_punpckhwd, mmx_punpckhdq, mmx_punpcklbw, mmx_punpcklwd,
+ mmx_punpckhwd, mmx_punpckhdq, mmx_punpcklbw, mmx_punpcklwd,
mmx_punpckldq, emms, addv2sf3, subv2sf3, subrv2sf3, gtv2sf3, gev2sf3,
eqv2sf3, pfmaxv2sf3, pfminv2sf3, mulv2sf3, femms, pf2id, pf2iw,
pfacc, pfnacc, pfpnacc, pi2fw, floatv2si2, pfrcpv2sf2, pfrcpit1v2sf3,
@@ -2226,7 +2274,7 @@
pswapdv2sf2): Move to mmx.md; rename as necessary with leading
mmx_ prefix.
(mmx_clrdi, pavgusb): Remove.
- (ldmxcsr, stmxcsr, sfence, sfence_insn): Move to sse.md; rename
+ (ldmxcsr, stmxcsr, sfence, sfence_insn): Move to sse.md; rename
with leading sse_ prefix.
* config/i386/sse.md: Receive them.
* config/i386/mmx.md: New file.
@@ -2240,7 +2288,7 @@
(mmx_add<MMXMODEI>3, mmx_ssadd<MMXMODE12>3, mmx_usadd<MMXMODE12>3,
mmx_sub<MMXMODEI>3, mmx_sssub<MMXMODE12>3, mmx_ussub<MMXMODE12>3
mmx_ashr<MMXMODE24>3, mmx_lshr<MMXMODE23>3, mmx_ashl<MMXMODE24>3
- mmx_eq<MMXMODEI>3, mmx_gt<MMXMODEI>3, mmx_and<MMXMODEI>3,
+ mmx_eq<MMXMODEI>3, mmx_gt<MMXMODEI>3, mmx_and<MMXMODEI>3,
mmx_nand<MMXMODEI>3, mmx_ior<MMXMODEI>3, mmx_xor<MMXMODEI>3):
Macroize from existing patterns; use ix86_binary_operator_ok.
(mmx_packsswb, mmx_packssdw, mmx_packuswb): Add memory alternative.
@@ -2298,16 +2346,16 @@
(movtf, movtf_internal): Move near other fp moves.
(SSEMODE, SSEMODEI, vec_setv2df, vec_extractv2df, vec_initv2df,
vec_setv4sf, vec_extractv4sf, vec_initv4sf, movv4sf, movv4sf_internal,
- movv2df, movv2df_internal, mov<SSEMODEI>, mov<SSEMODEI>_internal,
+ movv2df, movv2df_internal, mov<SSEMODEI>, mov<SSEMODEI>_internal,
movmisalign<SSEMODE>, sse_movups_1, sse_movmskps, sse_movntv4sf,
sse_movhlps, sse_movlhps, sse_storehps, sse_loadhps, sse_storelps,
sse_loadlps, sse_loadss, sse_loadss_1, sse_movss, sse_storess,
sse_shufps, addv4sf3, vmaddv4sf3, subv4sf3, vmsubv4sf3, negv4sf2,
mulv4sf3, vmmulv4sf3, divv4sf3, vmdivv4sf3, rcpv4sf2, vmrcpv4sf2,
rsqrtv4sf2, vmrsqrtv4sf2, sqrtv4sf2, vmsqrtv4sf2, sse_andv4sf3,
- sse_nandv4sf3, sse_iorv4sf3, sse_xorv4sf3, sse2_andv2df3,
- sse2_nandv2df3, sse2_iorv2df3, sse2_xorv2df3, sse2_andv2di3,
- sse2_nandv2di3, sse2_iorv2di3, sse2_xorv2di3, maskcmpv4sf3,
+ sse_nandv4sf3, sse_iorv4sf3, sse_xorv4sf3, sse2_andv2df3,
+ sse2_nandv2df3, sse2_iorv2df3, sse2_xorv2df3, sse2_andv2di3,
+ sse2_nandv2di3, sse2_iorv2di3, sse2_xorv2di3, maskcmpv4sf3,
vmmaskcmpv4sf3, sse_comi, sse_ucomi, sse_unpckhps, sse_unpcklps,
smaxv4sf3, vmsmaxv4sf3, sminv4sf3, vmsminv4sf3, cvtpi2ps, cvtps2pi,
cvttps2pi, cvtsi2ss, cvtsi2ssq, cvtss2si, cvtss2siq, cvttss2si,
@@ -2325,14 +2373,14 @@
smulv8hi3_highpart, umulv8hi3_highpart, sse2_umulsidi3,
sse2_umulv2siv2di3, sse2_pmaddwd, sse2_uavgv16qi3, sse2_uavgv8hi3,
sse2_psadbw, sse2_pinsrw, sse2_pextrw, sse2_pshufd, sse2_pshuflw,
- sse2_pshufhw, eqv16qi3, eqv8hi3, eqv4si3, gtv16qi3, gtv8hi3,
+ sse2_pshufhw, eqv16qi3, eqv8hi3, eqv4si3, gtv16qi3, gtv8hi3,
gtv4si3, umaxv16qi3, smaxv8hi3, uminv16qi3, sminv8hi3, ashrv8hi3,
ashrv4si3, lshrv8hi3, lshrv4si3, lshrv2di3, ashlv8hi3, ashlv4si3,
ashlv2di3, sse2_ashlti3, sse2_lshrti3, sse2_unpckhpd, sse2_unpcklpd,
- sse2_packsswb, sse2_packssdw, sse2_packuswb, sse2_punpckhbw,
+ sse2_packsswb, sse2_packssdw, sse2_packuswb, sse2_punpckhbw,
sse2_punpckhwd, sse2_punpckhdq, sse2_punpcklbw, sse2_punpcklwd,
sse2_punpckldq, sse2_punpcklqdq, sse2_punpckhqdq, sse2_movupd,
- sse2_movdqu, sse2_movdq2q, sse2_movdq2q_rex64, sse2_movq2dq,
+ sse2_movdqu, sse2_movdq2q, sse2_movdq2q_rex64, sse2_movq2dq,
sse2_movq2dq_rex64, sse2_loadd, sse2_stored, sse2_storehpd,
sse2_loadhpd, sse2_storelpd, sse2_loadlpd, sse2_movsd, sse2_loadsd,
sse2_loadsd_1, sse2_storesd, sse2_shufpd, sse2_clflush, sse2_mfence,
@@ -2377,19 +2425,19 @@
sse3_haddv4sf3, sse3_hsubv4sf3, sse3_addsubv2df3, sse3_haddv2df3,
sse3_hsubv2df3, sse2_ashlti3, sse2_lshrti3): Likewise.
(sse_movhlps): Model with vec_select+vec_concat.
- (sse_movlhps, sse_unpckhps, sse_unpcklps, sse3_movshdup,
+ (sse_movlhps, sse_unpckhps, sse_unpcklps, sse3_movshdup,
sse3_movsldup, sse_shufps, sse_shufps_1, sse2_unpckhpd, sse3_movddup,
sse2_unpcklpd, sse2_shufpd, sse2_shufpd_1, sse2_punpckhbw,
sse2_punpcklbw, sse2_punpckhwd, sse2_punpcklwd, sse2_punpckhdq,
sse2_punpckldq, sse2_punpckhqdq, sse2_punpcklqdq, sse2_pshufd,
- sse2_pshufd_1, sse2_pshuflw, sse2_pshuflw_1, sse2_pshufhw,
+ sse2_pshufd_1, sse2_pshuflw, sse2_pshuflw_1, sse2_pshufhw,
sse2_pshufhw_1): Likewise.
(neg<SSEMODEI>2, one_cmpl<SSEMODEI>2): New.
(add<SSEMODEI>3, sse2_ssadd<SSEMODE12>3, sse2_usadd<SSEMODE12>3,
sub<SSEMODEI>3, sse2_sssub<SSEMODE12>3, sse2_ussub<SSEMODE12>3,
ashr<SSEMODE24>3, lshr<SSEMODE248>3, sse2_eq<SSEMODE124>3,
sse2_gt<SSEMODDE124>3, and<SSEMODEI>3, sse_nand<SSEMODEI>3,
- ior<SSEMODEI>3, xor<SSEMODEI>3): Macroize from existing patterns.
+ ior<SSEMODEI>3, xor<SSEMODEI>3): Macroize from existing patterns.
(addv4sf3, sse_vmaddv4sf3, mulv4sf3, sse_vmmulv4sf3, smaxv4sf3,
sse_vmsmaxv4sf3, sminv4sf3, sse_vmsminv4sf3, addv2df3, sse2_vmaddv2df3,
mulv2df3, sse2_vmmulv2df3, smaxv2df3, sse2_vmsmaxv2df3, sminv2df3,
@@ -2515,14 +2563,14 @@
for IBM long double format correctly.
2005-01-06 Daniel Berlin <dberlin@dberlin.org>
-
+
Fix PR tree-optimization/18792
* tree-data-ref.c (build_classic_dist_vector): Change first_loop
to first_loop_depth, and use loop depth instead of loop number.
(build_classic_dir_vector): Ditto.
(compute_data_dependences_for_loop): Use depth, not loop number.
- * tree-loop-linear.c (try_interchange_loops): Use loop depth, not loop
+ * tree-loop-linear.c (try_interchange_loops): Use loop depth, not loop
number. Pass in loops, instead of loop numbers.
(gather_interchange_stats): Ditto.
(linear_transform_loops): Ditto.
@@ -2544,7 +2592,7 @@
* gcc.c (process_command): Change year in 'gcc --version' to 2005.
2005-01-05 Daniel Berlin <dberlin@dberlin.org>
-
+
Fix PR middle-end/19286
Fix PR debug/19267
* dwarf2out.c (gen_subprogram_die): If we've already tried to
@@ -2555,7 +2603,7 @@
(decls_for_scope): Ditto.
* gimple-low.c (mark_blocks_with_used_subblocks): Remove.
(mark_used_blocks): Don't call mark_blocks_with_used_subblocks.
-
+
2005-01-05 Richard Henderson <rth@redhat.com>
PR target/11327
@@ -2564,7 +2612,7 @@
(ix86_expand_binop_builtin): Force operands into registers
when optimizing.
(ix86_expand_unop_builtin, ix86_expand_unop1_builtin,
- ix86_expand_sse_compare, ix86_expand_sse_comi,
+ ix86_expand_sse_compare, ix86_expand_sse_comi,
ix86_expand_builtin): Likewise.
2005-01-05 Richard Henderson <rth@redhat.com>
@@ -2629,7 +2677,7 @@
Richard Henderson <rth@redhat.com>
PR target/18910
- * config/i386/i386.c (ix86_expand_move): Handle tls symbols
+ * config/i386/i386.c (ix86_expand_move): Handle tls symbols
with an offset.
2005-01-05 Richard Henderson <rth@redhat.com>
@@ -2807,7 +2855,7 @@
* tree-vectorizer.c (vect_strip_conversions): New function.
(vect_analyze_offset_expr): Call vect_strip_conversions. Add
- check for binary class.
+ check for binary class.
2005-01-03 Daniel Berlin <dberlin@dberlin.org>
@@ -2822,7 +2870,7 @@
* tree-inline.c: Include debug.h.
(expand_call_inline): Call outlining_inline_function here.
* tree-optimize.c (init_tree_optimization_passes): Add
- pass_mark_used_blocks.
+ pass_mark_used_blocks.
* tree-pass.h (pass_mark_used_blocks): New.
* Makefile.in (tree-inline.o): Add debug.h dependency.
--- /usr/src/SpecTests/sandbox-britten-memory/x86_64/mem-result/ChangeLog.cp 2005-01-31 22:51:02.000000000 +0000
+++ /usr/src/SpecTests/sandbox-britten-memory/gcc/gcc/cp/ChangeLog 2005-02-01 03:32:20.000000000 +0000
@@ -1,3 +1,42 @@
+2005-01-31 Mark Mitchell <mark@codesourcery.com>
+
+ * decl.c (build_enumerator): Do not issue duplicate error messages
+ about invalid enumeration constants.
+ * parser.c (cp_parser_non_integral_constant_expression): Always
+ set parser->non_integral_constant_expression_p.
+ (cp_parser_primary_expression): Add cast_p parameter. Issue
+ errors about invalid uses of floating-point literals in
+ cast-expressions.
+ (cp_parser_postfix_expression): Add cast_p parameter.
+ (cp_parser_open_square_expression): Pass it.
+ (cp_parser_parenthesized_expression_list): Add cast_p parameter.
+ (cp_parser_unary_expression): Likewise.
+ (cp_parser_new_placement): Pass it.
+ (cp_parser_direct_new_declarator): Likewise.
+ (cp_parser_new_initializer): Likewise.
+ (cp_parser_cast_expression): Add cast_p parameter.
+ (cp_parser_binary_expression): Likewise.
+ (cp_parser_question_colon_clause): Likewise.
+ (cp_parser_assignment_expression): Likewise.
+ (cp_parser_expression): Likewise.
+ (cp_parser_constant_expression): If an integral constant
+ expression is invalid, return error_mark_node.
+ (cp_parser_expression_statement): Pass cast_p.
+ (cp_parser_condition): Likewise.
+ (cp_parser_iteration_statement): Likewise.
+ (cp_parser_jump_statement): Likewise.
+ (cp_parser_mem_initializer): Likewise.
+ (cp_parser_template_argument): Likewise.
+ (cp_parser_parameter_declaration): Likewise.
+ (cp_parser_initializer): Likewise.
+ (cp_parser_throw_expression): Likewise.
+ (cp_parser_attribute_list): Likewise.
+ (cp_parser_simple_cast_expression): Likewise.
+ (cp_parser_functional_cast): Likewise.
+ (cp_parser_late_parsing_default_args): Likewise.
+ (cp_parser_sizeof_operand): Save/restore
+ non_integral_constant_expression_p.
+
2005-01-31 Mike Stump <mrs@apple.com>
* parser.c (cp_lexer_new_main): Get the first token, first, before
I am friendly script caring about memory consumption in GCC. Please contact
jh@suse.cz if something is going wrong.
The results can be reproduced by building compiler with
--enable-gather-detailed-mem-stats targetting x86-64 and compiling preprocessed
combine.c or testcase from PR8632 with:
-fmem-report --param=ggc-min-heapsize=1024 --param=ggc-min-expand=1 -Ox -Q
The memory consumption summary appears in the dump after detailed listing of
the places they are allocated in. Peak memory consumption is actually computed
by looking for maximal value in {GC XXXX -> YYYY} report.
Yours testing script.
More information about the Gcc-regression
mailing list