This is the mail archive of the gcc-regression@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

A recent patch increased GCC's memory consumption!


Hi,

I am a friendly script caring about memory consumption in GCC.  Please
contact jh@suse.cz if something is going wrong.

Comparing memory consumption on compilation of combine.i, insn-attrtab.i,
and generate-3.4.ii I got:


comparing empty function compilation at -O0 level:
    Overall memory needed: 18265k
    Peak memory use before GGC: 2233k
    Peak memory use after GGC: 1940k
    Maximum of released memory in single GGC run: 293k
    Garbage: 422k -> 422k
    Leak: 2271k -> 2271k
    Overhead: 446k -> 446k
    GGC runs: 3

comparing empty function compilation at -O0 -g level:
    Overall memory needed: 18281k
    Peak memory use before GGC: 2260k
    Peak memory use after GGC: 1967k
    Maximum of released memory in single GGC run: 293k
    Garbage: 425k -> 425k
    Leak: 2303k -> 2303k
    Overhead: 450k -> 450k
    GGC runs: 3

comparing empty function compilation at -O1 level:
    Overall memory needed: 18365k -> 18369k
    Peak memory use before GGC: 2233k
    Peak memory use after GGC: 1940k
    Maximum of released memory in single GGC run: 293k
    Garbage: 426k -> 426k
    Leak: 2273k -> 2273k
    Overhead: 446k -> 446k
    GGC runs: 4

comparing empty function compilation at -O2 level:
    Overall memory needed: 18377k -> 18381k
    Peak memory use before GGC: 2233k
    Peak memory use after GGC: 1940k
    Maximum of released memory in single GGC run: 293k
    Garbage: 430k -> 429k
    Leak: 2273k -> 2273k
    Overhead: 447k -> 447k
    GGC runs: 4

comparing empty function compilation at -O3 level:
    Overall memory needed: 18377k -> 18381k
    Peak memory use before GGC: 2233k
    Peak memory use after GGC: 1940k
    Maximum of released memory in single GGC run: 293k
    Garbage: 430k -> 429k
    Leak: 2273k -> 2273k
    Overhead: 447k -> 447k
    GGC runs: 4

comparing combine.c compilation at -O0 level:
    Overall memory needed: 28397k
    Peak memory use before GGC: 9268k -> 9267k
    Peak memory use after GGC: 8786k -> 8784k
    Maximum of released memory in single GGC run: 2643k
    Garbage: 37458k -> 37441k
    Leak: 6452k -> 6450k
    Overhead: 4868k -> 4862k
    GGC runs: 280

comparing combine.c compilation at -O0 -g level:
    Overall memory needed: 30485k -> 30477k
    Peak memory use before GGC: 10816k -> 10795k
    Peak memory use after GGC: 10446k -> 10425k
    Maximum of released memory in single GGC run: 2319k -> 2360k
    Garbage: 38030k -> 38023k
    Leak: 9328k -> 9260k
    Overhead: 5569k -> 5530k
    GGC runs: 272

comparing combine.c compilation at -O1 level:
    Overall memory needed: 39825k
    Peak memory use before GGC: 16872k -> 16870k
    Peak memory use after GGC: 16704k -> 16702k
    Maximum of released memory in single GGC run: 2259k
    Garbage: 58321k -> 58326k
    Leak: 6473k -> 6473k
    Overhead: 6296k -> 6290k
    GGC runs: 359 -> 358

comparing combine.c compilation at -O2 level:
  Amount of produced GGC garbage increased from 79525k to 79634k, overall 0.14%
  Amount of memory still referenced at the end of compilation increased from 6590k to 6598k, overall 0.12%
    Overall memory needed: 29386k
    Peak memory use before GGC: 16874k -> 16872k
    Peak memory use after GGC: 16704k -> 16702k
    Maximum of released memory in single GGC run: 3773k -> 4308k
    Garbage: 79525k -> 79634k
    Leak: 6590k -> 6598k
    Overhead: 8629k -> 8621k
    GGC runs: 416

comparing combine.c compilation at -O3 level:
  Peak amount of GGC memory allocated before garbage collecting increased from 18390k to 18754k, overall 1.98%
  Peak amount of GGC memory still allocated after garbage collectin increased from 17806k to 17982k, overall 0.99%
    Overall memory needed: 29510k
    Peak memory use before GGC: 18390k -> 18754k
    Peak memory use after GGC: 17806k -> 17982k
    Maximum of released memory in single GGC run: 5533k -> 6490k
    Garbage: 116470k -> 116549k
    Leak: 6668k -> 6668k
    Overhead: 13259k -> 13281k
    GGC runs: 468 -> 466

comparing insn-attrtab.c compilation at -O0 level:
  Amount of memory still referenced at the end of compilation increased from 9274k to 9289k, overall 0.16%
    Overall memory needed: 88530k
    Peak memory use before GGC: 70079k -> 70078k
    Peak memory use after GGC: 43585k
    Maximum of released memory in single GGC run: 37868k
    Garbage: 131134k -> 131129k
    Leak: 9274k -> 9289k
    Overhead: 16943k -> 16940k
    GGC runs: 216

comparing insn-attrtab.c compilation at -O0 -g level:
    Overall memory needed: 89710k -> 89714k
    Peak memory use before GGC: 71240k -> 71228k
    Peak memory use after GGC: 44853k -> 44840k
    Maximum of released memory in single GGC run: 37869k
    Garbage: 132034k -> 132027k
    Leak: 11218k -> 11184k
    Overhead: 17339k -> 17318k
    GGC runs: 213

comparing insn-attrtab.c compilation at -O1 level:
    Overall memory needed: 110262k -> 110266k
    Peak memory use before GGC: 85471k -> 85470k
    Peak memory use after GGC: 79438k -> 79437k
    Maximum of released memory in single GGC run: 31671k -> 31672k
    Garbage: 271416k -> 271413k
    Leak: 9336k -> 9336k
    Overhead: 28708k -> 28721k
    GGC runs: 223

comparing insn-attrtab.c compilation at -O2 level:
    Overall memory needed: 120766k -> 117834k
    Peak memory use before GGC: 87633k -> 87632k
    Peak memory use after GGC: 80352k
    Maximum of released memory in single GGC run: 30216k
    Garbage: 310286k -> 310281k
    Leak: 9339k -> 9339k
    Overhead: 35187k -> 35201k
    GGC runs: 246

comparing insn-attrtab.c compilation at -O3 level:
    Overall memory needed: 130406k -> 132294k
    Peak memory use before GGC: 87658k
    Peak memory use after GGC: 80378k -> 80377k
    Maximum of released memory in single GGC run: 30409k
    Garbage: 311122k -> 311122k
    Leak: 9342k -> 9342k
    Overhead: 35413k -> 35427k
    GGC runs: 250

comparing Gerald's testcase PR8361 compilation at -O0 level:
    Overall memory needed: 118582k
    Peak memory use before GGC: 91843k -> 91699k
    Peak memory use after GGC: 90934k -> 90791k
    Maximum of released memory in single GGC run: 19241k -> 19168k
    Garbage: 209815k -> 209462k
    Leak: 47804k -> 47751k
    Overhead: 21134k -> 20904k
    GGC runs: 414 -> 413

comparing Gerald's testcase PR8361 compilation at -O0 -g level:
  Peak amount of GGC memory allocated before garbage collecting increased from 104049k to 104519k, overall 0.45%
  Peak amount of GGC memory still allocated after garbage collectin increased from 103007k to 103480k, overall 0.46%
    Overall memory needed: 130990k -> 131790k
    Peak memory use before GGC: 104049k -> 104519k
    Peak memory use after GGC: 103007k -> 103480k
    Maximum of released memory in single GGC run: 18732k -> 19647k
    Garbage: 216401k -> 216047k
    Leak: 71322k -> 70710k
    Overhead: 27038k -> 26521k
    GGC runs: 385

comparing Gerald's testcase PR8361 compilation at -O1 level:
    Overall memory needed: 119610k -> 119578k
    Peak memory use before GGC: 96238k -> 96182k
    Peak memory use after GGC: 93982k -> 93928k
    Maximum of released memory in single GGC run: 18069k -> 17997k
    Garbage: 447100k -> 447346k
    Leak: 49492k -> 49435k
    Overhead: 32503k -> 32335k
    GGC runs: 565

comparing Gerald's testcase PR8361 compilation at -O2 level:
  Amount of produced GGC garbage increased from 564492k to 565496k, overall 0.18%
    Overall memory needed: 119630k -> 119602k
    Peak memory use before GGC: 96264k -> 96209k
    Peak memory use after GGC: 94009k -> 93956k
    Maximum of released memory in single GGC run: 18069k -> 17997k
    Garbage: 564492k -> 565496k
    Leak: 50412k -> 50350k
    Overhead: 42169k -> 42152k
    GGC runs: 623 -> 626

comparing Gerald's testcase PR8361 compilation at -O3 level:
  Amount of produced GGC garbage increased from 585812k to 587055k, overall 0.21%
    Overall memory needed: 121402k -> 121418k
    Peak memory use before GGC: 97232k -> 97180k
    Peak memory use after GGC: 95640k -> 95588k
    Maximum of released memory in single GGC run: 18476k -> 18403k
    Garbage: 585812k -> 587055k
    Leak: 50365k -> 50305k
    Overhead: 43030k -> 42978k
    GGC runs: 634 -> 635

comparing PR rtl-optimization/28071 testcase compilation at -O0 level:
    Overall memory needed: 137618k
    Peak memory use before GGC: 81563k
    Peak memory use after GGC: 58443k
    Maximum of released memory in single GGC run: 45145k
    Garbage: 148487k -> 148485k
    Leak: 7540k -> 7540k
    Overhead: 25306k -> 25305k
    GGC runs: 82

comparing PR rtl-optimization/28071 testcase compilation at -O0 -g level:
    Overall memory needed: 137990k
    Peak memory use before GGC: 82209k -> 82197k
    Peak memory use after GGC: 59089k -> 59077k
    Maximum of released memory in single GGC run: 45210k
    Garbage: 148690k -> 148688k
    Leak: 9307k -> 9247k
    Overhead: 25801k -> 25770k
    GGC runs: 88

comparing PR rtl-optimization/28071 testcase compilation at -O1 level:
    Overall memory needed: 408334k -> 408354k
    Peak memory use before GGC: 194153k -> 194178k
    Peak memory use after GGC: 187972k -> 187997k
    Maximum of released memory in single GGC run: 94097k -> 94123k
    Garbage: 283876k -> 283657k
    Leak: 29776k -> 29776k
    Overhead: 29695k -> 29619k
    GGC runs: 98

comparing PR rtl-optimization/28071 testcase compilation at -O2 level:
    Overall memory needed: 329958k -> 330166k
    Peak memory use before GGC: 194146k -> 194171k
    Peak memory use after GGC: 187965k -> 187990k
    Maximum of released memory in single GGC run: 96113k -> 96138k
    Garbage: 364116k -> 363897k
    Leak: 30359k -> 30359k
    Overhead: 45444k -> 45368k
    GGC runs: 105

comparing PR rtl-optimization/28071 testcase compilation at -O3 -fno-tree-pre -fno-tree-fre level:
  Peak amount of GGC memory allocated before garbage collecting increased from 308402k to 310802k, overall 0.78%
  Peak amount of GGC memory still allocated after garbage collectin increased from 286746k to 289146k, overall 0.84%
    Overall memory needed: 777726k -> 778354k
    Peak memory use before GGC: 308402k -> 310802k
    Peak memory use after GGC: 286746k -> 289146k
    Maximum of released memory in single GGC run: 162843k -> 166516k
    Garbage: 501929k -> 501676k
    Leak: 45411k -> 45411k
    Overhead: 57095k -> 57008k
    GGC runs: 98

Head of the ChangeLog is:

--- /usr/src/SpecTests/sandbox-britten-memory/x86_64/mem-result/ChangeLog	2006-12-11 03:44:43.000000000 +0000
+++ /usr/src/SpecTests/sandbox-britten-memory/gcc/gcc/ChangeLog	2006-12-11 19:56:44.000000000 +0000
@@ -1,3 +1,148 @@
+2006-12-11  Andreas Schwab  <schwab@suse.de>
+
+	* varasm.c (elf_record_gcc_switches): Cast second argument of
+	ASM_OUTPUT_SKIP to unsigned HOST_WIDE_INT.
+
+2006-12-11  Diego Novillo  <dnovillo@redhat.com>
+
+	* tree-scalar-evolution.c (scev_const_prop):
+	* tree-phinodes.c (remove_phi_node): Add argument
+	RELEASE_LHS_P.  If given, release the SSA name on the LHS of
+	the PHI node.
+	Update all users.
+	* tree-ssa-dce.c: Remove forward declarations for static
+	functions.  Re-arrange functions bodies as needed.
+	(find_obviously_necessary_stmts): Never mark PHI nodes as
+	obviously necessary.
+
+2006-12-11  Carlos O'Donell  <carlos@codesourcery.com>
+
+	* config/arm/elf.h (MAX_OFILE_ALIGNMENT): Remove definition.
+
+2006-12-11  Jan Hubicka  <jh@suse.cz>
+
+	* value-prof.c (tree_stringops_transform): New.
+	(tree_value_profile_transformations): Require count to be non-zero;
+	call stringop transform; reset stmt BSI after BB changed.
+	(tree_divmod_fixed_value, tree_mod_pow2): Don't emit unnecesary label.
+	(interesting_stringop_to_profile_p, tree_stringop_fixed_value): New.
+	(tree_stringops_values_to_profile): New.
+	(tree_values_to_profile): Call tree_stringops_values_to_profile.
+	* tree.h (build_string_literal): Tidy prototype.
+	(validate_arglist, builtin_memset_read_str, get_pointer_alignment):
+	Declare.
+	* builtins.c (validate_arglist, builtin_memset_read_str,
+	get_pointer_alignment): Export.
+
+2006-12-11  Uros Bizjak  <ubizjak@gmail.com>
+
+	PR target/30120
+	Revert:
+	2006-11-15  Uros Bizjak  <ubizjak@gmail.com>
+
+	* config/i386/i386.opt: New target option -mx87regparm.
+
+	* config/i386/i386.h (struct ix86_args): Add x87_nregs, x87_regno,
+	float_in_x87: Add new variables. mmx_words, sse_words: Remove.
+	(X87_REGPARM_MAX): Define.
+
+	* config/i386/i386.c (override_options): Error out for
+	-mx87regparm but no 80387 support.
+	(ix86_attribute_table): Add x87regparm.
+	(ix86_handle_cconv_attribute): Update comments for x87regparm.
+	(ix86_comp_type_attributes): Check for mismatched x87regparm types.
+	(ix86_function_x87regparm): New function.
+	(ix86_function_arg_regno_p): Add X87_REGPARM_MAX 80387 floating
+	point registers.
+	(init_cumulative_args): Initialize x87_nregs and float_in_x87
+	variables.
+	(function_arg_advance): Process x87_nregs and x87_regno when
+	floating point argument is to be passed in 80387 register.
+	(function_arg): Pass XFmode arguments in 80387 registers for local
+	functions.  Pass SFmode and DFmode arguments to local functions
+	in 80387 registers when flag_unsafe_math_optimizations is set.
+
+	* reg-stack.c (convert_regs_entry): Disable NaN load for
+	stack registers that are used for argument passing.
+
+	* doc/extend.texi: Document x87regparm function attribute.
+	* doc/invoke.texi: Document -mx87regparm.
+
+2006-12-11  Jan Hubicka  <jh@suse.cz>
+
+	Move all varpool routines out of cgraph/cgraphunit to varpool.c
+	* cgraph.c: Update comments.
+	(cgraph_varpool_hash,
+	cgraph_varpool_nodes, cgraph_varpool_last_needed_node
+	cgraph_varpool_node_name, cgraph_varpool_node,
+	cgraph_varpol_mode_for_asm, cgraph_varpool_mark_needed_node,
+	cgraph_variable_initializer_availability): Move to
+	varpool.c and drop cgraph_ prefixes.
+	(cgraph_varpool_enqueue_needed_node, cgraph_varpool_reset_queue,
+	cgraph_varpool_first_unanalyzed_node, cgraph_varpool_finalize_decl):
+	move to varpool.c; drop cgraph_ prefix; make static.
+	(dump_cgraph_varpool_node): Move to varpool.c under name
+	dump_varpool_node.
+	(dump_varpool, hash_varpool_node, eq_varpool_node,
+	decide_is_variable_needed): Move to varpool.c
+	(decl_assembler_name_equal): Move to tree.c.
+	(availability_names): Rename to ...
+	(cgraph_availability_names): ... this one.
+	(dump_cgraph_node): Update.
+	* cgraph.h: Reorder declarations now in varpool.c
+	(cgraph_vailablity_names): Declare.
+	(struct cgraph_varpool_node): Rename to ...
+	(struct varpool_node): ... this one.
+	(cgraph_varpool_first_unanalyzed_node, cgraph_varpool_nodes_queue,
+	cgraph_varpool_first_unanalyzed_node, cgraph_varpool_node,
+	cgraph_varpool_node_for_asm, cgraph_varpool_mark_needed_node,
+	cgraph_varpool_finalize_decl, cgraph_varpool_enqueue_needed_node,
+	cgraph_varpool_reset_queue, cgraph_varpool_assemble_pending_decls,
+	cgraph_variable_initializer_availability): Rename to ...
+	(varpool_first_unanalyzed_node, varpool_nodes_queue,
+	varpool_first_unanalyzed_node, varpool_node,
+	varpool_node_for_asm, varpool_mark_needed_node,
+	varpool_finalize_decl, varpool_enqueue_needed_node,
+	varpool_assemble_pending_decls, variable_initializer_availability):
+	Rename to ...
+	* tree.c (decl_assembler_name_equal): Move here from cgraph.c.
+	* tree.h (decl_assembler_name_equal): Declare.
+	* omp-low.c (lower_omp_critical): Update.
+	* ipa-reference (analyze_variable, static_execute): Likewise.
+	* toplev.c (wrapup_global_declaration_2, compile_file): Update.
+	* cgraphunit.c: Update comments.
+	(cgraph_varpool_assembled_nodes_queue): Move to varpool.c under name
+	varpool_assembled_nodes_queue.
+	(cgraph_varpool_analyze_pending_decls): Move to varpool.c under name
+	varpool_analyze_pending_decls.
+	(cgraph_varpool_remove_unreferenced_decls): Move to varpool.c under name
+	varpool_remove_unreferenced_decls.
+	(record_reference): Update.
+	(cgraph_create_edges): Update.
+	(record_referneces_in_initializer): New function.
+	(cgraph_varpool_assemble_decl): Move to varpool.c under name
+	varpool_assemble_decl; make global.
+	(cgraph_varpool_assemble_pending_decls): Move to varpool.c under name
+	varpool_assemble_pending_decls.
+	(process_function_and_variable_attributes, cgraph_finalize_compilation_unit,
+	struct cgraph_order_sort, cgraph_output_in_order,
+	cgraph_function_and_variable_invisibility, cgraph_optimize,
+	cgraph_increase_alignment): Update.
+	* dwarf2out.c (decls_for_scope): Likewise.
+	* ipa-type-escape.c (analyze_variable, type_escape_execute): Likewise.
+	* except.c (output_ttype): Likewise.
+	* varasm.c (mark_decl_referenced): Likewise.
+	(find_decl_and_mark_referenced, assemble_alias): update.
+	* Makefile.in: Add varpool.c, gt-varpool.c and remove gt-cgraphunit.c
+	* passes.c (rest_of_decl_compilation): Update.
+
+2006-12-11  Ira Rosen  <irar@il.ibm.com>
+
+	* tree-vect-patterns.c (vect_recog_dot_prod_pattern): Use 
+	GIMPLE_STMT_OPERAND.
+	* tree-vect-transform.c (vect_permute_store_chain): Likewise.
+	(vect_setup_realignment): Likewise.
+
 2006-12-11  Sa Liu  <saliu@de.ibm.com>
 	    Ben Elliston  <bje@au.ibm.com>
 
--- /usr/src/SpecTests/sandbox-britten-memory/x86_64/mem-result/ChangeLog.cp	2006-12-10 11:37:41.000000000 +0000
+++ /usr/src/SpecTests/sandbox-britten-memory/gcc/gcc/cp/ChangeLog	2006-12-11 19:56:43.000000000 +0000
@@ -1,3 +1,7 @@
+2006-12-11  Jan Hubicka  <jh@suse.cz>
+
+	* decl2.c (var_finalized_p): Update for renamed varpool functions.
+
 2006-12-09  Zack Weinberg  <zackw@panix.com>
 
 	* parser.c (yydebug, enum pragma_omp_clause): Delete.


The results can be reproduced by building a compiler with

--enable-gather-detailed-mem-stats targetting x86-64

and compiling preprocessed combine.c or testcase from PR8632 with:

-fmem-report --param=ggc-min-heapsize=1024 --param=ggc-min-expand=1 -Ox -Q

The memory consumption summary appears in the dump after detailed listing
of the places they are allocated in.  Peak memory consumption is actually
computed by looking for maximal value in {GC XXXX -> YYYY} report.

Your testing script.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]