This is the mail archive of the gcc-regression@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: GCC memory consumption increased by recent patch!


Jan Hubicka wrote:

Kenneth,
this increse in memory usage seems to be yours (at least I've checked
that my patch enabling scev induction variable analysis don't change any
increase in ggc memory consumed).
While the datastructures used in your patch looks very sane, perhaps
there is some room for improvement (at least the 10% in combine.c at -O3
looks relatively serious)


I am quite surprised at this number. Combine does have a fair number of static variables so the bitmaps are non trivial but the function does not have a huge number of functions (it does have some very large functions but this is not relevant.) It is hard to imagine what is being done at the cgraph level in o3 that would cause this kind of behavior.

Is it possible that I am adding bit vectors to unreachable cgraph nodes that are the result of more aggressive inlining?

Also would you mind if I moved your code into separate file out of
cgraphunit for next developpment period (so on the tree-profiling
branch).  If you have some additional changes, I can do this later just
when it is convenient.



As far as moving it to another file, that is fine. I put it in cgraphunit because that was the only place where from a temporal point of view I had access to all of the functions before they were compiled and I could attach the bit vectors to the cgraph nodes. I do have another set of changes that I will get done soon where I will add some more external calls. Then after that I was planning to make this work in a branch that contained stuart hastings restructuring code and take advantage of that restructuring. When it is properly restructured, this will cause the space to drop by a factor of 2 since we will not need two sets of bit vectors, one indexed by var ann uid and one indexed by the decl uid.

Kenny

Honza


Hi,
Comparing memory consumption on compilation of combine.i and generate-3.4.ii I got:


comparing combine.c compilation at -O0 level: Overall memory needed: 17820k Peak memory use before GGC: 9294k Peak memory use after GGC: 8606k Maximum of released memory in single GGC run: 2867k Garbage: 42475k -> 42487k Leak: 6107k -> 6087k Overhead: 5590k -> 5586k GGC runs: 363

comparing combine.c compilation at -O1 level:
 Overall memory allocated via mmap and sbrk increased from 18540k to 18652k, overall 0.60%
 Peak amount of GGC memory allocated before garbage collecting increased from 9573k to 9710k, overall 1.43%
 Peak amount of GGC memory still allocated after garbage collectin increased from 8663k to 8800k, overall 1.58%
 Amount of produced GGC garbage increased from 78747k to 79460k, overall 0.90%
 Amount of memory still referenced at the end of compilation increased from 6483k to 6669k, overall 2.87%
   Overall memory needed: 18540k -> 18652k
   Peak memory use before GGC: 9573k -> 9710k
   Peak memory use after GGC: 8663k -> 8800k
   Maximum of released memory in single GGC run: 2067k -> 2073k
   Garbage: 78747k -> 79460k
   Leak: 6483k -> 6669k
   Overhead: 13868k -> 14291k
   GGC runs: 589 -> 591

comparing combine.c compilation at -O2 level:
 Peak amount of GGC memory allocated before garbage collecting increased from 12756k to 12769k, overall 0.10%
 Amount of produced GGC garbage increased from 94713k to 95369k, overall 0.69%
 Amount of memory still referenced at the end of compilation increased from 6304k to 6424k, overall 1.89%
   Overall memory needed: 22076k -> 21988k
   Peak memory use before GGC: 12756k -> 12769k
   Peak memory use after GGC: 12610k
   Maximum of released memory in single GGC run: 2576k -> 2577k
   Garbage: 94713k -> 95369k
   Leak: 6304k -> 6424k
   Overhead: 18777k -> 19165k
   GGC runs: 580 -> 582

comparing combine.c compilation at -O3 level:
 Overall memory allocated via mmap and sbrk increased from 23972k to 26384k, overall 10.06%
 Peak amount of GGC memory allocated before garbage collecting increased from 13246k to 13426k, overall 1.36%
 Peak amount of GGC memory still allocated after garbage collectin increased from 12610k to 12742k, overall 1.05%
 Amount of produced GGC garbage increased from 126026k to 127206k, overall 0.94%
 Amount of memory still referenced at the end of compilation increased from 6852k to 6991k, overall 2.03%
   Overall memory needed: 23972k -> 26384k
   Peak memory use before GGC: 13246k -> 13426k
   Peak memory use after GGC: 12610k -> 12742k
   Maximum of released memory in single GGC run: 3483k
   Garbage: 126026k -> 127206k
   Leak: 6852k -> 6991k
   Overhead: 24652k -> 25258k
   GGC runs: 646 -> 650

comparing insn-attrtab.c compilation at -O0 level:
   Overall memory needed: 132860k
   Peak memory use before GGC: 76388k
   Peak memory use after GGC: 45185k
   Maximum of released memory in single GGC run: 41417k
   Garbage: 157790k -> 157803k
   Leak: 10620k -> 10618k
   Overhead: 19800k -> 19798k
   GGC runs: 310

comparing insn-attrtab.c compilation at -O1 level:
 Peak amount of GGC memory allocated before garbage collecting increased from 94307k to 94612k, overall 0.32%
 Peak amount of GGC memory still allocated after garbage collectin increased from 71401k to 71706k, overall 0.43%
 Amount of memory still referenced at the end of compilation increased from 10968k to 11052k, overall 0.77%
   Overall memory needed: 151600k -> 150756k
   Peak memory use before GGC: 94307k -> 94612k
   Peak memory use after GGC: 71401k -> 71706k
   Maximum of released memory in single GGC run: 40513k
   Garbage: 474239k -> 474241k
   Leak: 10968k -> 11052k
   Overhead: 84931k -> 85779k
   GGC runs: 461 -> 462

comparing insn-attrtab.c compilation at -O2 level:
 Overall memory allocated via mmap and sbrk increased from 237408k to 241428k, overall 1.69%
 Peak amount of GGC memory allocated before garbage collecting increased from 109880k to 110181k, overall 0.27%
 Peak amount of GGC memory still allocated after garbage collectin increased from 86974k to 87274k, overall 0.34%
 Amount of memory still referenced at the end of compilation increased from 11150k to 11207k, overall 0.51%
   Overall memory needed: 237408k -> 241428k
   Peak memory use before GGC: 109880k -> 110181k
   Peak memory use after GGC: 86974k -> 87274k
   Maximum of released memory in single GGC run: 35488k -> 35489k
   Garbage: 525180k -> 525164k
   Leak: 11150k -> 11207k
   Overhead: 95096k -> 95934k
   GGC runs: 383

comparing insn-attrtab.c compilation at -O3 level:
 Overall memory allocated via mmap and sbrk increased from 237384k to 241444k, overall 1.71%
 Peak amount of GGC memory allocated before garbage collecting increased from 109882k to 110181k, overall 0.27%
 Peak amount of GGC memory still allocated after garbage collectin increased from 86975k to 87275k, overall 0.34%
 Amount of memory still referenced at the end of compilation increased from 11223k to 11271k, overall 0.43%
   Overall memory needed: 237384k -> 241444k
   Peak memory use before GGC: 109882k -> 110181k
   Peak memory use after GGC: 86975k -> 87275k
   Maximum of released memory in single GGC run: 35488k
   Garbage: 527525k -> 527533k
   Leak: 11223k -> 11271k
   Overhead: 95873k -> 96717k
   GGC runs: 392

comparing Gerald's testcase PR8361 compilation at -O0 level:
 Amount of memory still referenced at the end of compilation increased from 58324k to 59855k, overall 2.63%
   Overall memory needed: 114844k
   Peak memory use before GGC: 92008k -> 92009k
   Peak memory use after GGC: 90475k -> 90476k
   Maximum of released memory in single GGC run: 20896k -> 20897k
   Garbage: 270774k -> 271018k
   Leak: 58324k -> 59855k
   Overhead: 34949k -> 35160k
   GGC runs: 552 -> 551

comparing Gerald's testcase PR8361 compilation at -O1 level:
 Overall memory allocated via mmap and sbrk increased from 120248k to 126228k, overall 4.97%
 Peak amount of GGC memory allocated before garbage collecting increased from 96217k to 96401k, overall 0.19%
 Amount of produced GGC garbage increased from 671500k to 689097k, overall 2.62%
 Amount of memory still referenced at the end of compilation increased from 60665k to 62390k, overall 2.84%
   Overall memory needed: 120248k -> 126228k
   Peak memory use before GGC: 96217k -> 96401k
   Peak memory use after GGC: 89741k
   Maximum of released memory in single GGC run: 20047k -> 20069k
   Garbage: 671500k -> 689097k
   Leak: 60665k -> 62390k
   Overhead: 145200k -> 151336k
   GGC runs: 835 -> 820

comparing Gerald's testcase PR8361 compilation at -O2 level:
 Overall memory allocated via mmap and sbrk increased from 121648k to 127504k, overall 4.81%
 Peak amount of GGC memory allocated before garbage collecting increased from 96218k to 96401k, overall 0.19%
 Amount of produced GGC garbage increased from 732787k to 750297k, overall 2.39%
 Amount of memory still referenced at the end of compilation increased from 61238k to 62963k, overall 2.82%
   Overall memory needed: 121648k -> 127504k
   Peak memory use before GGC: 96218k -> 96401k
   Peak memory use after GGC: 89741k
   Maximum of released memory in single GGC run: 20048k -> 20069k
   Garbage: 732787k -> 750297k
   Leak: 61238k -> 62963k
   Overhead: 171765k -> 177962k
   GGC runs: 866 -> 848

comparing Gerald's testcase PR8361 compilation at -O3 level:
 Overall memory allocated via mmap and sbrk increased from 120456k to 126044k, overall 4.64%
 Peak amount of GGC memory allocated before garbage collecting increased from 92007k to 92781k, overall 0.84%
 Peak amount of GGC memory still allocated after garbage collectin increased from 90540k to 90698k, overall 0.17%
 Amount of produced GGC garbage increased from 770315k to 790042k, overall 2.56%
 Amount of memory still referenced at the end of compilation increased from 61606k to 63353k, overall 2.84%
   Overall memory needed: 120456k -> 126044k
   Peak memory use before GGC: 92007k -> 92781k
   Peak memory use after GGC: 90540k -> 90698k
   Maximum of released memory in single GGC run: 20814k
   Garbage: 770315k -> 790042k
   Leak: 61606k -> 63353k
   Overhead: 183430k -> 190014k
   GGC runs: 853 -> 836

Head of changelog is:

--- /usr/src/SpecTests/sandbox-britten-memory/x86_64/mem-result/ChangeLog	2004-09-13 21:44:05.000000000 +0000
+++ /usr/src/SpecTests/sandbox-britten-memory/gcc/gcc/ChangeLog	2004-09-14 03:38:36.000000000 +0000
@@ -1,3 +1,69 @@
+2004-09-14  Jan Hubicka  <jh@suse.cz>
+
+	* Makefile.in (predict.o): Depend on tree-scalar-evolution.h
+	* predict.c: Include tree-scalar-evolution.h and cfgloop.h
+	(predict_loops): Use number_of_iterations_exit to predict
+	number of iterations on trees.
+
+2004-09-13  Dale Johannesen  <dalej@apple.com>
+
+	PR 17408
+	PR 17409
+	* c-decl.c (start_decl): Repair TREE_STATIC for initialized
+	objects declared extern.
+
+2004-09-14  Paul Brook  <paul@codesourcery.com>
+
+	* config/arm/arm.c (arm_expand_prologue): Make args_to_push a
+	HOST_WIDE_INT.
+
+2004-09-13  Daniel Jacobowitz  <dan@debian.org>
+
+	* fold-const.c (fold_checksum_tree): Ignore TYPE_CACHED_VALUES.
+	Only use TYPE_BINFO for aggregates.
+
+2004-09-13  Daniel Jacobowitz  <dan@debian.org>
+
+	* expmed.c (synth_mult): Initialize latency.  Check cost before
+	checking ops count.
+
+2004-09-13  Kenneth Zadeck  <Kenneth.Zadeck@NaturalBridge.com>
+
+
+	* tree-ssa-operands.c (get_call_expr_operands): Added parm to
+	add_call_clobber_ops and add_call_read_ops.
+	(add_call_clobber_ops, add_call_read_ops): Added code to reduce
+	the number of vdefs and vuses inserted based on analysis of global
+	variables across calls.  * tree-dfa.c (find_referenced_vars):
+	Needed to reset static var maps before each function is compiled.
+	* cgraphunit.c:
+	(static_vars_to_consider_by_tree,static_vars_to_consider_by_uid,
+	static_vars_info,functions_to_static_vars_info,module_statics_escape,
+	all_module_statics,searchc_env,dfs_info): New fields to support
+	analysis of static global variables.
+	(print_order, convert_UIDs_in_bitmap, new_static_vars_info,
+	cgraph_reset_static_var_maps, get_global_static_vars_info,
+	get_global_statics_not_read, get_global_statics_not_written,
+	searchc, cgraph_reduced_inorder, has_proper_scope_for_analysis,
+	check_rhs_var, check_lhs_var, get_asm_expr_operands,
+	process_call_for_static_vars, scan_for_static_refs,
+	cgraph_characterize_statics_local, cgraph_get_static_name_by_uid,
+	clear_static_vars_maps, cgraph_propagate_bits,
+	cgraph_characterize_statics): New. Functions to support analysis
+	of static global variables.
+	(cgraph_mark_local_and_external_functions): Renamed from:
+	(cgraph_mark_local_functions)
+	(cgraph_expand_all_functions): Remove call to
+	cgraph_mark_local_and_external_functions.
+	(cgraph_optimize): Added driver to analyze static variables whose
+	scope is within the compilation unit.  * cgraph.h (struct
+	cgraph_local_info, GTY): Added statics_read, statics_written,
+	local, calls_read_all, calls_write_all, for_functions_valid.
+	(struct cgraph_node): Added next_cycle.  * cgraph.c
+	(dump_cgraph_node): Added print routines for new fields.  *
+	makefile.in: macroized cgraph.h, added cgraphunit.c to the ggc
+	list.
+
2004-09-13  Joseph S. Myers  <jsm@polyomino.org.uk>

	* c-decl.c (grokdeclarator): Correct comments about where storage
--- /usr/src/SpecTests/sandbox-britten-memory/x86_64/mem-result/ChangeLog.cp	2004-09-12 21:43:28.000000000 +0000
+++ /usr/src/SpecTests/sandbox-britten-memory/gcc/gcc/cp/ChangeLog	2004-09-14 03:38:37.000000000 +0000
@@ -1,3 +1,13 @@
+2004-09-13  Mark Mitchell  <mark@codesourcery.com>
+
+	PR c++/16716
+	* parser.c (cp_parser_parse_and_diagnose_invalid_type_name):
+	Robustify.
+
+	PR c++/17327
+	* pt.c (unify): Add ENUMERAL_TYPE case.  Replace sorry with
+	gcc_unreacable.
+
2004-09-12  Richard Henderson  <rth@redhat.com>

PR c++/16254

I am friendly script caring about memory consumption in GCC.  Please contact
jh@suse.cz if something is going wrong.

The results can be reproduced by building compiler with
--enable-gather-detailed-mem-stats targetting x86-64 and compiling preprocessed
combine.c or testcase from PR8632 with:

-fmem-report --param=ggc-min-heapsize=1024 --param=ggc-min-expand=1 -Ox -Q

The memory consumption summary appears in the dump after detailed listing of
the places they are allocated in.  Peak memory consumption is actually computed
by looking for maximal value in {GC XXXX -> YYYY} report.

Yours testing script.




Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]