This is the mail archive of the gcc-regression@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

A recent patch increased GCC's memory consumption in some cases!


Hi,

I am a friendly script caring about memory consumption in GCC.  Please
contact jh@suse.cz if something is going wrong.

Comparing memory consumption on compilation of combine.i, insn-attrtab.i,
and generate-3.4.ii I got:


comparing empty function compilation at -O0 level:
    Overall memory needed: 8332k -> 8331k
    Peak memory use before GGC: 3324k
    Peak memory use after GGC: 2989k
    Maximum of released memory in single GGC run: 335k
    Garbage: 490k
    Leak: 3721k
    Overhead: 874k
    GGC runs: 3

comparing empty function compilation at -O0 -g level:
    Overall memory needed: 8348k -> 8347k
    Peak memory use before GGC: 3352k
    Peak memory use after GGC: 3016k
    Maximum of released memory in single GGC run: 336k
    Garbage: 492k
    Leak: 3753k
    Overhead: 878k
    GGC runs: 3

comparing empty function compilation at -O1 level:
    Overall memory needed: 8384k -> 8387k
    Peak memory use before GGC: 3324k
    Peak memory use after GGC: 2989k
    Maximum of released memory in single GGC run: 335k
    Garbage: 495k -> 495k
    Leak: 3723k -> 3723k
    Overhead: 874k -> 874k
    GGC runs: 3

comparing empty function compilation at -O2 level:
    Overall memory needed: 8396k -> 8395k
    Peak memory use before GGC: 3324k
    Peak memory use after GGC: 2989k
    Maximum of released memory in single GGC run: 335k
    Garbage: 498k -> 499k
    Leak: 3723k -> 3724k
    Overhead: 875k -> 875k
    GGC runs: 3

comparing empty function compilation at -O3 level:
    Overall memory needed: 8396k -> 8395k
    Peak memory use before GGC: 3324k
    Peak memory use after GGC: 2989k
    Maximum of released memory in single GGC run: 335k
    Garbage: 498k -> 499k
    Leak: 3723k -> 3724k
    Overhead: 875k -> 875k
    GGC runs: 3

comparing combine.c compilation at -O0 level:
    Overall memory needed: 23412k -> 23411k
    Peak memory use before GGC: 9970k
    Peak memory use after GGC: 9190k
    Maximum of released memory in single GGC run: 1907k
    Garbage: 38272k
    Leak: 8338k
    Overhead: 5338k
    GGC runs: 243

comparing combine.c compilation at -O0 -g level:
    Overall memory needed: 25296k -> 25299k
    Peak memory use before GGC: 11632k
    Peak memory use after GGC: 10976k
    Maximum of released memory in single GGC run: 1880k
    Garbage: 38622k
    Leak: 11231k
    Overhead: 6044k
    GGC runs: 241

comparing combine.c compilation at -O1 level:
  Amount of produced GGC garbage increased from 51750k to 52593k, overall 1.63%
    Overall memory needed: 37100k -> 37139k
    Peak memory use before GGC: 18609k -> 18616k
    Peak memory use after GGC: 18408k
    Maximum of released memory in single GGC run: 1379k -> 1374k
    Garbage: 51750k -> 52593k
    Leak: 8394k -> 8397k
    Overhead: 6453k -> 6439k
    GGC runs: 317 -> 321

comparing combine.c compilation at -O2 level:
  Peak amount of GGC memory allocated before garbage collecting increased from 18650k to 18673k, overall 0.12%
  Amount of produced GGC garbage increased from 68574k to 71731k, overall 4.60%
    Overall memory needed: 39020k -> 39155k
    Peak memory use before GGC: 18650k -> 18673k
    Peak memory use after GGC: 18464k
    Maximum of released memory in single GGC run: 1356k -> 1407k
    Garbage: 68574k -> 71731k
    Leak: 8516k -> 8513k
    Overhead: 8601k -> 8762k
    GGC runs: 375 -> 384

comparing combine.c compilation at -O3 level:
  Ovarall memory allocated via mmap and sbrk decreased from 44476k to 42555k, overall -4.51%
  Peak amount of GGC memory allocated before garbage collecting increased from 18803k to 18842k, overall 0.21%
  Amount of memory still referenced at the end of compilation increased from 8616k to 8635k, overall 0.22%
    Overall memory needed: 44476k -> 42555k
    Peak memory use before GGC: 18803k -> 18842k
    Peak memory use after GGC: 18600k -> 18601k
    Maximum of released memory in single GGC run: 3718k -> 2208k
    Garbage: 93873k -> 93581k
    Leak: 8616k -> 8635k
    Overhead: 11897k -> 11462k
    GGC runs: 405 -> 412

comparing insn-attrtab.c compilation at -O0 level:
    Overall memory needed: 143072k -> 143071k
    Peak memory use before GGC: 60777k
    Peak memory use after GGC: 33783k
    Maximum of released memory in single GGC run: 34624k
    Garbage: 132231k
    Leak: 10943k
    Overhead: 14736k
    GGC runs: 186

comparing insn-attrtab.c compilation at -O0 -g level:
    Overall memory needed: 144336k -> 144339k
    Peak memory use before GGC: 61939k
    Peak memory use after GGC: 34944k
    Maximum of released memory in single GGC run: 34625k
    Garbage: 132464k
    Leak: 12651k
    Overhead: 15133k
    GGC runs: 192

comparing insn-attrtab.c compilation at -O1 level:
  Amount of produced GGC garbage increased from 215940k to 217882k, overall 0.90%
    Overall memory needed: 153504k -> 153495k
    Peak memory use before GGC: 59003k -> 59031k
    Peak memory use after GGC: 54782k -> 54810k
    Maximum of released memory in single GGC run: 23623k -> 23586k
    Garbage: 215940k -> 217882k
    Leak: 11069k -> 11074k
    Overhead: 25110k -> 25127k
    GGC runs: 217

comparing insn-attrtab.c compilation at -O2 level:
  Amount of produced GGC garbage increased from 248912k to 253766k, overall 1.95%
    Overall memory needed: 191892k -> 193799k
    Peak memory use before GGC: 58886k -> 58917k
    Peak memory use after GGC: 54871k -> 54870k
    Maximum of released memory in single GGC run: 21313k -> 21338k
    Garbage: 248912k -> 253766k
    Leak: 11058k -> 11068k
    Overhead: 30537k -> 30775k
    GGC runs: 241 -> 243

comparing insn-attrtab.c compilation at -O3 level:
  Peak amount of GGC memory allocated before garbage collecting increased from 70105k to 71317k, overall 1.73%
  Amount of produced GGC garbage increased from 277963k to 282794k, overall 1.74%
    Overall memory needed: 197980k -> 197931k
    Peak memory use before GGC: 70105k -> 71317k
    Peak memory use after GGC: 65674k -> 65704k
    Maximum of released memory in single GGC run: 22922k -> 22949k
    Garbage: 277963k -> 282794k
    Leak: 11069k -> 11078k
    Overhead: 32256k -> 32490k
    GGC runs: 243 -> 245

comparing Gerald's testcase PR8361 compilation at -O0 level:
    Overall memory needed: 154308k -> 154306k
    Peak memory use before GGC: 89847k
    Peak memory use after GGC: 88957k
    Maximum of released memory in single GGC run: 17978k
    Garbage: 208363k
    Leak: 52514k
    Overhead: 24545k
    GGC runs: 398

comparing Gerald's testcase PR8361 compilation at -O0 -g level:
    Overall memory needed: 174644k -> 174642k
    Peak memory use before GGC: 102484k
    Peak memory use after GGC: 101469k
    Maximum of released memory in single GGC run: 18362k
    Garbage: 214034k
    Leak: 75824k
    Overhead: 30441k
    GGC runs: 373

comparing Gerald's testcase PR8361 compilation at -O1 level:
    Overall memory needed: 150468k -> 151328k
    Peak memory use before GGC: 101407k -> 101406k
    Peak memory use after GGC: 100402k
    Maximum of released memory in single GGC run: 17434k
    Garbage: 339691k -> 334947k
    Leak: 53505k -> 53027k
    Overhead: 31443k -> 31254k
    GGC runs: 514 -> 508

comparing Gerald's testcase PR8361 compilation at -O2 level:
  Amount of produced GGC garbage increased from 389623k to 394324k, overall 1.21%
    Overall memory needed: 158076k -> 160948k
    Peak memory use before GGC: 101781k -> 101775k
    Peak memory use after GGC: 100766k
    Maximum of released memory in single GGC run: 17434k
    Garbage: 389623k -> 394324k
    Leak: 53816k -> 53353k
    Overhead: 37289k -> 38055k
    GGC runs: 563 -> 568

comparing Gerald's testcase PR8361 compilation at -O3 level:
  Overall memory allocated via mmap and sbrk increased from 160540k to 165920k, overall 3.35%
  Amount of produced GGC garbage increased from 423283k to 433147k, overall 2.33%
    Overall memory needed: 160540k -> 165920k
    Peak memory use before GGC: 103429k -> 103408k
    Peak memory use after GGC: 102384k
    Maximum of released memory in single GGC run: 17795k
    Garbage: 423283k -> 433147k
    Leak: 54530k -> 54040k
    Overhead: 39860k -> 41037k
    GGC runs: 588 -> 596

comparing PR rtl-optimization/28071 testcase compilation at -O0 level:
    Overall memory needed: 387576k
    Peak memory use before GGC: 103387k
    Peak memory use after GGC: 59041k
    Maximum of released memory in single GGC run: 50582k
    Garbage: 179405k
    Leak: 8878k
    Overhead: 31379k
    GGC runs: 64

comparing PR rtl-optimization/28071 testcase compilation at -O0 -g level:
    Overall memory needed: 388384k
    Peak memory use before GGC: 104034k
    Peak memory use after GGC: 59687k
    Maximum of released memory in single GGC run: 50583k
    Garbage: 179558k
    Leak: 10646k
    Overhead: 31875k
    GGC runs: 72

comparing PR rtl-optimization/28071 testcase compilation at -O1 level:
  Amount of produced GGC garbage increased from 232479k to 233003k, overall 0.23%
    Overall memory needed: 311672k -> 310731k
    Peak memory use before GGC: 84351k -> 84236k
    Peak memory use after GGC: 75888k
    Maximum of released memory in single GGC run: 39401k -> 39285k
    Garbage: 232479k -> 233003k
    Leak: 22313k -> 22315k
    Overhead: 32107k -> 31902k
    GGC runs: 71

comparing PR rtl-optimization/28071 testcase compilation at -O2 level:
  Amount of produced GGC garbage increased from 240669k to 243809k, overall 1.30%
    Overall memory needed: 320080k -> 315559k
    Peak memory use before GGC: 80874k
    Peak memory use after GGC: 75888k
    Maximum of released memory in single GGC run: 33014k -> 33013k
    Garbage: 240669k -> 243809k
    Leak: 22393k -> 22396k
    Overhead: 34408k -> 34421k
    GGC runs: 83 -> 84

comparing PR rtl-optimization/28071 testcase compilation at -O3 -fno-tree-pre -fno-tree-fre level:
    Overall memory needed: 1032352k -> 1032351k
    Peak memory use before GGC: 184628k
    Peak memory use after GGC: 172144k
    Maximum of released memory in single GGC run: 80995k
    Garbage: 349089k
    Leak: 47771k
    Overhead: 45251k
    GGC runs: 66

Head of the ChangeLog is:

--- /usr/src/SpecTests/sandbox-britten-memory/x86_64/mem-result/ChangeLog	2007-06-30 03:09:31.000000000 +0000
+++ /usr/src/SpecTests/sandbox-britten-memory/gcc/gcc/ChangeLog	2007-06-30 18:28:13.000000000 +0000
@@ -1,3 +1,160 @@
+2007-06-30  Uros Bizjak  <ubizjak@gmail.com>
+
+	PR target/32433
+	* config/i386/i386.md (ffssi2): Expand as ffs_cmove for TARGET_CMOVE.
+	(ffs_cmove): New expander to expand using ctz pattern.
+	(*ffs_cmove): Remove pattern.
+	(*ffs_no_cmove): Enable only for !TARGET_CMOVE.
+	(ffsdi2): Expand using ctz pattern.
+	(*ffs_rex64): Remove pattern.
+
+2007-06-30  John David Anglin  <dave.anglin@nrc-cnrc.gc.ca>
+
+	PR rtl-optimization/32296
+	* pa.md (return): Delete pattern.
+	(return_internal): Remove "(const_int 1)" from pattern.
+	(epilogue): Use return_internal pattern for trivial returns.
+	* pa-protos.h (hppa_can_use_return_insn_p): Delete declaration.
+	* pa.c (hppa_can_use_return_insn_p): Delete function.  Include "df.h".
+
+2007-06-30  Daniel Berlin  <dberlin@dberlin.org>
+	
+	Fix PR tree-optimization/32540
+	Fix PR tree-optimization/31651
+
+	* tree-ssa-sccvn.c: New file.
+
+	* tree-ssa-sccvn.h: Ditto.
+	
+	* tree-vn.c: Include tree-ssa-sccvn.h
+	(val_expr_paid_d): Removed.
+	(value_table): Ditto.
+	(vn_compute): Ditto.
+	(val_expr_pair_hash): Ditto.
+	(val_expr_pair_expr_eq): Ditto.
+	(copy_vuses_from_stmt): Ditto.
+	(vn_delete): Ditto.
+	(vn_init): Ditto.
+	(shared_vuses_from_stmt): Ditto.
+	(print_creation_to_file): Moved up.
+	(sort_vuses): Ditto.
+	(sort_vuses_heap): Ditto.
+	(set_value_handle): Make non-static.
+	(make_value_handle): Ditto.
+	(vn_add): Rewritten to use sccvn lookups.
+	(vn_add_with_vuses): Ditto.
+	(vn_lookup): Ditto (and second argument removed).
+	(vn_lookup_with_vuses): Ditto.
+	(vn_lookup_or_add): Ditto (and second argument removed);
+	(vn_lookup_or_add_with_vuses): Ditto.
+	(vn_lookup_with_stmt): New.
+	(vn_lookup_or_add_with_stmt): Ditto.
+	(create_value_handle_for_expr): Ditto.
+
+	* tree-ssa-pre.c: Include tree-ssa-sccvn.h.
+	(seen_during_translate): New function.
+	(phi_trans_lookup): Use iterative_hash_expr, not vn_compute.
+	(phi_trans_add): Ditto.
+	(constant_expr_p): FIELD_DECL is always constant.
+	(phi_translate_1): Renamed from phi_translate, add seen bitmap.
+	Use constant_expr_p.
+	Avoid infinite recursion on mutually valued expressions.
+	Change callers of vn_lookup_or_add.
+	(phi_translate): New function.
+	(compute_antic_safe): Allow phi nodes.
+	(create_component_ref_by_pieces): Update for FIELD_DECL change.
+	(find_or_generate_expression): Rewrite slightly.
+	(create_expression_by_pieces): Updated for vn_lookup_or_add
+	change.
+	Update VN_INFO for new names.
+	(insert_into_preds_of_block): Update for new names.
+	(add_to_exp_gen): New function.
+	(add_to_sets): Use vn_lookup_or_add_with_stmt.
+	(find_existing_value_expr): Rewrite to changed vn_lookup.
+	(create_value_expr_from): Ditto, and use add_to_exp_gen.
+	(try_look_through_load): Removed.
+	(try_combine_conversion): Ditto.
+	(get_sccvn_value): New function.
+	(make_values_for_phi): Ditto.
+	(make_values_for_stmt): Ditto.
+	(compute_avail): Rewritten for vn_lookup_or_add changes and to use
+	SCCVN.
+	(init_pre): Update for SCCVN changes.
+	(fini_pre): Ditto.
+	(execute_pre): Ditto.
+
+	* tree-flow.h (make_value_handle): Declare.
+	(set_value_handle): Ditto.
+	(sort_vuses_heap): Ditto.
+	(vn_lookup_or_add_with_stmt): Ditto.
+	(vn_lookup_with_stmt): Ditto.
+	(vn_compute): Remove.
+	(vn_init): Ditto.
+	(vn_delete): Ditto.
+	(vn_lookup): Update arguments.
+
+	* Makefile.in (tree-ssa-pre.o): Add tree-ssa-sccvn.h
+	(tree-vn.o): Ditto.
+	(tree-ssa-sccvn.o): New.
+	(OBJS-common): Add tree-ssa-sccvn.o
+	
+2007-06-30  Manuel Lopez-Ibanez  <manu@gcc.gnu.org>
+
+	PR c/4076
+	* c-typeck.c (build_external_ref): Don't mark as used if called
+	from itself.
+	* calls.c (rtx_for_function_call): Likewise.
+	
+2007-06-30  Richard Sandiford  <richard@codesourcery.com>
+
+	Revert:
+
+	2007-06-27  Richard Sandiford  <richard@codesourcery.com>
+
+	* dce.c (deletable_insn_p_1): New function, split out from...
+	(deletable_insn_p): ...here.  Only treat bare USEs and CLOBBERs
+	specially, not those inside PARALLELs.  Remove BODY argument
+	and adjust recursive call accordingly.
+	(prescan_insns_for_dce): Update call to delete_insn_p.
+
+2007-06-30  Rask Ingemann Lambertsen <rask@sygehus.dk>
+
+	* combine.c (combine_validate_cost): New parameter NEWOTHERPAT.
+	(try_combine): Move potential calls to undo_all() so they happen
+	before we commit to using the combined insns.
+
+2006-06-30  Jan Hubicka  <jh@suse.cz>
+
+	* loop-unroll.c (unroll_loop_runtime_iterations): Unshare newly emit    
+	code.
+
+2006-06-30  Thomas Neumann  <tneumann@users.sourceforge.net>
+
+	* ipa.c (cgraph_postorder): Cast according to the coding conventions.
+	(cgraph_remove_unreachable_nodes): Likewise.
+	* ipa-cp.c (ipcp_propagate_stage): Use BOTTOM instead of integer 0.
+	* ipa-inline.c (update_caller_keys): Cast according to the coding
+	conventions.
+	(cgraph_decide_recursive_inlining): Likewise.
+	(cgraph_decide_inlining_of_small_function): Likewise.
+	(try_inline): Likewise.
+	(cgraph_decide_inlining_incrementally): Likewise.
+	* ipa-pure-const.c (get_function_state): Likewise.
+	(scan_function): Likewise.
+	(analyze_function): Likewise.
+ 	(static_execute): Likewise.
+	* gcc/ipa-reference.c (scan_for_static_refs): Likewise.
+	(merge_callee_local_info): Likewise.
+	(analyze_function): Use type safe memory macros.
+	(static_execute): Likewise. Cast according to the coding conventions.
+	* ipa-type-escape.c (scan_for_regs): Cast according to the coding
+	conventions.
+	* ipa-utils.c (searchc): Likewise. Avoid using C++ keywords as variable
+	names.
+	(ipa_utils_reduced_inorder): Likewise. Use type safe memory macros.
+	* ipa-utils.h (struct ipa_dfa_info): Avoid using C++ keywords as
+	variable names.
+
 2007-06-29  Andrew Pinski  <andrew_pinski@playstation.sony.com>
 
 	PR middle-end/30024


The results can be reproduced by building a compiler with

--enable-gather-detailed-mem-stats targetting x86-64

and compiling preprocessed combine.c or testcase from PR8632 with:

-fmem-report --param=ggc-min-heapsize=1024 --param=ggc-min-expand=1 -Ox -Q

The memory consumption summary appears in the dump after detailed listing
of the places they are allocated in.  Peak memory consumption is actually
computed by looking for maximal value in {GC XXXX -> YYYY} report.

Your testing script.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]