A recent patch increased GCC's memory consumption!

gcctest@suse.de gcctest@suse.de
Sun Jun 5 11:11:00 GMT 2005


Hi,

I am a friendly script caring about memory consumption in GCC.  Please
contact jh@suse.cz if something is going wrong.

Comparing memory consumption on compilation of combine.i, insn-attrtab.i,
and generate-3.4.ii I got:


comparing combine.c compilation at -O0 level:
    Overall memory needed: 25296k -> 25300k
    Peak memory use before GGC: 9665k
    Peak memory use after GGC: 8978k
    Maximum of released memory in single GGC run: 2790k
    Garbage: 42282k
    Leak: 6724k
    Overhead: 5889k
    GGC runs: 328

comparing combine.c compilation at -O1 level:
  Peak amount of GGC memory still allocated after garbage collectin increased from 8732k to 8756k, overall 0.27%
    Overall memory needed: 27640k -> 27812k
    Peak memory use before GGC: 9173k -> 9115k
    Peak memory use after GGC: 8732k -> 8756k
    Maximum of released memory in single GGC run: 2204k
    Garbage: 63163k -> 63159k
    Leak: 7106k
    Overhead: 7792k -> 8035k
    GGC runs: 520 -> 523

comparing combine.c compilation at -O2 level:
    Overall memory needed: 24832k -> 24836k
    Peak memory use before GGC: 18290k
    Peak memory use after GGC: 18109k
    Maximum of released memory in single GGC run: 2524k
    Garbage: 86245k -> 86254k
    Leak: 7095k
    Overhead: 10927k -> 11178k
    GGC runs: 479 -> 478

comparing combine.c compilation at -O3 level:
    Overall memory needed: 25068k -> 25096k
    Peak memory use before GGC: 18292k -> 18294k
    Peak memory use after GGC: 18109k
    Maximum of released memory in single GGC run: 3099k
    Garbage: 114900k -> 114893k
    Leak: 7182k
    Overhead: 14471k -> 14800k
    GGC runs: 528 -> 529

comparing insn-attrtab.c compilation at -O0 level:
    Overall memory needed: 85588k
    Peak memory use before GGC: 73387k
    Peak memory use after GGC: 45373k
    Maximum of released memory in single GGC run: 37597k
    Garbage: 153190k
    Leak: 11550k
    Overhead: 19639k
    GGC runs: 268

comparing insn-attrtab.c compilation at -O1 level:
  Peak amount of GGC memory allocated before garbage collecting increased from 76447k to 76700k, overall 0.33%
  Peak amount of GGC memory still allocated after garbage collectin increased from 65602k to 65854k, overall 0.38%
    Overall memory needed: 101184k -> 101588k
    Peak memory use before GGC: 76447k -> 76700k
    Peak memory use after GGC: 65602k -> 65854k
    Maximum of released memory in single GGC run: 37075k
    Garbage: 304678k -> 304678k
    Leak: 11611k
    Overhead: 38534k -> 39363k
    GGC runs: 381

comparing insn-attrtab.c compilation at -O2 level:
  Peak amount of GGC memory allocated before garbage collecting increased from 121115k to 121367k, overall 0.21%
  Peak amount of GGC memory still allocated after garbage collectin increased from 92207k to 92459k, overall 0.27%
    Overall memory needed: 154684k -> 156172k
    Peak memory use before GGC: 121115k -> 121367k
    Peak memory use after GGC: 92207k -> 92459k
    Maximum of released memory in single GGC run: 32950k -> 32951k
    Garbage: 402864k -> 402859k
    Leak: 11447k
    Overhead: 51375k -> 52218k
    GGC runs: 303

comparing insn-attrtab.c compilation at -O3 level:
  Peak amount of GGC memory allocated before garbage collecting increased from 121117k to 121369k, overall 0.21%
  Peak amount of GGC memory still allocated after garbage collectin increased from 92209k to 92461k, overall 0.27%
    Overall memory needed: 154632k -> 156276k
    Peak memory use before GGC: 121117k -> 121369k
    Peak memory use after GGC: 92209k -> 92461k
    Maximum of released memory in single GGC run: 32950k -> 32951k
    Garbage: 403630k -> 403626k
    Leak: 11468k
    Overhead: 51496k -> 52343k
    GGC runs: 309 -> 308

comparing Gerald's testcase PR8361 compilation at -O0 level:
    Overall memory needed: 127340k
    Peak memory use before GGC: 103091k
    Peak memory use after GGC: 102069k
    Maximum of released memory in single GGC run: 21524k
    Garbage: 247514k
    Leak: 53785k
    Overhead: 42959k
    GGC runs: 346

comparing Gerald's testcase PR8361 compilation at -O1 level:
    Overall memory needed: 121448k -> 121452k
    Peak memory use before GGC: 112533k -> 112534k
    Peak memory use after GGC: 100711k
    Maximum of released memory in single GGC run: 20026k -> 20027k
    Garbage: 662424k -> 662426k
    Leak: 58917k -> 58933k
    Overhead: 86365k -> 91376k
    GGC runs: 518 -> 514

comparing Gerald's testcase PR8361 compilation at -O2 level:
    Overall memory needed: 121436k -> 121424k
    Peak memory use before GGC: 112534k
    Peak memory use after GGC: 100711k
    Maximum of released memory in single GGC run: 20027k
    Garbage: 758014k -> 757720k
    Leak: 59636k -> 59620k
    Overhead: 104501k -> 109597k
    GGC runs: 600 -> 591

comparing Gerald's testcase PR8361 compilation at -O3 level:
    Overall memory needed: 124376k
    Peak memory use before GGC: 115276k
    Peak memory use after GGC: 102537k
    Maximum of released memory in single GGC run: 21386k
    Garbage: 814222k -> 814170k
    Leak: 60964k
    Overhead: 111035k -> 116762k
    GGC runs: 604 -> 599

Head of the ChangeLog is:

--- /usr/src/SpecTests/sandbox-britten-memory/x86_64/mem-result/ChangeLog	2005-06-05 08:52:57.000000000 +0000
+++ /usr/src/SpecTests/sandbox-britten-memory/gcc/gcc/ChangeLog	2005-06-05 10:09:20.000000000 +0000
@@ -1,3 +1,66 @@
+2005-06-05  Dorit Nuzman  <dorit@il.ibm.com>
+
+        * tree-flow.h (stmt_ann_d): Move aux to ...
+        (tree_ann_common_d): ... here.
+        * tree-ssa-loop-im.c (LIM_DATA, determine_invariantness_stmt,
+        move_computations_stmt, schedule_sm): Update references to
+        aux.
+        * tree-vectorizer.h (set_stmt_info, vinfo_for_stmt): Likewise.
+        * tree-vect-transform.c (vect_create_index_for_vector_ref): Update
+        call to set_stmt_info.
+        (vect_transform_loop): Likewise.
+        * tree-vectorizer.c (new_loop_vec_info, destroy_loop_vec_info):
+        Likewise.
+
+        * tree-vect-analyze.c (vect_analyze_scalar_cycles): Made void instead of
+        bool.
+        (vect_mark_relevant): Takes two additional arguments - live_p and
+        relevant_p. Set RELEVANT_P and LIVE_P according to these arguments.
+        (vect_stmt_relevant_p): Differentiate between a live stmt and a
+        relevant stmt. Return two values = live_p and relevant_p.
+        (vect_mark_stmts_to_be_vectorized): Call vect_mark_relevant and
+        vect_stmt_relevant_p with additional arguments. Phis are no longer
+        put into the worklist (analyzed seperately in analyze_scalar_cycles).
+        (vect_determine_vectorization_factor): Also check for LIVE_P, because a
+        stmt that is marked as irrelevant and live, cause it's only used out
+        side the loop, may need to be vectorized (e.g. reduction).
+        (vect_analyze_operations): Examine phis. Call
+        vectorizable_live_operation for for LIVE_P stmts. Check if
+        need_to_vectorize.
+        (vect_analyze_scalar_cycles): Update documentation. Don't fail
+        vectorization - just classify the scalar cycles created by the loop
+        phis. Call vect_is_simple_reduction.
+        (vect_analyze_loop): Call to analyze_scalar_cycles moved earlier.
+        * tree-vect-transform.c (vect_create_index_for_vector_ref): Update
+        call to set_stmt_info.
+        (vect_get_vec_def_for_operand): Code reorganized - the code that
+        classifies the type of use was factored out to vect_is_simple_use.
+        (vectorizable_store, vect_is_simple_cond): Call vect_is_simple_use with
+        additional arguments.
+        (vectorizable_assignment): Likewise. Also make sure the stmt is relevant
+        and computes a loop_vec_def.
+        (vectorizable_operation, vectorizable_load, vectorizable_condition):
+        Likewise.
+        (vectorizable_live_operation): New.
+        (vect_transform_stmt): Handle LIVE_P stmts.
+        * tree-vectorizer.c (new_stmt_vec_info): Initialize the new fields
+        STMT_VINFO_LIVE_P and STMT_VINFO_DEF_TYPE.
+        (new_loop_vec_info, destroy_loop_vec_info): Also handle phis.
+        (vect_is_simple_use): Determine the type of the def and return it
+        in a new function argument. Consider vect_reduction_def and
+        vect_induction_def, but for now these are not supported.
+        (vect_is_simple_reduction): New. Empty for now.
+        * tree-vectorizer.h (vect_def_type): New enum type.
+        (_stmt_vec_info): Added new fields - live and _stmt_vec_info.
+        (STMT_VINFO_LIVE_P, STMT_VINFO_DEF_TYPE): New accessor macros.
+        (vect_is_simple_use): New arguments added to function declaration.
+        (vect_is_simple_reduction): New function declaration.
+        (vectorizable_live_operation): New function declaration.
+
+        * tree-vect-analyze.c (vect_can_advance_ivs_p): Add debug printout.
+        (vect_can_advance_ivs_p): Likewise.
+        * tree-vect-transform.c (vect_update_ivs_after_vectorizer): Likewise.
+
 2005-06-05  Eric Christopher  <echristo@redhat.com>
 
 	* config/mips/mips.c (mips_rtx_costs): Remove unused variable.


The results can be reproduced by building a compiler with

--enable-gather-detailed-mem-stats targetting x86-64

and compiling preprocessed combine.c or testcase from PR8632 with:

-fmem-report --param=ggc-min-heapsize=1024 --param=ggc-min-expand=1 -Ox -Q

The memory consumption summary appears in the dump after detailed listing
of the places they are allocated in.  Peak memory consumption is actually
computed by looking for maximal value in {GC XXXX -> YYYY} report.

Your testing script.



More information about the Gcc-regression mailing list