This is the mail archive of the gcc-regression@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

A recent patch increased GCC's memory consumption!


Hi,

I am a friendly script caring about memory consumption in GCC.  Please
contact jh@suse.cz if something is going wrong.

Comparing memory consumption on compilation of combine.i, insn-attrtab.i,
and generate-3.4.ii I got:


comparing empty function compilation at -O0 level:
  Peak amount of GGC memory allocated before garbage collecting increased from 1166k to 1191k, overall 2.14%
  Peak amount of GGC memory still allocated after garbage collecting increased from 1062k to 1091k, overall 2.73%
  Amount of produced GGC garbage increased from 249k to 249k, overall 0.10%
  Amount of memory still referenced at the end of compilation increased from 1077k to 1105k, overall 2.57%
    Overall memory needed: 7043k -> 7045k
    Peak memory use before GGC: 1166k -> 1191k
    Peak memory use after GGC: 1062k -> 1091k
    Maximum of released memory in single GGC run: 121k -> 125k
    Garbage: 249k -> 249k
    Leak: 1077k -> 1105k
    Overhead: 148k -> 150k
    GGC runs: 4

comparing empty function compilation at -O0 -g level:
  Peak amount of GGC memory allocated before garbage collecting increased from 1193k to 1219k, overall 2.18%
  Peak amount of GGC memory still allocated after garbage collecting increased from 1089k to 1118k, overall 2.66%
  Amount of produced GGC garbage increased from 251k to 252k, overall 0.10%
  Amount of memory still referenced at the end of compilation increased from 1110k to 1138k, overall 2.49%
    Overall memory needed: 7059k -> 7045k
    Peak memory use before GGC: 1193k -> 1219k
    Peak memory use after GGC: 1089k -> 1118k
    Maximum of released memory in single GGC run: 124k -> 128k
    Garbage: 251k -> 252k
    Leak: 1110k -> 1138k
    Overhead: 152k -> 155k
    GGC runs: 4

comparing empty function compilation at -O1 level:
  Peak amount of GGC memory allocated before garbage collecting increased from 1166k to 1191k, overall 2.14%
  Peak amount of GGC memory still allocated after garbage collecting increased from 1054k to 1083k, overall 2.75%
  Amount of produced GGC garbage increased from 251k to 251k, overall 0.10%
  Amount of memory still referenced at the end of compilation increased from 1078k to 1106k, overall 2.57%
    Overall memory needed: 7095k -> 7097k
    Peak memory use before GGC: 1166k -> 1191k
    Peak memory use after GGC: 1054k -> 1083k
    Maximum of released memory in single GGC run: 117k -> 121k
    Garbage: 251k -> 251k
    Leak: 1078k -> 1106k
    Overhead: 148k -> 151k
    GGC runs: 3

comparing empty function compilation at -O2 level:
  Peak amount of GGC memory allocated before garbage collecting increased from 1166k to 1192k, overall 2.23%
  Peak amount of GGC memory still allocated after garbage collecting increased from 1054k to 1083k, overall 2.75%
  Amount of produced GGC garbage increased from 255k to 255k, overall 0.10%
  Amount of memory still referenced at the end of compilation increased from 1079k to 1107k, overall 2.56%
    Overall memory needed: 7099k -> 7101k
    Peak memory use before GGC: 1166k -> 1192k
    Peak memory use after GGC: 1054k -> 1083k
    Maximum of released memory in single GGC run: 116k -> 121k
    Garbage: 255k -> 255k
    Leak: 1079k -> 1107k
    Overhead: 149k -> 152k
    GGC runs: 3

comparing empty function compilation at -O3 level:
  Peak amount of GGC memory allocated before garbage collecting increased from 1166k to 1192k, overall 2.23%
  Peak amount of GGC memory still allocated after garbage collecting increased from 1054k to 1083k, overall 2.75%
  Amount of produced GGC garbage increased from 255k to 255k, overall 0.10%
  Amount of memory still referenced at the end of compilation increased from 1079k to 1107k, overall 2.56%
    Overall memory needed: 7099k -> 7101k
    Peak memory use before GGC: 1166k -> 1192k
    Peak memory use after GGC: 1054k -> 1083k
    Maximum of released memory in single GGC run: 116k -> 121k
    Garbage: 255k -> 255k
    Leak: 1079k -> 1107k
    Overhead: 149k -> 152k
    GGC runs: 3

comparing combine.c compilation at -O0 level:
  Peak amount of GGC memory allocated before garbage collecting increased from 8292k to 8321k, overall 0.35%
  Peak amount of GGC memory still allocated after garbage collecting increased from 7632k to 7661k, overall 0.38%
  Amount of memory still referenced at the end of compilation increased from 6210k to 6237k, overall 0.45%
    Overall memory needed: 22047k -> 22081k
    Peak memory use before GGC: 8292k -> 8321k
    Peak memory use after GGC: 7632k -> 7661k
    Maximum of released memory in single GGC run: 1581k
    Garbage: 38805k -> 38808k
    Leak: 6210k -> 6237k
    Overhead: 5052k -> 5054k
    GGC runs: 372 -> 369

comparing combine.c compilation at -O0 -g level:
  Peak amount of GGC memory allocated before garbage collecting increased from 10122k to 10151k, overall 0.29%
  Peak amount of GGC memory still allocated after garbage collecting increased from 9396k to 9425k, overall 0.31%
  Amount of memory still referenced at the end of compilation increased from 9037k to 9064k, overall 0.31%
    Overall memory needed: 24055k -> 24097k
    Peak memory use before GGC: 10122k -> 10151k
    Peak memory use after GGC: 9396k -> 9425k
    Maximum of released memory in single GGC run: 1874k -> 1875k
    Garbage: 39156k -> 39162k
    Leak: 9037k -> 9064k
    Overhead: 5724k -> 5726k
    GGC runs: 344 -> 341

comparing combine.c compilation at -O1 level:
  Peak amount of GGC memory allocated before garbage collecting increased from 17067k to 17092k, overall 0.15%
  Peak amount of GGC memory still allocated after garbage collecting increased from 16879k to 16904k, overall 0.15%
  Amount of memory still referenced at the end of compilation increased from 6357k to 6376k, overall 0.31%
    Overall memory needed: 33127k -> 33173k
    Peak memory use before GGC: 17067k -> 17092k
    Peak memory use after GGC: 16879k -> 16904k
    Maximum of released memory in single GGC run: 1379k -> 1378k
    Garbage: 52438k -> 52447k
    Leak: 6357k -> 6376k
    Overhead: 6033k -> 6035k
    GGC runs: 442 -> 439

comparing combine.c compilation at -O2 level:
  Peak amount of GGC memory allocated before garbage collecting increased from 17137k to 17169k, overall 0.19%
  Peak amount of GGC memory still allocated after garbage collecting increased from 16967k to 16993k, overall 0.15%
  Amount of memory still referenced at the end of compilation increased from 6675k to 6703k, overall 0.42%
    Overall memory needed: 35387k -> 35429k
    Peak memory use before GGC: 17137k -> 17169k
    Peak memory use after GGC: 16967k -> 16993k
    Maximum of released memory in single GGC run: 1336k -> 1335k
    Garbage: 71317k -> 71359k
    Leak: 6675k -> 6703k
    Overhead: 8284k -> 8289k
    GGC runs: 510 -> 507

comparing combine.c compilation at -O3 level:
  Peak amount of GGC memory still allocated after garbage collecting increased from 17021k to 17046k, overall 0.15%
  Amount of memory still referenced at the end of compilation increased from 6802k to 6818k, overall 0.23%
    Overall memory needed: 39871k -> 38849k
    Peak memory use before GGC: 17366k -> 17368k
    Peak memory use after GGC: 17021k -> 17046k
    Maximum of released memory in single GGC run: 2131k -> 2130k
    Garbage: 92695k -> 92724k
    Leak: 6802k -> 6818k
    Overhead: 10764k -> 10792k
    GGC runs: 540 -> 536

comparing insn-attrtab.c compilation at -O0 level:
  Amount of memory still referenced at the end of compilation increased from 8939k to 8967k, overall 0.31%
    Overall memory needed: 138499k -> 138533k
    Peak memory use before GGC: 58646k -> 58675k
    Peak memory use after GGC: 32139k -> 32168k
    Maximum of released memory in single GGC run: 34144k
    Garbage: 131583k -> 131586k
    Leak: 8939k -> 8967k
    Overhead: 14856k -> 14859k
    GGC runs: 296 -> 294

comparing insn-attrtab.c compilation at -O0 -g level:
  Amount of memory still referenced at the end of compilation increased from 10375k to 10643k, overall 2.58%
    Overall memory needed: 139751k -> 139793k
    Peak memory use before GGC: 59795k -> 59824k
    Peak memory use after GGC: 33288k -> 33317k
    Maximum of released memory in single GGC run: 34144k
    Garbage: 132068k -> 131809k
    Leak: 10375k -> 10643k
    Overhead: 15237k -> 15239k
    GGC runs: 290 -> 289

comparing insn-attrtab.c compilation at -O1 level:
  Amount of memory still referenced at the end of compilation increased from 9832k to 9859k, overall 0.28%
    Overall memory needed: 149715k -> 149765k
    Peak memory use before GGC: 57143k -> 57168k
    Peak memory use after GGC: 50913k -> 50938k
    Maximum of released memory in single GGC run: 24232k
    Garbage: 212484k -> 212484k
    Leak: 9832k -> 9859k
    Overhead: 24861k -> 24863k
    GGC runs: 320 -> 319

comparing insn-attrtab.c compilation at -O2 level:
  Amount of memory still referenced at the end of compilation increased from 10919k to 10947k, overall 0.25%
    Overall memory needed: 187259k -> 187337k
    Peak memory use before GGC: 57777k -> 57802k
    Peak memory use after GGC: 52503k -> 52531k
    Maximum of released memory in single GGC run: 22973k
    Garbage: 253944k -> 253951k
    Leak: 10919k -> 10947k
    Overhead: 30607k -> 30609k
    GGC runs: 351 -> 350

comparing insn-attrtab.c compilation at -O3 level:
  Amount of memory still referenced at the end of compilation increased from 10925k to 10952k, overall 0.25%
    Overall memory needed: 194487k -> 194545k
    Peak memory use before GGC: 69771k -> 69795k
    Peak memory use after GGC: 63204k -> 63228k
    Maximum of released memory in single GGC run: 23359k -> 23355k
    Garbage: 280710k -> 280713k
    Leak: 10925k -> 10952k
    Overhead: 32373k -> 32376k
    GGC runs: 351 -> 350

comparing Gerald's testcase PR8361 compilation at -O0 level:
    Overall memory needed: 155216k -> 155475k
    Peak memory use before GGC: 89694k -> 89723k
    Peak memory use after GGC: 88801k -> 88830k
    Maximum of released memory in single GGC run: 18062k
    Garbage: 210317k -> 210310k
    Leak: 52988k -> 53020k
    Overhead: 26476k -> 26478k
    GGC runs: 418

comparing Gerald's testcase PR8361 compilation at -O0 -g level:
    Overall memory needed: 174552k -> 174583k
    Peak memory use before GGC: 101137k -> 101166k
    Peak memory use after GGC: 100135k -> 100164k
    Maximum of released memory in single GGC run: 18248k
    Garbage: 215905k -> 215902k
    Leak: 74812k -> 74843k
    Overhead: 31885k -> 31887k
    GGC runs: 392

comparing Gerald's testcase PR8361 compilation at -O1 level:
    Overall memory needed: 121560k -> 121731k
    Peak memory use before GGC: 88593k -> 88622k
    Peak memory use after GGC: 87716k -> 87745k
    Maximum of released memory in single GGC run: 17329k
    Garbage: 297852k -> 297826k
    Leak: 52244k -> 52266k
    Overhead: 30854k -> 30850k
    GGC runs: 517 -> 516

comparing Gerald's testcase PR8361 compilation at -O2 level:
    Overall memory needed: 127212k -> 127535k
    Peak memory use before GGC: 88773k -> 88802k
    Peak memory use after GGC: 87886k -> 87914k
    Maximum of released memory in single GGC run: 17313k
    Garbage: 365038k -> 364715k
    Leak: 53323k -> 53354k
    Overhead: 38073k -> 38016k
    GGC runs: 594

comparing Gerald's testcase PR8361 compilation at -O3 level:
    Overall memory needed: 131108k -> 130955k
    Peak memory use before GGC: 89879k -> 89908k
    Peak memory use after GGC: 88984k -> 89013k
    Maximum of released memory in single GGC run: 17671k
    Garbage: 392049k -> 391468k
    Leak: 53574k -> 53602k
    Overhead: 40464k -> 40419k
    GGC runs: 610

comparing PR rtl-optimization/28071 testcase compilation at -O0 level:
  Amount of memory still referenced at the end of compilation increased from 6297k to 6325k, overall 0.44%
    Overall memory needed: 379373k -> 379411k
    Peak memory use before GGC: 101495k -> 101524k
    Peak memory use after GGC: 57149k -> 57178k
    Maximum of released memory in single GGC run: 50582k
    Garbage: 179457k -> 179457k
    Leak: 6297k -> 6325k
    Overhead: 30887k -> 30890k
    GGC runs: 107 -> 105

comparing PR rtl-optimization/28071 testcase compilation at -O0 -g level:
  Amount of memory still referenced at the end of compilation increased from 8005k to 8033k, overall 0.35%
    Overall memory needed: 380189k -> 380223k
    Peak memory use before GGC: 102129k -> 102158k
    Peak memory use after GGC: 57782k -> 57811k
    Maximum of released memory in single GGC run: 50583k
    Garbage: 179561k -> 179514k
    Leak: 8005k -> 8033k
    Overhead: 31352k -> 31355k
    GGC runs: 111 -> 110

comparing PR rtl-optimization/28071 testcase compilation at -O1 level:
  Amount of memory still referenced at the end of compilation increased from 15636k to 15663k, overall 0.18%
    Overall memory needed: 296755k -> 296257k
    Peak memory use before GGC: 80804k -> 80827k
    Peak memory use after GGC: 73190k -> 73215k
    Maximum of released memory in single GGC run: 40019k -> 40017k
    Garbage: 236023k -> 236019k
    Leak: 15636k -> 15663k
    Overhead: 31660k -> 31663k
    GGC runs: 105 -> 103

comparing PR rtl-optimization/28071 testcase compilation at -O2 level:
  Amount of memory still referenced at the end of compilation increased from 15725k to 15752k, overall 0.18%
    Overall memory needed: 270375k -> 270949k
    Peak memory use before GGC: 78176k -> 78201k
    Peak memory use after GGC: 73190k -> 73215k
    Maximum of released memory in single GGC run: 33751k -> 33754k
    Garbage: 246062k -> 246066k
    Leak: 15725k -> 15752k
    Overhead: 33727k -> 33730k
    GGC runs: 118 -> 116

comparing PR rtl-optimization/28071 testcase compilation at -O3 -fno-tree-pre -fno-tree-fre level:
  Amount of memory still referenced at the end of compilation increased from 25867k to 25895k, overall 0.11%
    Overall memory needed: 1017567k -> 1017649k
    Peak memory use before GGC: 166818k -> 166843k
    Peak memory use after GGC: 156382k -> 156407k
    Maximum of released memory in single GGC run: 83494k -> 83495k
    Garbage: 357355k -> 357355k
    Leak: 25867k -> 25895k
    Overhead: 46232k -> 46235k
    GGC runs: 99 -> 97

Head of the ChangeLog is:

--- /usr/src/SpecTests/sandbox-britten-memory/x86_64/mem-result/ChangeLog	2007-09-14 22:41:04.000000000 +0000
+++ /usr/src/SpecTests/sandbox-britten-memory/gcc/gcc/ChangeLog	2007-09-16 07:07:26.000000000 +0000
@@ -1,3 +1,55 @@
+2007-09-15  Zdenek Dvorak  <ook@ucw.cz>
+
+	* tree-parloops.c: New file.
+	* tree-ssa-operands.h (free_stmt_operands): Declare.
+	* tree-ssa-loop-manip.c (split_loop_exit_edge): Return the new basic
+	block.
+	* tree-pass.h (pass_parallelize_loops): Declare.
+	* omp-low.c (expand_omp_parallel, expand_omp_for): Update SSA form for
+	virtual operands.
+	(build_omp_regions_1): Allow analysing just a single OMP region and
+	its subregions.
+	( build_omp_regions_root, omp_expand_local): New functions.
+	(build_omp_regions): Add argument to build_omp_regions_1 call.
+	* builtins.def (DEF_GOMP_BUILTIN): Initialize OMP builtins when
+	autoparallelization is run.
+	* timevar.def (TV_TREE_PARALLELIZE_LOOPS): New.
+	* tree-ssa-loop.c (gate_tree_parallelize_loops, tree_parallelize_loops,
+	pass_parallelize_loops): New.
+	* common.opt (ftree-parallelize-loops): New.
+	* tree-flow.h (omp_expand_local, tree_duplicate_sese_tail,
+	parallelize_loops): Declare.
+	(add_phi_args_after_copy, split_loop_exit_edge): Declaration changed.
+	* Makefile.in (tree-parloops.o): Added.
+	* tree-cfg.c (add_phi_args_after_copy_edge, tree_duplicate_sese_tail):
+	New functions.
+	(add_phi_args_after_copy_bb): Use add_phi_args_after_copy_edge.
+	(add_phi_args_after_copy): Call add_phi_args_after_copy_edge for
+	one extra edge as well.
+	(tree_duplicate_sese_region): Add argument to add_phi_args_after_copy.
+	Use VEC_free to free doms vector.
+	(move_block_to_fn): Update loop info. Remove phi nodes for virtual
+	operands.  Recompute operand caches in the new function.
+	(move_sese_region_to_fn): Update loop info.
+	* passes.c (init_optimization_passes): Add pass_parallelize_loops.
+	* tree-ssa-operands.c (free_stmt_operands): New function.
+
+	* doc/passes.texi: Document autoparallelization.
+	* doc/invoke.texi (-ftree-parallelize-loops): New option.
+
+2007-09-15  John David Anglin  <dave.anglin@nrc-cnrc.gc.ca>
+
+	PR target/33062
+	* pa.c (function_value): Use GET_MODE_BITSIZE instead of TYPE_PRECISION.
+
+2007-09-15  Dorit Nuzman  <dorit@il.ibm.com>
+
+	* tree-vect-transform.c (vect_get_vec_defs_for_stmt_copy): check if 
+	the VEC is not NULL.
+	(vectorizable_type_demotion, vectorizable_type_promotion): Check that 
+	get_vectype_for_scalar_type succeeded.
+	(vectorizable_conversion): Likewise.
+
 2007-09-14  Jan Hubicka  <jh@suse.cz>
 
 	* config/i386/i386.md (*floatdi<mode>2_i387): Guard against


The results can be reproduced by building a compiler with

--enable-gather-detailed-mem-stats targetting x86-64

and compiling preprocessed combine.c or testcase from PR8632 with:

-fmem-report --param=ggc-min-heapsize=1024 --param=ggc-min-expand=1 -Ox -Q

The memory consumption summary appears in the dump after detailed listing
of the places they are allocated in.  Peak memory consumption is actually
computed by looking for maximal value in {GC XXXX -> YYYY} report.

Your testing script.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]