This is the mail archive of the gcc-regression@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

A recent patch increased GCC's memory consumption!


Hi,

I am a friendly script caring about memory consumption in GCC.  Please
contact jh@suse.cz if something is going wrong.

Comparing memory consumption on compilation of combine.i, insn-attrtab.i,
and generate-3.4.ii I got:


comparing empty function compilation at -O0 level:
    Overall memory needed: 7382k -> 7383k
    Peak memory use before GGC: 2265k
    Peak memory use after GGC: 1955k
    Maximum of released memory in single GGC run: 310k
    Garbage: 445k
    Leak: 2289k
    Overhead: 456k
    GGC runs: 3

comparing empty function compilation at -O0 -g level:
    Overall memory needed: 7398k -> 7399k
    Peak memory use before GGC: 2293k
    Peak memory use after GGC: 1982k
    Maximum of released memory in single GGC run: 311k
    Garbage: 448k
    Leak: 2321k
    Overhead: 461k
    GGC runs: 3

comparing empty function compilation at -O1 level:
    Overall memory needed: 7498k -> 7495k
    Peak memory use before GGC: 2265k
    Peak memory use after GGC: 1955k
    Maximum of released memory in single GGC run: 310k
    Garbage: 451k
    Leak: 2291k
    Overhead: 457k
    GGC runs: 4

comparing empty function compilation at -O2 level:
    Overall memory needed: 7510k -> 7507k
    Peak memory use before GGC: 2266k
    Peak memory use after GGC: 1955k
    Maximum of released memory in single GGC run: 311k
    Garbage: 454k
    Leak: 2291k
    Overhead: 457k
    GGC runs: 4

comparing empty function compilation at -O3 level:
    Overall memory needed: 7510k -> 7507k
    Peak memory use before GGC: 2266k
    Peak memory use after GGC: 1955k
    Maximum of released memory in single GGC run: 311k
    Garbage: 454k
    Leak: 2291k
    Overhead: 457k
    GGC runs: 4

comparing combine.c compilation at -O0 level:
    Overall memory needed: 17730k -> 17731k
    Peak memory use before GGC: 9291k
    Peak memory use after GGC: 8871k
    Maximum of released memory in single GGC run: 2603k
    Garbage: 37162k
    Leak: 6539k
    Overhead: 5024k
    GGC runs: 280

comparing combine.c compilation at -O0 -g level:
    Overall memory needed: 19822k -> 19823k
    Peak memory use before GGC: 10897k
    Peak memory use after GGC: 10531k
    Maximum of released memory in single GGC run: 2375k
    Garbage: 37738k
    Leak: 9415k
    Overhead: 5727k
    GGC runs: 270

comparing combine.c compilation at -O1 level:
    Overall memory needed: 35158k -> 35155k
    Peak memory use before GGC: 19412k
    Peak memory use after GGC: 19206k
    Maximum of released memory in single GGC run: 2197k
    Garbage: 57381k -> 57374k
    Leak: 6562k
    Overhead: 6356k -> 6354k
    GGC runs: 349

comparing combine.c compilation at -O2 level:
  Amount of memory still referenced at the end of compilation increased from 6673k to 6681k, overall 0.12%
    Overall memory needed: 37546k -> 37567k
    Peak memory use before GGC: 19448k -> 19446k
    Peak memory use after GGC: 19245k
    Maximum of released memory in single GGC run: 2187k
    Garbage: 69083k -> 68818k
    Leak: 6673k -> 6681k
    Overhead: 7970k -> 7955k
    GGC runs: 407 -> 406

comparing combine.c compilation at -O3 level:
  Peak amount of GGC memory allocated before garbage collecting increased from 20533k to 20560k, overall 0.13%
    Overall memory needed: 45630k -> 45623k
    Peak memory use before GGC: 20533k -> 20560k
    Peak memory use after GGC: 19744k
    Maximum of released memory in single GGC run: 3125k
    Garbage: 102970k -> 102695k
    Leak: 6817k
    Overhead: 12400k -> 12395k
    GGC runs: 458 -> 456

comparing insn-attrtab.c compilation at -O0 level:
    Overall memory needed: 103546k -> 103547k
    Peak memory use before GGC: 69329k
    Peak memory use after GGC: 44976k
    Maximum of released memory in single GGC run: 36886k
    Garbage: 130571k
    Leak: 9588k
    Overhead: 16932k
    GGC runs: 206

comparing insn-attrtab.c compilation at -O0 -g level:
    Overall memory needed: 105066k -> 105067k
    Peak memory use before GGC: 70490k
    Peak memory use after GGC: 46244k
    Maximum of released memory in single GGC run: 36886k
    Garbage: 131730k
    Leak: 11278k
    Overhead: 17326k
    GGC runs: 206

comparing insn-attrtab.c compilation at -O1 level:
    Overall memory needed: 148174k -> 146167k
    Peak memory use before GGC: 86337k
    Peak memory use after GGC: 80543k
    Maximum of released memory in single GGC run: 33045k
    Garbage: 264606k -> 264605k
    Leak: 9404k
    Overhead: 27590k -> 27590k
    GGC runs: 225

comparing insn-attrtab.c compilation at -O2 level:
    Overall memory needed: 193002k -> 191635k
    Peak memory use before GGC: 87648k
    Peak memory use after GGC: 80609k
    Maximum of released memory in single GGC run: 31388k
    Garbage: 299523k -> 299515k
    Leak: 9401k
    Overhead: 33192k -> 33192k
    GGC runs: 245

comparing insn-attrtab.c compilation at -O3 level:
    Overall memory needed: 191658k -> 191655k
    Peak memory use before GGC: 87665k
    Peak memory use after GGC: 80626k
    Maximum of released memory in single GGC run: 31450k
    Garbage: 300159k -> 300153k
    Leak: 9407k
    Overhead: 33388k -> 33388k
    GGC runs: 245

comparing Gerald's testcase PR8361 compilation at -O0 level:
    Overall memory needed: 151175k -> 151184k
    Peak memory use before GGC: 92317k
    Peak memory use after GGC: 91394k
    Maximum of released memory in single GGC run: 18793k
    Garbage: 210317k
    Leak: 49389k
    Overhead: 23720k
    GGC runs: 411

comparing Gerald's testcase PR8361 compilation at -O0 -g level:
    Overall memory needed: 169367k -> 169352k
    Peak memory use before GGC: 104950k
    Peak memory use after GGC: 103910k
    Maximum of released memory in single GGC run: 18979k
    Garbage: 216936k
    Leak: 72814k
    Overhead: 29643k
    GGC runs: 382

comparing Gerald's testcase PR8361 compilation at -O1 level:
    Overall memory needed: 145071k -> 145031k
    Peak memory use before GGC: 103061k -> 103088k
    Peak memory use after GGC: 102031k -> 102050k
    Maximum of released memory in single GGC run: 17981k
    Garbage: 392060k -> 392073k
    Leak: 50219k -> 50219k
    Overhead: 34379k -> 34383k
    GGC runs: 541

comparing Gerald's testcase PR8361 compilation at -O2 level:
    Overall memory needed: 146103k -> 146127k
    Peak memory use before GGC: 103490k -> 103514k
    Peak memory use after GGC: 102465k -> 102485k
    Maximum of released memory in single GGC run: 17979k
    Garbage: 431339k -> 431219k
    Leak: 50909k
    Overhead: 39623k -> 39623k
    GGC runs: 592

comparing Gerald's testcase PR8361 compilation at -O3 level:
    Overall memory needed: 148967k -> 148999k
    Peak memory use before GGC: 104751k -> 104781k
    Peak memory use after GGC: 103682k -> 103711k
    Maximum of released memory in single GGC run: 18300k
    Garbage: 451263k -> 450979k
    Leak: 50997k -> 51004k
    Overhead: 41457k -> 41437k
    GGC runs: 607

comparing PR rtl-optimization/28071 testcase compilation at -O0 level:
    Overall memory needed: 246993k -> 246990k
    Peak memory use before GGC: 82632k
    Peak memory use after GGC: 59514k
    Maximum of released memory in single GGC run: 45585k
    Garbage: 147214k
    Leak: 8082k
    Overhead: 24807k
    GGC runs: 80

comparing PR rtl-optimization/28071 testcase compilation at -O0 -g level:
    Overall memory needed: 247849k -> 247850k
    Peak memory use before GGC: 83278k
    Peak memory use after GGC: 60160k
    Maximum of released memory in single GGC run: 45230k
    Garbage: 147433k
    Leak: 9338k
    Overhead: 25303k
    GGC runs: 88

comparing PR rtl-optimization/28071 testcase compilation at -O1 level:
    Overall memory needed: 258734k -> 258683k
    Peak memory use before GGC: 104566k -> 104513k
    Peak memory use after GGC: 101352k -> 101299k
    Maximum of released memory in single GGC run: 51848k -> 51827k
    Garbage: 240888k -> 240867k
    Leak: 25176k
    Overhead: 29476k -> 29476k
    GGC runs: 79

comparing PR rtl-optimization/28071 testcase compilation at -O2 level:
    Overall memory needed: 531882k -> 531999k
    Peak memory use before GGC: 104556k -> 104503k
    Peak memory use after GGC: 101342k -> 101289k
    Maximum of released memory in single GGC run: 37192k -> 37171k
    Garbage: 272729k -> 272708k
    Leak: 25605k
    Overhead: 35516k -> 35515k
    GGC runs: 91

comparing PR rtl-optimization/28071 testcase compilation at -O3 -fno-tree-pre -fno-tree-fre level:
    Overall memory needed: 1182430k -> 1181215k
    Peak memory use before GGC: 200610k
    Peak memory use after GGC: 188951k
    Maximum of released memory in single GGC run: 80735k
    Garbage: 371841k -> 371820k
    Leak: 45260k -> 45260k
    Overhead: 48290k -> 48290k
    GGC runs: 70

Head of the ChangeLog is:

--- /usr/src/SpecTests/sandbox-britten-memory/x86_64/mem-result/ChangeLog	2007-02-07 22:02:52.000000000 +0000
+++ /usr/src/SpecTests/sandbox-britten-memory/gcc/gcc/ChangeLog	2007-02-09 04:41:43.000000000 +0000
@@ -1,3 +1,94 @@
+2007-02-09  Joseph Myers  <joseph@codesourcery.com>
+
+	* calls.c (store_one_arg): Pass correct alignment to
+	emit_push_insn for non-BLKmode values.
+	* expr.c (emit_push_insn): If STRICT_ALIGNMENT, copy to an
+	unaligned stack slot via a suitably aligned slot.
+
+2007-02-08  DJ Delorie  <dj@redhat.com>
+
+	* config/m32c/m32c.c (m32c_unpend_compare): Add default to silence
+	warnings.
+	(legal_subregs): Use unsigned char, make const.
+	(m32c_illegal_subreg_p): Use ARRAY_SIZE.  Delete unused variables.
+
+2007-02-08  Paul Brook  <paul@codesourcery.com>
+
+	* config/arm/lib1funcs.asm (RETLDM): Pop directly into PC when no
+	special interworking needed.
+
+2007-02-08  Harsha Jagasia  <harsha.jagasia@amd.com>
+
+	* config/i386/xmmintrin.h: Make inclusion of emmintrin.h
+	conditional to __SSE2__.
+	(Entries below should have been added to first ChangeLog
+	entry for amdfam10 dated 2007-02-05)
+	* config/i386/emmintrin.h: Generate #error if __SSE2__ is not
+	defined.
+	* config/i386/pmmintrin.h: Generate #error if __SSE3__ is not
+	defined.
+	* config/i386/tmmintrin.h: Generate #error if __SSSE3__ is not
+	defined.
+
+2007-02-08  DJ Delorie  <dj@redhat.com>
+
+	* config/m32c/m32c-protos.h (m32c_illegal_subreg_p): New.
+	* config/m32c/m32c.c (legal_subregs): New.
+	(m32c_illegal_subreg_p): New.
+	* config/m32c/predicates.md (m32c_any_operand): Use it to reject
+	unsupported subregs of hard regs.
+
+2007-02-08  Jan Hubicka  <jh@suse.cz>
+
+	* tree-cfg.c (bsi_replace): Shortcut when replacing the statement with
+	the same one; always update histograms.
+
+2007-02-08  Diego Novillo  <dnovillo@redhat.com>
+
+	* passes.c (init_optimization_passes): Tidy comment.
+
+2007-02-08  Roger Sayle  <roger@eyesopen.com>
+
+	* simplify-rtx.c (simplify_unary_operation_1) <POPCOUNT>: We can
+	strip zero_extend, bswap and rotates from POCOUNT's argument.
+	<PARITY>: Likewise, we can strip not, bswap, sign_extend,
+	zero_extend and rotates from PARITY's argument.
+	<BSWAP>: A byte-swap followed by a byte-swap is an identity.
+	(simplify_const_unary_operation) <BSWAP>: Evaluate the byte-swap
+	of an integer constant at compile-time.
+
+2007-02-08  Diego Novillo  <dnovillo@redhat.com>
+
+	PR 30562
+	* tree-flow.h (struct var_ann_d): Remove field 'is_used'.
+	Update all users.
+	* tree-ssa-alias.c (compute_is_aliased): Remove.  Update all
+	users.
+	(init_alias_info):
+	* tree-ssa-live.c (remove_unused_locals): Do not remove
+	TREE_ADDRESSABLE variables.
+	* tree-ssa-structalias.c (compute_points_to_sets): Tidy.
+	* tree-ssa-operands.c (add_virtual_operand): Remove argument
+	FOR_CLOBBER.  Update all users.
+	If VAR has an associated alias set, add a virtual operand for
+	it if no alias is found to conflict with the memory reference.
+
+2007-02-07  Jan Hubicka  <jh@suse.cz>
+	    Robert Kidd <rkidd@crhc.uiuc.edu>
+
+	* value-prof.c (visit_hist, free_hist): Return 1 instead of 0.
+
+2007-02-07  Ian Lance Taylor  <iant@google.com>
+
+	* lower-subreg.c (simple_move): Reject PARTIAL_INT modes.
+
+2007-02-07  Roger Sayle  <roger@eyesopen.com>
+
+	* config/rs6000/rs6000.md (ctz<mode>2, ffs<mode>2, popcount<mode>2,
+	parity<mode>2, smulsi3_highpart, abstf2_internal, allocate_stack,
+	tablejumpdi, movsi_to_cr_one): Remove constraints from
+	define_expand's match_operands.
+
 2007-02-07  Roger Sayle  <roger@eyesopen.com>
 
 	* global.c (compute_regsets): Move declatation of "i" inside of


The results can be reproduced by building a compiler with

--enable-gather-detailed-mem-stats targetting x86-64

and compiling preprocessed combine.c or testcase from PR8632 with:

-fmem-report --param=ggc-min-heapsize=1024 --param=ggc-min-expand=1 -Ox -Q

The memory consumption summary appears in the dump after detailed listing
of the places they are allocated in.  Peak memory consumption is actually
computed by looking for maximal value in {GC XXXX -> YYYY} report.

Your testing script.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]