A recent patch increased GCC's memory consumption!

gcctest@suse.de gcctest@suse.de
Thu Feb 22 03:42:00 GMT 2007


Hi,

I am a friendly script caring about memory consumption in GCC.  Please
contact jh@suse.cz if something is going wrong.

Comparing memory consumption on compilation of combine.i, insn-attrtab.i,
and generate-3.4.ii I got:


comparing empty function compilation at -O0 level:
    Overall memory needed: 7383k
    Peak memory use before GGC: 2269k
    Peak memory use after GGC: 1958k
    Maximum of released memory in single GGC run: 311k
    Garbage: 446k
    Leak: 2292k
    Overhead: 456k
    GGC runs: 3

comparing empty function compilation at -O0 -g level:
    Overall memory needed: 7399k
    Peak memory use before GGC: 2297k
    Peak memory use after GGC: 1986k
    Maximum of released memory in single GGC run: 311k
    Garbage: 449k
    Leak: 2325k
    Overhead: 461k
    GGC runs: 3

comparing empty function compilation at -O1 level:
    Overall memory needed: 7511k
    Peak memory use before GGC: 2269k
    Peak memory use after GGC: 1958k
    Maximum of released memory in single GGC run: 311k
    Garbage: 452k
    Leak: 2295k
    Overhead: 457k
    GGC runs: 4

comparing empty function compilation at -O2 level:
    Overall memory needed: 7519k
    Peak memory use before GGC: 2270k
    Peak memory use after GGC: 1959k
    Maximum of released memory in single GGC run: 311k
    Garbage: 455k
    Leak: 2295k
    Overhead: 457k
    GGC runs: 4

comparing empty function compilation at -O3 level:
    Overall memory needed: 7519k
    Peak memory use before GGC: 2270k
    Peak memory use after GGC: 1959k
    Maximum of released memory in single GGC run: 311k
    Garbage: 455k
    Leak: 2295k
    Overhead: 457k
    GGC runs: 4

comparing combine.c compilation at -O0 level:
    Overall memory needed: 17879k
    Peak memory use before GGC: 9277k
    Peak memory use after GGC: 8859k
    Maximum of released memory in single GGC run: 2584k
    Garbage: 37095k -> 37118k
    Leak: 6587k -> 6579k
    Overhead: 5036k -> 5037k
    GGC runs: 279

comparing combine.c compilation at -O0 -g level:
    Overall memory needed: 19895k
    Peak memory use before GGC: 10885k -> 10887k
    Peak memory use after GGC: 10519k
    Maximum of released memory in single GGC run: 2355k
    Garbage: 37703k -> 37710k
    Leak: 9464k
    Overhead: 5739k -> 5741k
    GGC runs: 269

comparing combine.c compilation at -O1 level:
    Overall memory needed: 35311k -> 35299k
    Peak memory use before GGC: 19350k
    Peak memory use after GGC: 19135k
    Maximum of released memory in single GGC run: 2169k
    Garbage: 57393k -> 57399k
    Leak: 6616k
    Overhead: 6370k -> 6371k
    GGC runs: 352

comparing combine.c compilation at -O2 level:
    Overall memory needed: 37583k -> 37591k
    Peak memory use before GGC: 19387k
    Peak memory use after GGC: 19185k
    Maximum of released memory in single GGC run: 2157k
    Garbage: 68659k -> 68663k
    Leak: 6725k
    Overhead: 7995k -> 7996k
    GGC runs: 403

comparing combine.c compilation at -O3 level:
    Overall memory needed: 47015k
    Peak memory use before GGC: 20424k
    Peak memory use after GGC: 19542k
    Maximum of released memory in single GGC run: 3135k
    Garbage: 101003k -> 101007k
    Leak: 6865k
    Overhead: 12221k -> 12222k
    GGC runs: 453

comparing insn-attrtab.c compilation at -O0 level:
    Overall memory needed: 103215k
    Peak memory use before GGC: 68882k
    Peak memory use after GGC: 44737k
    Maximum of released memory in single GGC run: 36678k
    Garbage: 130211k -> 130211k
    Leak: 9350k
    Overhead: 16801k -> 16801k
    GGC runs: 206

comparing insn-attrtab.c compilation at -O0 -g level:
    Overall memory needed: 104615k
    Peak memory use before GGC: 70043k
    Peak memory use after GGC: 46005k
    Maximum of released memory in single GGC run: 36678k
    Garbage: 131109k -> 131109k
    Leak: 11279k
    Overhead: 17195k -> 17195k
    GGC runs: 206

comparing insn-attrtab.c compilation at -O1 level:
    Overall memory needed: 147615k
    Peak memory use before GGC: 85871k
    Peak memory use after GGC: 80078k
    Maximum of released memory in single GGC run: 32817k
    Garbage: 264119k -> 264120k
    Leak: 9410k
    Overhead: 27484k -> 27484k
    GGC runs: 225

comparing insn-attrtab.c compilation at -O2 level:
    Overall memory needed: 193227k
    Peak memory use before GGC: 87187k
    Peak memory use after GGC: 80148k
    Maximum of released memory in single GGC run: 31177k
    Garbage: 298877k -> 298878k
    Leak: 9407k
    Overhead: 33061k -> 33061k
    GGC runs: 245

comparing insn-attrtab.c compilation at -O3 level:
    Overall memory needed: 193187k
    Peak memory use before GGC: 87200k
    Peak memory use after GGC: 80161k
    Maximum of released memory in single GGC run: 31240k
    Garbage: 299536k -> 299536k
    Leak: 9412k
    Overhead: 33260k -> 33260k
    GGC runs: 245

comparing Gerald's testcase PR8361 compilation at -O0 level:
  Amount of produced GGC garbage increased from 208973k to 209329k, overall 0.17%
    Overall memory needed: 148351k
    Peak memory use before GGC: 90646k
    Peak memory use after GGC: 89748k
    Maximum of released memory in single GGC run: 18005k
    Garbage: 208973k -> 209329k
    Leak: 49218k
    Overhead: 23460k -> 23532k
    GGC runs: 413

comparing Gerald's testcase PR8361 compilation at -O0 -g level:
  Amount of produced GGC garbage increased from 215570k to 215935k, overall 0.17%
    Overall memory needed: 166687k
    Peak memory use before GGC: 103574k
    Peak memory use after GGC: 102548k
    Maximum of released memory in single GGC run: 18487k
    Garbage: 215570k -> 215935k
    Leak: 72643k
    Overhead: 29381k -> 29453k
    GGC runs: 385

comparing Gerald's testcase PR8361 compilation at -O1 level:
    Overall memory needed: 142187k
    Peak memory use before GGC: 101989k
    Peak memory use after GGC: 100966k
    Maximum of released memory in single GGC run: 17236k
    Garbage: 345330k -> 345386k
    Leak: 49931k
    Overhead: 30011k -> 30022k
    GGC runs: 528

comparing Gerald's testcase PR8361 compilation at -O2 level:
    Overall memory needed: 144315k
    Peak memory use before GGC: 102664k
    Peak memory use after GGC: 101611k
    Maximum of released memory in single GGC run: 17234k
    Garbage: 374162k -> 374190k
    Leak: 50534k
    Overhead: 33937k -> 33943k
    GGC runs: 562

comparing Gerald's testcase PR8361 compilation at -O3 level:
    Overall memory needed: 146915k
    Peak memory use before GGC: 104418k
    Peak memory use after GGC: 103384k
    Maximum of released memory in single GGC run: 17650k
    Garbage: 391422k -> 391435k
    Leak: 51195k
    Overhead: 35271k -> 35274k
    GGC runs: 573

comparing PR rtl-optimization/28071 testcase compilation at -O0 level:
    Overall memory needed: 244755k
    Peak memory use before GGC: 80969k
    Peak memory use after GGC: 58708k
    Maximum of released memory in single GGC run: 44133k
    Garbage: 145362k
    Leak: 7619k
    Overhead: 24814k
    GGC runs: 79

comparing PR rtl-optimization/28071 testcase compilation at -O0 -g level:
    Overall memory needed: 245583k
    Peak memory use before GGC: 81615k
    Peak memory use after GGC: 59354k
    Maximum of released memory in single GGC run: 44123k
    Garbage: 145629k
    Leak: 9387k
    Overhead: 25310k
    GGC runs: 89

comparing PR rtl-optimization/28071 testcase compilation at -O1 level:
    Overall memory needed: 245975k
    Peak memory use before GGC: 85141k
    Peak memory use after GGC: 74853k
    Maximum of released memory in single GGC run: 36136k
    Garbage: 223655k
    Leak: 20863k
    Overhead: 30547k
    GGC runs: 81

comparing PR rtl-optimization/28071 testcase compilation at -O2 level:
    Overall memory needed: 499287k
    Peak memory use before GGC: 79840k
    Peak memory use after GGC: 74854k
    Maximum of released memory in single GGC run: 33434k
    Garbage: 230701k
    Leak: 20953k
    Overhead: 32630k
    GGC runs: 91

comparing PR rtl-optimization/28071 testcase compilation at -O3 -fno-tree-pre -fno-tree-fre level:
    Overall memory needed: 1189119k
    Peak memory use before GGC: 201756k
    Peak memory use after GGC: 190218k
    Maximum of released memory in single GGC run: 80703k
    Garbage: 376734k
    Leak: 46318k
    Overhead: 49361k
    GGC runs: 70

Head of the ChangeLog is:

--- /usr/src/SpecTests/sandbox-britten-memory/x86_64/mem-result/ChangeLog	2007-02-21 09:24:27.000000000 +0000
+++ /usr/src/SpecTests/sandbox-britten-memory/gcc/gcc/ChangeLog	2007-02-22 01:38:33.000000000 +0000
@@ -1,3 +1,125 @@
+2007-02-20  Mike Stump  <mrs@apple.com>
+
+	* configure.ac (powerpc*-*-darwin*): #include <sys/cdefs.h>.
+	* configure: Regenerate.
+
+2007-02-21  Trevor Smigiel  <trevor_smigiel@playstation.sony.com>
+
+	Change the defaults of some parameters and options.
+	* config/spu/spu-protos.h (spu_optimization_options): Declare.
+	* config/spu/spu.c (spu_optimization_options): Add.
+	(spu_override_options): Change params in spu_optimization_options.
+	* config/spu/spu.h (OPTIMIZATION_OPTIONS): Define.
+
+	Register 127 is only 16 byte aligned when used as a frame pointer.
+	* config/spu/spu-protos.h (spu_init_expanders): Declare.
+	* config/spu/spu.c (spu_expand_prologue): Set REGNO_POINTER_ALIGN for
+	HARD_FRAME_POINTER_REGNUM.
+	(spu_legitimate_address):  Use regno_aligned_for_reload.
+	(regno_aligned_for_load):  HARD_FRAME_POINTER_REGNUM is only 16 byte
+	aligned when frame_pointer_needed is true.
+	(spu_init_expanders): New.  Set alignment of HARD_FRAME_POINTER_REGNUM
+	to 8 bits.
+	* config/spu/spu.h (INIT_EXPANDERS): Define.
+
+	Make sure shift and rotate instructions have valid immediate operands.
+	* config/spu/predicates.md (spu_shift_operand): Remove.
+	* config/spu/spu.c (print_operand): Add [efghEFGH] modifiers.
+	* config/spu/constraints.md (W, O): Extend range.
+	* config/spu/spu.md (umask, nmask): Define.
+	(ashl<mode>3, ashldi3, ashlti3_imm, shlqbybi_ti, shlqbi_ti, shlqby_ti,
+	lshr<mode>3, rotm_<mode>, lshr<mode>3_imm, rotqmbybi_<mode>,
+	rotqmbi_<mode>, rotqmby_<mode>, ashr<mode>3, rotma_<mode>,
+	rotl<mode>3, rotlti3, rotqbybi_ti, rotqby_ti, rotqbi_ti): Use
+	spu_nonmem_operand instead of spu_shift_operands.  Use new modifiers.
+	(lshr<mode>3_reg):  Fix rtl description.
+
+	Make sure mulhisi immediate operands are valid.
+	* config/spu/predicates.md (imm_K_operand): Add.
+	* config/spu/spu.md (mulhisi3_imm, umulhisi3_imm): Use imm_K_operand.
+
+	Generate constants using fsmbi and andi.
+	* config/spu/spu.c (enum immediate_class): Add IC_FSMBI2.
+	(print_operand, spu_split_immediate, classify_immediate,
+	fsmbi_const_p): Handle IC_FSMBI2.
+
+	Correctly handle a CONST_VECTOR containing symbols.
+	* config/spu/spu.c (print_operand): Handle HIGH correctly.
+	(spu_split_immediate): Split CONST_VECTORs with -mlarge-mem.
+	(immediate_load_p): Allow symbols that use 2 instructions to create.
+	(classify_immediate, spu_builtin_splats):  Don't accept a CONST_VECTOR
+	with symbols when flag_pic is set.
+	(const_vector_immediate_p): New.
+	(logical_immediate_p, iohl_immediate_p, arith_immediate_p): Don't
+	accept a CONST_VECTOR with symbols.
+	(spu_legitimate_constant_p): Use const_vector_immediate_p.  Don't
+	accept a CONST_VECTOR with symbols when flag_pic is set.  Handle HIGH
+	correctly.
+	* config/spu/spu.md (high, low): Delete.
+	(low_<mode>): Define.
+
+	Remove INTRmode and INTR_REGNUM, which didn't work.
+	* config/spu/spu.c (spu_conditional_register_usage): Remove reference
+	of INTR_REGNUM.
+	* config/spu/spu-builtins.md (spu_idisable, spu_ienable, set_intr,
+	set_intr_pic, set_intr_cc, set_intr_cc_pic, set_intr_return, unnamed
+	peephole2 pattern): Don't use INTR or 131.
+	(movintrcc): Delete.
+	* config/spu/spu.h (FIRST_PSEUDO_REGISTER, FIXED_REGISTERS,
+	CALL_USED_REGISTERS, REGISTER_NAMES, INTR_REGNUM): Remove INTR_REGNUM.
+	* config/spu/spu.md (UNSPEC_IDISABLE, UNSPEC_IENABLE): Remove.
+	(UNSPEC_SET_INTR): Add.
+	* config/spu/spu-modes.def (INTR): Remove.
+
+	More accurate warnings about run-time relocations.
+	* config/spu/spu.c (reloc_diagnostic): Test in_section.
+
+	Correctly warn about immediate arguments to specific intrinsics.
+	* config/spu/spu.c (spu_check_builtin_parm): Handle CONST_VECTORs.
+	(spu_expand_builtin_1): Call spu_check_builtin_parm before checking
+	the instruction predicate.
+
+	Fix tree check errors with latest update.
+	* config/spu/spu.c (expand_builtin_args, spu_expand_builtin_1): Use
+	CALL_EXPR_ARG.
+	(spu_expand_builtin): Use CALL_EXPR_FN.
+
+	Add missing specific intrinsics.
+	* config/spu/spu-builtins.def: Add si_bisled, si_bisledd and
+	si_bislede.
+	* config/spu/spu_internals.h: Ditto.
+
+	Fix incorrect operand modifiers.
+	* config/spu/spu-builtins.md (spu_mpy, spu_mpyu):  Remove use of %H.
+	* config/spu/spu.md (xor<mode>3):  Change %S to %J.
+
+	Optimize one case of zero_extend of a vec_select.
+	* config/spu/spu.md (_vec_extractv8hi_ze):  Add.
+
+	Accept any immediate for hbr.
+	* config/spu/spu.md (hbr):  Change s constraints to i.
+
+2007-02-21  Paul Brook  <paul@codesourcery.com>
+
+	* config/arm/arm.c (thumb2_final_prescan_insn): Don't incrememnt
+	condexec_count when skipping USE and CLOBBER.
+
+2007-02-21  Nick Clifton  <nickc@redhat.com>
+
+	* common.opt (Warray-bounds): Add Warning attribute.
+	(Wstrict-overflow, Wstrict-overflow=, Wcoverage-mismatch):
+	Likewise.
+	(fsized-zeroes): Add Optimization attribute.
+	(fsplit-wide-types, ftree-scev-cprop): Likewise.
+	* c.opt (Wc++0x-compat): Add Warning attribute.
+
+2007-02-21  Ulrich Weigand  <uweigand@de.ibm.com>
+
+	PR middle-end/30761
+	* reload1.c (eliminate_regs_in_insn): In the single_set special
+	case, attempt to re-recognize the insn before falling back to
+	having reload fix it up.
+
 2007-02-20  Eric Christopher  <echristo@gmail.com>
 
 	* config/frv/frv.c (frv_read_argument): Take a tree and int argument.
@@ -88,7 +210,7 @@
 	intrinsics.
 
 2007-02-20  Manuel Lopez-Ibanez  <manu@gcc.gnu.org>
-            DJ Delorie <dj@redhat.com>
+	    DJ Delorie <dj@redhat.com>
 
 	PR other/30824
 	* diagnostic.c (diagnostic_count_diagnostic): Move -Werror logic to...


The results can be reproduced by building a compiler with

--enable-gather-detailed-mem-stats targetting x86-64

and compiling preprocessed combine.c or testcase from PR8632 with:

-fmem-report --param=ggc-min-heapsize=1024 --param=ggc-min-expand=1 -Ox -Q

The memory consumption summary appears in the dump after detailed listing
of the places they are allocated in.  Peak memory consumption is actually
computed by looking for maximal value in {GC XXXX -> YYYY} report.

Your testing script.



More information about the Gcc-regression mailing list