A recent patch increased GCC's memory consumption!
gcctest@suse.de
gcctest@suse.de
Thu Feb 22 03:42:00 GMT 2007
Hi,
I am a friendly script caring about memory consumption in GCC. Please
contact jh@suse.cz if something is going wrong.
Comparing memory consumption on compilation of combine.i, insn-attrtab.i,
and generate-3.4.ii I got:
comparing empty function compilation at -O0 level:
Overall memory needed: 7383k
Peak memory use before GGC: 2269k
Peak memory use after GGC: 1958k
Maximum of released memory in single GGC run: 311k
Garbage: 446k
Leak: 2292k
Overhead: 456k
GGC runs: 3
comparing empty function compilation at -O0 -g level:
Overall memory needed: 7399k
Peak memory use before GGC: 2297k
Peak memory use after GGC: 1986k
Maximum of released memory in single GGC run: 311k
Garbage: 449k
Leak: 2325k
Overhead: 461k
GGC runs: 3
comparing empty function compilation at -O1 level:
Overall memory needed: 7511k
Peak memory use before GGC: 2269k
Peak memory use after GGC: 1958k
Maximum of released memory in single GGC run: 311k
Garbage: 452k
Leak: 2295k
Overhead: 457k
GGC runs: 4
comparing empty function compilation at -O2 level:
Overall memory needed: 7519k
Peak memory use before GGC: 2270k
Peak memory use after GGC: 1959k
Maximum of released memory in single GGC run: 311k
Garbage: 455k
Leak: 2295k
Overhead: 457k
GGC runs: 4
comparing empty function compilation at -O3 level:
Overall memory needed: 7519k
Peak memory use before GGC: 2270k
Peak memory use after GGC: 1959k
Maximum of released memory in single GGC run: 311k
Garbage: 455k
Leak: 2295k
Overhead: 457k
GGC runs: 4
comparing combine.c compilation at -O0 level:
Overall memory needed: 17879k
Peak memory use before GGC: 9277k
Peak memory use after GGC: 8859k
Maximum of released memory in single GGC run: 2584k
Garbage: 37095k -> 37118k
Leak: 6587k -> 6579k
Overhead: 5036k -> 5037k
GGC runs: 279
comparing combine.c compilation at -O0 -g level:
Overall memory needed: 19895k
Peak memory use before GGC: 10885k -> 10887k
Peak memory use after GGC: 10519k
Maximum of released memory in single GGC run: 2355k
Garbage: 37703k -> 37710k
Leak: 9464k
Overhead: 5739k -> 5741k
GGC runs: 269
comparing combine.c compilation at -O1 level:
Overall memory needed: 35311k -> 35299k
Peak memory use before GGC: 19350k
Peak memory use after GGC: 19135k
Maximum of released memory in single GGC run: 2169k
Garbage: 57393k -> 57399k
Leak: 6616k
Overhead: 6370k -> 6371k
GGC runs: 352
comparing combine.c compilation at -O2 level:
Overall memory needed: 37583k -> 37591k
Peak memory use before GGC: 19387k
Peak memory use after GGC: 19185k
Maximum of released memory in single GGC run: 2157k
Garbage: 68659k -> 68663k
Leak: 6725k
Overhead: 7995k -> 7996k
GGC runs: 403
comparing combine.c compilation at -O3 level:
Overall memory needed: 47015k
Peak memory use before GGC: 20424k
Peak memory use after GGC: 19542k
Maximum of released memory in single GGC run: 3135k
Garbage: 101003k -> 101007k
Leak: 6865k
Overhead: 12221k -> 12222k
GGC runs: 453
comparing insn-attrtab.c compilation at -O0 level:
Overall memory needed: 103215k
Peak memory use before GGC: 68882k
Peak memory use after GGC: 44737k
Maximum of released memory in single GGC run: 36678k
Garbage: 130211k -> 130211k
Leak: 9350k
Overhead: 16801k -> 16801k
GGC runs: 206
comparing insn-attrtab.c compilation at -O0 -g level:
Overall memory needed: 104615k
Peak memory use before GGC: 70043k
Peak memory use after GGC: 46005k
Maximum of released memory in single GGC run: 36678k
Garbage: 131109k -> 131109k
Leak: 11279k
Overhead: 17195k -> 17195k
GGC runs: 206
comparing insn-attrtab.c compilation at -O1 level:
Overall memory needed: 147615k
Peak memory use before GGC: 85871k
Peak memory use after GGC: 80078k
Maximum of released memory in single GGC run: 32817k
Garbage: 264119k -> 264120k
Leak: 9410k
Overhead: 27484k -> 27484k
GGC runs: 225
comparing insn-attrtab.c compilation at -O2 level:
Overall memory needed: 193227k
Peak memory use before GGC: 87187k
Peak memory use after GGC: 80148k
Maximum of released memory in single GGC run: 31177k
Garbage: 298877k -> 298878k
Leak: 9407k
Overhead: 33061k -> 33061k
GGC runs: 245
comparing insn-attrtab.c compilation at -O3 level:
Overall memory needed: 193187k
Peak memory use before GGC: 87200k
Peak memory use after GGC: 80161k
Maximum of released memory in single GGC run: 31240k
Garbage: 299536k -> 299536k
Leak: 9412k
Overhead: 33260k -> 33260k
GGC runs: 245
comparing Gerald's testcase PR8361 compilation at -O0 level:
Amount of produced GGC garbage increased from 208973k to 209329k, overall 0.17%
Overall memory needed: 148351k
Peak memory use before GGC: 90646k
Peak memory use after GGC: 89748k
Maximum of released memory in single GGC run: 18005k
Garbage: 208973k -> 209329k
Leak: 49218k
Overhead: 23460k -> 23532k
GGC runs: 413
comparing Gerald's testcase PR8361 compilation at -O0 -g level:
Amount of produced GGC garbage increased from 215570k to 215935k, overall 0.17%
Overall memory needed: 166687k
Peak memory use before GGC: 103574k
Peak memory use after GGC: 102548k
Maximum of released memory in single GGC run: 18487k
Garbage: 215570k -> 215935k
Leak: 72643k
Overhead: 29381k -> 29453k
GGC runs: 385
comparing Gerald's testcase PR8361 compilation at -O1 level:
Overall memory needed: 142187k
Peak memory use before GGC: 101989k
Peak memory use after GGC: 100966k
Maximum of released memory in single GGC run: 17236k
Garbage: 345330k -> 345386k
Leak: 49931k
Overhead: 30011k -> 30022k
GGC runs: 528
comparing Gerald's testcase PR8361 compilation at -O2 level:
Overall memory needed: 144315k
Peak memory use before GGC: 102664k
Peak memory use after GGC: 101611k
Maximum of released memory in single GGC run: 17234k
Garbage: 374162k -> 374190k
Leak: 50534k
Overhead: 33937k -> 33943k
GGC runs: 562
comparing Gerald's testcase PR8361 compilation at -O3 level:
Overall memory needed: 146915k
Peak memory use before GGC: 104418k
Peak memory use after GGC: 103384k
Maximum of released memory in single GGC run: 17650k
Garbage: 391422k -> 391435k
Leak: 51195k
Overhead: 35271k -> 35274k
GGC runs: 573
comparing PR rtl-optimization/28071 testcase compilation at -O0 level:
Overall memory needed: 244755k
Peak memory use before GGC: 80969k
Peak memory use after GGC: 58708k
Maximum of released memory in single GGC run: 44133k
Garbage: 145362k
Leak: 7619k
Overhead: 24814k
GGC runs: 79
comparing PR rtl-optimization/28071 testcase compilation at -O0 -g level:
Overall memory needed: 245583k
Peak memory use before GGC: 81615k
Peak memory use after GGC: 59354k
Maximum of released memory in single GGC run: 44123k
Garbage: 145629k
Leak: 9387k
Overhead: 25310k
GGC runs: 89
comparing PR rtl-optimization/28071 testcase compilation at -O1 level:
Overall memory needed: 245975k
Peak memory use before GGC: 85141k
Peak memory use after GGC: 74853k
Maximum of released memory in single GGC run: 36136k
Garbage: 223655k
Leak: 20863k
Overhead: 30547k
GGC runs: 81
comparing PR rtl-optimization/28071 testcase compilation at -O2 level:
Overall memory needed: 499287k
Peak memory use before GGC: 79840k
Peak memory use after GGC: 74854k
Maximum of released memory in single GGC run: 33434k
Garbage: 230701k
Leak: 20953k
Overhead: 32630k
GGC runs: 91
comparing PR rtl-optimization/28071 testcase compilation at -O3 -fno-tree-pre -fno-tree-fre level:
Overall memory needed: 1189119k
Peak memory use before GGC: 201756k
Peak memory use after GGC: 190218k
Maximum of released memory in single GGC run: 80703k
Garbage: 376734k
Leak: 46318k
Overhead: 49361k
GGC runs: 70
Head of the ChangeLog is:
--- /usr/src/SpecTests/sandbox-britten-memory/x86_64/mem-result/ChangeLog 2007-02-21 09:24:27.000000000 +0000
+++ /usr/src/SpecTests/sandbox-britten-memory/gcc/gcc/ChangeLog 2007-02-22 01:38:33.000000000 +0000
@@ -1,3 +1,125 @@
+2007-02-20 Mike Stump <mrs@apple.com>
+
+ * configure.ac (powerpc*-*-darwin*): #include <sys/cdefs.h>.
+ * configure: Regenerate.
+
+2007-02-21 Trevor Smigiel <trevor_smigiel@playstation.sony.com>
+
+ Change the defaults of some parameters and options.
+ * config/spu/spu-protos.h (spu_optimization_options): Declare.
+ * config/spu/spu.c (spu_optimization_options): Add.
+ (spu_override_options): Change params in spu_optimization_options.
+ * config/spu/spu.h (OPTIMIZATION_OPTIONS): Define.
+
+ Register 127 is only 16 byte aligned when used as a frame pointer.
+ * config/spu/spu-protos.h (spu_init_expanders): Declare.
+ * config/spu/spu.c (spu_expand_prologue): Set REGNO_POINTER_ALIGN for
+ HARD_FRAME_POINTER_REGNUM.
+ (spu_legitimate_address): Use regno_aligned_for_reload.
+ (regno_aligned_for_load): HARD_FRAME_POINTER_REGNUM is only 16 byte
+ aligned when frame_pointer_needed is true.
+ (spu_init_expanders): New. Set alignment of HARD_FRAME_POINTER_REGNUM
+ to 8 bits.
+ * config/spu/spu.h (INIT_EXPANDERS): Define.
+
+ Make sure shift and rotate instructions have valid immediate operands.
+ * config/spu/predicates.md (spu_shift_operand): Remove.
+ * config/spu/spu.c (print_operand): Add [efghEFGH] modifiers.
+ * config/spu/constraints.md (W, O): Extend range.
+ * config/spu/spu.md (umask, nmask): Define.
+ (ashl<mode>3, ashldi3, ashlti3_imm, shlqbybi_ti, shlqbi_ti, shlqby_ti,
+ lshr<mode>3, rotm_<mode>, lshr<mode>3_imm, rotqmbybi_<mode>,
+ rotqmbi_<mode>, rotqmby_<mode>, ashr<mode>3, rotma_<mode>,
+ rotl<mode>3, rotlti3, rotqbybi_ti, rotqby_ti, rotqbi_ti): Use
+ spu_nonmem_operand instead of spu_shift_operands. Use new modifiers.
+ (lshr<mode>3_reg): Fix rtl description.
+
+ Make sure mulhisi immediate operands are valid.
+ * config/spu/predicates.md (imm_K_operand): Add.
+ * config/spu/spu.md (mulhisi3_imm, umulhisi3_imm): Use imm_K_operand.
+
+ Generate constants using fsmbi and andi.
+ * config/spu/spu.c (enum immediate_class): Add IC_FSMBI2.
+ (print_operand, spu_split_immediate, classify_immediate,
+ fsmbi_const_p): Handle IC_FSMBI2.
+
+ Correctly handle a CONST_VECTOR containing symbols.
+ * config/spu/spu.c (print_operand): Handle HIGH correctly.
+ (spu_split_immediate): Split CONST_VECTORs with -mlarge-mem.
+ (immediate_load_p): Allow symbols that use 2 instructions to create.
+ (classify_immediate, spu_builtin_splats): Don't accept a CONST_VECTOR
+ with symbols when flag_pic is set.
+ (const_vector_immediate_p): New.
+ (logical_immediate_p, iohl_immediate_p, arith_immediate_p): Don't
+ accept a CONST_VECTOR with symbols.
+ (spu_legitimate_constant_p): Use const_vector_immediate_p. Don't
+ accept a CONST_VECTOR with symbols when flag_pic is set. Handle HIGH
+ correctly.
+ * config/spu/spu.md (high, low): Delete.
+ (low_<mode>): Define.
+
+ Remove INTRmode and INTR_REGNUM, which didn't work.
+ * config/spu/spu.c (spu_conditional_register_usage): Remove reference
+ of INTR_REGNUM.
+ * config/spu/spu-builtins.md (spu_idisable, spu_ienable, set_intr,
+ set_intr_pic, set_intr_cc, set_intr_cc_pic, set_intr_return, unnamed
+ peephole2 pattern): Don't use INTR or 131.
+ (movintrcc): Delete.
+ * config/spu/spu.h (FIRST_PSEUDO_REGISTER, FIXED_REGISTERS,
+ CALL_USED_REGISTERS, REGISTER_NAMES, INTR_REGNUM): Remove INTR_REGNUM.
+ * config/spu/spu.md (UNSPEC_IDISABLE, UNSPEC_IENABLE): Remove.
+ (UNSPEC_SET_INTR): Add.
+ * config/spu/spu-modes.def (INTR): Remove.
+
+ More accurate warnings about run-time relocations.
+ * config/spu/spu.c (reloc_diagnostic): Test in_section.
+
+ Correctly warn about immediate arguments to specific intrinsics.
+ * config/spu/spu.c (spu_check_builtin_parm): Handle CONST_VECTORs.
+ (spu_expand_builtin_1): Call spu_check_builtin_parm before checking
+ the instruction predicate.
+
+ Fix tree check errors with latest update.
+ * config/spu/spu.c (expand_builtin_args, spu_expand_builtin_1): Use
+ CALL_EXPR_ARG.
+ (spu_expand_builtin): Use CALL_EXPR_FN.
+
+ Add missing specific intrinsics.
+ * config/spu/spu-builtins.def: Add si_bisled, si_bisledd and
+ si_bislede.
+ * config/spu/spu_internals.h: Ditto.
+
+ Fix incorrect operand modifiers.
+ * config/spu/spu-builtins.md (spu_mpy, spu_mpyu): Remove use of %H.
+ * config/spu/spu.md (xor<mode>3): Change %S to %J.
+
+ Optimize one case of zero_extend of a vec_select.
+ * config/spu/spu.md (_vec_extractv8hi_ze): Add.
+
+ Accept any immediate for hbr.
+ * config/spu/spu.md (hbr): Change s constraints to i.
+
+2007-02-21 Paul Brook <paul@codesourcery.com>
+
+ * config/arm/arm.c (thumb2_final_prescan_insn): Don't incrememnt
+ condexec_count when skipping USE and CLOBBER.
+
+2007-02-21 Nick Clifton <nickc@redhat.com>
+
+ * common.opt (Warray-bounds): Add Warning attribute.
+ (Wstrict-overflow, Wstrict-overflow=, Wcoverage-mismatch):
+ Likewise.
+ (fsized-zeroes): Add Optimization attribute.
+ (fsplit-wide-types, ftree-scev-cprop): Likewise.
+ * c.opt (Wc++0x-compat): Add Warning attribute.
+
+2007-02-21 Ulrich Weigand <uweigand@de.ibm.com>
+
+ PR middle-end/30761
+ * reload1.c (eliminate_regs_in_insn): In the single_set special
+ case, attempt to re-recognize the insn before falling back to
+ having reload fix it up.
+
2007-02-20 Eric Christopher <echristo@gmail.com>
* config/frv/frv.c (frv_read_argument): Take a tree and int argument.
@@ -88,7 +210,7 @@
intrinsics.
2007-02-20 Manuel Lopez-Ibanez <manu@gcc.gnu.org>
- DJ Delorie <dj@redhat.com>
+ DJ Delorie <dj@redhat.com>
PR other/30824
* diagnostic.c (diagnostic_count_diagnostic): Move -Werror logic to...
The results can be reproduced by building a compiler with
--enable-gather-detailed-mem-stats targetting x86-64
and compiling preprocessed combine.c or testcase from PR8632 with:
-fmem-report --param=ggc-min-heapsize=1024 --param=ggc-min-expand=1 -Ox -Q
The memory consumption summary appears in the dump after detailed listing
of the places they are allocated in. Peak memory consumption is actually
computed by looking for maximal value in {GC XXXX -> YYYY} report.
Your testing script.
More information about the Gcc-regression
mailing list