A recent patch increased GCC's memory consumption in some cases!
gcctest@suse.de
gcctest@suse.de
Sat Oct 6 03:45:00 GMT 2007
Hi,
I am a friendly script caring about memory consumption in GCC. Please
contact jh@suse.cz if something is going wrong.
Comparing memory consumption on compilation of combine.i, insn-attrtab.i,
and generate-3.4.ii I got:
comparing empty function compilation at -O0 level:
Overall memory needed: 7040k
Peak memory use before GGC: 1180k
Peak memory use after GGC: 1079k
Maximum of released memory in single GGC run: 126k
Garbage: 249k
Leak: 1084k
Overhead: 141k
GGC runs: 4
comparing empty function compilation at -O0 -g level:
Overall memory needed: 7056k -> 7060k
Peak memory use before GGC: 1207k
Peak memory use after GGC: 1107k
Maximum of released memory in single GGC run: 128k
Garbage: 252k
Leak: 1116k
Overhead: 145k
GGC runs: 4
comparing empty function compilation at -O1 level:
Overall memory needed: 7096k
Peak memory use before GGC: 1180k
Peak memory use after GGC: 1071k
Maximum of released memory in single GGC run: 121k
Garbage: 251k
Leak: 1084k
Overhead: 141k
GGC runs: 3
comparing empty function compilation at -O2 level:
Overall memory needed: 7100k
Peak memory use before GGC: 1180k
Peak memory use after GGC: 1072k
Maximum of released memory in single GGC run: 121k
Garbage: 255k
Leak: 1085k
Overhead: 142k
GGC runs: 3
comparing empty function compilation at -O3 level:
Overall memory needed: 7100k
Peak memory use before GGC: 1180k
Peak memory use after GGC: 1072k
Maximum of released memory in single GGC run: 121k
Garbage: 255k
Leak: 1085k
Overhead: 142k
GGC runs: 3
comparing combine.c compilation at -O0 level:
Overall memory needed: 22032k
Peak memory use before GGC: 8285k
Peak memory use after GGC: 7624k
Maximum of released memory in single GGC run: 1580k
Garbage: 38810k
Leak: 6169k
Overhead: 5021k
GGC runs: 369
comparing combine.c compilation at -O0 -g level:
Overall memory needed: 24044k
Peak memory use before GGC: 10115k
Peak memory use after GGC: 9389k
Maximum of released memory in single GGC run: 1875k
Garbage: 39134k
Leak: 8996k
Overhead: 5693k
GGC runs: 343
comparing combine.c compilation at -O1 level:
Overall memory needed: 33048k -> 32968k
Peak memory use before GGC: 17056k
Peak memory use after GGC: 16868k
Maximum of released memory in single GGC run: 1379k
Garbage: 52438k -> 52438k
Leak: 6308k
Overhead: 6002k -> 6002k
GGC runs: 440
comparing combine.c compilation at -O2 level:
Overall memory needed: 35372k
Peak memory use before GGC: 17127k
Peak memory use after GGC: 16957k
Maximum of released memory in single GGC run: 1335k
Garbage: 71351k -> 71353k
Leak: 6639k
Overhead: 8254k -> 8254k
GGC runs: 507
comparing combine.c compilation at -O3 level:
Overall memory needed: 38816k
Peak memory use before GGC: 17337k
Peak memory use after GGC: 17010k
Maximum of released memory in single GGC run: 2130k
Garbage: 92736k -> 92739k
Leak: 6750k
Overhead: 10759k -> 10759k
GGC runs: 537
comparing insn-attrtab.c compilation at -O0 level:
Overall memory needed: 138156k
Peak memory use before GGC: 58644k
Peak memory use after GGC: 32137k
Maximum of released memory in single GGC run: 34144k
Garbage: 131586k
Leak: 8909k
Overhead: 14830k
GGC runs: 295
comparing insn-attrtab.c compilation at -O0 -g level:
Overall memory needed: 139404k
Peak memory use before GGC: 59793k
Peak memory use after GGC: 33286k
Maximum of released memory in single GGC run: 34144k
Garbage: 132063k
Leak: 10345k
Overhead: 15211k
GGC runs: 291
comparing insn-attrtab.c compilation at -O1 level:
Ovarall memory allocated via mmap and sbrk decreased from 153536k to 146168k, overall -5.04%
Overall memory needed: 153536k -> 146168k
Peak memory use before GGC: 57137k
Peak memory use after GGC: 50907k
Maximum of released memory in single GGC run: 24233k
Garbage: 212481k
Leak: 9801k
Overhead: 24835k
GGC runs: 319
comparing insn-attrtab.c compilation at -O2 level:
Overall memory needed: 187064k -> 187052k
Peak memory use before GGC: 57772k
Peak memory use after GGC: 52500k
Maximum of released memory in single GGC run: 22973k
Garbage: 253948k
Leak: 10889k
Overhead: 30581k
GGC runs: 350
comparing insn-attrtab.c compilation at -O3 level:
Overall memory needed: 194252k -> 194256k
Peak memory use before GGC: 69771k
Peak memory use after GGC: 63204k
Maximum of released memory in single GGC run: 23494k
Garbage: 281995k
Leak: 10925k
Overhead: 32460k
GGC runs: 356
comparing Gerald's testcase PR8361 compilation at -O0 level:
Overall memory needed: 155523k -> 155507k
Peak memory use before GGC: 89791k
Peak memory use after GGC: 88898k
Maximum of released memory in single GGC run: 18065k
Garbage: 210359k
Leak: 53116k
Overhead: 26511k
GGC runs: 418
comparing Gerald's testcase PR8361 compilation at -O0 -g level:
Overall memory needed: 174663k
Peak memory use before GGC: 101236k
Peak memory use after GGC: 100233k
Maximum of released memory in single GGC run: 18252k
Garbage: 215953k
Leak: 74940k
Overhead: 31919k
GGC runs: 392
comparing Gerald's testcase PR8361 compilation at -O1 level:
Overall memory needed: 121289k -> 121281k
Peak memory use before GGC: 88659k
Peak memory use after GGC: 87777k
Maximum of released memory in single GGC run: 17321k
Garbage: 292134k -> 292134k
Leak: 52378k
Overhead: 30190k -> 30190k
GGC runs: 510
comparing Gerald's testcase PR8361 compilation at -O2 level:
Overall memory needed: 126369k
Peak memory use before GGC: 88759k
Peak memory use after GGC: 87878k
Maximum of released memory in single GGC run: 17318k
Garbage: 357871k -> 357872k
Leak: 53383k
Overhead: 37119k -> 37119k
GGC runs: 591
comparing Gerald's testcase PR8361 compilation at -O3 level:
Overall memory needed: 130761k -> 131133k
Peak memory use before GGC: 89976k
Peak memory use after GGC: 89082k -> 89081k
Maximum of released memory in single GGC run: 17674k
Garbage: 389861k -> 389943k
Leak: 53814k -> 53813k
Overhead: 39850k -> 39865k
GGC runs: 612
comparing PR rtl-optimization/28071 testcase compilation at -O0 level:
Overall memory needed: 379339k -> 379338k
Peak memory use before GGC: 101510k
Peak memory use after GGC: 57163k
Maximum of released memory in single GGC run: 50582k
Garbage: 179454k
Leak: 6299k
Overhead: 30876k
GGC runs: 105
comparing PR rtl-optimization/28071 testcase compilation at -O0 -g level:
Overall memory needed: 380143k -> 380146k
Peak memory use before GGC: 102144k
Peak memory use after GGC: 57797k
Maximum of released memory in single GGC run: 50583k
Garbage: 179559k
Leak: 8007k
Overhead: 31342k
GGC runs: 110
comparing PR rtl-optimization/28071 testcase compilation at -O1 level:
Overall memory allocated via mmap and sbrk increased from 297612k to 392896k, overall 32.02%
Overall memory needed: 297612k -> 392896k
Peak memory use before GGC: 81062k
Peak memory use after GGC: 73201k
Maximum of released memory in single GGC run: 40265k
Garbage: 236192k
Leak: 15698k
Overhead: 31631k
GGC runs: 103
comparing PR rtl-optimization/28071 testcase compilation at -O2 level:
Overall memory allocated via mmap and sbrk increased from 270588k to 296924k, overall 9.73%
Overall memory needed: 270588k -> 296924k
Peak memory use before GGC: 78187k
Peak memory use after GGC: 73201k
Maximum of released memory in single GGC run: 33868k
Garbage: 246253k
Leak: 15788k
Overhead: 33704k
GGC runs: 116
comparing PR rtl-optimization/28071 testcase compilation at -O3 -fno-tree-pre -fno-tree-fre level:
Overall memory allocated via mmap and sbrk increased from 1056156k to 1312880k, overall 24.31%
Overall memory needed: 1056156k -> 1312880k
Peak memory use before GGC: 136332k
Peak memory use after GGC: 126665k
Maximum of released memory in single GGC run: 68197k
Garbage: 364426k
Leak: 26574k
Overhead: 46311k
GGC runs: 102
Head of the ChangeLog is:
--- /usr/src/SpecTests/sandbox-britten-memory/x86_64/mem-result/ChangeLog 2007-10-05 07:54:54.000000000 +0000
+++ /usr/src/SpecTests/sandbox-britten-memory/gcc/gcc/ChangeLog 2007-10-06 00:13:09.000000000 +0000
@@ -1,3 +1,85 @@
+2007-10-05 Hans-Peter Nilsson <hp@axis.com>
+
+ * gthr-single.h: Revert last change.
+
+2007-10-05 Michael Matz <matz@suse.de>
+
+ PR middle-end/33667
+ * lower-subreg.c (decompose_multiword_subregs): Use
+ validate_unshare_change().
+
+2007-10-05 Peter Bergner <bergner@vnet.ibm.com>
+
+ * ra-conflict.c: Include "sparseset.h".
+ (conflicts): Change to HOST_WIDEST_FAST_INT.
+ (allocnos_live): Redefine variable as a sparseset.
+ (SET_ALLOCNO_LIVE, CLEAR_ALLOCNO_LIVE, GET_ALLOCNO_LIVE): Delete macros.
+ (allocno_row_words): Removed global variable.
+ (partial_bitnum, max_bitnum, adjacency_pool, adjacency): New variables.
+ (CONFLICT_BITNUM, CONFLICT_BITNUM_FAST): New defines.
+ (conflict_p, set_conflict_p, set_conflicts_p): New functions.
+ (record_one_conflict_between_regnos): Cache allocno values and reuse.
+ Use set_conflict_p.
+ (record_one_conflict): Update uses of allocnos_live to use
+ the sparseset routines. Use set_conflicts_p.
+ (mark_reg_store): Likewise.
+ (set_reg_in_live): Likewise.
+ (global_conflicts): Update uses of allocnos_live.
+ Use the new adjacency list to visit an allocno's neighbors
+ rather than iterating over all possible allocnos.
+ Call set_conflicts_p to setup conflicts rather than adding
+ them manually.
+ * global.c: Comments updated.
+ (CONFLICTP): Delete define.
+ (regno_compare): New function. Add prototype.
+ (global_alloc): Sort the allocno to regno mapping according to
+ which basic blocks the regnos are referenced in. Modify the
+ conflict bit matrix to a compressed triangular bitmatrix.
+ Only allocate the conflict bit matrix and adjacency lists if
+ we are actually going to allocate something.
+ (expand_preferences): Use conflict_p. Update uses of allocnos_live.
+ (prune_preferences): Use the FOR_EACH_CONFLICT macro to visit an
+ allocno's neighbors rather than iterating over all possible allocnos.
+ (mirror_conflicts): Removed function.
+ (dump_conflicts): Iterate over regnos rather than allocnos so
+ that all dump output will be sorted by regno number.
+ Use the FOR_EACH_CONFLICT macro.
+ * ra.h: Comments updated.
+ (conflicts): Update prototype to HOST_WIDEST_FAST_INT.
+ (partial_bitnum, max_bitnum, adjacency, adjacency_pool): Add prototypes.
+ (ADJACENCY_VEC_LENGTH, FOR_EACH_CONFLICT): New defines.
+ (adjacency_list_d, adjacency_iterator_d): New types.
+ (add_neighbor, adjacency_iter_init, adjacency_iter_done,
+ adjacency_iter_next, regno_basic_block): New static inline functions.
+ (EXECUTE_IF_SET_IN_ALLOCNO_SET): Removed define.
+ (conflict_p): Add function prototype.
+ * sparseset.h, sparseset.c: New files.
+ * Makefile.in (OBJS-common): Add sparseset.o.
+ (sparseset.o): New rule.
+
+2007-10-05 Richard Guenther <rguenther@suse.de>
+
+ PR middle-end/33666
+ * fold-const.c (fold_unary): Do not fold (long long)(int)ptr
+ to (long long)ptr.
+
+2007-10-05 Michael Matz <matz@suse.de>
+
+ PR inline-asm/33600
+ * function.c (match_asm_constraints_1): Check for input
+ being used in the outputs.
+
+2007-10-05 Richard Guenther <rguenther@suse.de>
+
+ * tree-cfg.c (verify_gimple_expr): Accept OBJ_TYPE_REF.
+
+2007-10-05 Richard Sandiford <rsandifo@nildram.co.uk>
+
+ PR target/33635
+ * config/mips/mips.c (mips_register_move_cost): Rewrite to use
+ subset checks. Make the cost of FPR -> FPR moves depend on
+ mips_mode_ok_for_mov_fmt_p.
+
2007-10-04 Doug Kwan <dougkwan@google.com>
* gthr-posix.h (__gthread_cond_broadcast, __gthread_cond_wait,
The results can be reproduced by building a compiler with
--enable-gather-detailed-mem-stats targetting x86-64
and compiling preprocessed combine.c or testcase from PR8632 with:
-fmem-report --param=ggc-min-heapsize=1024 --param=ggc-min-expand=1 -Ox -Q
The memory consumption summary appears in the dump after detailed listing
of the places they are allocated in. Peak memory consumption is actually
computed by looking for maximal value in {GC XXXX -> YYYY} report.
Your testing script.
More information about the Gcc-regression
mailing list