A recent patch increased GCC's memory consumption in some cases!

gcctest@suse.de gcctest@suse.de
Sat Oct 6 03:45:00 GMT 2007


Hi,

I am a friendly script caring about memory consumption in GCC.  Please
contact jh@suse.cz if something is going wrong.

Comparing memory consumption on compilation of combine.i, insn-attrtab.i,
and generate-3.4.ii I got:


comparing empty function compilation at -O0 level:
    Overall memory needed: 7040k
    Peak memory use before GGC: 1180k
    Peak memory use after GGC: 1079k
    Maximum of released memory in single GGC run: 126k
    Garbage: 249k
    Leak: 1084k
    Overhead: 141k
    GGC runs: 4

comparing empty function compilation at -O0 -g level:
    Overall memory needed: 7056k -> 7060k
    Peak memory use before GGC: 1207k
    Peak memory use after GGC: 1107k
    Maximum of released memory in single GGC run: 128k
    Garbage: 252k
    Leak: 1116k
    Overhead: 145k
    GGC runs: 4

comparing empty function compilation at -O1 level:
    Overall memory needed: 7096k
    Peak memory use before GGC: 1180k
    Peak memory use after GGC: 1071k
    Maximum of released memory in single GGC run: 121k
    Garbage: 251k
    Leak: 1084k
    Overhead: 141k
    GGC runs: 3

comparing empty function compilation at -O2 level:
    Overall memory needed: 7100k
    Peak memory use before GGC: 1180k
    Peak memory use after GGC: 1072k
    Maximum of released memory in single GGC run: 121k
    Garbage: 255k
    Leak: 1085k
    Overhead: 142k
    GGC runs: 3

comparing empty function compilation at -O3 level:
    Overall memory needed: 7100k
    Peak memory use before GGC: 1180k
    Peak memory use after GGC: 1072k
    Maximum of released memory in single GGC run: 121k
    Garbage: 255k
    Leak: 1085k
    Overhead: 142k
    GGC runs: 3

comparing combine.c compilation at -O0 level:
    Overall memory needed: 22032k
    Peak memory use before GGC: 8285k
    Peak memory use after GGC: 7624k
    Maximum of released memory in single GGC run: 1580k
    Garbage: 38810k
    Leak: 6169k
    Overhead: 5021k
    GGC runs: 369

comparing combine.c compilation at -O0 -g level:
    Overall memory needed: 24044k
    Peak memory use before GGC: 10115k
    Peak memory use after GGC: 9389k
    Maximum of released memory in single GGC run: 1875k
    Garbage: 39134k
    Leak: 8996k
    Overhead: 5693k
    GGC runs: 343

comparing combine.c compilation at -O1 level:
    Overall memory needed: 33048k -> 32968k
    Peak memory use before GGC: 17056k
    Peak memory use after GGC: 16868k
    Maximum of released memory in single GGC run: 1379k
    Garbage: 52438k -> 52438k
    Leak: 6308k
    Overhead: 6002k -> 6002k
    GGC runs: 440

comparing combine.c compilation at -O2 level:
    Overall memory needed: 35372k
    Peak memory use before GGC: 17127k
    Peak memory use after GGC: 16957k
    Maximum of released memory in single GGC run: 1335k
    Garbage: 71351k -> 71353k
    Leak: 6639k
    Overhead: 8254k -> 8254k
    GGC runs: 507

comparing combine.c compilation at -O3 level:
    Overall memory needed: 38816k
    Peak memory use before GGC: 17337k
    Peak memory use after GGC: 17010k
    Maximum of released memory in single GGC run: 2130k
    Garbage: 92736k -> 92739k
    Leak: 6750k
    Overhead: 10759k -> 10759k
    GGC runs: 537

comparing insn-attrtab.c compilation at -O0 level:
    Overall memory needed: 138156k
    Peak memory use before GGC: 58644k
    Peak memory use after GGC: 32137k
    Maximum of released memory in single GGC run: 34144k
    Garbage: 131586k
    Leak: 8909k
    Overhead: 14830k
    GGC runs: 295

comparing insn-attrtab.c compilation at -O0 -g level:
    Overall memory needed: 139404k
    Peak memory use before GGC: 59793k
    Peak memory use after GGC: 33286k
    Maximum of released memory in single GGC run: 34144k
    Garbage: 132063k
    Leak: 10345k
    Overhead: 15211k
    GGC runs: 291

comparing insn-attrtab.c compilation at -O1 level:
  Ovarall memory allocated via mmap and sbrk decreased from 153536k to 146168k, overall -5.04%
    Overall memory needed: 153536k -> 146168k
    Peak memory use before GGC: 57137k
    Peak memory use after GGC: 50907k
    Maximum of released memory in single GGC run: 24233k
    Garbage: 212481k
    Leak: 9801k
    Overhead: 24835k
    GGC runs: 319

comparing insn-attrtab.c compilation at -O2 level:
    Overall memory needed: 187064k -> 187052k
    Peak memory use before GGC: 57772k
    Peak memory use after GGC: 52500k
    Maximum of released memory in single GGC run: 22973k
    Garbage: 253948k
    Leak: 10889k
    Overhead: 30581k
    GGC runs: 350

comparing insn-attrtab.c compilation at -O3 level:
    Overall memory needed: 194252k -> 194256k
    Peak memory use before GGC: 69771k
    Peak memory use after GGC: 63204k
    Maximum of released memory in single GGC run: 23494k
    Garbage: 281995k
    Leak: 10925k
    Overhead: 32460k
    GGC runs: 356

comparing Gerald's testcase PR8361 compilation at -O0 level:
    Overall memory needed: 155523k -> 155507k
    Peak memory use before GGC: 89791k
    Peak memory use after GGC: 88898k
    Maximum of released memory in single GGC run: 18065k
    Garbage: 210359k
    Leak: 53116k
    Overhead: 26511k
    GGC runs: 418

comparing Gerald's testcase PR8361 compilation at -O0 -g level:
    Overall memory needed: 174663k
    Peak memory use before GGC: 101236k
    Peak memory use after GGC: 100233k
    Maximum of released memory in single GGC run: 18252k
    Garbage: 215953k
    Leak: 74940k
    Overhead: 31919k
    GGC runs: 392

comparing Gerald's testcase PR8361 compilation at -O1 level:
    Overall memory needed: 121289k -> 121281k
    Peak memory use before GGC: 88659k
    Peak memory use after GGC: 87777k
    Maximum of released memory in single GGC run: 17321k
    Garbage: 292134k -> 292134k
    Leak: 52378k
    Overhead: 30190k -> 30190k
    GGC runs: 510

comparing Gerald's testcase PR8361 compilation at -O2 level:
    Overall memory needed: 126369k
    Peak memory use before GGC: 88759k
    Peak memory use after GGC: 87878k
    Maximum of released memory in single GGC run: 17318k
    Garbage: 357871k -> 357872k
    Leak: 53383k
    Overhead: 37119k -> 37119k
    GGC runs: 591

comparing Gerald's testcase PR8361 compilation at -O3 level:
    Overall memory needed: 130761k -> 131133k
    Peak memory use before GGC: 89976k
    Peak memory use after GGC: 89082k -> 89081k
    Maximum of released memory in single GGC run: 17674k
    Garbage: 389861k -> 389943k
    Leak: 53814k -> 53813k
    Overhead: 39850k -> 39865k
    GGC runs: 612

comparing PR rtl-optimization/28071 testcase compilation at -O0 level:
    Overall memory needed: 379339k -> 379338k
    Peak memory use before GGC: 101510k
    Peak memory use after GGC: 57163k
    Maximum of released memory in single GGC run: 50582k
    Garbage: 179454k
    Leak: 6299k
    Overhead: 30876k
    GGC runs: 105

comparing PR rtl-optimization/28071 testcase compilation at -O0 -g level:
    Overall memory needed: 380143k -> 380146k
    Peak memory use before GGC: 102144k
    Peak memory use after GGC: 57797k
    Maximum of released memory in single GGC run: 50583k
    Garbage: 179559k
    Leak: 8007k
    Overhead: 31342k
    GGC runs: 110

comparing PR rtl-optimization/28071 testcase compilation at -O1 level:
  Overall memory allocated via mmap and sbrk increased from 297612k to 392896k, overall 32.02%
    Overall memory needed: 297612k -> 392896k
    Peak memory use before GGC: 81062k
    Peak memory use after GGC: 73201k
    Maximum of released memory in single GGC run: 40265k
    Garbage: 236192k
    Leak: 15698k
    Overhead: 31631k
    GGC runs: 103

comparing PR rtl-optimization/28071 testcase compilation at -O2 level:
  Overall memory allocated via mmap and sbrk increased from 270588k to 296924k, overall 9.73%
    Overall memory needed: 270588k -> 296924k
    Peak memory use before GGC: 78187k
    Peak memory use after GGC: 73201k
    Maximum of released memory in single GGC run: 33868k
    Garbage: 246253k
    Leak: 15788k
    Overhead: 33704k
    GGC runs: 116

comparing PR rtl-optimization/28071 testcase compilation at -O3 -fno-tree-pre -fno-tree-fre level:
  Overall memory allocated via mmap and sbrk increased from 1056156k to 1312880k, overall 24.31%
    Overall memory needed: 1056156k -> 1312880k
    Peak memory use before GGC: 136332k
    Peak memory use after GGC: 126665k
    Maximum of released memory in single GGC run: 68197k
    Garbage: 364426k
    Leak: 26574k
    Overhead: 46311k
    GGC runs: 102

Head of the ChangeLog is:

--- /usr/src/SpecTests/sandbox-britten-memory/x86_64/mem-result/ChangeLog	2007-10-05 07:54:54.000000000 +0000
+++ /usr/src/SpecTests/sandbox-britten-memory/gcc/gcc/ChangeLog	2007-10-06 00:13:09.000000000 +0000
@@ -1,3 +1,85 @@
+2007-10-05  Hans-Peter Nilsson  <hp@axis.com>
+
+	* gthr-single.h: Revert last change.
+
+2007-10-05  Michael Matz  <matz@suse.de>
+
+	PR middle-end/33667
+	* lower-subreg.c (decompose_multiword_subregs): Use
+	validate_unshare_change().
+
+2007-10-05  Peter Bergner  <bergner@vnet.ibm.com>
+
+	* ra-conflict.c: Include "sparseset.h".
+	(conflicts): Change to HOST_WIDEST_FAST_INT.
+	(allocnos_live): Redefine variable as a sparseset.
+	(SET_ALLOCNO_LIVE, CLEAR_ALLOCNO_LIVE, GET_ALLOCNO_LIVE): Delete macros.
+	(allocno_row_words): Removed global variable.
+	(partial_bitnum, max_bitnum, adjacency_pool, adjacency): New variables.
+	(CONFLICT_BITNUM, CONFLICT_BITNUM_FAST): New defines.
+	(conflict_p, set_conflict_p, set_conflicts_p): New functions.
+	(record_one_conflict_between_regnos): Cache allocno values and reuse.
+	Use set_conflict_p.
+	(record_one_conflict): Update uses of allocnos_live to use
+	the sparseset routines.  Use set_conflicts_p.
+	(mark_reg_store): Likewise.
+	(set_reg_in_live): Likewise.
+	(global_conflicts): Update uses of allocnos_live.
+	Use the new adjacency list to visit an allocno's neighbors
+	rather than iterating over all possible allocnos.
+	Call set_conflicts_p to setup conflicts rather than adding
+	them manually.
+	* global.c: Comments updated.  
+	(CONFLICTP): Delete define.
+	(regno_compare): New function.  Add prototype.
+	(global_alloc): Sort the allocno to regno mapping according to
+	which basic blocks the regnos are referenced in.  Modify the
+	conflict bit matrix to a compressed triangular bitmatrix.
+	Only allocate the conflict bit matrix and adjacency lists if
+	we are actually going to allocate something.
+	(expand_preferences): Use conflict_p.  Update uses of allocnos_live.
+	(prune_preferences): Use the FOR_EACH_CONFLICT macro to visit an
+	allocno's neighbors rather than iterating over all possible allocnos.
+	(mirror_conflicts): Removed function.
+	(dump_conflicts): Iterate over regnos rather than allocnos so
+	that all dump output will be sorted by regno number.
+	Use the FOR_EACH_CONFLICT macro.
+	* ra.h: Comments updated.
+	(conflicts): Update prototype to HOST_WIDEST_FAST_INT.
+	(partial_bitnum, max_bitnum, adjacency, adjacency_pool): Add prototypes.
+	(ADJACENCY_VEC_LENGTH, FOR_EACH_CONFLICT): New defines.
+	(adjacency_list_d, adjacency_iterator_d): New types.
+	(add_neighbor, adjacency_iter_init, adjacency_iter_done,
+	adjacency_iter_next, regno_basic_block): New static inline functions.
+	(EXECUTE_IF_SET_IN_ALLOCNO_SET): Removed define.
+	(conflict_p): Add function prototype.
+	* sparseset.h, sparseset.c: New files.
+	* Makefile.in (OBJS-common): Add sparseset.o.
+	(sparseset.o): New rule.
+
+2007-10-05  Richard Guenther  <rguenther@suse.de>
+
+	PR middle-end/33666
+	* fold-const.c (fold_unary): Do not fold (long long)(int)ptr
+	to (long long)ptr.
+
+2007-10-05  Michael Matz  <matz@suse.de>
+
+	PR inline-asm/33600
+	* function.c (match_asm_constraints_1): Check for input
+	being used in the outputs.
+
+2007-10-05  Richard Guenther  <rguenther@suse.de>
+
+	* tree-cfg.c (verify_gimple_expr): Accept OBJ_TYPE_REF.
+
+2007-10-05  Richard Sandiford  <rsandifo@nildram.co.uk>
+
+	PR target/33635
+	* config/mips/mips.c (mips_register_move_cost): Rewrite to use
+	subset checks.  Make the cost of FPR -> FPR moves depend on
+	mips_mode_ok_for_mov_fmt_p.
+
 2007-10-04  Doug Kwan  <dougkwan@google.com>
 
 	* gthr-posix.h (__gthread_cond_broadcast, __gthread_cond_wait,


The results can be reproduced by building a compiler with

--enable-gather-detailed-mem-stats targetting x86-64

and compiling preprocessed combine.c or testcase from PR8632 with:

-fmem-report --param=ggc-min-heapsize=1024 --param=ggc-min-expand=1 -Ox -Q

The memory consumption summary appears in the dump after detailed listing
of the places they are allocated in.  Peak memory consumption is actually
computed by looking for maximal value in {GC XXXX -> YYYY} report.

Your testing script.



More information about the Gcc-regression mailing list