A recent patch increased GCC's memory consumption!

gcctest@suse.de gcctest@suse.de
Tue Dec 23 19:59:00 GMT 2008


Hi,

I am a friendly script caring about memory consumption in GCC.  Please
contact jh@suse.cz if something is going wrong.

Comparing memory consumption on compilation of combine.i, insn-attrtab.i,
and generate-3.4.ii I got:


comparing empty function compilation at -O0 level:
    Overall memory needed: 8207k
    Peak memory use before GGC: 1291k
    Peak memory use after GGC: 1217k
    Maximum of released memory in single GGC run: 134k
    Garbage: 218k
    Leak: 1221k
    Overhead: 136k
    GGC runs: 4
    Pre-IPA-Garbage: 207k
    Pre-IPA-Leak: 1224k
    Pre-IPA-Overhead: 135k
    Post-IPA-Garbage: 207k
    Post-IPA-Leak: 1224k
    Post-IPA-Overhead: 135k

comparing empty function compilation at -O0 -g level:
    Overall memory needed: 8451k
    Peak memory use before GGC: 1319k
    Peak memory use after GGC: 1245k
    Maximum of released memory in single GGC run: 133k
    Garbage: 220k
    Leak: 1254k
    Overhead: 141k
    GGC runs: 4
    Pre-IPA-Garbage: 207k
    Pre-IPA-Leak: 1224k
    Pre-IPA-Overhead: 135k
    Post-IPA-Garbage: 207k
    Post-IPA-Leak: 1224k
    Post-IPA-Overhead: 135k

comparing empty function compilation at -O1 level:
    Overall memory needed: 8239k
    Peak memory use before GGC: 1291k
    Peak memory use after GGC: 1217k
    Maximum of released memory in single GGC run: 134k
    Garbage: 221k
    Leak: 1221k
    Overhead: 137k
    GGC runs: 4
    Pre-IPA-Garbage: 207k
    Pre-IPA-Leak: 1224k
    Pre-IPA-Overhead: 135k
    Post-IPA-Garbage: 207k
    Post-IPA-Leak: 1224k
    Post-IPA-Overhead: 135k

comparing empty function compilation at -O2 level:
    Overall memory needed: 8467k
    Peak memory use before GGC: 1291k
    Peak memory use after GGC: 1218k
    Maximum of released memory in single GGC run: 135k
    Garbage: 226k
    Leak: 1221k
    Overhead: 138k
    GGC runs: 4
    Pre-IPA-Garbage: 207k
    Pre-IPA-Leak: 1224k
    Pre-IPA-Overhead: 135k
    Post-IPA-Garbage: 207k
    Post-IPA-Leak: 1224k
    Post-IPA-Overhead: 135k

comparing empty function compilation at -O3 level:
    Overall memory needed: 8471k
    Peak memory use before GGC: 1291k
    Peak memory use after GGC: 1218k
    Maximum of released memory in single GGC run: 135k
    Garbage: 226k
    Leak: 1221k
    Overhead: 138k
    GGC runs: 4
    Pre-IPA-Garbage: 207k
    Pre-IPA-Leak: 1224k
    Pre-IPA-Overhead: 135k
    Post-IPA-Garbage: 207k
    Post-IPA-Leak: 1224k
    Post-IPA-Overhead: 135k

comparing combine.c compilation at -O0 level:
    Overall memory needed: 31991k
    Peak memory use before GGC: 18026k
    Peak memory use after GGC: 17809k
    Maximum of released memory in single GGC run: 1839k
    Garbage: 39472k
    Leak: 5816k
    Overhead: 5219k
    GGC runs: 338
    Pre-IPA-Garbage: 12410k
    Pre-IPA-Leak: 19357k
    Pre-IPA-Overhead: 2559k
    Post-IPA-Garbage: 12410k
    Post-IPA-Leak: 19357k
    Post-IPA-Overhead: 2559k

comparing combine.c compilation at -O0 -g level:
    Overall memory needed: 34015k
    Peak memory use before GGC: 20026k
    Peak memory use after GGC: 19665k
    Maximum of released memory in single GGC run: 1849k
    Garbage: 39775k
    Leak: 9106k
    Overhead: 6037k
    GGC runs: 320
    Pre-IPA-Garbage: 12509k
    Pre-IPA-Leak: 21631k
    Pre-IPA-Overhead: 3049k
    Post-IPA-Garbage: 12509k
    Post-IPA-Leak: 21631k
    Post-IPA-Overhead: 3049k

comparing combine.c compilation at -O1 level:
    Overall memory needed: 30963k -> 30871k
    Peak memory use before GGC: 15690k
    Peak memory use after GGC: 15514k
    Maximum of released memory in single GGC run: 1340k
    Garbage: 46861k -> 46862k
    Leak: 5780k
    Overhead: 6012k -> 6012k
    GGC runs: 403
    Pre-IPA-Garbage: 13151k
    Pre-IPA-Leak: 16852k
    Pre-IPA-Overhead: 2472k
    Post-IPA-Garbage: 13151k
    Post-IPA-Leak: 16852k
    Post-IPA-Overhead: 2472k

comparing combine.c compilation at -O2 level:
    Overall memory needed: 31259k -> 31411k
    Peak memory use before GGC: 15836k
    Peak memory use after GGC: 15672k
    Maximum of released memory in single GGC run: 1353k
    Garbage: 60859k -> 60861k
    Leak: 5811k
    Overhead: 8044k -> 8045k
    GGC runs: 467
    Pre-IPA-Garbage: 13314k
    Pre-IPA-Leak: 16934k
    Pre-IPA-Overhead: 2492k
    Post-IPA-Garbage: 13314k
    Post-IPA-Leak: 16934k
    Post-IPA-Overhead: 2492k

comparing combine.c compilation at -O3 level:
    Overall memory needed: 31863k -> 32115k
    Peak memory use before GGC: 16004k
    Peak memory use after GGC: 15766k
    Maximum of released memory in single GGC run: 1657k
    Garbage: 73731k -> 73732k
    Leak: 7179k
    Overhead: 9532k -> 9532k
    GGC runs: 496
    Pre-IPA-Garbage: 13314k
    Pre-IPA-Leak: 16934k
    Pre-IPA-Overhead: 2492k
    Post-IPA-Garbage: 13314k
    Post-IPA-Leak: 16934k
    Post-IPA-Overhead: 2492k

comparing insn-attrtab.c compilation at -O0 level:
    Overall memory needed: 155455k
    Peak memory use before GGC: 65504k
    Peak memory use after GGC: 53919k
    Maximum of released memory in single GGC run: 27354k
    Garbage: 131299k
    Leak: 8497k
    Overhead: 15723k
    GGC runs: 264
    Pre-IPA-Garbage: 38215k
    Pre-IPA-Leak: 55487k
    Pre-IPA-Overhead: 8223k
    Post-IPA-Garbage: 38215k
    Post-IPA-Leak: 55487k
    Post-IPA-Overhead: 8223k

comparing insn-attrtab.c compilation at -O0 -g level:
    Overall memory needed: 156727k
    Peak memory use before GGC: 66777k
    Peak memory use after GGC: 55193k
    Maximum of released memory in single GGC run: 27354k
    Garbage: 131779k
    Leak: 10143k
    Overhead: 16179k
    GGC runs: 257
    Pre-IPA-Garbage: 38272k
    Pre-IPA-Leak: 57029k
    Pre-IPA-Overhead: 8558k
    Post-IPA-Garbage: 38272k
    Post-IPA-Leak: 57029k
    Post-IPA-Overhead: 8558k

comparing insn-attrtab.c compilation at -O1 level:
    Overall memory needed: 133459k
    Peak memory use before GGC: 50313k
    Peak memory use after GGC: 43428k
    Maximum of released memory in single GGC run: 23088k
    Garbage: 181524k -> 181524k
    Leak: 7873k
    Overhead: 24500k -> 24500k
    GGC runs: 300
    Pre-IPA-Garbage: 43193k
    Pre-IPA-Leak: 43086k
    Pre-IPA-Overhead: 7642k
    Post-IPA-Garbage: 43193k
    Post-IPA-Leak: 43086k
    Post-IPA-Overhead: 7642k

comparing insn-attrtab.c compilation at -O2 level:
    Overall memory needed: 148831k -> 148835k
    Peak memory use before GGC: 50274k
    Peak memory use after GGC: 45087k
    Maximum of released memory in single GGC run: 18101k
    Garbage: 205184k -> 205184k
    Leak: 15535k
    Overhead: 30000k -> 30000k
    GGC runs: 327
    Pre-IPA-Garbage: 43265k
    Pre-IPA-Leak: 43092k
    Pre-IPA-Overhead: 7651k
    Post-IPA-Garbage: 43265k
    Post-IPA-Leak: 43092k
    Post-IPA-Overhead: 7651k

comparing insn-attrtab.c compilation at -O3 level:
    Overall memory needed: 162695k -> 162683k
    Peak memory use before GGC: 61897k
    Peak memory use after GGC: 58799k
    Maximum of released memory in single GGC run: 24088k
    Garbage: 243061k -> 243061k
    Leak: 7899k
    Overhead: 33465k -> 33465k
    GGC runs: 337
    Pre-IPA-Garbage: 43265k
    Pre-IPA-Leak: 43092k
    Pre-IPA-Overhead: 7651k
    Post-IPA-Garbage: 43265k
    Post-IPA-Leak: 43092k
    Post-IPA-Overhead: 7651k

comparing Gerald's testcase PR8361 compilation at -O0 level:
    Overall memory needed: 149079k -> 149091k
    Peak memory use before GGC: 81861k
    Peak memory use after GGC: 81050k
    Maximum of released memory in single GGC run: 14466k
    Garbage: 203499k
    Leak: 51679k
    Overhead: 26794k
    GGC runs: 415
    Pre-IPA-Garbage: 110428k
    Pre-IPA-Leak: 87218k
    Pre-IPA-Overhead: 14731k
    Post-IPA-Garbage: 110428k
    Post-IPA-Leak: 87218k
    Post-IPA-Overhead: 14731k

comparing Gerald's testcase PR8361 compilation at -O0 -g level:
    Overall memory needed: 167171k -> 167175k
    Peak memory use before GGC: 95524k
    Peak memory use after GGC: 94578k
    Maximum of released memory in single GGC run: 14899k
    Garbage: 209181k
    Leak: 78420k
    Overhead: 33467k
    GGC runs: 389
    Pre-IPA-Garbage: 111050k
    Pre-IPA-Leak: 103733k
    Pre-IPA-Overhead: 18235k
    Post-IPA-Garbage: 111050k
    Post-IPA-Leak: 103733k
    Post-IPA-Overhead: 18235k

comparing Gerald's testcase PR8361 compilation at -O1 level:
    Overall memory needed: 110702k -> 110705k
    Peak memory use before GGC: 83518k
    Peak memory use after GGC: 82675k
    Maximum of released memory in single GGC run: 14867k
    Garbage: 279661k -> 279682k
    Leak: 48421k
    Overhead: 31431k -> 31435k
    GGC runs: 502
    Pre-IPA-Garbage: 158188k
    Pre-IPA-Leak: 87362k
    Pre-IPA-Overhead: 19750k
    Post-IPA-Garbage: 158188k
    Post-IPA-Leak: 87362k
    Post-IPA-Overhead: 19750k

comparing Gerald's testcase PR8361 compilation at -O2 level:
    Overall memory needed: 111530k -> 111517k
    Peak memory use before GGC: 85274k
    Peak memory use after GGC: 84430k
    Maximum of released memory in single GGC run: 14870k
    Garbage: 335618k -> 335649k
    Leak: 48435k
    Overhead: 38111k -> 38117k
    GGC runs: 574
    Pre-IPA-Garbage: 162110k
    Pre-IPA-Leak: 87734k
    Pre-IPA-Overhead: 20224k
    Post-IPA-Garbage: 162110k
    Post-IPA-Leak: 87734k
    Post-IPA-Overhead: 20224k

comparing Gerald's testcase PR8361 compilation at -O3 level:
    Overall memory needed: 112146k -> 112161k
    Peak memory use before GGC: 85898k
    Peak memory use after GGC: 85043k
    Maximum of released memory in single GGC run: 14870k
    Garbage: 367911k -> 367896k
    Leak: 48456k -> 48440k
    Overhead: 41454k -> 41456k
    GGC runs: 604 -> 603
    Pre-IPA-Garbage: 162190k
    Pre-IPA-Leak: 88392k
    Pre-IPA-Overhead: 20276k
    Post-IPA-Garbage: 162190k
    Post-IPA-Leak: 88392k
    Post-IPA-Overhead: 20276k

comparing PR rtl-optimization/28071 testcase compilation at -O0 level:
    Overall memory needed: 361961k -> 361965k
    Peak memory use before GGC: 78540k
    Peak memory use after GGC: 49474k
    Maximum of released memory in single GGC run: 38208k
    Garbage: 144672k
    Leak: 7110k
    Overhead: 24889k
    GGC runs: 87
    Pre-IPA-Garbage: 12561k
    Pre-IPA-Leak: 20190k
    Pre-IPA-Overhead: 2241k
    Post-IPA-Garbage: 12561k
    Post-IPA-Leak: 20190k
    Post-IPA-Overhead: 2241k

comparing PR rtl-optimization/28071 testcase compilation at -O0 -g level:
    Overall memory needed: 362749k
    Peak memory use before GGC: 79237k
    Peak memory use after GGC: 50171k
    Maximum of released memory in single GGC run: 38192k
    Garbage: 144774k
    Leak: 9152k
    Overhead: 25473k
    GGC runs: 93
    Pre-IPA-Garbage: 12569k
    Pre-IPA-Leak: 20439k
    Pre-IPA-Overhead: 2295k
    Post-IPA-Garbage: 12569k
    Post-IPA-Leak: 20439k
    Post-IPA-Overhead: 2295k

comparing PR rtl-optimization/28071 testcase compilation at -O1 level:
    Overall memory needed: 227399k -> 227395k
    Peak memory use before GGC: 73624k
    Peak memory use after GGC: 66143k
    Maximum of released memory in single GGC run: 34735k
    Garbage: 222524k -> 222524k
    Leak: 7551k
    Overhead: 30652k -> 30653k
    GGC runs: 96
    Pre-IPA-Garbage: 48348k
    Pre-IPA-Leak: 63005k
    Pre-IPA-Overhead: 8797k
    Post-IPA-Garbage: 48348k
    Post-IPA-Leak: 63005k
    Post-IPA-Overhead: 8797k

comparing PR rtl-optimization/28071 testcase compilation at -O2 level:
  Overall memory allocated via mmap and sbrk increased from 358483k to 505351k, overall 40.97%
  Amount of produced GGC garbage increased from 251050k to 253877k, overall 1.13%
    Overall memory needed: 358483k -> 505351k
    Peak memory use before GGC: 73624k
    Peak memory use after GGC: 66143k
    Maximum of released memory in single GGC run: 36073k
    Garbage: 251050k -> 253877k
    Leak: 7553k
    Overhead: 36773k -> 37323k
    GGC runs: 105
    Pre-IPA-Garbage: 107058k
    Pre-IPA-Leak: 75901k
    Pre-IPA-Overhead: 14919k
    Post-IPA-Garbage: 107058k
    Post-IPA-Leak: 75901k
    Post-IPA-Overhead: 14919k

comparing PR rtl-optimization/28071 testcase compilation at -O3 -fno-tree-pre -fno-tree-fre level:
    Overall memory needed: 1026639k -> 1026767k
    Peak memory use before GGC: 141898k
    Peak memory use after GGC: 129175k
    Maximum of released memory in single GGC run: 62763k -> 62764k
    Garbage: 366976k -> 366978k
    Leak: 9099k
    Overhead: 45157k -> 45158k
    GGC runs: 102
    Pre-IPA-Garbage: 107058k
    Pre-IPA-Leak: 75901k
    Pre-IPA-Overhead: 14919k
    Post-IPA-Garbage: 107058k
    Post-IPA-Leak: 75901k
    Post-IPA-Overhead: 14919k

Head of the ChangeLog is:

--- /usr/src/SpecTests/sandbox-britten-memory/x86_64/mem-result/ChangeLog	2008-12-22 23:41:07.000000000 +0000
+++ /usr/src/SpecTests/sandbox-britten-memory/gcc/gcc/ChangeLog	2008-12-23 16:25:49.000000000 +0000
@@ -1,5 +1,75 @@
+2008-12-23  Andrew Pinski  <pinski@gmail.com>
+
+	PR middle-end/38590
+	* fold-const.c (fold_binary): Call fold_convert on arguments to
+	fold_build2 for negative divide optimization.
+
 2008-12-23  Jakub Jelinek  <jakub@redhat.com>
 
+	PR middle-end/31150
+	* dse.c (struct store_info): Add const_rhs field.
+	(clear_rhs_from_active_local_stores): Clear also const_rhs.
+	(record_store): Try also cselib_expand_value_rtx to get a constant.
+	(find_shift_sequence, get_stored_val): Use const_rhs instead of
+	rhs if worthwhile.
+	* cselib.c (cselib_record_sets): If !cselib_record_memory and
+	there is just one set from read-only MEM, look at REG_EQUAL or
+	REG_EQUIV note.
+
+	* dse.c (struct store_info): Add redundant_reason field.
+	(record_store): When storing the same constant as has been
+	stored by an earlier store, set redundant_reason field
+	to the earlier store's insn_info_t.  Don't delete cannot_delete
+	insns.
+	(find_shift_sequence): Remove read_info argument, add read_mode
+	and require_cst arguments.  Return early if require_cst and
+	constant wouldn't be returned.
+	(get_stored_val): New function.
+	(replace_read): Use it.
+	(scan_insn): Put even cannot_delete insns with exactly 1 store
+	into active_local_stores.
+	(dse_step1): Don't delete cannot_delete insns.  Remove redundant
+	constant stores if contains_cselib_groups and earlier store storing
+	the same value hasn't been eliminated.
+	(dse_step6): Renamed to dse_step7.  New function.
+	(dse_step7): Renamed from dse_step6.
+	(rest_of_handle_dse): Call dse_step6 and dse_step7 at the end.
+	* cselib.c (cselib_expand_value_rtx): Don't wrap CONST_INTs
+	into CONST unless really necessary.  Handle SUBREG, unary,
+	ternary, bitfield and compares specially, to be able to simplify
+	operations on constants.
+	(expand_loc): Try to optimize LO_SUM.
+
+	* dse.c (get_call_args): New function.
+	(scan_insn): Don't handle BUILT_IN_BZERO.  For memset, attempt
+	to get call arguments and if successful and both len and val are
+	constants, handle the call as (mem:BLK) (const_int) store.
+
+	* dse.c (struct store_info): Add is_large bool field, change
+	positions_needed into a union of a bitmask and bitmap + count.
+	(free_store_info): Free bitmap if is_large.
+	(set_usage_bits): Don't look at stores where
+	offset + width >= MAX_OFFSET.
+	(set_position_unneeded, set_all_positions_unneeded,
+	any_positions_needed_p, all_positions_needed_p): New static inline
+	functions.
+	(record_store): Handle BLKmode stores of CONST_INT, if
+	MEM_SIZE is set on the MEM.  Use the new positions_needed
+	accessor inlines.
+	(replace_read): Handle reads from BLKmode CONST_INT stores.
+	(check_mem_read_rtx): Use all_positions_needed_p function.
+	(dse_step1): Free large positions_needed bitmaps and clear is_large.
+
+	* dse.c (struct store_info): Change begin and end types to
+	HOST_WIDE_INT.
+
+	* dse.c (record_store): Fix check for unused store.
+
+	* expr.c (block_clear_fn): No longer static.
+	* expr.h (block_clear_fn): Declare.
+	* dse.c (scan_insn): Memset and bzero can just read their
+	arguments.
+
 	* config/i386/i386.c (expand_setmem_via_rep_stos): Add ORIG_VALUE
 	argument.  If ORIG_VALUE is const0_rtx and COUNT is constant,
 	set MEM_SIZE on DESTMEM.


The results can be reproduced by building a compiler with

--enable-gather-detailed-mem-stats targetting x86-64

and compiling preprocessed combine.c or testcase from PR8632 with:

-fmem-report --param=ggc-min-heapsize=1024 --param=ggc-min-expand=1 -Ox -Q

The memory consumption summary appears in the dump after detailed listing
of the places they are allocated in.  Peak memory consumption is actually
computed by looking for maximal value in {GC XXXX -> YYYY} report.

Your testing script.



More information about the Gcc-regression mailing list