This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PING*2][PATCH][Updated][STAGE2] New interference graph implementation.


> With Kenny's commit of his new interference graph builder, my new interference
> graph representation patch needed updating due to some of the patched code
> being moved to different files and was changed with Kenny's patch.  The original
> patch was posted during stage2 here:
> 
>     http://gcc.gnu.org/ml/gcc-patches/2007-09/msg00529.html
> 
> The updated patch and ChangeLog is here.  With help from Kenny, this has
> bootstrapped and regtested with no regressions on powerpc64-linux (ran the
> testsuite in both 32-bit and 64-bit modes), x86-{32,64}-linux and
> ia-64-linux.
> 
> Is this ok for mainline?


Hi,
this patch seems to cause some noticeable memory usage regressions (the
original Kenny's ra-conflict patch cause few of them too. 

Honza

comparing PR rtl-optimization/28071 testcase compilation at -O1 level:
  Overall memory allocated via mmap and sbrk increased from 297612k to 392896k, overall 32.02%
    Overall memory needed: 297612k -> 392896k
    Peak memory use before GGC: 81062k
    Peak memory use after GGC: 73201k
    Maximum of released memory in single GGC run: 40265k
    Garbage: 236192k
    Leak: 15698k
    Overhead: 31631k
    GGC runs: 103

comparing PR rtl-optimization/28071 testcase compilation at -O2 level:
  Overall memory allocated via mmap and sbrk increased from 270588k to 296924k, overall 9.73%
    Overall memory needed: 270588k -> 296924k
    Peak memory use before GGC: 78187k
    Peak memory use after GGC: 73201k
    Maximum of released memory in single GGC run: 33868k
    Garbage: 246253k
    Leak: 15788k
    Overhead: 33704k
    GGC runs: 116

comparing PR rtl-optimization/28071 testcase compilation at -O3 -fno-tree-pre -fno-tree-fre level:
  Overall memory allocated via mmap and sbrk increased from 1056156k to 1312880k, overall 24.31%
    Overall memory needed: 1056156k -> 1312880k
    Peak memory use before GGC: 136332k
    Peak memory use after GGC: 126665k
    Maximum of released memory in single GGC run: 68197k
    Garbage: 364426k
    Leak: 26574k
    Overhead: 46311k
    GGC runs: 102

Head of the ChangeLog is:

--- /usr/src/SpecTests/sandbox-britten-memory/x86_64/mem-result/ChangeLog	2007-10-05 07:54:54.000000000 +0000
+++ /usr/src/SpecTests/sandbox-britten-memory/gcc/gcc/ChangeLog	2007-10-06 00:13:09.000000000 +0000
@@ -1,3 +1,85 @@
+2007-10-05  Hans-Peter Nilsson  <hp@axis.com>
+
+	* gthr-single.h: Revert last change.
+
+2007-10-05  Michael Matz  <matz@suse.de>
+
+	PR middle-end/33667
+	* lower-subreg.c (decompose_multiword_subregs): Use
+	validate_unshare_change().
+
+2007-10-05  Peter Bergner  <bergner@vnet.ibm.com>
+
+	* ra-conflict.c: Include "sparseset.h".
+	(conflicts): Change to HOST_WIDEST_FAST_INT.
+	(allocnos_live): Redefine variable as a sparseset.
+	(SET_ALLOCNO_LIVE, CLEAR_ALLOCNO_LIVE, GET_ALLOCNO_LIVE): Delete macros.
+	(allocno_row_words): Removed global variable.
+	(partial_bitnum, max_bitnum, adjacency_pool, adjacency): New variables.
+	(CONFLICT_BITNUM, CONFLICT_BITNUM_FAST): New defines.
+	(conflict_p, set_conflict_p, set_conflicts_p): New functions.
+	(record_one_conflict_between_regnos): Cache allocno values and reuse.
+	Use set_conflict_p.
+	(record_one_conflict): Update uses of allocnos_live to use
+	the sparseset routines.  Use set_conflicts_p.
+	(mark_reg_store): Likewise.
+	(set_reg_in_live): Likewise.
+	(global_conflicts): Update uses of allocnos_live.
+	Use the new adjacency list to visit an allocno's neighbors
+	rather than iterating over all possible allocnos.
+	Call set_conflicts_p to setup conflicts rather than adding
+	them manually.
+	* global.c: Comments updated.  
+	(CONFLICTP): Delete define.
+	(regno_compare): New function.  Add prototype.
+	(global_alloc): Sort the allocno to regno mapping according to
+	which basic blocks the regnos are referenced in.  Modify the
+	conflict bit matrix to a compressed triangular bitmatrix.
+	Only allocate the conflict bit matrix and adjacency lists if
+	we are actually going to allocate something.
+	(expand_preferences): Use conflict_p.  Update uses of allocnos_live.
+	(prune_preferences): Use the FOR_EACH_CONFLICT macro to visit an
+	allocno's neighbors rather than iterating over all possible allocnos.
+	(mirror_conflicts): Removed function.
+	(dump_conflicts): Iterate over regnos rather than allocnos so
+	that all dump output will be sorted by regno number.
+	Use the FOR_EACH_CONFLICT macro.
+	* ra.h: Comments updated.
+	(conflicts): Update prototype to HOST_WIDEST_FAST_INT.
+	(partial_bitnum, max_bitnum, adjacency, adjacency_pool): Add prototypes.
+	(ADJACENCY_VEC_LENGTH, FOR_EACH_CONFLICT): New defines.
+	(adjacency_list_d, adjacency_iterator_d): New types.
+	(add_neighbor, adjacency_iter_init, adjacency_iter_done,
+	adjacency_iter_next, regno_basic_block): New static inline functions.
+	(EXECUTE_IF_SET_IN_ALLOCNO_SET): Removed define.
+	(conflict_p): Add function prototype.
+	* sparseset.h, sparseset.c: New files.
+	* Makefile.in (OBJS-common): Add sparseset.o.
+	(sparseset.o): New rule.
+
+2007-10-05  Richard Guenther  <rguenther@suse.de>
+
+	PR middle-end/33666
+	* fold-const.c (fold_unary): Do not fold (long long)(int)ptr
+	to (long long)ptr.
+
+2007-10-05  Michael Matz  <matz@suse.de>
+
+	PR inline-asm/33600
+	* function.c (match_asm_constraints_1): Check for input
+	being used in the outputs.
+
+2007-10-05  Richard Guenther  <rguenther@suse.de>
+
+	* tree-cfg.c (verify_gimple_expr): Accept OBJ_TYPE_REF.
+
+2007-10-05  Richard Sandiford  <rsandifo@nildram.co.uk>
+
+	PR target/33635
+	* config/mips/mips.c (mips_register_move_cost): Rewrite to use
+	subset checks.  Make the cost of FPR -> FPR moves depend on
+	mips_mode_ok_for_mov_fmt_p.
+
 2007-10-04  Doug Kwan  <dougkwan@google.com>
 
 	* gthr-posix.h (__gthread_cond_broadcast, __gthread_cond_wait,


The results can be reproduced by building a compiler with

--enable-gather-detailed-mem-stats targetting x86-64

and compiling preprocessed combine.c or testcase from PR8632 with:

-fmem-report --param=ggc-min-heapsize=1024 --param=ggc-min-expand=1 -Ox -Q

The memory consumption summary appears in the dump after detailed listing
of the places they are allocated in.  Peak memory consumption is actually
computed by looking for maximal value in {GC XXXX -> YYYY} report.

Your testing script.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]