This is the mail archive of the
mailing list for the GCC project.
Re: [patch] Speed up phi node insertion
- From: Daniel Berlin <dberlin at dberlin dot org>
- To: Jan Hubicka <jh at suse dot cz>
- Cc: Diego Novillo <dnovillo at redhat dot com>, mark at codesourcery dot com, Zdenek Dvorak <rakdver at atrey dot karlin dot mff dot cuni dot cz>, gcc-patches at gcc dot gnu dot org
- Date: Thu, 17 Aug 2006 08:40:09 -0400
- Subject: Re: [patch] Speed up phi node insertion
- References: <20060730105350.GA9172@atrey.karlin.mff.cuni.cz> <44E34F6B.email@example.com> <20060817095330.GD1612@kam.mff.cuni.cz>
Jan Hubicka wrote:
>> Zdenek Dvorak wrote on 07/30/06 06:53:
>>> PR rtl-optimization/28071
>>> * basic-block.h (bb_dom_dfs_in, bb_dom_dfs_out): Declare.
>>> * dominance.c (bb_dom_dfs_in, bb_dom_dfs_out): New functions.
>>> * tree-into-ssa.c (struct dom_dfsnum): New.
>>> (cmp_dfsnum, find_dfsnum_interval, prune_unused_phi_nodes): New
>>> (insert_phi_nodes_for): Use prune_unused_phi_nodes instead of
>>> (prepare_block_for_update, prepare_use_sites_for): Mark the uses
>>> in phi nodes in the correct blocks.
>> OK. Nice catch, thanks.
> Nice ineed:
> comparing PR rtl-optimization/28071 testcase compilation at -O0 level:
> Overall memory needed: 146456k
> Peak memory use before GGC: 95412k
> Peak memory use after GGC: 58507k
> Maximum of released memory in single GGC run: 45493k
> Garbage: 163295k
> Leak: 7142k
> Overhead: 29023k
> GGC runs: 87
> comparing PR rtl-optimization/28071 testcase compilation at -O1 level:
> Overall memory needed: 428348k -> 430308k
> Peak memory use before GGC: 201177k
> Peak memory use after GGC: 196173k
> Maximum of released memory in single GGC run: 100203k
> Garbage: 279198k
> Leak: 47195k
> Overhead: 31459k
> GGC runs: 105
> comparing PR rtl-optimization/28071 testcase compilation at -O2 level:
> Overall memory needed: 350296k -> 350424k
> Peak memory use before GGC: 208293k
> Peak memory use after GGC: 196536k
> Maximum of released memory in single GGC run: 101565k
> Garbage: 394891k
> Leak: 47778k
> Overhead: 49054k
> GGC runs: 111
> comparing PR rtl-optimization/28071 testcase compilation at -O3 -fno-tree-pre -fno-tree-fre level:
> Ovarall memory allocated via mmap and sbrk decreased from 781364k to 535696k, overall -45.86%
> Overall memory needed: 781364k -> 535696k
> Peak memory use before GGC: 314602k
> Peak memory use after GGC: 292946k
> Maximum of released memory in single GGC run: 163430k
> Garbage: 494953k
> Leak: 65110k
> Overhead: 60330k
> GGC runs: 100
> I wonder what to do about the PR tree-optimization/27865. The memory
> consumption seems to be more or less under control (ICC needs 200MB to
> compile that, but it is 32bit binary, so peak 530MB is not bad and it is
> better than any older version I tested except for 2.95 peaking at about
> 120MB by not inlining, but it needs inadequate compilation time).
> Compilation time is not _that_ bad either.
> Only remaining problems are the scheduler quadratic compilation time
> (not too serius) for -O2 and the PRE memory explosion (-O3) and
> compilation time (-O2) issues. Daniel, do you plan to do something
> about it in 4.2 timeframe (your patch you sent me worked well for -O2)?
> Otherwise I guess we can retarged the bug to 4.3 and stop it from
> holding stage3...
The changes i've got in store to fix this are not really stage3 material:
dberlin@dannyb-corp0:~/gccstuff/gcc-pre-speed/gcc> svn diff
tree-ssa-pre.c | 2435
++++++++++++++++++++++++++++++++++++--------------------- 1 file
changed, 1546 insertions(+), 889 deletions(-)
I've been waiting for stage1 to put them in.