[patch] Speed up phi node insertion
Jan Hubicka
jh@suse.cz
Thu Aug 17 09:59:00 GMT 2006
> Zdenek Dvorak wrote on 07/30/06 06:53:
>
> > PR rtl-optimization/28071
> > * basic-block.h (bb_dom_dfs_in, bb_dom_dfs_out): Declare.
> > * dominance.c (bb_dom_dfs_in, bb_dom_dfs_out): New functions.
> > * tree-into-ssa.c (struct dom_dfsnum): New.
> > (cmp_dfsnum, find_dfsnum_interval, prune_unused_phi_nodes): New
> > functions.
> > (insert_phi_nodes_for): Use prune_unused_phi_nodes instead of
> > compute_global_livein.
> > (prepare_block_for_update, prepare_use_sites_for): Mark the uses
> > in phi nodes in the correct blocks.
> >
> OK. Nice catch, thanks.
Nice ineed:
comparing PR rtl-optimization/28071 testcase compilation at -O0 level:
Overall memory needed: 146456k
Peak memory use before GGC: 95412k
Peak memory use after GGC: 58507k
Maximum of released memory in single GGC run: 45493k
Garbage: 163295k
Leak: 7142k
Overhead: 29023k
GGC runs: 87
comparing PR rtl-optimization/28071 testcase compilation at -O1 level:
Overall memory needed: 428348k -> 430308k
Peak memory use before GGC: 201177k
Peak memory use after GGC: 196173k
Maximum of released memory in single GGC run: 100203k
Garbage: 279198k
Leak: 47195k
Overhead: 31459k
GGC runs: 105
comparing PR rtl-optimization/28071 testcase compilation at -O2 level:
Overall memory needed: 350296k -> 350424k
Peak memory use before GGC: 208293k
Peak memory use after GGC: 196536k
Maximum of released memory in single GGC run: 101565k
Garbage: 394891k
Leak: 47778k
Overhead: 49054k
GGC runs: 111
comparing PR rtl-optimization/28071 testcase compilation at -O3 -fno-tree-pre -fno-tree-fre level:
Ovarall memory allocated via mmap and sbrk decreased from 781364k to 535696k, overall -45.86%
Overall memory needed: 781364k -> 535696k
Peak memory use before GGC: 314602k
Peak memory use after GGC: 292946k
Maximum of released memory in single GGC run: 163430k
Garbage: 494953k
Leak: 65110k
Overhead: 60330k
GGC runs: 100
I wonder what to do about the PR tree-optimization/27865. The memory
consumption seems to be more or less under control (ICC needs 200MB to
compile that, but it is 32bit binary, so peak 530MB is not bad and it is
better than any older version I tested except for 2.95 peaking at about
120MB by not inlining, but it needs inadequate compilation time).
Compilation time is not _that_ bad either.
Only remaining problems are the scheduler quadratic compilation time
(not too serius) for -O2 and the PRE memory explosion (-O3) and
compilation time (-O2) issues. Daniel, do you plan to do something
about it in 4.2 timeframe (your patch you sent me worked well for -O2)?
Otherwise I guess we can retarged the bug to 4.3 and stop it from
holding stage3...
Honza
More information about the Gcc-patches
mailing list