This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
[tree-ssa] POOMA compile time / memory requirement comparison
- From: Richard Guenther <rguenth at tat dot physik dot uni-tuebingen dot de>
- To: gcc at gcc dot gnu dot org
- Date: Tue, 4 May 2004 12:22:11 +0200 (CEST)
- Subject: [tree-ssa] POOMA compile time / memory requirement comparison
Here is more data for the merge-criteria of tree-ssa compared to mainline.
I compiled the tramp-v3.cpp testcase on a 1GB ram ia32 machine with
gcc-3.5 (GCC) 3.5.0 20040430 (experimental)
and
gcc-ssa (GCC) 3.5-tree-ssa 20040504 (merged 20040428)
with leafify attribute disabled and enabled.
The summary is
mem user sys wall
3.5 387MB 168.98 3.85 181.07
3.5 w/leafify 452MB 266.71 4.37 284.90
ssa 493MB 226.71 5.35 245.18
ssa w/leafify 575MB 377.43 8.16 412.24
So tree-ssa memory requirement is 127% of mainline, compile time
is 134% of mainline (without leafify) and 126% of mainline (with leafify).
Details (from the first entries you see, mainline is better at optimizing gcc than tree-ssa):
Execution times gcc-3.5 gcc-3.5 w/leafify tree-ssa tree-ssa w/leafify
garbage collection 10.79 14.58 garbage collection 13.63 20.36
callgraph construction 1.08 1.09 callgraph construction 1.59 1.59
callgraph optimization 0.71 8.93 callgraph optimization 0.65 1.25
cfg construction 2.12 3.76 cfg construction 0.63 0.84
cfg cleanup 3.53 5.18 cfg cleanup 2.71 3.19
trivially dead code 3.56 5.47 trivially dead code 2.25 3.05
life analysis 4.88 6.24 life analysis 5.07 5.58
life info update 1.83 2.54 life info update 2.49 2.76
alias analysis 4.31 6.67 alias analysis 3.37 4.52
register scan 2.83 4.51 register scan 2.54 2.84
rebuild jump labels 1.48 2.73 rebuild jump labels 0.67 1.16
preprocessing 0.65 0.75 preprocessing 0.60 0.76
parser 14.04 14.62 parser 17.63 17.79
name lookup 5.00 5.22 name lookup 5.60 5.88
tree gimplify 3.47 4.04
tree eh 2.74 4.80
tree CFG construction 1.83 2.90
tree CFG cleanup 3.22 5.16
tree PTA 0.87 0.89
tree alias analysis 0.89 1.44
tree PHI insertion 2.79 15.85
tree SSA rewrite 3.10 5.02
tree SSA other 5.01 8.16
tree operand scan 2.99 4.60
dominator optimization 12.34 22.15
tree SRA 0.19 0.44
tree CCP 2.10 3.33
tree split crit edges 0.23 0.41
tree PRE 7.25 49.52
tree linearize phis 0.03 0.05
tree forward propagate 1.36 2.59
tree conservative DCE 2.77 4.39
tree aggressive DCE 1.12 1.86
tree DSE 2.71 4.22
tree copy headers 2.51 3.88
tree SSA to normal 3.70 5.52
tree rename SSA copies 1.13 1.77
dominance frontiers 0.46 0.55
control dependences 0.16 0.23
expand 20.20 37.50 expand 22.23 32.18
varconst 0.59 0.61 varconst 0.58 0.71
integration 22.24 45.65 integration 23.91 48.77
jump 2.10 4.93 jump 1.19 1.74
CSE 12.18 14.27 CSE 8.58 9.74
global CSE 14.66 7.33 global CSE 5.33 9.89
loop analysis 10.08 3.39 loop analysis 1.44 3.61
bypass jumps 1.27 1.81 bypass jumps 1.12 1.40
web 1.60 2.49 web 1.33 1.95
CSE 2 3.88 4.59 CSE 2 3.46 3.88
branch prediction 3.21 4.57 branch prediction 2.30 3.22
flow analysis 0.08 0.21 flow analysis 0.14 0.33
combiner 4.06 4.90 combiner 3.27 3.75
if-conversion 0.58 0.61 if-conversion 0.64 0.73
regmove 0.83 1.13 regmove 0.80 0.95
mode switching 0.00 0.01
local alloc 2.89 3.88 local alloc 2.54 2.97
global alloc 5.96 7.31 global alloc 6.10 7.14
reload CSE regs 2.65 3.38 reload CSE regs 2.91 3.27
flow 2 0.67 0.76 flow 2 0.66 0.69
if-conversion 2 0.27 0.35 if-conversion 2 0.35 0.44
peephole 2 0.62 0.78 peephole 2 0.58 0.67
rename registers 0.86 1.08 rename registers 1.03 1.08
scheduling 2 4.88 5.72 scheduling 2 4.58 5.28
machine dep reorg 0.75 1.10 machine dep reorg 0.90 1.22
reorder blocks 0.50 0.70 reorder blocks 0.60 0.66
shorten branches 1.34 1.53 shorten branches 0.96 1.02
reg stack 0.24 0.33 reg stack 0.10 0.21
final 1.46 2.00 final 1.56 1.84
symout 0.04 0.05 symout 0.04 0.04
rest of compilation 5.13 7.75 rest of compilation 2.53 3.20
TOTAL 168.98 266.71 TOTAL 226.71 377.43
--
Richard Guenther <richard dot guenther at uni-tuebingen dot de>
WWW: http://www.tat.physik.uni-tuebingen.de/~rguenth/