[Bug tree-optimization/23955] Compile time regressions with tramp3d
rguenth at gcc dot gnu dot org
gcc-bugzilla@gcc.gnu.org
Sun Sep 18 18:16:00 GMT 2005
------- Additional Comments From rguenth at gcc dot gnu dot org 2005-09-18 18:16 -------
-ftime-report for the 4.1 + flatten compile:
Execution times (seconds)
garbage collection : 6.32 ( 4%) usr 0.07 ( 1%) sys 6.73 ( 4%) wall
0 kB ( 0%) ggc
callgraph construction: 0.42 ( 0%) usr 0.03 ( 0%) sys 0.42 ( 0%) wall
5274 kB ( 0%) ggc
callgraph optimization: 0.12 ( 0%) usr 0.00 ( 0%) sys 0.11 ( 0%) wall
1605 kB ( 0%) ggc
ipa reference : 0.49 ( 0%) usr 0.00 ( 0%) sys 0.49 ( 0%) wall
440 kB ( 0%) ggc
ipa pure const : 0.13 ( 0%) usr 0.01 ( 0%) sys 0.14 ( 0%) wall
0 kB ( 0%) ggc
ipa type escape : 4.88 ( 3%) usr 0.00 ( 0%) sys 4.88 ( 3%) wall
0 kB ( 0%) ggc
cfg construction : 0.06 ( 0%) usr 0.00 ( 0%) sys 0.07 ( 0%) wall
2530 kB ( 0%) ggc
cfg cleanup : 1.29 ( 1%) usr 0.01 ( 0%) sys 1.16 ( 1%) wall
2256 kB ( 0%) ggc
trivially dead code : 0.70 ( 0%) usr 0.01 ( 0%) sys 0.62 ( 0%) wall
0 kB ( 0%) ggc
life analysis : 3.52 ( 2%) usr 0.00 ( 0%) sys 3.64 ( 2%) wall
2601 kB ( 0%) ggc
life info update : 0.39 ( 0%) usr 0.00 ( 0%) sys 0.29 ( 0%) wall
590 kB ( 0%) ggc
alias analysis : 1.41 ( 1%) usr 0.00 ( 0%) sys 1.13 ( 1%) wall
12731 kB ( 1%) ggc
register scan : 0.69 ( 0%) usr 0.01 ( 0%) sys 0.87 ( 0%) wall
526 kB ( 0%) ggc
rebuild jump labels : 0.21 ( 0%) usr 0.00 ( 0%) sys 0.16 ( 0%) wall
0 kB ( 0%) ggc
preprocessing : 0.75 ( 0%) usr 0.37 ( 5%) sys 1.15 ( 1%) wall
686 kB ( 0%) ggc
parser : 4.32 ( 2%) usr 0.98 (13%) sys 5.24 ( 3%) wall
229494 kB (11%) ggc
name lookup : 2.08 ( 1%) usr 0.88 (12%) sys 2.88 ( 2%) wall
46108 kB ( 2%) ggc
inline heuristics : 1.02 ( 1%) usr 0.04 ( 1%) sys 1.06 ( 1%) wall
36310 kB ( 2%) ggc
integration : 12.02 ( 7%) usr 0.02 ( 0%) sys 11.77 ( 6%) wall
693907 kB (34%) ggc
tree gimplify : 0.65 ( 0%) usr 0.03 ( 0%) sys 0.83 ( 0%) wall
11198 kB ( 1%) ggc
tree eh : 0.06 ( 0%) usr 0.00 ( 0%) sys 0.03 ( 0%) wall
107 kB ( 0%) ggc
tree CFG construction : 0.04 ( 0%) usr 0.02 ( 0%) sys 0.10 ( 0%) wall
14527 kB ( 1%) ggc
tree CFG cleanup : 3.77 ( 2%) usr 0.11 ( 2%) sys 3.98 ( 2%) wall
16679 kB ( 1%) ggc
tree VRP : 3.57 ( 2%) usr 0.13 ( 2%) sys 3.69 ( 2%) wall
22691 kB ( 1%) ggc
tree copy propagation : 3.09 ( 2%) usr 0.04 ( 1%) sys 3.09 ( 2%) wall
3066 kB ( 0%) ggc
tree store copy prop : 0.59 ( 0%) usr 0.03 ( 0%) sys 0.42 ( 0%) wall
652 kB ( 0%) ggc
tree find ref. vars : 1.36 ( 1%) usr 0.05 ( 1%) sys 1.48 ( 1%) wall
86797 kB ( 4%) ggc
tree PTA : 12.39 ( 7%) usr 0.06 ( 1%) sys 12.36 ( 7%) wall
32031 kB ( 2%) ggc
tree alias analysis : 9.35 ( 5%) usr 0.84 (11%) sys 10.62 ( 6%) wall
68682 kB ( 3%) ggc
tree PHI insertion : 1.40 ( 1%) usr 0.01 ( 0%) sys 1.49 ( 1%) wall
21821 kB ( 1%) ggc
tree SSA rewrite : 4.88 ( 3%) usr 0.05 ( 1%) sys 4.67 ( 2%) wall
108845 kB ( 5%) ggc
tree SSA other : 1.19 ( 1%) usr 0.47 ( 6%) sys 1.72 ( 1%) wall
1481 kB ( 0%) ggc
tree SSA incremental : 12.44 ( 7%) usr 0.23 ( 3%) sys 12.44 ( 7%) wall
30571 kB ( 1%) ggc
tree operand scan : 9.20 ( 5%) usr 2.05 (28%) sys 11.56 ( 6%) wall
68307 kB ( 3%) ggc
dominator optimization: 9.49 ( 5%) usr 0.10 ( 1%) sys 9.60 ( 5%) wall
78640 kB ( 4%) ggc
tree SRA : 0.50 ( 0%) usr 0.02 ( 0%) sys 0.57 ( 0%) wall
11723 kB ( 1%) ggc
tree STORE-CCP : 0.62 ( 0%) usr 0.00 ( 0%) sys 0.68 ( 0%) wall
447 kB ( 0%) ggc
tree CCP : 1.38 ( 1%) usr 0.01 ( 0%) sys 1.30 ( 1%) wall
2024 kB ( 0%) ggc
tree split crit edges : 0.16 ( 0%) usr 0.01 ( 0%) sys 0.22 ( 0%) wall
18294 kB ( 1%) ggc
tree reassociation : 0.11 ( 0%) usr 0.00 ( 0%) sys 0.08 ( 0%) wall
4 kB ( 0%) ggc
tree PRE : 2.88 ( 2%) usr 0.03 ( 0%) sys 3.01 ( 2%) wall
27185 kB ( 1%) ggc
tree FRE : 4.40 ( 2%) usr 0.06 ( 1%) sys 4.43 ( 2%) wall
41584 kB ( 2%) ggc
tree code sinking : 0.36 ( 0%) usr 0.01 ( 0%) sys 0.49 ( 0%) wall
79 kB ( 0%) ggc
tree linearize phis : 0.06 ( 0%) usr 0.00 ( 0%) sys 0.02 ( 0%) wall
10 kB ( 0%) ggc
tree forward propagate: 1.00 ( 1%) usr 0.22 ( 3%) sys 1.19 ( 1%) wall
49760 kB ( 2%) ggc
tree conservative DCE : 2.32 ( 1%) usr 0.00 ( 0%) sys 2.21 ( 1%) wall
0 kB ( 0%) ggc
tree aggressive DCE : 0.48 ( 0%) usr 0.00 ( 0%) sys 0.50 ( 0%) wall
0 kB ( 0%) ggc
tree DSE : 0.46 ( 0%) usr 0.00 ( 0%) sys 0.37 ( 0%) wall
760 kB ( 0%) ggc
PHI merge : 0.02 ( 0%) usr 0.00 ( 0%) sys 0.03 ( 0%) wall
747 kB ( 0%) ggc
tree loop bounds : 0.71 ( 0%) usr 0.00 ( 0%) sys 0.73 ( 0%) wall
5718 kB ( 0%) ggc
loop invariant motion : 0.53 ( 0%) usr 0.00 ( 0%) sys 0.54 ( 0%) wall
185 kB ( 0%) ggc
tree canonical iv : 0.18 ( 0%) usr 0.00 ( 0%) sys 0.17 ( 0%) wall
4380 kB ( 0%) ggc
scev constant prop : 0.30 ( 0%) usr 0.00 ( 0%) sys 0.31 ( 0%) wall
4656 kB ( 0%) ggc
complete unrolling : 1.72 ( 1%) usr 0.07 ( 1%) sys 1.63 ( 1%) wall
32424 kB ( 2%) ggc
tree iv optimization : 1.43 ( 1%) usr 0.01 ( 0%) sys 1.33 ( 1%) wall
29199 kB ( 1%) ggc
tree loop init : 0.59 ( 0%) usr 0.02 ( 0%) sys 0.45 ( 0%) wall
12 kB ( 0%) ggc
tree loop fini : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.00 ( 0%) wall
0 kB ( 0%) ggc
tree copy headers : 0.41 ( 0%) usr 0.01 ( 0%) sys 0.63 ( 0%) wall
19708 kB ( 1%) ggc
tree SSA uncprop : 0.09 ( 0%) usr 0.00 ( 0%) sys 0.09 ( 0%) wall
0 kB ( 0%) ggc
tree SSA to normal : 1.42 ( 1%) usr 0.00 ( 0%) sys 1.46 ( 1%) wall
15217 kB ( 1%) ggc
tree rename SSA copies: 0.56 ( 0%) usr 0.00 ( 0%) sys 0.64 ( 0%) wall
1 kB ( 0%) ggc
dominance frontiers : 0.61 ( 0%) usr 0.00 ( 0%) sys 0.77 ( 0%) wall
0 kB ( 0%) ggc
control dependences : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.04 ( 0%) wall
0 kB ( 0%) ggc
expand : 7.56 ( 4%) usr 0.10 ( 1%) sys 7.63 ( 4%) wall
85819 kB ( 4%) ggc
varconst : 0.23 ( 0%) usr 0.00 ( 0%) sys 0.22 ( 0%) wall
615 kB ( 0%) ggc
jump : 0.14 ( 0%) usr 0.00 ( 0%) sys 0.13 ( 0%) wall
189 kB ( 0%) ggc
CSE : 6.53 ( 4%) usr 0.01 ( 0%) sys 6.65 ( 4%) wall
6438 kB ( 0%) ggc
loop analysis : 1.15 ( 1%) usr 0.00 ( 0%) sys 1.06 ( 1%) wall
6947 kB ( 0%) ggc
global CSE : 0.05 ( 0%) usr 0.00 ( 0%) sys 0.03 ( 0%) wall
0 kB ( 0%) ggc
CPROP 1 : 0.52 ( 0%) usr 0.00 ( 0%) sys 0.51 ( 0%) wall
4188 kB ( 0%) ggc
PRE : 1.57 ( 1%) usr 0.00 ( 0%) sys 1.56 ( 1%) wall
2273 kB ( 0%) ggc
CPROP 2 : 0.67 ( 0%) usr 0.00 ( 0%) sys 0.61 ( 0%) wall
1548 kB ( 0%) ggc
bypass jumps : 0.56 ( 0%) usr 0.00 ( 0%) sys 0.59 ( 0%) wall
1401 kB ( 0%) ggc
web : 0.43 ( 0%) usr 0.00 ( 0%) sys 0.46 ( 0%) wall
222 kB ( 0%) ggc
CSE 2 : 4.20 ( 2%) usr 0.00 ( 0%) sys 4.40 ( 2%) wall
3341 kB ( 0%) ggc
branch prediction : 0.93 ( 1%) usr 0.00 ( 0%) sys 0.89 ( 0%) wall
4372 kB ( 0%) ggc
flow analysis : 0.07 ( 0%) usr 0.00 ( 0%) sys 0.02 ( 0%) wall
7 kB ( 0%) ggc
combiner : 3.06 ( 2%) usr 0.01 ( 0%) sys 2.96 ( 2%) wall
10796 kB ( 1%) ggc
if-conversion : 0.13 ( 0%) usr 0.00 ( 0%) sys 0.23 ( 0%) wall
405 kB ( 0%) ggc
regmove : 0.67 ( 0%) usr 0.00 ( 0%) sys 0.62 ( 0%) wall
146 kB ( 0%) ggc
local alloc : 1.81 ( 1%) usr 0.00 ( 0%) sys 1.97 ( 1%) wall
3329 kB ( 0%) ggc
global alloc : 4.71 ( 3%) usr 0.00 ( 0%) sys 4.84 ( 3%) wall
26430 kB ( 1%) ggc
reload CSE regs : 2.71 ( 2%) usr 0.00 ( 0%) sys 2.73 ( 1%) wall
10832 kB ( 1%) ggc
flow 2 : 0.14 ( 0%) usr 0.00 ( 0%) sys 0.12 ( 0%) wall
2225 kB ( 0%) ggc
if-conversion 2 : 0.07 ( 0%) usr 0.00 ( 0%) sys 0.12 ( 0%) wall
11 kB ( 0%) ggc
peephole 2 : 0.30 ( 0%) usr 0.00 ( 0%) sys 0.30 ( 0%) wall
356 kB ( 0%) ggc
rename registers : 1.43 ( 1%) usr 0.00 ( 0%) sys 1.60 ( 1%) wall
2031 kB ( 0%) ggc
machine dep reorg : 0.56 ( 0%) usr 0.00 ( 0%) sys 0.68 ( 0%) wall
75 kB ( 0%) ggc
reorder blocks : 0.33 ( 0%) usr 0.00 ( 0%) sys 0.30 ( 0%) wall
2611 kB ( 0%) ggc
shorten branches : 0.13 ( 0%) usr 0.00 ( 0%) sys 0.19 ( 0%) wall
0 kB ( 0%) ggc
reg stack : 0.14 ( 0%) usr 0.00 ( 0%) sys 0.18 ( 0%) wall
1705 kB ( 0%) ggc
final : 1.11 ( 1%) usr 0.03 ( 0%) sys 1.12 ( 1%) wall
4552 kB ( 0%) ggc
TOTAL : 179.59 7.31 187.62
2049140 kB
and for 4.0 + leafify patch:
garbage collection : 5.01 ( 4%) usr 0.06 ( 1%) sys 5.91 ( 4%) wall
callgraph construction: 0.28 ( 0%) usr 0.00 ( 0%) sys 0.34 ( 0%) wall
callgraph optimization: 0.67 ( 0%) usr 0.07 ( 1%) sys 0.85 ( 1%) wall
cfg construction : 0.01 ( 0%) usr 0.01 ( 0%) sys 0.05 ( 0%) wall
cfg cleanup : 0.94 ( 1%) usr 0.00 ( 0%) sys 1.00 ( 1%) wall
trivially dead code : 0.58 ( 0%) usr 0.01 ( 0%) sys 0.62 ( 0%) wall
life analysis : 2.53 ( 2%) usr 0.03 ( 1%) sys 3.31 ( 2%) wall
life info update : 0.27 ( 0%) usr 0.00 ( 0%) sys 0.32 ( 0%) wall
alias analysis : 0.90 ( 1%) usr 0.00 ( 0%) sys 0.95 ( 1%) wall
register scan : 0.74 ( 1%) usr 0.02 ( 0%) sys 0.72 ( 0%) wall
rebuild jump labels : 0.23 ( 0%) usr 0.00 ( 0%) sys 0.18 ( 0%) wall
preprocessing : 0.29 ( 0%) usr 0.18 ( 4%) sys 0.51 ( 0%) wall
parser : 4.71 ( 3%) usr 0.65 (13%) sys 6.07 ( 4%) wall
name lookup : 2.39 ( 2%) usr 0.71 (14%) sys 3.84 ( 2%) wall
integration : 45.27 (33%) usr 0.30 ( 6%) sys 53.22 (32%) wall
tree gimplify : 0.65 ( 0%) usr 0.01 ( 0%) sys 0.84 ( 1%) wall
tree eh : 0.36 ( 0%) usr 0.01 ( 0%) sys 0.43 ( 0%) wall
tree CFG construction : 0.77 ( 1%) usr 0.01 ( 0%) sys 0.98 ( 1%) wall
tree CFG cleanup : 1.34 ( 1%) usr 0.01 ( 0%) sys 1.34 ( 1%) wall
tree find referenced vars: 0.99 ( 1%) usr 0.02 ( 0%) sys 1.12 ( 1%) wall
tree PTA : 1.53 ( 1%) usr 0.01 ( 0%) sys 2.07 ( 1%) wall
tree alias analysis : 5.58 ( 4%) usr 0.11 ( 2%) sys 6.57 ( 4%) wall
tree PHI insertion : 2.37 ( 2%) usr 0.02 ( 0%) sys 2.91 ( 2%) wall
tree SSA rewrite : 2.73 ( 2%) usr 0.03 ( 1%) sys 3.11 ( 2%) wall
tree SSA other : 4.80 ( 3%) usr 0.91 (18%) sys 6.47 ( 4%) wall
tree operand scan : 3.15 ( 2%) usr 0.93 (19%) sys 4.87 ( 3%) wall
dominator optimization: 7.91 ( 6%) usr 0.24 ( 5%) sys 9.26 ( 6%) wall
tree SRA : 0.40 ( 0%) usr 0.00 ( 0%) sys 0.52 ( 0%) wall
tree CCP : 0.58 ( 0%) usr 0.01 ( 0%) sys 0.64 ( 0%) wall
tree split crit edges : 0.14 ( 0%) usr 0.01 ( 0%) sys 0.18 ( 0%) wall
tree PRE : 1.95 ( 1%) usr 0.05 ( 1%) sys 2.23 ( 1%) wall
tree remove redundant PHIs: 1.37 ( 1%) usr 0.03 ( 1%) sys 1.70 ( 1%) wall
tree linearize phis : 0.01 ( 0%) usr 0.01 ( 0%) sys 0.03 ( 0%) wall
tree forward propagate: 0.54 ( 0%) usr 0.00 ( 0%) sys 0.67 ( 0%) wall
tree conservative DCE : 1.26 ( 1%) usr 0.00 ( 0%) sys 1.56 ( 1%) wall
tree aggressive DCE : 0.41 ( 0%) usr 0.00 ( 0%) sys 0.46 ( 0%) wall
tree DSE : 0.61 ( 0%) usr 0.00 ( 0%) sys 0.67 ( 0%) wall
PHI merge : 0.00 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall
tree loop optimization: 0.01 ( 0%) usr 0.00 ( 0%) sys 0.00 ( 0%) wall
tree record loop bounds: 0.30 ( 0%) usr 0.01 ( 0%) sys 0.37 ( 0%) wall
loop invariant motion : 0.79 ( 1%) usr 0.00 ( 0%) sys 0.86 ( 1%) wall
tree canonical iv creation: 0.33 ( 0%) usr 0.00 ( 0%) sys 0.35 ( 0%) wall
complete unrolling : 0.78 ( 1%) usr 0.05 ( 1%) sys 1.19 ( 1%) wall
tree iv optimization : 1.66 ( 1%) usr 0.05 ( 1%) sys 1.98 ( 1%) wall
tree loop init : 0.47 ( 0%) usr 0.01 ( 0%) sys 0.68 ( 0%) wall
tree loop fini : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall
tree copy headers : 0.74 ( 1%) usr 0.03 ( 1%) sys 0.96 ( 1%) wall
tree SSA to normal : 1.31 ( 1%) usr 0.02 ( 0%) sys 1.72 ( 1%) wall
tree rename SSA copies: 0.52 ( 0%) usr 0.00 ( 0%) sys 0.63 ( 0%) wall
dominance frontiers : 0.10 ( 0%) usr 0.00 ( 0%) sys 0.15 ( 0%) wall
control dependences : 0.04 ( 0%) usr 0.00 ( 0%) sys 0.07 ( 0%) wall
expand : 5.74 ( 4%) usr 0.08 ( 2%) sys 6.98 ( 4%) wall
varconst : 0.21 ( 0%) usr 0.00 ( 0%) sys 0.22 ( 0%) wall
jump : 0.07 ( 0%) usr 0.00 ( 0%) sys 0.12 ( 0%) wall
CSE : 3.80 ( 3%) usr 0.02 ( 0%) sys 4.16 ( 2%) wall
loop analysis : 0.64 ( 0%) usr 0.03 ( 1%) sys 0.83 ( 0%) wall
global CSE : 0.05 ( 0%) usr 0.00 ( 0%) sys 0.10 ( 0%) wall
CPROP 1 : 0.28 ( 0%) usr 0.00 ( 0%) sys 0.49 ( 0%) wall
PRE : 1.01 ( 1%) usr 0.00 ( 0%) sys 1.15 ( 1%) wall
CPROP 2 : 0.35 ( 0%) usr 0.00 ( 0%) sys 0.52 ( 0%) wall
bypass jumps : 0.38 ( 0%) usr 0.00 ( 0%) sys 0.57 ( 0%) wall
CSE 2 : 1.86 ( 1%) usr 0.01 ( 0%) sys 2.13 ( 1%) wall
branch prediction : 0.93 ( 1%) usr 0.02 ( 0%) sys 1.01 ( 1%) wall
flow analysis : 0.02 ( 0%) usr 0.00 ( 0%) sys 0.07 ( 0%) wall
combiner : 1.94 ( 1%) usr 0.00 ( 0%) sys 2.21 ( 1%) wall
if-conversion : 0.24 ( 0%) usr 0.00 ( 0%) sys 0.29 ( 0%) wall
regmove : 0.57 ( 0%) usr 0.02 ( 0%) sys 0.54 ( 0%) wall
local alloc : 1.29 ( 1%) usr 0.02 ( 0%) sys 1.54 ( 1%) wall
global alloc : 3.15 ( 2%) usr 0.05 ( 1%) sys 3.76 ( 2%) wall
reload CSE regs : 1.68 ( 1%) usr 0.01 ( 0%) sys 2.03 ( 1%) wall
flow 2 : 0.21 ( 0%) usr 0.00 ( 0%) sys 0.25 ( 0%) wall
if-conversion 2 : 0.05 ( 0%) usr 0.00 ( 0%) sys 0.17 ( 0%) wall
peephole 2 : 0.31 ( 0%) usr 0.00 ( 0%) sys 0.30 ( 0%) wall
rename registers : 0.33 ( 0%) usr 0.00 ( 0%) sys 0.38 ( 0%) wall
machine dep reorg : 0.46 ( 0%) usr 0.01 ( 0%) sys 0.47 ( 0%) wall
reorder blocks : 0.19 ( 0%) usr 0.00 ( 0%) sys 0.29 ( 0%) wall
shorten branches : 0.40 ( 0%) usr 0.00 ( 0%) sys 0.49 ( 0%) wall
reg stack : 0.05 ( 0%) usr 0.00 ( 0%) sys 0.06 ( 0%) wall
final : 0.59 ( 0%) usr 0.04 ( 1%) sys 0.73 ( 0%) wall
symout : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.02 ( 0%) wall
rest of compilation : 0.38 ( 0%) usr 0.01 ( 0%) sys 0.55 ( 0%) wall
TOTAL : 138.55 4.95 167.96
which I think is a fair comparison because of equal runtime performance
and possibly similar inlining (non-leafified parts may be still differently
inlined).
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23955
More information about the Gcc-bugs
mailing list