[Bug tree-optimization/23955] Compile time regressions with tramp3d

rguenth at gcc dot gnu dot org gcc-bugzilla@gcc.gnu.org
Sun Sep 18 18:16:00 GMT 2005


------- Additional Comments From rguenth at gcc dot gnu dot org  2005-09-18 18:16 -------
-ftime-report for the 4.1 + flatten compile:

Execution times (seconds)
 garbage collection    :   6.32 ( 4%) usr   0.07 ( 1%) sys   6.73 ( 4%) wall   
   0 kB ( 0%) ggc
 callgraph construction:   0.42 ( 0%) usr   0.03 ( 0%) sys   0.42 ( 0%) wall   
5274 kB ( 0%) ggc
 callgraph optimization:   0.12 ( 0%) usr   0.00 ( 0%) sys   0.11 ( 0%) wall   
1605 kB ( 0%) ggc
 ipa reference         :   0.49 ( 0%) usr   0.00 ( 0%) sys   0.49 ( 0%) wall   
 440 kB ( 0%) ggc
 ipa pure const        :   0.13 ( 0%) usr   0.01 ( 0%) sys   0.14 ( 0%) wall   
   0 kB ( 0%) ggc
 ipa type escape       :   4.88 ( 3%) usr   0.00 ( 0%) sys   4.88 ( 3%) wall   
   0 kB ( 0%) ggc
 cfg construction      :   0.06 ( 0%) usr   0.00 ( 0%) sys   0.07 ( 0%) wall   
2530 kB ( 0%) ggc
 cfg cleanup           :   1.29 ( 1%) usr   0.01 ( 0%) sys   1.16 ( 1%) wall   
2256 kB ( 0%) ggc
 trivially dead code   :   0.70 ( 0%) usr   0.01 ( 0%) sys   0.62 ( 0%) wall   
   0 kB ( 0%) ggc
 life analysis         :   3.52 ( 2%) usr   0.00 ( 0%) sys   3.64 ( 2%) wall   
2601 kB ( 0%) ggc
 life info update      :   0.39 ( 0%) usr   0.00 ( 0%) sys   0.29 ( 0%) wall   
 590 kB ( 0%) ggc
 alias analysis        :   1.41 ( 1%) usr   0.00 ( 0%) sys   1.13 ( 1%) wall  
12731 kB ( 1%) ggc
 register scan         :   0.69 ( 0%) usr   0.01 ( 0%) sys   0.87 ( 0%) wall   
 526 kB ( 0%) ggc
 rebuild jump labels   :   0.21 ( 0%) usr   0.00 ( 0%) sys   0.16 ( 0%) wall   
   0 kB ( 0%) ggc
 preprocessing         :   0.75 ( 0%) usr   0.37 ( 5%) sys   1.15 ( 1%) wall   
 686 kB ( 0%) ggc
 parser                :   4.32 ( 2%) usr   0.98 (13%) sys   5.24 ( 3%) wall 
229494 kB (11%) ggc
 name lookup           :   2.08 ( 1%) usr   0.88 (12%) sys   2.88 ( 2%) wall  
46108 kB ( 2%) ggc
 inline heuristics     :   1.02 ( 1%) usr   0.04 ( 1%) sys   1.06 ( 1%) wall  
36310 kB ( 2%) ggc
 integration           :  12.02 ( 7%) usr   0.02 ( 0%) sys  11.77 ( 6%) wall 
693907 kB (34%) ggc
 tree gimplify         :   0.65 ( 0%) usr   0.03 ( 0%) sys   0.83 ( 0%) wall  
11198 kB ( 1%) ggc
 tree eh               :   0.06 ( 0%) usr   0.00 ( 0%) sys   0.03 ( 0%) wall   
 107 kB ( 0%) ggc
 tree CFG construction :   0.04 ( 0%) usr   0.02 ( 0%) sys   0.10 ( 0%) wall  
14527 kB ( 1%) ggc
 tree CFG cleanup      :   3.77 ( 2%) usr   0.11 ( 2%) sys   3.98 ( 2%) wall  
16679 kB ( 1%) ggc
 tree VRP              :   3.57 ( 2%) usr   0.13 ( 2%) sys   3.69 ( 2%) wall  
22691 kB ( 1%) ggc
 tree copy propagation :   3.09 ( 2%) usr   0.04 ( 1%) sys   3.09 ( 2%) wall   
3066 kB ( 0%) ggc
 tree store copy prop  :   0.59 ( 0%) usr   0.03 ( 0%) sys   0.42 ( 0%) wall   
 652 kB ( 0%) ggc
 tree find ref. vars   :   1.36 ( 1%) usr   0.05 ( 1%) sys   1.48 ( 1%) wall  
86797 kB ( 4%) ggc
 tree PTA              :  12.39 ( 7%) usr   0.06 ( 1%) sys  12.36 ( 7%) wall  
32031 kB ( 2%) ggc
 tree alias analysis   :   9.35 ( 5%) usr   0.84 (11%) sys  10.62 ( 6%) wall  
68682 kB ( 3%) ggc
 tree PHI insertion    :   1.40 ( 1%) usr   0.01 ( 0%) sys   1.49 ( 1%) wall  
21821 kB ( 1%) ggc
 tree SSA rewrite      :   4.88 ( 3%) usr   0.05 ( 1%) sys   4.67 ( 2%) wall 
108845 kB ( 5%) ggc
 tree SSA other        :   1.19 ( 1%) usr   0.47 ( 6%) sys   1.72 ( 1%) wall   
1481 kB ( 0%) ggc
 tree SSA incremental  :  12.44 ( 7%) usr   0.23 ( 3%) sys  12.44 ( 7%) wall  
30571 kB ( 1%) ggc
 tree operand scan     :   9.20 ( 5%) usr   2.05 (28%) sys  11.56 ( 6%) wall  
68307 kB ( 3%) ggc
 dominator optimization:   9.49 ( 5%) usr   0.10 ( 1%) sys   9.60 ( 5%) wall  
78640 kB ( 4%) ggc
 tree SRA              :   0.50 ( 0%) usr   0.02 ( 0%) sys   0.57 ( 0%) wall  
11723 kB ( 1%) ggc
 tree STORE-CCP        :   0.62 ( 0%) usr   0.00 ( 0%) sys   0.68 ( 0%) wall   
 447 kB ( 0%) ggc
 tree CCP              :   1.38 ( 1%) usr   0.01 ( 0%) sys   1.30 ( 1%) wall   
2024 kB ( 0%) ggc
 tree split crit edges :   0.16 ( 0%) usr   0.01 ( 0%) sys   0.22 ( 0%) wall  
18294 kB ( 1%) ggc
 tree reassociation    :   0.11 ( 0%) usr   0.00 ( 0%) sys   0.08 ( 0%) wall   
   4 kB ( 0%) ggc
 tree PRE              :   2.88 ( 2%) usr   0.03 ( 0%) sys   3.01 ( 2%) wall  
27185 kB ( 1%) ggc
 tree FRE              :   4.40 ( 2%) usr   0.06 ( 1%) sys   4.43 ( 2%) wall  
41584 kB ( 2%) ggc
 tree code sinking     :   0.36 ( 0%) usr   0.01 ( 0%) sys   0.49 ( 0%) wall   
  79 kB ( 0%) ggc
 tree linearize phis   :   0.06 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall   
  10 kB ( 0%) ggc
 tree forward propagate:   1.00 ( 1%) usr   0.22 ( 3%) sys   1.19 ( 1%) wall  
49760 kB ( 2%) ggc
 tree conservative DCE :   2.32 ( 1%) usr   0.00 ( 0%) sys   2.21 ( 1%) wall   
   0 kB ( 0%) ggc
 tree aggressive DCE   :   0.48 ( 0%) usr   0.00 ( 0%) sys   0.50 ( 0%) wall   
   0 kB ( 0%) ggc
 tree DSE              :   0.46 ( 0%) usr   0.00 ( 0%) sys   0.37 ( 0%) wall   
 760 kB ( 0%) ggc
 PHI merge             :   0.02 ( 0%) usr   0.00 ( 0%) sys   0.03 ( 0%) wall   
 747 kB ( 0%) ggc
 tree loop bounds      :   0.71 ( 0%) usr   0.00 ( 0%) sys   0.73 ( 0%) wall   
5718 kB ( 0%) ggc
 loop invariant motion :   0.53 ( 0%) usr   0.00 ( 0%) sys   0.54 ( 0%) wall   
 185 kB ( 0%) ggc
 tree canonical iv     :   0.18 ( 0%) usr   0.00 ( 0%) sys   0.17 ( 0%) wall   
4380 kB ( 0%) ggc
 scev constant prop    :   0.30 ( 0%) usr   0.00 ( 0%) sys   0.31 ( 0%) wall   
4656 kB ( 0%) ggc
 complete unrolling    :   1.72 ( 1%) usr   0.07 ( 1%) sys   1.63 ( 1%) wall  
32424 kB ( 2%) ggc
 tree iv optimization  :   1.43 ( 1%) usr   0.01 ( 0%) sys   1.33 ( 1%) wall  
29199 kB ( 1%) ggc
 tree loop init        :   0.59 ( 0%) usr   0.02 ( 0%) sys   0.45 ( 0%) wall   
  12 kB ( 0%) ggc
 tree loop fini        :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall   
   0 kB ( 0%) ggc
 tree copy headers     :   0.41 ( 0%) usr   0.01 ( 0%) sys   0.63 ( 0%) wall  
19708 kB ( 1%) ggc
 tree SSA uncprop      :   0.09 ( 0%) usr   0.00 ( 0%) sys   0.09 ( 0%) wall   
   0 kB ( 0%) ggc
 tree SSA to normal    :   1.42 ( 1%) usr   0.00 ( 0%) sys   1.46 ( 1%) wall  
15217 kB ( 1%) ggc
 tree rename SSA copies:   0.56 ( 0%) usr   0.00 ( 0%) sys   0.64 ( 0%) wall   
   1 kB ( 0%) ggc
 dominance frontiers   :   0.61 ( 0%) usr   0.00 ( 0%) sys   0.77 ( 0%) wall   
   0 kB ( 0%) ggc
 control dependences   :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.04 ( 0%) wall   
   0 kB ( 0%) ggc
 expand                :   7.56 ( 4%) usr   0.10 ( 1%) sys   7.63 ( 4%) wall  
85819 kB ( 4%) ggc
 varconst              :   0.23 ( 0%) usr   0.00 ( 0%) sys   0.22 ( 0%) wall   
 615 kB ( 0%) ggc
 jump                  :   0.14 ( 0%) usr   0.00 ( 0%) sys   0.13 ( 0%) wall   
 189 kB ( 0%) ggc
 CSE                   :   6.53 ( 4%) usr   0.01 ( 0%) sys   6.65 ( 4%) wall   
6438 kB ( 0%) ggc
 loop analysis         :   1.15 ( 1%) usr   0.00 ( 0%) sys   1.06 ( 1%) wall   
6947 kB ( 0%) ggc
 global CSE            :   0.05 ( 0%) usr   0.00 ( 0%) sys   0.03 ( 0%) wall   
   0 kB ( 0%) ggc
 CPROP 1               :   0.52 ( 0%) usr   0.00 ( 0%) sys   0.51 ( 0%) wall   
4188 kB ( 0%) ggc
 PRE                   :   1.57 ( 1%) usr   0.00 ( 0%) sys   1.56 ( 1%) wall   
2273 kB ( 0%) ggc
 CPROP 2               :   0.67 ( 0%) usr   0.00 ( 0%) sys   0.61 ( 0%) wall   
1548 kB ( 0%) ggc
 bypass jumps          :   0.56 ( 0%) usr   0.00 ( 0%) sys   0.59 ( 0%) wall   
1401 kB ( 0%) ggc
 web                   :   0.43 ( 0%) usr   0.00 ( 0%) sys   0.46 ( 0%) wall   
 222 kB ( 0%) ggc
 CSE 2                 :   4.20 ( 2%) usr   0.00 ( 0%) sys   4.40 ( 2%) wall   
3341 kB ( 0%) ggc
 branch prediction     :   0.93 ( 1%) usr   0.00 ( 0%) sys   0.89 ( 0%) wall   
4372 kB ( 0%) ggc
 flow analysis         :   0.07 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall   
   7 kB ( 0%) ggc
 combiner              :   3.06 ( 2%) usr   0.01 ( 0%) sys   2.96 ( 2%) wall  
10796 kB ( 1%) ggc
 if-conversion         :   0.13 ( 0%) usr   0.00 ( 0%) sys   0.23 ( 0%) wall   
 405 kB ( 0%) ggc
 regmove               :   0.67 ( 0%) usr   0.00 ( 0%) sys   0.62 ( 0%) wall   
 146 kB ( 0%) ggc
 local alloc           :   1.81 ( 1%) usr   0.00 ( 0%) sys   1.97 ( 1%) wall   
3329 kB ( 0%) ggc
 global alloc          :   4.71 ( 3%) usr   0.00 ( 0%) sys   4.84 ( 3%) wall  
26430 kB ( 1%) ggc
 reload CSE regs       :   2.71 ( 2%) usr   0.00 ( 0%) sys   2.73 ( 1%) wall  
10832 kB ( 1%) ggc
 flow 2                :   0.14 ( 0%) usr   0.00 ( 0%) sys   0.12 ( 0%) wall   
2225 kB ( 0%) ggc
 if-conversion 2       :   0.07 ( 0%) usr   0.00 ( 0%) sys   0.12 ( 0%) wall   
  11 kB ( 0%) ggc
 peephole 2            :   0.30 ( 0%) usr   0.00 ( 0%) sys   0.30 ( 0%) wall   
 356 kB ( 0%) ggc
 rename registers      :   1.43 ( 1%) usr   0.00 ( 0%) sys   1.60 ( 1%) wall   
2031 kB ( 0%) ggc
 machine dep reorg     :   0.56 ( 0%) usr   0.00 ( 0%) sys   0.68 ( 0%) wall   
  75 kB ( 0%) ggc
 reorder blocks        :   0.33 ( 0%) usr   0.00 ( 0%) sys   0.30 ( 0%) wall   
2611 kB ( 0%) ggc
 shorten branches      :   0.13 ( 0%) usr   0.00 ( 0%) sys   0.19 ( 0%) wall   
   0 kB ( 0%) ggc
 reg stack             :   0.14 ( 0%) usr   0.00 ( 0%) sys   0.18 ( 0%) wall   
1705 kB ( 0%) ggc
 final                 :   1.11 ( 1%) usr   0.03 ( 0%) sys   1.12 ( 1%) wall   
4552 kB ( 0%) ggc
 TOTAL                 : 179.59             7.31           187.62           
2049140 kB

and for 4.0 + leafify patch:

 garbage collection    :   5.01 ( 4%) usr   0.06 ( 1%) sys   5.91 ( 4%) wall
 callgraph construction:   0.28 ( 0%) usr   0.00 ( 0%) sys   0.34 ( 0%) wall
 callgraph optimization:   0.67 ( 0%) usr   0.07 ( 1%) sys   0.85 ( 1%) wall
 cfg construction      :   0.01 ( 0%) usr   0.01 ( 0%) sys   0.05 ( 0%) wall
 cfg cleanup           :   0.94 ( 1%) usr   0.00 ( 0%) sys   1.00 ( 1%) wall
 trivially dead code   :   0.58 ( 0%) usr   0.01 ( 0%) sys   0.62 ( 0%) wall
 life analysis         :   2.53 ( 2%) usr   0.03 ( 1%) sys   3.31 ( 2%) wall
 life info update      :   0.27 ( 0%) usr   0.00 ( 0%) sys   0.32 ( 0%) wall
 alias analysis        :   0.90 ( 1%) usr   0.00 ( 0%) sys   0.95 ( 1%) wall
 register scan         :   0.74 ( 1%) usr   0.02 ( 0%) sys   0.72 ( 0%) wall
 rebuild jump labels   :   0.23 ( 0%) usr   0.00 ( 0%) sys   0.18 ( 0%) wall
 preprocessing         :   0.29 ( 0%) usr   0.18 ( 4%) sys   0.51 ( 0%) wall
 parser                :   4.71 ( 3%) usr   0.65 (13%) sys   6.07 ( 4%) wall
 name lookup           :   2.39 ( 2%) usr   0.71 (14%) sys   3.84 ( 2%) wall
 integration           :  45.27 (33%) usr   0.30 ( 6%) sys  53.22 (32%) wall
 tree gimplify         :   0.65 ( 0%) usr   0.01 ( 0%) sys   0.84 ( 1%) wall
 tree eh               :   0.36 ( 0%) usr   0.01 ( 0%) sys   0.43 ( 0%) wall
 tree CFG construction :   0.77 ( 1%) usr   0.01 ( 0%) sys   0.98 ( 1%) wall
 tree CFG cleanup      :   1.34 ( 1%) usr   0.01 ( 0%) sys   1.34 ( 1%) wall
 tree find referenced vars:   0.99 ( 1%) usr   0.02 ( 0%) sys   1.12 ( 1%) wall
 tree PTA              :   1.53 ( 1%) usr   0.01 ( 0%) sys   2.07 ( 1%) wall
 tree alias analysis   :   5.58 ( 4%) usr   0.11 ( 2%) sys   6.57 ( 4%) wall
 tree PHI insertion    :   2.37 ( 2%) usr   0.02 ( 0%) sys   2.91 ( 2%) wall
 tree SSA rewrite      :   2.73 ( 2%) usr   0.03 ( 1%) sys   3.11 ( 2%) wall
 tree SSA other        :   4.80 ( 3%) usr   0.91 (18%) sys   6.47 ( 4%) wall
 tree operand scan     :   3.15 ( 2%) usr   0.93 (19%) sys   4.87 ( 3%) wall
 dominator optimization:   7.91 ( 6%) usr   0.24 ( 5%) sys   9.26 ( 6%) wall
 tree SRA              :   0.40 ( 0%) usr   0.00 ( 0%) sys   0.52 ( 0%) wall
 tree CCP              :   0.58 ( 0%) usr   0.01 ( 0%) sys   0.64 ( 0%) wall
 tree split crit edges :   0.14 ( 0%) usr   0.01 ( 0%) sys   0.18 ( 0%) wall
 tree PRE              :   1.95 ( 1%) usr   0.05 ( 1%) sys   2.23 ( 1%) wall
 tree remove redundant PHIs:   1.37 ( 1%) usr   0.03 ( 1%) sys   1.70 ( 1%) wall
 tree linearize phis   :   0.01 ( 0%) usr   0.01 ( 0%) sys   0.03 ( 0%) wall
 tree forward propagate:   0.54 ( 0%) usr   0.00 ( 0%) sys   0.67 ( 0%) wall
 tree conservative DCE :   1.26 ( 1%) usr   0.00 ( 0%) sys   1.56 ( 1%) wall
 tree aggressive DCE   :   0.41 ( 0%) usr   0.00 ( 0%) sys   0.46 ( 0%) wall
 tree DSE              :   0.61 ( 0%) usr   0.00 ( 0%) sys   0.67 ( 0%) wall
 PHI merge             :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall
 tree loop optimization:   0.01 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall
 tree record loop bounds:   0.30 ( 0%) usr   0.01 ( 0%) sys   0.37 ( 0%) wall
 loop invariant motion :   0.79 ( 1%) usr   0.00 ( 0%) sys   0.86 ( 1%) wall
 tree canonical iv creation:   0.33 ( 0%) usr   0.00 ( 0%) sys   0.35 ( 0%) wall
 complete unrolling    :   0.78 ( 1%) usr   0.05 ( 1%) sys   1.19 ( 1%) wall
 tree iv optimization  :   1.66 ( 1%) usr   0.05 ( 1%) sys   1.98 ( 1%) wall
 tree loop init        :   0.47 ( 0%) usr   0.01 ( 0%) sys   0.68 ( 0%) wall
 tree loop fini        :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall
 tree copy headers     :   0.74 ( 1%) usr   0.03 ( 1%) sys   0.96 ( 1%) wall
 tree SSA to normal    :   1.31 ( 1%) usr   0.02 ( 0%) sys   1.72 ( 1%) wall
 tree rename SSA copies:   0.52 ( 0%) usr   0.00 ( 0%) sys   0.63 ( 0%) wall
 dominance frontiers   :   0.10 ( 0%) usr   0.00 ( 0%) sys   0.15 ( 0%) wall
 control dependences   :   0.04 ( 0%) usr   0.00 ( 0%) sys   0.07 ( 0%) wall
 expand                :   5.74 ( 4%) usr   0.08 ( 2%) sys   6.98 ( 4%) wall
 varconst              :   0.21 ( 0%) usr   0.00 ( 0%) sys   0.22 ( 0%) wall
 jump                  :   0.07 ( 0%) usr   0.00 ( 0%) sys   0.12 ( 0%) wall
 CSE                   :   3.80 ( 3%) usr   0.02 ( 0%) sys   4.16 ( 2%) wall
 loop analysis         :   0.64 ( 0%) usr   0.03 ( 1%) sys   0.83 ( 0%) wall
 global CSE            :   0.05 ( 0%) usr   0.00 ( 0%) sys   0.10 ( 0%) wall
 CPROP 1               :   0.28 ( 0%) usr   0.00 ( 0%) sys   0.49 ( 0%) wall
 PRE                   :   1.01 ( 1%) usr   0.00 ( 0%) sys   1.15 ( 1%) wall
 CPROP 2               :   0.35 ( 0%) usr   0.00 ( 0%) sys   0.52 ( 0%) wall
 bypass jumps          :   0.38 ( 0%) usr   0.00 ( 0%) sys   0.57 ( 0%) wall
 CSE 2                 :   1.86 ( 1%) usr   0.01 ( 0%) sys   2.13 ( 1%) wall
 branch prediction     :   0.93 ( 1%) usr   0.02 ( 0%) sys   1.01 ( 1%) wall
 flow analysis         :   0.02 ( 0%) usr   0.00 ( 0%) sys   0.07 ( 0%) wall
 combiner              :   1.94 ( 1%) usr   0.00 ( 0%) sys   2.21 ( 1%) wall
 if-conversion         :   0.24 ( 0%) usr   0.00 ( 0%) sys   0.29 ( 0%) wall
 regmove               :   0.57 ( 0%) usr   0.02 ( 0%) sys   0.54 ( 0%) wall
 local alloc           :   1.29 ( 1%) usr   0.02 ( 0%) sys   1.54 ( 1%) wall
 global alloc          :   3.15 ( 2%) usr   0.05 ( 1%) sys   3.76 ( 2%) wall
 reload CSE regs       :   1.68 ( 1%) usr   0.01 ( 0%) sys   2.03 ( 1%) wall
 flow 2                :   0.21 ( 0%) usr   0.00 ( 0%) sys   0.25 ( 0%) wall
 if-conversion 2       :   0.05 ( 0%) usr   0.00 ( 0%) sys   0.17 ( 0%) wall
 peephole 2            :   0.31 ( 0%) usr   0.00 ( 0%) sys   0.30 ( 0%) wall
 rename registers      :   0.33 ( 0%) usr   0.00 ( 0%) sys   0.38 ( 0%) wall
 machine dep reorg     :   0.46 ( 0%) usr   0.01 ( 0%) sys   0.47 ( 0%) wall
 reorder blocks        :   0.19 ( 0%) usr   0.00 ( 0%) sys   0.29 ( 0%) wall
 shorten branches      :   0.40 ( 0%) usr   0.00 ( 0%) sys   0.49 ( 0%) wall
 reg stack             :   0.05 ( 0%) usr   0.00 ( 0%) sys   0.06 ( 0%) wall
 final                 :   0.59 ( 0%) usr   0.04 ( 1%) sys   0.73 ( 0%) wall
 symout                :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall
 rest of compilation   :   0.38 ( 0%) usr   0.01 ( 0%) sys   0.55 ( 0%) wall
 TOTAL                 : 138.55             4.95           167.96


which I think is a fair comparison because of equal runtime performance
and possibly similar inlining (non-leafified parts may be still differently
inlined).

-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23955



More information about the Gcc-bugs mailing list