This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Compilation performance comparison of 3.5.0 and TreeSSA treeson MICO sources as requested in: [tree-ssa] Merge status 2004-05-03


Daniel Berlin wrote:

Is it possible to not bootstrap tree-ssa, but just compile
it by GCC3.4.0/3.5.0 and see if parser is faster? If so, how?

Sure. Don't do make bootstrap, just do make It'll compile gcc with your system compiler

I made that experiment to check my suspicions about tree-ssa not being good at optimizing itself. Results do _not_ confirm this theory:


tree-ssa built by tree-ssa compiling tramp3d-v3.cpp:
Execution times (seconds)
 garbage collection    :  12.90 ( 6%) usr   0.00 ( 0%) sys  13.04 ( 6%)
 callgraph construction:   1.50 ( 1%) usr   0.00 ( 0%) sys   1.52 ( 1%)
 callgraph optimization:   0.56 ( 0%) usr   0.06 ( 1%) sys   0.62 ( 0%)
 cfg construction      :   0.64 ( 0%) usr   0.02 ( 0%) sys   0.66 ( 0%)
 cfg cleanup           :   2.23 ( 1%) usr   0.01 ( 0%) sys   2.24 ( 1%)
 trivially dead code   :   2.31 ( 1%) usr   0.00 ( 0%) sys   2.31 ( 1%)
 life analysis         :   4.46 ( 2%) usr   0.00 ( 0%) sys   4.50 ( 2%)
 life info update      :   2.38 ( 1%) usr   0.01 ( 0%) sys   2.39 ( 1%)
 alias analysis        :   3.33 ( 2%) usr   0.00 ( 0%) sys   3.35 ( 2%)
 register scan         :   1.88 ( 1%) usr   0.00 ( 0%) sys   1.90 ( 1%)
 rebuild jump labels   :   0.56 ( 0%) usr   0.00 ( 0%) sys   0.56 ( 0%)
 preprocessing         :   0.70 ( 0%) usr   0.14 ( 3%) sys   0.88 ( 0%)
 parser                :  16.16 ( 8%) usr   1.15 (25%) sys  17.52 ( 8%)
 name lookup           :   4.98 ( 2%) usr   1.27 (28%) sys   6.32 ( 3%)
 integration           :  22.32 (11%) usr   0.15 ( 3%) sys  22.71 (10%)
 tree gimplify         :   4.01 ( 2%) usr   0.08 ( 2%) sys   4.19 ( 2%)
 tree eh               :   2.84 ( 1%) usr   0.01 ( 0%) sys   2.85 ( 1%)
 tree CFG construction :   1.66 ( 1%) usr   0.05 ( 1%) sys   1.71 ( 1%)
 tree CFG cleanup      :   2.61 ( 1%) usr   0.00 ( 0%) sys   2.63 ( 1%)
 tree PTA              :   0.71 ( 0%) usr   0.00 ( 0%) sys   0.73 ( 0%)
 tree alias analysis   :   0.88 ( 0%) usr   0.01 ( 0%) sys   0.89 ( 0%)
 tree PHI insertion    :   2.50 ( 1%) usr   0.01 ( 0%) sys   2.57 ( 1%)
 tree SSA rewrite      :   3.19 ( 2%) usr   0.00 ( 0%) sys   3.21 ( 1%)
 tree SSA other        :   5.27 ( 2%) usr   0.15 ( 3%) sys   5.44 ( 2%)
 tree operand scan     :   2.92 ( 1%) usr   0.29 ( 6%) sys   3.32 ( 2%)
 dominator optimization:  11.84 ( 6%) usr   0.18 ( 4%) sys  12.09 ( 6%)
 tree SRA              :   0.18 ( 0%) usr   0.00 ( 0%) sys   0.18 ( 0%)
 tree CCP              :   2.05 ( 1%) usr   0.01 ( 0%) sys   2.10 ( 1%)
 tree split crit edges :   0.23 ( 0%) usr   0.00 ( 0%) sys   0.23 ( 0%)
 tree PRE              :   6.72 ( 3%) usr   0.04 ( 1%) sys   6.76 ( 3%)
 tree linearize phis   :   0.02 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%)
 tree forward propagate:   1.25 ( 1%) usr   0.00 ( 0%) sys   1.29 ( 1%)
 tree conservative DCE :   2.66 ( 1%) usr   0.02 ( 0%) sys   2.68 ( 1%)
 tree aggressive DCE   :   1.23 ( 1%) usr   0.01 ( 0%) sys   1.24 ( 1%)
 tree DSE              :   2.51 ( 1%) usr   0.00 ( 0%) sys   2.57 ( 1%)
 tree copy headers     :   2.09 ( 1%) usr   0.03 ( 1%) sys   2.16 ( 1%)
 tree SSA to normal    :   3.48 ( 2%) usr   0.13 ( 3%) sys   3.66 ( 2%)
 tree rename SSA copies:   0.99 ( 0%) usr   0.01 ( 0%) sys   1.02 ( 0%)
 dominance frontiers   :   0.32 ( 0%) usr   0.00 ( 0%) sys   0.32 ( 0%)
 control dependences   :   0.16 ( 0%) usr   0.00 ( 0%) sys   0.16 ( 0%)
 expand                :  20.61 (10%) usr   0.09 ( 2%) sys  20.93 (10%)
 varconst              :   0.49 ( 0%) usr   0.00 ( 0%) sys   0.49 ( 0%)
 jump                  :   1.15 ( 1%) usr   0.06 ( 1%) sys   1.21 ( 1%)
 CSE                   :   7.82 ( 4%) usr   0.02 ( 0%) sys   7.90 ( 4%)
 global CSE            :   5.15 ( 2%) usr   0.01 ( 0%) sys   5.20 ( 2%)
 loop analysis         :   1.12 ( 1%) usr   0.00 ( 0%) sys   1.14 ( 1%)
 bypass jumps          :   1.05 ( 0%) usr   0.01 ( 0%) sys   1.06 ( 0%)
 web                   :   1.30 ( 1%) usr   0.02 ( 0%) sys   1.32 ( 1%)
 CSE 2                 :   3.02 ( 1%) usr   0.02 ( 0%) sys   3.09 ( 1%)
 branch prediction     :   2.05 ( 1%) usr   0.03 ( 1%) sys   2.12 ( 1%)
 flow analysis         :   0.18 ( 0%) usr   0.00 ( 0%) sys   0.18 ( 0%)
 combiner              :   2.94 ( 1%) usr   0.02 ( 0%) sys   3.00 ( 1%)
 if-conversion         :   0.59 ( 0%) usr   0.00 ( 0%) sys   0.63 ( 0%)
 regmove               :   0.86 ( 0%) usr   0.00 ( 0%) sys   0.86 ( 0%)
 mode switching        :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%)
 local alloc           :   2.49 ( 1%) usr   0.02 ( 0%) sys   2.55 ( 1%)
 global alloc          :   5.72 ( 3%) usr   0.10 ( 2%) sys   5.88 ( 3%)
 reload CSE regs       :   2.49 ( 1%) usr   0.02 ( 0%) sys   2.58 ( 1%)
 flow 2                :   0.69 ( 0%) usr   0.00 ( 0%) sys   0.71 ( 0%)
 if-conversion 2       :   0.35 ( 0%) usr   0.00 ( 0%) sys   0.35 ( 0%)
 peephole 2            :   0.51 ( 0%) usr   0.00 ( 0%) sys   0.51 ( 0%)
 rename registers      :   0.81 ( 0%) usr   0.01 ( 0%) sys   0.82 ( 0%)
 scheduling 2          :   4.50 ( 2%) usr   0.06 ( 1%) sys   4.65 ( 2%)
 machine dep reorg     :   0.80 ( 0%) usr   0.00 ( 0%) sys   0.82 ( 0%)
 reorder blocks        :   0.65 ( 0%) usr   0.00 ( 0%) sys   0.65 ( 0%)
 shorten branches      :   0.88 ( 0%) usr   0.00 ( 0%) sys   0.92 ( 0%)
 reg stack             :   0.13 ( 0%) usr   0.00 ( 0%) sys   0.13 ( 0%)
 final                 :   1.53 ( 1%) usr   0.14 ( 3%) sys   1.67 ( 1%)
 symout                :   0.02 ( 0%) usr   0.02 ( 0%) sys   0.04 ( 0%)
 rest of compilation   :   2.39 ( 1%) usr   0.02 ( 0%) sys   2.44 ( 1%)
 TOTAL                 : 211.55             4.52           218.47

tree-ssa build by mainline building tramp3d-v3.cpp testcase:
Execution times (seconds)
 garbage collection    :  13.59 ( 6%) usr   0.01 ( 0%) sys  13.76 ( 6%)
 callgraph construction:   1.56 ( 1%) usr   0.01 ( 0%) sys   1.58 ( 1%)
 callgraph optimization:   0.60 ( 0%) usr   0.02 ( 0%) sys   0.62 ( 0%)
 cfg construction      :   0.58 ( 0%) usr   0.01 ( 0%) sys   0.59 ( 0%)
 cfg cleanup           :   2.04 ( 1%) usr   0.00 ( 0%) sys   2.06 ( 1%)
 trivially dead code   :   2.33 ( 1%) usr   0.01 ( 0%) sys   2.36 ( 1%)
 life analysis         :   4.83 ( 2%) usr   0.02 ( 0%) sys   4.91 ( 2%)
 life info update      :   2.48 ( 1%) usr   0.00 ( 0%) sys   2.52 ( 1%)
 alias analysis        :   3.09 ( 1%) usr   0.00 ( 0%) sys   3.09 ( 1%)
 register scan         :   2.14 ( 1%) usr   0.02 ( 0%) sys   2.16 ( 1%)
 rebuild jump labels   :   0.62 ( 0%) usr   0.00 ( 0%) sys   0.62 ( 0%)
 preprocessing         :   0.73 ( 0%) usr   0.08 ( 2%) sys   0.82 ( 0%)
 parser                :  16.75 ( 8%) usr   0.85 (19%) sys  17.88 ( 8%)
 name lookup           :   5.32 ( 2%) usr   1.27 (29%) sys   6.92 ( 3%)
 integration           :  22.29 (10%) usr   0.11 ( 3%) sys  22.61 (10%)
 tree gimplify         :   3.64 ( 2%) usr   0.04 ( 1%) sys   3.72 ( 2%)
 tree eh               :   2.73 ( 1%) usr   0.01 ( 0%) sys   2.77 ( 1%)
 tree CFG construction :   1.65 ( 1%) usr   0.07 ( 2%) sys   1.75 ( 1%)
 tree CFG cleanup      :   2.63 ( 1%) usr   0.01 ( 0%) sys   2.64 ( 1%)
 tree PTA              :   0.74 ( 0%) usr   0.00 ( 0%) sys   0.74 ( 0%)
 tree alias analysis   :   0.95 ( 0%) usr   0.02 ( 0%) sys   0.97 ( 0%)
 tree PHI insertion    :   2.78 ( 1%) usr   0.05 ( 1%) sys   2.89 ( 1%)
 tree SSA rewrite      :   2.88 ( 1%) usr   0.01 ( 0%) sys   2.93 ( 1%)
 tree SSA other        :   4.51 ( 2%) usr   0.16 ( 4%) sys   4.68 ( 2%)
 tree operand scan     :   3.07 ( 1%) usr   0.26 ( 6%) sys   3.35 ( 2%)
 dominator optimization:  11.99 ( 6%) usr   0.15 ( 3%) sys  12.26 ( 6%)
 tree SRA              :   0.23 ( 0%) usr   0.00 ( 0%) sys   0.25 ( 0%)
 tree CCP              :   1.89 ( 1%) usr   0.02 ( 0%) sys   1.91 ( 1%)
 tree split crit edges :   0.33 ( 0%) usr   0.02 ( 0%) sys   0.35 ( 0%)
 tree PRE              :   6.56 ( 3%) usr   0.02 ( 0%) sys   6.70 ( 3%)
 tree linearize phis   :   0.03 ( 0%) usr   0.00 ( 0%) sys   0.03 ( 0%)
 tree forward propagate:   1.40 ( 1%) usr   0.00 ( 0%) sys   1.40 ( 1%)
 tree conservative DCE :   2.53 ( 1%) usr   0.01 ( 0%) sys   2.54 ( 1%)
 tree aggressive DCE   :   1.24 ( 1%) usr   0.01 ( 0%) sys   1.28 ( 1%)
 tree DSE              :   2.57 ( 1%) usr   0.00 ( 0%) sys   2.59 ( 1%)
 tree copy headers     :   2.30 ( 1%) usr   0.05 ( 1%) sys   2.36 ( 1%)
 tree SSA to normal    :   3.45 ( 2%) usr   0.13 ( 3%) sys   3.63 ( 2%)
 tree NRV optimization :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%)
 tree rename SSA copies:   1.20 ( 1%) usr   0.06 ( 1%) sys   1.29 ( 1%)
 dominance frontiers   :   0.42 ( 0%) usr   0.00 ( 0%) sys   0.42 ( 0%)
 control dependences   :   0.12 ( 0%) usr   0.00 ( 0%) sys   0.12 ( 0%)
 expand                :  20.89 (10%) usr   0.13 ( 3%) sys  21.48 (10%)
 varconst              :   0.56 ( 0%) usr   0.01 ( 0%) sys   0.57 ( 0%)
 jump                  :   1.32 ( 1%) usr   0.09 ( 2%) sys   1.43 ( 1%)
 CSE                   :   7.91 ( 4%) usr   0.02 ( 0%) sys   7.98 ( 4%)
 global CSE            :   5.22 ( 2%) usr   0.07 ( 2%) sys   5.41 ( 2%)
 loop analysis         :   1.25 ( 1%) usr   0.01 ( 0%) sys   1.27 ( 1%)
 bypass jumps          :   1.06 ( 0%) usr   0.01 ( 0%) sys   1.07 ( 0%)
 web                   :   1.41 ( 1%) usr   0.04 ( 1%) sys   1.45 ( 1%)
 CSE 2                 :   3.30 ( 2%) usr   0.00 ( 0%) sys   3.40 ( 2%)
 branch prediction     :   2.16 ( 1%) usr   0.03 ( 1%) sys   2.26 ( 1%)
 flow analysis         :   0.15 ( 0%) usr   0.00 ( 0%) sys   0.15 ( 0%)
 combiner              :   2.84 ( 1%) usr   0.01 ( 0%) sys   2.88 ( 1%)
 if-conversion         :   0.56 ( 0%) usr   0.01 ( 0%) sys   0.58 ( 0%)
 regmove               :   0.86 ( 0%) usr   0.00 ( 0%) sys   0.88 ( 0%)
 local alloc           :   2.55 ( 1%) usr   0.02 ( 0%) sys   2.61 ( 1%)
 global alloc          :   5.68 ( 3%) usr   0.10 ( 2%) sys   5.88 ( 3%)
 reload CSE regs       :   2.94 ( 1%) usr   0.00 ( 0%) sys   2.94 ( 1%)
 flow 2                :   0.50 ( 0%) usr   0.00 ( 0%) sys   0.51 ( 0%)
 if-conversion 2       :   0.28 ( 0%) usr   0.00 ( 0%) sys   0.28 ( 0%)
 peephole 2            :   0.62 ( 0%) usr   0.00 ( 0%) sys   0.64 ( 0%)
 rename registers      :   0.96 ( 0%) usr   0.04 ( 1%) sys   1.01 ( 0%)
 scheduling 2          :   4.27 ( 2%) usr   0.08 ( 2%) sys   4.41 ( 2%)
 machine dep reorg     :   0.91 ( 0%) usr   0.00 ( 0%) sys   0.91 ( 0%)
 reorder blocks        :   0.49 ( 0%) usr   0.01 ( 0%) sys   0.52 ( 0%)
 shorten branches      :   0.78 ( 0%) usr   0.01 ( 0%) sys   0.79 ( 0%)
 reg stack             :   0.10 ( 0%) usr   0.00 ( 0%) sys   0.10 ( 0%)
 final                 :   1.60 ( 1%) usr   0.15 ( 3%) sys   1.79 ( 1%)
 symout                :   0.05 ( 0%) usr   0.01 ( 0%) sys   0.06 ( 0%)
 rest of compilation   :   2.20 ( 1%) usr   0.02 ( 0%) sys   2.24 ( 1%)
 TOTAL                 : 214.24             4.39           221.61

So, tree-ssa is actually better at optimizing tree-ssa for compiling the
tramp3d-v3.cpp testcase, at least not worse. The opposite (compiling mainline with mainline/tree-ssa) check needs still to be performed.


Stripped binary size of cc1plus is comparable:
-rwxr-x--- 1 rguenth tat 4480828 May 5 23:57 gcc-obj-3.5/gcc/cc1plus*
-rwxr-x--- 1 rguenth tat 4501168 May 5 23:57 gcc-obj-ssa/gcc/cc1plus*


Richard.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]