This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Compile time speed


Hi!

I wonder if some compile time improvements made on the 3.3 branch can
be backported to the 3.2 branch ;) For a typical source (heavy use of
expression templates) I get the following time reports for 3.2/3.3:

g++-3.2 -o tramp3d tramp3d.cpp -pipe -ftemplate-depth-80
-Drestrict=__restrict__ -Wno-deprecated -fno-exceptions -DNOPAssert
-DNOCTAssert -O2 -fomit-frame-pointer -march=athlon
-I/home/rguenth/ix86/pooma/tat-serial-debug/pooma/linux/lib/PoomaConfiguration-gcc3
-I/home/rguenth/ix86/pooma/tat-serial-debug/pooma/linux/src
-I/home/rguenth/ix86/pooma/tat-serial-debug/pooma/linux/lib
-I/home/rguenth/ix86/include
-L/home/rguenth/ix86/pooma/tat-serial-debug/pooma/linux/lib -lpooma-gcc3
-L/home/rguenth/ix86/lib -lhdf5 -lm -fno-exceptions  -lz delta.o
-ftime-report

Execution times (seconds)
 garbage collection    :  13.15 ( 4%) usr   0.03 ( 1%) sys  13.44 ( 4%)
wall
 cfg construction      :   2.24 ( 1%) usr   0.06 ( 2%) sys   2.39 ( 1%)
wall
 cfg cleanup           : 115.26 (32%) usr   0.21 ( 6%) sys 117.66 (31%)
wall
 life analysis         :   4.78 ( 1%) usr   0.33 ( 9%) sys   5.30 ( 1%)
wall
 life info update      :   1.07 ( 0%) usr   0.04 ( 1%) sys   1.22 ( 0%)
wall
 preprocessing         :   0.65 ( 0%) usr   0.12 ( 3%) sys   0.73 ( 0%)
wall
 lexical analysis      :   0.72 ( 0%) usr   0.18 ( 5%) sys   0.92 ( 0%)
wall
 parser                :  13.44 ( 4%) usr   0.68 (18%) sys  15.11 ( 4%)
wall
 expand                : 110.38 (30%) usr   0.58 (15%) sys 112.70 (30%)
wall
 varconst              :   0.46 ( 0%) usr   0.00 ( 0%) sys   0.55 ( 0%)
wall
 integration           :   8.09 ( 2%) usr   0.52 (14%) sys   9.31 ( 2%)
wall
 jump                  :   2.68 ( 1%) usr   0.08 ( 2%) sys   2.95 ( 1%)
wall
 CSE                   :  37.52 (10%) usr   0.10 ( 3%) sys  38.67 (10%)
wall
 global CSE            :   6.95 ( 2%) usr   0.04 ( 1%) sys   6.97 ( 2%)
wall
 loop analysis         :   4.86 ( 1%) usr   0.28 ( 7%) sys   5.27 ( 1%)
wall
 CSE 2                 :  11.98 ( 3%) usr   0.02 ( 1%) sys  12.16 ( 3%)
wall
 flow analysis         :   0.84 ( 0%) usr   0.02 ( 1%) sys   0.94 ( 0%)
wall
 combiner              :   2.00 ( 1%) usr   0.03 ( 1%) sys   2.05 ( 1%)
wall
 if-conversion         :   0.06 ( 0%) usr   0.01 ( 0%) sys   0.06 ( 0%)
wall
 regmove               :   1.74 ( 0%) usr   0.00 ( 0%) sys   1.84 ( 0%)
wall
 mode switching        :   0.22 ( 0%) usr   0.00 ( 0%) sys   0.17 ( 0%)
wall
 local alloc           :   2.35 ( 1%) usr   0.09 ( 2%) sys   2.61 ( 1%)
wall
 global alloc          :   5.47 ( 1%) usr   0.08 ( 2%) sys   5.55 ( 1%)
wall
 reload CSE regs       :   5.47 ( 1%) usr   0.02 ( 1%) sys   5.62 ( 1%)
wall
 flow 2                :   0.97 ( 0%) usr   0.00 ( 0%) sys   1.02 ( 0%)
wall
 if-conversion 2       :   0.02 ( 0%) usr   0.00 ( 0%) sys   0.03 ( 0%)
wall
 peephole 2            :   0.47 ( 0%) usr   0.04 ( 1%) sys   0.58 ( 0%)
wall
 rename registers      :   1.27 ( 0%) usr   0.00 ( 0%) sys   1.30 ( 0%)
wall
 scheduling 2          :   3.88 ( 1%) usr   0.15 ( 4%) sys   4.22 ( 1%)
wall
 reorder blocks        :   0.31 ( 0%) usr   0.03 ( 1%) sys   0.33 ( 0%)
wall
 shorten branches      :   0.44 ( 0%) usr   0.02 ( 1%) sys   0.48 ( 0%)
wall
 reg stack             :   0.09 ( 0%) usr   0.01 ( 0%) sys   0.11 ( 0%)
wall
 final                 :   1.08 ( 0%) usr   0.02 ( 1%) sys   2.36 ( 1%)
wall
 symout                :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%)
wall
 rest of compilation   :   3.81 ( 1%) usr   0.01 ( 0%) sys   3.86 ( 1%)
wall
 TOTAL                 : 364.75             3.80           378.52

cfg cleanup and expand look bad here...


bellatrix:~/src/C19/rhalk/tramp$ g++-3.3 -o tramp3d tramp3d.cpp -pipe
-ftemplate-depth-80 -Drestrict=__restrict__ -Wno-deprecated
-fno-exceptions -DNOPAssert -DNOCTAssert -O2 -fomit-frame-pointer
-march=athlon
-I/home/rguenth/ix86/pooma/tat-serial-debug/pooma/linux/lib/PoomaConfiguration-gcc3
-I/home/rguenth/ix86/pooma/tat-serial-debug/pooma/linux/src
-I/home/rguenth/ix86/pooma/tat-serial-debug/pooma/linux/lib
-I/home/rguenth/ix86/include
-L/home/rguenth/ix86/pooma/tat-serial-debug/pooma/linux/lib -lpooma-gcc3
-L/home/rguenth/ix86/lib -lhdf5 -lm -fno-exceptions  -lz delta.o
-ftime-report

Execution times (seconds)
 garbage collection    :  12.88 ( 9%) usr   0.01 ( 0%) sys  13.56 ( 8%)
wall
 cfg construction      :   1.01 ( 1%) usr   0.01 ( 0%) sys   0.92 ( 1%)
wall
 cfg cleanup           :   1.83 ( 1%) usr   0.03 ( 1%) sys   1.88 ( 1%)
wall
 trivially dead code   :   2.84 ( 2%) usr   0.01 ( 0%) sys   3.16 ( 2%)
wall
 life analysis         :   4.04 ( 3%) usr   0.00 ( 0%) sys   4.28 ( 3%)
wall
 life info update      :   1.23 ( 1%) usr   0.00 ( 0%) sys   1.36 ( 1%)
wall
 preprocessing         :   0.64 ( 0%) usr   0.27 ( 6%) sys   2.12 ( 1%)
wall
 lexical analysis      :   0.52 ( 0%) usr   0.21 ( 5%) sys   0.72 ( 0%)
wall
 parser                :  10.78 ( 7%) usr   0.86 (20%) sys  11.84 ( 7%)
wall
 name lookup           :   5.73 ( 4%) usr   1.11 (26%) sys   6.78 ( 4%)
wall
 expand                :  37.34 (25%) usr   0.29 ( 7%) sys  38.94 (24%)
wall
 varconst              :   0.27 ( 0%) usr   0.00 ( 0%) sys   0.38 ( 0%)
wall
 integration           :   7.97 ( 5%) usr   0.27 ( 6%) sys   8.72 ( 5%)
wall
 jump                  :   2.20 ( 1%) usr   0.08 ( 2%) sys   2.36 ( 1%)
wall
 CSE                   :  16.00 (11%) usr   0.11 ( 3%) sys  16.34 (10%)
wall
 global CSE            :   5.80 ( 4%) usr   0.08 ( 2%) sys   5.89 ( 4%)
wall
 loop analysis         :   4.20 ( 3%) usr   0.41 (10%) sys   4.75 ( 3%)
wall
 CSE 2                 :   5.26 ( 4%) usr   0.01 ( 0%) sys   5.09 ( 3%)
wall
 branch prediction     :   1.84 ( 1%) usr   0.00 ( 0%) sys   1.83 ( 1%)
wall
 flow analysis         :   0.38 ( 0%) usr   0.02 ( 0%) sys   0.50 ( 0%)
wall
 combiner              :   2.37 ( 2%) usr   0.01 ( 0%) sys   2.38 ( 1%)
wall
 if-conversion         :   0.21 ( 0%) usr   0.00 ( 0%) sys   0.22 ( 0%)
wall
 regmove               :   0.77 ( 1%) usr   0.00 ( 0%) sys   0.83 ( 1%)
wall
 mode switching        :   0.18 ( 0%) usr   0.00 ( 0%) sys   0.19 ( 0%)
wall
 local alloc           :   4.02 ( 3%) usr   0.11 ( 3%) sys   4.33 ( 3%)
wall
 global alloc          :   5.12 ( 3%) usr   0.06 ( 1%) sys   5.11 ( 3%)
wall
 reload CSE regs       :   3.44 ( 2%) usr   0.04 ( 1%) sys   3.52 ( 2%)
wall
 flow 2                :   0.77 ( 1%) usr   0.00 ( 0%) sys   0.92 ( 1%)
wall
 if-conversion 2       :   0.03 ( 0%) usr   0.00 ( 0%) sys   0.05 ( 0%)
wall
 peephole 2            :   0.26 ( 0%) usr   0.00 ( 0%) sys   0.30 ( 0%)
wall
 rename registers      :   1.48 ( 1%) usr   0.01 ( 0%) sys   1.41 ( 1%)
wall
 scheduling 2          :   3.58 ( 2%) usr   0.19 ( 4%) sys   3.84 ( 2%)
wall
 reorder blocks        :   0.04 ( 0%) usr   0.00 ( 0%) sys   0.03 ( 0%)
wall
 shorten branches      :   0.41 ( 0%) usr   0.03 ( 1%) sys   0.44 ( 0%)
wall
 reg stack             :   0.10 ( 0%) usr   0.00 ( 0%) sys   0.11 ( 0%)
wall
 final                 :   1.27 ( 1%) usr   0.02 ( 0%) sys   2.50 ( 2%)
wall
 symout                :   0.01 ( 0%) usr   0.01 ( 0%) sys   0.05 ( 0%)
wall
 rest of compilation   :   3.26 ( 2%) usr   0.05 ( 1%) sys   3.50 ( 2%)
wall
 TOTAL                 : 150.09             4.31           161.61


which is a 60% improvement of 3.3 compared to 3.2! (3.3 as of 20030303)

Richard.

--
Richard Guenther <richard dot guenther at uni-tuebingen dot de>
WWW: http://www.tat.physik.uni-tuebingen.de/~rguenth/


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]