This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Performance problems with flow analysis?


With the input 

http://www.math.purdue.edu/~lucier/all.i.gz

compiled with

banach-12% /pkgs/gcc-2.96/bin/gcc -v
Reading specs from /pkgs/gcc-2.96/lib/gcc-lib/sparc-sun-solaris2.8/3.1/specs
Configured with: ../configure --prefix=/pkgs/gcc-2.96 --enable-checking=no
Thread model: posix
gcc version 3.1 20010808 (experimental)

I get timings of

cc1 -fPIC -O1 -fschedule-insns2 -fno-math-errno -fno-strict-aliasing -mcpu=supersparc -mtune=ultrasparc -Wall -W -Wno-unused all.i
 ___H__20_all {GC 72513k -> 24052k} {GC 32111k -> 25325k} {GC 33960k -> 25286k} {GC 40662k -> 24398k} {GC 37460k -> 27415k} {GC 50784k -> 30011k} ___init_proc ____20_all
Execution times (seconds)
 garbage collection    :   3.68 (-0%) usr   0.01 ( 0%) sys   3.70 (-1%) wall
 cfg construction      : 143.85 (-19%) usr  16.60 (59%) sys 160.45 (-22%) wall
 cfg cleanup           : 303.65 (-41%) usr   0.00 ( 0%) sys 303.66 (-42%) wall
 preprocessing         :   0.74 (-0%) usr   2.66 ( 9%) sys   3.35 (-0%) wall
 lexical analysis      :   1.12 (-0%) usr   4.15 (15%) sys   6.57 (-1%) wall
 parser                :   6.34 (-1%) usr   3.21 (11%) sys   9.35 (-1%) wall
 varconst              :   0.35 (-0%) usr   0.01 ( 0%) sys   0.37 (-0%) wall
 jump                  :  13.03 (-2%) usr   0.01 ( 0%) sys  13.04 (-2%) wall
 CSE                   :   5.56 (-1%) usr   0.00 ( 0%) sys   5.56 (-1%) wall
 loop analysis         :   0.06 (-0%) usr   0.00 ( 0%) sys   0.06 (-0%) wall
 CSE 2                 :   0.00 (-0%) usr   0.00 ( 0%) sys   0.00 (-0%) wall
 flow analysis         : 966.15 (-129%) usr   1.09 ( 4%) sys 969.47 (-135%) wall
 combiner              :   5.00 (-1%) usr   0.00 ( 0%) sys   5.00 (-1%) wall
 if-conversion         :   9.89 (-1%) usr   0.00 ( 0%) sys   9.89 (-1%) wall
 scheduling            :   0.00 (-0%) usr   0.00 ( 0%) sys   0.00 (-0%) wall
 local alloc           :   3.94 (-1%) usr   0.00 ( 0%) sys   3.94 (-1%) wall
 global alloc          :  20.82 (-3%) usr   0.59 ( 2%) sys  21.41 (-3%) wall
 reload CSE regs       :  86.58 (-12%) usr   0.00 ( 0%) sys  86.58 (-12%) wall
 flow 2                :1916.16 (-256%) usr   0.00 ( 0%) sys1916.29 (-268%) wall
 if-conversion 2       :  10.17 (-1%) usr   0.00 ( 0%) sys  10.17 (-1%) wall
 scheduling 2          :  27.15 (-4%) usr   0.00 ( 0%) sys  27.15 (-4%) wall
 delay branch sched    :   4.20 (-1%) usr   0.00 ( 0%) sys   4.20 (-1%) wall
 reorder blocks        :   0.00 (-0%) usr   0.00 ( 0%) sys   0.00 (-0%) wall
 shorten branches      :   0.33 (-0%) usr   0.00 ( 0%) sys   0.33 (-0%) wall
 final                 :  15.58 (-2%) usr   0.02 ( 0%) sys  15.65 (-2%) wall
 symout                :   0.00 (-0%) usr   0.00 ( 0%) sys   0.00 (-0%) wall
 rest of compilation   :   2.22 (-0%) usr   0.00 ( 0%) sys   2.22 (-0%) wall
 TOTAL                 :-748.-39            28.37          -716.-28

(which I doubt, since it took over 2 1/2 hours of CPU time), whereas
with 3.0 one gets:

banach-11% /pkgs/gcc-3.0/lib/gcc-lib/sparc-sun-solaris2.8/3.0/cc1 -fPIC -O1 -fschedule-insns2 -fno-math-errno -fno-strict-aliasing -mcpu=supersparc -mtune=ultrasparc -Wall -W -Wno-unused all.i
 ___H__20_all {GC 47058k -> 14474k} {GC 19055k -> 15561k} {GC 20907k -> 15533k}
{GC 24345k -> 16449k} {GC 25411k -> 18532k} {GC 32856k -> 21109k} ___init_proc _
___20_all
Execution times (seconds)
 garbage collection    :   3.60 ( 1%) usr   0.00 ( 0%) sys   3.60 ( 1%) wall
 preprocessing         :   0.76 ( 0%) usr   2.50 (10%) sys   3.12 ( 1%) wall
 lexical analysis      :   0.84 ( 0%) usr   4.21 (17%) sys   5.36 ( 1%) wall
 parser                :   6.21 ( 1%) usr   2.87 (12%) sys   8.97 ( 2%) wall
 varconst              :   0.28 ( 0%) usr   0.02 ( 0%) sys   0.30 ( 0%) wall
 jump                  :  54.27 (13%) usr  12.94 (53%) sys  67.21 (15%) wall
 CSE                   :   4.28 ( 1%) usr   0.00 ( 0%) sys   4.28 ( 1%) wall
 loop analysis         :   0.02 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall
 CSE 2                 :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall
 flow analysis         : 137.30 (32%) usr   0.05 ( 0%) sys 137.35 (30%) wall
 combiner              :   4.14 ( 1%) usr   0.00 ( 0%) sys   4.14 ( 1%) wall
 if-conversion         :  10.65 ( 2%) usr   0.11 ( 0%) sys  10.76 ( 2%) wall
 local alloc           :   3.07 ( 1%) usr   0.00 ( 0%) sys   3.07 ( 1%) wall
 global alloc          :  17.99 ( 4%) usr   1.67 ( 7%) sys  19.66 ( 4%) wall
 reload CSE regs       :  42.31 (10%) usr   0.00 ( 0%) sys  42.31 ( 9%) wall
 flow 2                : 101.38 (23%) usr   0.00 ( 0%) sys 101.40 (22%) wall
 if-conversion 2       :  10.09 ( 2%) usr   0.00 ( 0%) sys  10.10 ( 2%) wall
 scheduling 2          :  13.25 ( 3%) usr   0.00 ( 0%) sys  13.25 ( 3%) wall
 delay branch sched    :   4.58 ( 1%) usr   0.00 ( 0%) sys   4.58 ( 1%) wall
 shorten branches      :   0.20 ( 0%) usr   0.00 ( 0%) sys   0.20 ( 0%) wall
 final                 :  15.71 ( 4%) usr   0.02 ( 0%) sys  15.80 ( 3%) wall
 symout                :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall
 rest of compilation   :   1.75 ( 0%) usr   0.00 ( 0%) sys   1.75 ( 0%) wall
 TOTAL                 : 432.68            24.40           457.40

which, in my book, is reasonable (or at least acceptable).

I'll build a profiled version (and test it on a smaller file) to
see precisely where the problem is, but perhaps this information can
set someone on the right track for now.

Brad Lucier


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]