This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Performance problems with flow analysis?
- To: gcc at gcc dot gnu dot org
- Subject: Performance problems with flow analysis?
- From: Brad Lucier <lucier at math dot purdue dot edu>
- Date: Thu, 9 Aug 2001 21:36:18 -0500 (EST)
- Cc: lucier at math dot purdue dot edu (Brad Lucier), feeley at iro dot umontreal dot ca
With the input
http://www.math.purdue.edu/~lucier/all.i.gz
compiled with
banach-12% /pkgs/gcc-2.96/bin/gcc -v
Reading specs from /pkgs/gcc-2.96/lib/gcc-lib/sparc-sun-solaris2.8/3.1/specs
Configured with: ../configure --prefix=/pkgs/gcc-2.96 --enable-checking=no
Thread model: posix
gcc version 3.1 20010808 (experimental)
I get timings of
cc1 -fPIC -O1 -fschedule-insns2 -fno-math-errno -fno-strict-aliasing -mcpu=supersparc -mtune=ultrasparc -Wall -W -Wno-unused all.i
___H__20_all {GC 72513k -> 24052k} {GC 32111k -> 25325k} {GC 33960k -> 25286k} {GC 40662k -> 24398k} {GC 37460k -> 27415k} {GC 50784k -> 30011k} ___init_proc ____20_all
Execution times (seconds)
garbage collection : 3.68 (-0%) usr 0.01 ( 0%) sys 3.70 (-1%) wall
cfg construction : 143.85 (-19%) usr 16.60 (59%) sys 160.45 (-22%) wall
cfg cleanup : 303.65 (-41%) usr 0.00 ( 0%) sys 303.66 (-42%) wall
preprocessing : 0.74 (-0%) usr 2.66 ( 9%) sys 3.35 (-0%) wall
lexical analysis : 1.12 (-0%) usr 4.15 (15%) sys 6.57 (-1%) wall
parser : 6.34 (-1%) usr 3.21 (11%) sys 9.35 (-1%) wall
varconst : 0.35 (-0%) usr 0.01 ( 0%) sys 0.37 (-0%) wall
jump : 13.03 (-2%) usr 0.01 ( 0%) sys 13.04 (-2%) wall
CSE : 5.56 (-1%) usr 0.00 ( 0%) sys 5.56 (-1%) wall
loop analysis : 0.06 (-0%) usr 0.00 ( 0%) sys 0.06 (-0%) wall
CSE 2 : 0.00 (-0%) usr 0.00 ( 0%) sys 0.00 (-0%) wall
flow analysis : 966.15 (-129%) usr 1.09 ( 4%) sys 969.47 (-135%) wall
combiner : 5.00 (-1%) usr 0.00 ( 0%) sys 5.00 (-1%) wall
if-conversion : 9.89 (-1%) usr 0.00 ( 0%) sys 9.89 (-1%) wall
scheduling : 0.00 (-0%) usr 0.00 ( 0%) sys 0.00 (-0%) wall
local alloc : 3.94 (-1%) usr 0.00 ( 0%) sys 3.94 (-1%) wall
global alloc : 20.82 (-3%) usr 0.59 ( 2%) sys 21.41 (-3%) wall
reload CSE regs : 86.58 (-12%) usr 0.00 ( 0%) sys 86.58 (-12%) wall
flow 2 :1916.16 (-256%) usr 0.00 ( 0%) sys1916.29 (-268%) wall
if-conversion 2 : 10.17 (-1%) usr 0.00 ( 0%) sys 10.17 (-1%) wall
scheduling 2 : 27.15 (-4%) usr 0.00 ( 0%) sys 27.15 (-4%) wall
delay branch sched : 4.20 (-1%) usr 0.00 ( 0%) sys 4.20 (-1%) wall
reorder blocks : 0.00 (-0%) usr 0.00 ( 0%) sys 0.00 (-0%) wall
shorten branches : 0.33 (-0%) usr 0.00 ( 0%) sys 0.33 (-0%) wall
final : 15.58 (-2%) usr 0.02 ( 0%) sys 15.65 (-2%) wall
symout : 0.00 (-0%) usr 0.00 ( 0%) sys 0.00 (-0%) wall
rest of compilation : 2.22 (-0%) usr 0.00 ( 0%) sys 2.22 (-0%) wall
TOTAL :-748.-39 28.37 -716.-28
(which I doubt, since it took over 2 1/2 hours of CPU time), whereas
with 3.0 one gets:
banach-11% /pkgs/gcc-3.0/lib/gcc-lib/sparc-sun-solaris2.8/3.0/cc1 -fPIC -O1 -fschedule-insns2 -fno-math-errno -fno-strict-aliasing -mcpu=supersparc -mtune=ultrasparc -Wall -W -Wno-unused all.i
___H__20_all {GC 47058k -> 14474k} {GC 19055k -> 15561k} {GC 20907k -> 15533k}
{GC 24345k -> 16449k} {GC 25411k -> 18532k} {GC 32856k -> 21109k} ___init_proc _
___20_all
Execution times (seconds)
garbage collection : 3.60 ( 1%) usr 0.00 ( 0%) sys 3.60 ( 1%) wall
preprocessing : 0.76 ( 0%) usr 2.50 (10%) sys 3.12 ( 1%) wall
lexical analysis : 0.84 ( 0%) usr 4.21 (17%) sys 5.36 ( 1%) wall
parser : 6.21 ( 1%) usr 2.87 (12%) sys 8.97 ( 2%) wall
varconst : 0.28 ( 0%) usr 0.02 ( 0%) sys 0.30 ( 0%) wall
jump : 54.27 (13%) usr 12.94 (53%) sys 67.21 (15%) wall
CSE : 4.28 ( 1%) usr 0.00 ( 0%) sys 4.28 ( 1%) wall
loop analysis : 0.02 ( 0%) usr 0.00 ( 0%) sys 0.02 ( 0%) wall
CSE 2 : 0.00 ( 0%) usr 0.00 ( 0%) sys 0.00 ( 0%) wall
flow analysis : 137.30 (32%) usr 0.05 ( 0%) sys 137.35 (30%) wall
combiner : 4.14 ( 1%) usr 0.00 ( 0%) sys 4.14 ( 1%) wall
if-conversion : 10.65 ( 2%) usr 0.11 ( 0%) sys 10.76 ( 2%) wall
local alloc : 3.07 ( 1%) usr 0.00 ( 0%) sys 3.07 ( 1%) wall
global alloc : 17.99 ( 4%) usr 1.67 ( 7%) sys 19.66 ( 4%) wall
reload CSE regs : 42.31 (10%) usr 0.00 ( 0%) sys 42.31 ( 9%) wall
flow 2 : 101.38 (23%) usr 0.00 ( 0%) sys 101.40 (22%) wall
if-conversion 2 : 10.09 ( 2%) usr 0.00 ( 0%) sys 10.10 ( 2%) wall
scheduling 2 : 13.25 ( 3%) usr 0.00 ( 0%) sys 13.25 ( 3%) wall
delay branch sched : 4.58 ( 1%) usr 0.00 ( 0%) sys 4.58 ( 1%) wall
shorten branches : 0.20 ( 0%) usr 0.00 ( 0%) sys 0.20 ( 0%) wall
final : 15.71 ( 4%) usr 0.02 ( 0%) sys 15.80 ( 3%) wall
symout : 0.00 ( 0%) usr 0.00 ( 0%) sys 0.00 ( 0%) wall
rest of compilation : 1.75 ( 0%) usr 0.00 ( 0%) sys 1.75 ( 0%) wall
TOTAL : 432.68 24.40 457.40
which, in my book, is reasonable (or at least acceptable).
I'll build a profiled version (and test it on a smaller file) to
see precisely where the problem is, but perhaps this information can
set someone on the right track for now.
Brad Lucier