This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: Timing information for CFG manipulations
- To: jh at suse dot cz
- Subject: Re: Timing information for CFG manipulations
- From: Brad Lucier <lucier at math dot purdue dot edu>
- Date: Wed, 17 Oct 2001 15:47:36 -0500 (EST)
- Cc: lucier at math dot purdue dot edu (Brad Lucier), gcc-patches at gcc dot gnu dot org
Here are gcc 3.0.1 compile times again on this file:
http://www.math.purdue.edu/~lucier/_num.i.gz
dino01% /soft/parallelisme/linux/gcc-3.0.1/lib/gcc-lib/i686-pc-linux-gnu/3.0.1/cc1 -fpic -fomit-frame-pointer -O1 -fno-math-errno -fno-strict-aliasing -mcpu=athlon -march=athlon _num.i
__sgn __sgnf __sgnl atan2 atan2f atan2l __atan2l fmod fmodf fmodl sqrt sqrtf sqrtl __sqrtl fabs fabsf fabsl __fabsl atan atanf atanl __sgn1l floor floorf floorl ceil ceilf ceill ldexp log1p log1pf log1pl asinh asinhf asinhl acosh acoshf acoshl atanh atanhf atanhl hypot hypotf hypotl logb logbf logbl drem dremf dreml __finite ___H__20___num {GC 23738k -> 7627k} {GC 11539k -> 7534k} {GC 9859k -> 7972k} {GC 11631k -> 8916k} {GC 14125k -> 9023k} ___init_proc ____20___num
Execution times (seconds)
garbage collection : 0.61 ( 1%) usr 0.00 ( 0%) sys 0.64 ( 1%) wall
preprocessing : 0.14 ( 0%) usr 0.13 (14%) sys 0.27 ( 0%) wall
lexical analysis : 0.34 ( 1%) usr 0.23 (24%) sys 0.56 ( 1%) wall
parser : 1.10 ( 2%) usr 0.15 (16%) sys 1.27 ( 2%) wall
varconst : 0.06 ( 0%) usr 0.00 ( 0%) sys 0.06 ( 0%) wall
integration : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall
jump : 4.04 ( 6%) usr 0.17 (18%) sys 4.36 ( 6%) wall
CSE : 0.75 ( 1%) usr 0.00 ( 0%) sys 0.77 ( 1%) wall
loop analysis : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall
CSE 2 : 0.00 ( 0%) usr 0.00 ( 0%) sys 0.00 ( 0%) wall
flow analysis : 11.21 (17%) usr 0.08 ( 8%) sys 12.47 (16%) wall
combiner : 1.07 ( 2%) usr 0.01 ( 1%) sys 1.12 ( 1%) wall
if-conversion : 1.12 ( 2%) usr 0.03 ( 3%) sys 1.22 ( 2%) wall
local alloc : 0.41 ( 1%) usr 0.03 ( 3%) sys 0.56 ( 1%) wall
global alloc : 2.58 ( 4%) usr 0.03 ( 3%) sys 3.14 ( 4%) wall
reload CSE regs : 9.85 (15%) usr 0.01 ( 1%) sys 12.76 (16%) wall
flow 2 : 13.72 (21%) usr 0.01 ( 1%) sys 19.63 (25%) wall
if-conversion 2 : 0.96 ( 1%) usr 0.01 ( 1%) sys 1.19 ( 2%) wall
shorten branches : 0.15 ( 0%) usr 0.00 ( 0%) sys 0.16 ( 0%) wall
reg stack : 15.94 (24%) usr 0.05 ( 5%) sys 17.05 (22%) wall
final : 0.97 ( 1%) usr 0.00 ( 0%) sys 0.97 ( 1%) wall
symout : 0.00 ( 0%) usr 0.00 ( 0%) sys 0.00 ( 0%) wall
rest of compilation : 0.45 ( 1%) usr 0.00 ( 0%) sys 0.46 ( 1%) wall
TOTAL : 65.49 0.96 78.82
Here are 3.1 times without the profiling code and with the patch
http://gcc.gnu.org/ml/gcc-patches/2001-10/msg00792.html
dino01% /u/lucier/local/gcc-3.1/lib/gcc-lib/i686-pc-linux-gnu/3.1/cc1 -fpic -fomit-frame-pointer -O1 -fno-math-errno -fno-strict-aliasing -mcpu=athlon -march=athlon _num.i
__sgn __sgnf __sgnl atan2 atan2f atan2l __atan2l fmod fmodf fmodl sqrt sqrtf sqrtl __sqrtl fabs fabsf fabsl __fabsl atan atanf atanl __sgn1l floor floorf floorl ceil ceilf ceill ldexp log1p log1pf log1pl asinh asinhf asinhl acosh acoshf acoshl atanh atanhf atanhl hypot hypotf hypotl logb logbf logbl drem dremf dreml __finite ___H__20___num {GC 25431k -> 7824k} {GC 10944k -> 7883k} {GC 10373k -> 7769k} {GC 13951k -> 8584k} {GC 14266k -> 9195k} ___init_proc {GC 12104k -> 9290k} ____20___num
Execution times (seconds)
garbage collection : 0.67 ( 0%) usr 0.00 ( 0%) sys 0.72 ( 0%) wall
cfg construction : 6.37 ( 4%) usr 0.26 ( 6%) sys 6.66 ( 4%) wall
cfg cleanup : 33.59 (21%) usr 0.01 ( 0%) sys 33.69 (20%) wall
preprocessing : 0.17 ( 0%) usr 0.11 ( 2%) sys 0.44 ( 0%) wall
lexical analysis : 0.29 ( 0%) usr 0.22 ( 5%) sys 0.53 ( 0%) wall
parser : 1.19 ( 1%) usr 0.15 ( 3%) sys 1.28 ( 1%) wall
varconst : 0.07 ( 0%) usr 0.00 ( 0%) sys 0.03 ( 0%) wall
jump : 0.63 ( 0%) usr 0.00 ( 0%) sys 0.66 ( 0%) wall
CSE : 0.72 ( 0%) usr 0.00 ( 0%) sys 0.69 ( 0%) wall
global CSE : 32.87 (21%) usr 0.57 (13%) sys 33.53 (20%) wall
loop analysis : 0.02 ( 0%) usr 0.00 ( 0%) sys 0.00 ( 0%) wall
flow analysis : 19.15 (12%) usr 0.14 ( 3%) sys 19.25 (12%) wall
combiner : 1.05 ( 1%) usr 0.00 ( 0%) sys 1.03 ( 1%) wall
if-conversion : 1.17 ( 1%) usr 0.03 ( 1%) sys 1.19 ( 1%) wall
local alloc : 0.40 ( 0%) usr 0.02 ( 0%) sys 0.41 ( 0%) wall
global alloc : 2.97 ( 2%) usr 0.04 ( 1%) sys 3.03 ( 2%) wall
reload CSE regs : 8.74 ( 5%) usr 0.00 ( 0%) sys 8.75 ( 5%) wall
flow 2 : 29.26 (18%) usr 0.08 ( 2%) sys 29.31 (18%) wall
if-conversion 2 : 0.98 ( 1%) usr 0.03 ( 1%) sys 1.00 ( 1%) wall
shorten branches : 0.13 ( 0%) usr 0.00 ( 0%) sys 0.12 ( 0%) wall
reg stack : 18.45 (12%) usr 2.88 (63%) sys 21.31 (13%) wall
final : 0.51 ( 0%) usr 0.00 ( 0%) sys 0.53 ( 0%) wall
rest of compilation : 0.80 ( 0%) usr 0.00 ( 0%) sys 0.84 ( 1%) wall
TOTAL : 160.20 4.54 165.09
So it still takes about 2.5 times as long for 3.1 as for 3.0.1 with -O1.
The times are closer for -O2. Here is first 3.0.1, and then 3.1:
dino01% /soft/parallelisme/linux/gcc-3.0.1/lib/gcc-lib/i686-pc-linux-gnu/3.0.1/cc1 -fpic -fomit-frame-pointer -O2 -fno-math-errno -fno-strict-aliasing -mcpu=athlon -march=athlon -Womitted-optimizations _num.i
__sgn __sgnf __sgnl atan2 atan2f atan2l __atan2l fmod fmodf fmodl sqrt sqrtf sqrtl __sqrtl fabs fabsf fabsl __fabsl atan atanf atanl __sgn1l floor floorf floorl ceil ceilf ceill ldexp log1p log1pf log1pl asinh asinhf asinhl acosh acoshf acoshl atanh atanhf atanhl hypot hypotf hypotl logb logbf logbl drem dremf dreml __finite ___H__20___num {GC 23945k -> 7739k} {GC 10264k -> 7082k} {GC 10049k -> 7614k} {GC 10129k -> 8167k} {GC 14667k -> 9158k} {GC 14858k -> 10744k} {GC 16174k -> 11278k} ___init_proc ____20___num
Execution times (seconds)
garbage collection : 0.88 ( 1%) usr 0.00 ( 0%) sys 0.88 ( 1%) wall
preprocessing : 0.27 ( 0%) usr 0.13 (12%) sys 0.40 ( 0%) wall
lexical analysis : 0.27 ( 0%) usr 0.14 (13%) sys 0.41 ( 0%) wall
parser : 1.07 ( 1%) usr 0.13 (12%) sys 1.20 ( 1%) wall
varconst : 0.02 ( 0%) usr 0.02 ( 2%) sys 0.04 ( 0%) wall
integration : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall
jump : 16.78 (16%) usr 0.37 (36%) sys 17.15 (16%) wall
CSE : 1.02 ( 1%) usr 0.00 ( 0%) sys 1.02 ( 1%) wall
global CSE : 3.19 ( 3%) usr 0.03 ( 3%) sys 3.22 ( 3%) wall
loop analysis : 0.10 ( 0%) usr 0.00 ( 0%) sys 0.10 ( 0%) wall
CSE 2 : 0.86 ( 1%) usr 0.00 ( 0%) sys 0.86 ( 1%) wall
flow analysis : 10.90 (11%) usr 0.04 ( 4%) sys 10.94 (11%) wall
combiner : 1.12 ( 1%) usr 0.01 ( 1%) sys 1.13 ( 1%) wall
if-conversion : 4.34 ( 4%) usr 0.03 ( 3%) sys 4.38 ( 4%) wall
regmove : 1.74 ( 2%) usr 0.00 ( 0%) sys 1.75 ( 2%) wall
local alloc : 0.64 ( 1%) usr 0.00 ( 0%) sys 0.64 ( 1%) wall
global alloc : 2.20 ( 2%) usr 0.00 ( 0%) sys 2.20 ( 2%) wall
reload CSE regs : 20.41 (20%) usr 0.03 ( 3%) sys 20.44 (20%) wall
flow 2 : 13.31 (13%) usr 0.05 ( 5%) sys 13.36 (13%) wall
if-conversion 2 : 0.93 ( 1%) usr 0.03 ( 3%) sys 1.01 ( 1%) wall
peephole 2 : 0.12 ( 0%) usr 0.00 ( 0%) sys 0.12 ( 0%) wall
scheduling 2 : 2.64 ( 3%) usr 0.00 ( 0%) sys 2.64 ( 3%) wall
reorder blocks : 0.79 ( 1%) usr 0.00 ( 0%) sys 0.79 ( 1%) wall
shorten branches : 0.17 ( 0%) usr 0.00 ( 0%) sys 0.17 ( 0%) wall
reg stack : 17.57 (17%) usr 0.02 ( 2%) sys 17.59 (17%) wall
final : 1.02 ( 1%) usr 0.01 ( 1%) sys 1.03 ( 1%) wall
symout : 0.00 ( 0%) usr 0.00 ( 0%) sys 0.00 ( 0%) wall
rest of compilation : 0.42 ( 0%) usr 0.00 ( 0%) sys 0.43 ( 0%) wall
TOTAL : 102.79 1.04 104.05
dino01% /u/lucier/local/gcc-3.1/lib/gcc-lib/i686-pc-linux-gnu/3.1/cc1 -fpic -fomit-frame-pointer -O2 -fno-math-errno -fno-strict-aliasing -mcpu=athlon -march=athlon -Womitted-optimizations _num.i
__sgn __sgnf __sgnl atan2 atan2f atan2l __atan2l fmod fmodf fmodl sqrt sqrtf sqrtl __sqrtl fabs fabsf fabsl __fabsl atan atanf atanl __sgn1l floor floorf floorl ceil ceilf ceill ldexp log1p log1pf log1pl asinh asinhf asinhl acosh acoshf acoshl atanh atanhf atanhl hypot hypotf hypotl logb logbf logbl drem dremf dreml __finite ___H__20___num {GC 25693k -> 7931k} {GC 10823k -> 7239k} {GC 10639k -> 7761k} {GC 10253k -> 8148k} {GC 14580k -> 8574k} {GC 13223k -> 9777k} {GC 15700k -> 10351k} ___init_proc {GC 13534k -> 10431k} ____20___num
Execution times (seconds)
garbage collection : 0.95 ( 1%) usr 0.00 ( 0%) sys 1.00 ( 1%) wall
cfg construction : 12.61 ( 7%) usr 0.45 (12%) sys 13.03 ( 7%) wall
cfg cleanup : 38.49 (20%) usr 0.00 ( 0%) sys 38.47 (20%) wall
preprocessing : 0.29 ( 0%) usr 0.14 ( 4%) sys 0.44 ( 0%) wall
lexical analysis : 0.23 ( 0%) usr 0.15 ( 4%) sys 0.44 ( 0%) wall
parser : 1.16 ( 1%) usr 0.15 ( 4%) sys 1.25 ( 1%) wall
varconst : 0.07 ( 0%) usr 0.00 ( 0%) sys 0.06 ( 0%) wall
jump : 1.51 ( 1%) usr 0.01 ( 0%) sys 1.59 ( 1%) wall
CSE : 1.59 ( 1%) usr 0.00 ( 0%) sys 1.56 ( 1%) wall
global CSE : 33.38 (18%) usr 0.50 (13%) sys 33.88 (18%) wall
loop analysis : 0.12 ( 0%) usr 0.00 ( 0%) sys 0.12 ( 0%) wall
CSE 2 : 0.88 ( 0%) usr 0.00 ( 0%) sys 0.84 ( 0%) wall
flow analysis : 18.87 (10%) usr 0.17 ( 4%) sys 19.06 (10%) wall
combiner : 1.14 ( 1%) usr 0.01 ( 0%) sys 1.12 ( 1%) wall
if-conversion : 1.18 ( 1%) usr 0.03 ( 1%) sys 1.22 ( 1%) wall
regmove : 1.65 ( 1%) usr 0.00 ( 0%) sys 1.66 ( 1%) wall
local alloc : 0.59 ( 0%) usr 0.00 ( 0%) sys 0.59 ( 0%) wall
global alloc : 2.53 ( 1%) usr 0.07 ( 2%) sys 2.59 ( 1%) wall
reload CSE regs : 17.11 ( 9%) usr 0.04 ( 1%) sys 17.16 ( 9%) wall
flow 2 : 30.48 (16%) usr 0.09 ( 2%) sys 30.56 (16%) wall
if-conversion 2 : 0.96 ( 1%) usr 0.03 ( 1%) sys 1.00 ( 1%) wall
peephole 2 : 0.10 ( 0%) usr 0.00 ( 0%) sys 0.09 ( 0%) wall
scheduling 2 : 2.63 ( 1%) usr 0.01 ( 0%) sys 2.62 ( 1%) wall
reorder blocks : 0.84 ( 0%) usr 0.00 ( 0%) sys 0.84 ( 0%) wall
shorten branches : 0.12 ( 0%) usr 0.01 ( 0%) sys 0.12 ( 0%) wall
reg stack : 17.70 ( 9%) usr 1.95 (51%) sys 19.66 (10%) wall
final : 0.50 ( 0%) usr 0.01 ( 0%) sys 0.53 ( 0%) wall
rest of compilation : 0.85 ( 0%) usr 0.00 ( 0%) sys 0.84 ( 0%) wall
TOTAL : 188.53 3.82 192.41
Brad