This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: Compilation time has more than doubled on some Polyhedron tests
- From: dominiq at lps dot ens dot fr (Dominique Dhumieres)
- To: Tobias dot Schlueter at Physik dot Uni-Muenchen dot DE, dominiq at lps dot ens dot fr
- Cc: gcc at gcc dot gnu dot org, fortran at gcc dot gnu dot org
- Date: Sun, 15 Jan 2006 20:38:42 +0100
- Subject: Re: Compilation time has more than doubled on some Polyhedron tests
- References: <43CA92B8.mailA0Y1CCN4J@tournesol.lps.ens.fr> <1137350174.43ca961e9415d@www.cip.physik.uni-muenchen.de>
> Are you building with --enable-checking (the default)?
On AMD I am using the François-Xavier's builds. On my G5 I use a patched
version of the Fink's info file, the answer is probably in
ConfigureParams: --prefix=%p/lib/gcc4 --enable-languages=c,c++,fortran,objc,java --infodir='${prefix}/share/info' --with-gmp=%p --with-included-gettext --host=%m-apple-darwin`uname -r|cut -f1 -d.` `if test ! -f /usr/lib/libSystemStubs.a ; then echo -n "--with-as=%p/lib/odcctools/bin/as --with-ld=%p/lib/odcctools/bin/ld" ; fi`
and as far as I can tell is "no".
> Can you try compiling some of the most-affected files with -ftime-report, ...?
As a quick answer, with -ftime-report on induct.f90 I get:
[karma] lin/source% time gfortran -ftime-report -O3 -ffast-math -funroll-loops induct.f90
Execution times (seconds)
garbage collection : 0.58 ( 1%) usr 0.11 ( 2%) sys 0.71 ( 1%) wall 0 kB ( 0%) ggc
callgraph construction: 0.15 ( 0%) usr 0.02 ( 0%) sys 0.16 ( 0%) wall 645 kB ( 0%) ggc
callgraph optimization: 0.01 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall 194 kB ( 0%) ggc
ipa reference : 0.03 ( 0%) usr 0.00 ( 0%) sys 0.02 ( 0%) wall 1 kB ( 0%) ggc
ipa pure const : 0.00 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall 0 kB ( 0%) ggc
ipa type escape : 0.04 ( 0%) usr 0.00 ( 0%) sys 0.04 ( 0%) wall 0 kB ( 0%) ggc
cfg construction : 0.04 ( 0%) usr 0.00 ( 0%) sys 0.00 ( 0%) wall 77 kB ( 0%) ggc
cfg cleanup : 0.08 ( 0%) usr 0.00 ( 0%) sys 0.10 ( 0%) wall 94 kB ( 0%) ggc
CFG verifier : 0.45 ( 1%) usr 0.09 ( 2%) sys 0.50 ( 1%) wall 0 kB ( 0%) ggc
trivially dead code : 0.12 ( 0%) usr 0.01 ( 0%) sys 0.28 ( 0%) wall 0 kB ( 0%) ggc
life analysis : 0.53 ( 1%) usr 0.02 ( 0%) sys 0.47 ( 1%) wall 505 kB ( 0%) ggc
life info update : 0.11 ( 0%) usr 0.01 ( 0%) sys 0.27 ( 0%) wall 102 kB ( 0%) ggc
alias analysis : 0.32 ( 1%) usr 0.02 ( 0%) sys 0.51 ( 1%) wall 2161 kB ( 2%) ggc
register scan : 0.11 ( 0%) usr 0.00 ( 0%) sys 0.09 ( 0%) wall 7 kB ( 0%) ggc
rebuild jump labels : 0.05 ( 0%) usr 0.00 ( 0%) sys 0.06 ( 0%) wall 0 kB ( 0%) ggc
parser : 0.39 ( 1%) usr 0.04 ( 1%) sys 0.98 ( 2%) wall 3728 kB ( 3%) ggc
integration : 0.01 ( 0%) usr 0.01 ( 0%) sys 0.00 ( 0%) wall 0 kB ( 0%) ggc
tree gimplify : 0.17 ( 0%) usr 0.01 ( 0%) sys 0.15 ( 0%) wall 977 kB ( 1%) ggc
tree CFG construction : 0.01 ( 0%) usr 0.01 ( 0%) sys 0.02 ( 0%) wall 1699 kB ( 1%) ggc
tree CFG cleanup : 0.13 ( 0%) usr 0.04 ( 1%) sys 0.16 ( 0%) wall 327 kB ( 0%) ggc
tree VRP : 0.25 ( 0%) usr 0.08 ( 2%) sys 0.36 ( 1%) wall 2304 kB ( 2%) ggc
tree copy propagation : 1.18 ( 2%) usr 0.31 ( 6%) sys 1.54 ( 2%) wall 542 kB ( 0%) ggc
tree store copy prop : 0.22 ( 0%) usr 0.05 ( 1%) sys 0.28 ( 0%) wall 93 kB ( 0%) ggc
tree find ref. vars : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall 347 kB ( 0%) ggc
tree PTA : 0.58 ( 1%) usr 0.01 ( 0%) sys 0.61 ( 1%) wall 185 kB ( 0%) ggc
tree alias analysis : 0.70 ( 1%) usr 0.40 ( 8%) sys 1.24 ( 2%) wall 1747 kB ( 1%) ggc
tree PHI insertion : 0.02 ( 0%) usr 0.00 ( 0%) sys 0.04 ( 0%) wall 301 kB ( 0%) ggc
tree SSA rewrite : 1.39 ( 2%) usr 0.38 ( 8%) sys 1.81 ( 3%) wall 45605 kB (35%) ggc
tree SSA other : 0.06 ( 0%) usr 0.04 ( 1%) sys 0.10 ( 0%) wall 0 kB ( 0%) ggc
tree SSA incremental : 3.70 ( 6%) usr 0.13 ( 3%) sys 3.82 ( 6%) wall 7379 kB ( 6%) ggc
tree operand scan : 1.20 ( 2%) usr 0.63 (13%) sys 1.93 ( 3%) wall 19607 kB (15%) ggc
dominator optimization: 0.96 ( 2%) usr 0.03 ( 1%) sys 0.96 ( 2%) wall 3806 kB ( 3%) ggc
tree SRA : 0.02 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall 0 kB ( 0%) ggc
tree STORE-CCP : 0.16 ( 0%) usr 0.05 ( 1%) sys 0.16 ( 0%) wall 39 kB ( 0%) ggc
tree CCP : 0.17 ( 0%) usr 0.03 ( 1%) sys 0.18 ( 0%) wall 17 kB ( 0%) ggc
tree split crit edges : 0.03 ( 0%) usr 0.02 ( 0%) sys 0.05 ( 0%) wall 1842 kB ( 1%) ggc
tree reassociation : 0.05 ( 0%) usr 0.03 ( 1%) sys 0.05 ( 0%) wall 42 kB ( 0%) ggc
tree PRE : 0.56 ( 1%) usr 0.04 ( 1%) sys 0.58 ( 1%) wall 1264 kB ( 1%) ggc
tree FRE : 0.17 ( 0%) usr 0.01 ( 0%) sys 0.21 ( 0%) wall 1014 kB ( 1%) ggc
tree code sinking : 0.02 ( 0%) usr 0.00 ( 0%) sys 0.04 ( 0%) wall 9 kB ( 0%) ggc
tree forward propagate: 0.02 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall 4 kB ( 0%) ggc
tree conservative DCE : 0.45 ( 1%) usr 0.00 ( 0%) sys 0.43 ( 1%) wall 0 kB ( 0%) ggc
tree aggressive DCE : 0.13 ( 0%) usr 0.00 ( 0%) sys 0.15 ( 0%) wall 0 kB ( 0%) ggc
tree DSE : 0.06 ( 0%) usr 0.00 ( 0%) sys 0.08 ( 0%) wall 87 kB ( 0%) ggc
PHI merge : 0.06 ( 0%) usr 0.00 ( 0%) sys 0.06 ( 0%) wall 1368 kB ( 1%) ggc
tree loop bounds : 0.03 ( 0%) usr 0.00 ( 0%) sys 0.06 ( 0%) wall 210 kB ( 0%) ggc
loop invariant motion : 0.02 ( 0%) usr 0.00 ( 0%) sys 0.03 ( 0%) wall 12 kB ( 0%) ggc
tree canonical iv : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.00 ( 0%) wall 109 kB ( 0%) ggc
scev constant prop : 0.00 ( 0%) usr 0.00 ( 0%) sys 0.02 ( 0%) wall 36 kB ( 0%) ggc
tree loop unswitching : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.00 ( 0%) wall 0 kB ( 0%) ggc
complete unrolling : 0.82 ( 1%) usr 0.04 ( 1%) sys 1.07 ( 2%) wall 737 kB ( 1%) ggc
tree iv optimization : 0.10 ( 0%) usr 0.00 ( 0%) sys 0.09 ( 0%) wall 916 kB ( 1%) ggc
tree loop init : 0.07 ( 0%) usr 0.02 ( 0%) sys 0.13 ( 0%) wall 0 kB ( 0%) ggc
tree copy headers : 0.02 ( 0%) usr 0.01 ( 0%) sys 0.04 ( 0%) wall 2030 kB ( 2%) ggc
tree SSA uncprop : 0.03 ( 0%) usr 0.00 ( 0%) sys 0.02 ( 0%) wall 0 kB ( 0%) ggc
tree SSA to normal : 0.20 ( 0%) usr 0.13 ( 3%) sys 0.33 ( 1%) wall 1386 kB ( 1%) ggc
tree rename SSA copies: 0.06 ( 0%) usr 0.12 ( 2%) sys 0.15 ( 0%) wall 0 kB ( 0%) ggc
tree SSA verifier : 28.82 (50%) usr 1.12 (23%) sys 30.10 (48%) wall 19 kB ( 0%) ggc
tree STMT verifier : 4.78 ( 8%) usr 0.18 ( 4%) sys 4.92 ( 8%) wall 0 kB ( 0%) ggc
callgraph verifier : 0.02 ( 0%) usr 0.02 ( 0%) sys 0.05 ( 0%) wall 0 kB ( 0%) ggc
dominance frontiers : 0.03 ( 0%) usr 0.00 ( 0%) sys 0.02 ( 0%) wall 0 kB ( 0%) ggc
expand : 1.29 ( 2%) usr 0.09 ( 2%) sys 1.43 ( 2%) wall 9358 kB ( 7%) ggc
jump : 0.02 ( 0%) usr 0.01 ( 0%) sys 0.00 ( 0%) wall 18 kB ( 0%) ggc
CSE : 0.63 ( 1%) usr 0.03 ( 1%) sys 0.63 ( 1%) wall 443 kB ( 0%) ggc
loop analysis : 0.26 ( 0%) usr 0.10 ( 2%) sys 0.34 ( 1%) wall 1635 kB ( 1%) ggc
global CSE : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.03 ( 0%) wall 0 kB ( 0%) ggc
CPROP 1 : 0.08 ( 0%) usr 0.01 ( 0%) sys 0.07 ( 0%) wall 567 kB ( 0%) ggc
PRE : 0.06 ( 0%) usr 0.02 ( 0%) sys 0.09 ( 0%) wall 355 kB ( 0%) ggc
CPROP 2 : 0.11 ( 0%) usr 0.00 ( 0%) sys 0.08 ( 0%) wall 269 kB ( 0%) ggc
bypass jumps : 0.09 ( 0%) usr 0.01 ( 0%) sys 0.10 ( 0%) wall 238 kB ( 0%) ggc
web : 0.09 ( 0%) usr 0.02 ( 0%) sys 0.12 ( 0%) wall 203 kB ( 0%) ggc
CSE 2 : 0.41 ( 1%) usr 0.01 ( 0%) sys 0.46 ( 1%) wall 271 kB ( 0%) ggc
branch prediction : 0.04 ( 0%) usr 0.01 ( 0%) sys 0.06 ( 0%) wall 143 kB ( 0%) ggc
flow analysis : 0.00 ( 0%) usr 0.01 ( 0%) sys 0.03 ( 0%) wall 0 kB ( 0%) ggc
combiner : 0.33 ( 1%) usr 0.00 ( 0%) sys 0.34 ( 1%) wall 1379 kB ( 1%) ggc
if-conversion : 0.00 ( 0%) usr 0.01 ( 0%) sys 0.02 ( 0%) wall 19 kB ( 0%) ggc
regmove : 0.10 ( 0%) usr 0.01 ( 0%) sys 0.09 ( 0%) wall 3 kB ( 0%) ggc
scheduling : 0.44 ( 1%) usr 0.09 ( 2%) sys 0.45 ( 1%) wall 2689 kB ( 2%) ggc
local alloc : 0.30 ( 1%) usr 0.03 ( 1%) sys 0.32 ( 1%) wall 596 kB ( 0%) ggc
global alloc : 0.92 ( 2%) usr 0.03 ( 1%) sys 0.90 ( 1%) wall 2497 kB ( 2%) ggc
reload CSE regs : 0.33 ( 1%) usr 0.01 ( 0%) sys 0.32 ( 1%) wall 1198 kB ( 1%) ggc
load CSE after reload : 0.03 ( 0%) usr 0.00 ( 0%) sys 0.02 ( 0%) wall 13 kB ( 0%) ggc
flow 2 : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall 200 kB ( 0%) ggc
if-conversion 2 : 0.00 ( 0%) usr 0.01 ( 0%) sys 0.01 ( 0%) wall 4 kB ( 0%) ggc
peephole 2 : 0.04 ( 0%) usr 0.00 ( 0%) sys 0.04 ( 0%) wall 0 kB ( 0%) ggc
rename registers : 0.48 ( 1%) usr 0.02 ( 0%) sys 0.50 ( 1%) wall 610 kB ( 0%) ggc
scheduling 2 : 0.40 ( 1%) usr 0.02 ( 0%) sys 0.36 ( 1%) wall 2552 kB ( 2%) ggc
reorder blocks : 0.03 ( 0%) usr 0.00 ( 0%) sys 0.03 ( 0%) wall 159 kB ( 0%) ggc
shorten branches : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.02 ( 0%) wall 0 kB ( 0%) ggc
final : 0.16 ( 0%) usr 0.00 ( 0%) sys 0.16 ( 0%) wall 350 kB ( 0%) ggc
TOTAL : 57.21 4.84 63.35 129840 kB
Extra diagnostic checks enabled; compiler may run slowly.
Configure with --disable-checking to disable checks.
57.300u 4.940s 1:03.76 97.6% 0+0k 8+23io 0pf+0w
where tree SSA verifier takes half the time. I'll do some check on AMD wher I can
more easily chose the version of gfortran I am using.
Cheers
Dominique