This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug middle-end/37448] [4.3/4.4/4.5 Regression] gcc 4.3.1 cannot compile big function
- From: "rguenth at gcc dot gnu dot org" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: 26 Jan 2010 11:35:29 -0000
- Subject: [Bug middle-end/37448] [4.3/4.4/4.5 Regression] gcc 4.3.1 cannot compile big function
- References: <bug-37448-16683@http.gcc.gnu.org/bugzilla/>
- Reply-to: gcc-bugzilla at gcc dot gnu dot org
------- Comment #31 from rguenth at gcc dot gnu dot org 2010-01-26 11:35 -------
Updated timings and memory:
> ~/bin/maxmem2.sh /usr/bin/time gcc-4.5 -S -o /dev/null lgwam.c
32.62user 1.48system 0:34.41elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+333822minor)pagefaults 0swaps
total: 1333745 kB
> ~/bin/maxmem2.sh /usr/bin/time /space/rguenther/install/gcc-4.4.3/bin/gcc -S -o /dev/null lgwam.c
35.01user 1.54system 0:36.89elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+329139minor)pagefaults 0swaps
total: 1306898 kB
> ~/bin/maxmem2.sh /usr/bin/time /space/rguenther/install/gcc-4.3.4/bin/gcc -S -o /dev/null lgwam.c
27.42user 1.83system 0:29.61elapsed 98%CPU (0avgtext+0avgdata 0maxresident)k
16inputs+0outputs (0major+374338minor)pagefaults 0swaps
total: 1341721 kB
> ~/bin/maxmem2.sh /usr/bin/time /space/rguenther/install/gcc-4.2.3/bin/gcc -S -o /dev/null lgwam.c
15.33user 0.80system 0:16.31elapsed 98%CPU (0avgtext+0avgdata 0maxresident)k
256inputs+0outputs (2major+197427minor)pagefaults 0swaps
total: 598733 kB
time-report for 4.5 trunk (only interesting parts):
expand : 6.72 (20%) usr 0.72 (26%) sys 7.45 (20%) wall
140609 kB (35%) ggc
integrated RA : 8.56 (25%) usr 0.12 ( 4%) sys 8.76 (24%) wall
5151 kB ( 1%) ggc
reload : 7.72 (23%) usr 0.22 ( 8%) sys 7.94 (21%) wall
TOTAL : 34.17 2.79 37.15
402913 kB
memory-usage is still high compared to 4.2.
At -O2 the picture is similar (memory peaks at 2.5GB):
Execution times (seconds)
garbage collection : 0.64 ( 0%) usr 0.00 ( 0%) sys 0.64 ( 0%) wall
0 kB ( 0%) ggc
callgraph construction: 0.31 ( 0%) usr 0.08 ( 1%) sys 0.39 ( 0%) wall
20525 kB ( 3%) ggc
callgraph optimization: 1.40 ( 1%) usr 0.03 ( 0%) sys 1.46 ( 1%) wall
639 kB ( 0%) ggc
ipa cp : 0.20 ( 0%) usr 0.00 ( 0%) sys 0.20 ( 0%) wall
9607 kB ( 1%) ggc
ipa reference : 0.14 ( 0%) usr 0.00 ( 0%) sys 0.14 ( 0%) wall
0 kB ( 0%) ggc
ipa pure const : 0.33 ( 0%) usr 0.00 ( 0%) sys 0.32 ( 0%) wall
199 kB ( 0%) ggc
cfg cleanup : 0.92 ( 0%) usr 0.00 ( 0%) sys 0.89 ( 0%) wall
1168 kB ( 0%) ggc
trivially dead code : 0.92 ( 0%) usr 0.00 ( 0%) sys 0.89 ( 0%) wall
0 kB ( 0%) ggc
df multiple defs : 0.86 ( 0%) usr 0.01 ( 0%) sys 0.88 ( 0%) wall
0 kB ( 0%) ggc
df reaching defs : 2.69 ( 1%) usr 0.96 (12%) sys 3.94 ( 2%) wall
0 kB ( 0%) ggc
df live regs : 7.15 ( 3%) usr 0.07 ( 1%) sys 7.13 ( 3%) wall
0 kB ( 0%) ggc
df live&initialized regs: 3.33 ( 2%) usr 0.06 ( 1%) sys 3.36 ( 1%) wall
0 kB ( 0%) ggc
df use-def / def-use chains: 0.85 ( 0%) usr 0.03 ( 0%) sys 0.88 ( 0%)
wall 0 kB ( 0%) ggc
df reg dead/unused notes: 6.31 ( 3%) usr 0.05 ( 1%) sys 12.84 ( 6%) wall
18222 kB ( 2%) ggc
register information : 3.42 ( 2%) usr 0.00 ( 0%) sys 3.46 ( 2%) wall
0 kB ( 0%) ggc
alias analysis : 1.14 ( 1%) usr 0.01 ( 0%) sys 1.17 ( 1%) wall
10447 kB ( 1%) ggc
alias stmt walking : 4.39 ( 2%) usr 0.26 ( 3%) sys 4.53 ( 2%) wall
0 kB ( 0%) ggc
register scan : 0.20 ( 0%) usr 0.00 ( 0%) sys 0.20 ( 0%) wall
7 kB ( 0%) ggc
rebuild jump labels : 0.53 ( 0%) usr 0.00 ( 0%) sys 0.51 ( 0%) wall
0 kB ( 0%) ggc
preprocessing : 0.64 ( 0%) usr 0.35 ( 4%) sys 0.93 ( 0%) wall
23140 kB ( 3%) ggc
lexical analysis : 0.30 ( 0%) usr 0.54 ( 7%) sys 0.88 ( 0%) wall
0 kB ( 0%) ggc
parser : 0.69 ( 0%) usr 0.38 ( 5%) sys 1.09 ( 0%) wall
38129 kB ( 5%) ggc
inline heuristics : 1.18 ( 1%) usr 0.01 ( 0%) sys 1.15 ( 1%) wall
29832 kB ( 4%) ggc
integration : 3.42 ( 2%) usr 0.45 ( 6%) sys 3.66 ( 2%) wall
175322 kB (22%) ggc
tree gimplify : 1.08 ( 1%) usr 0.09 ( 1%) sys 1.17 ( 1%) wall
104718 kB (13%) ggc
tree eh : 0.04 ( 0%) usr 0.00 ( 0%) sys 0.04 ( 0%) wall
0 kB ( 0%) ggc
tree CFG construction : 0.14 ( 0%) usr 0.01 ( 0%) sys 0.14 ( 0%) wall
11817 kB ( 1%) ggc
tree CFG cleanup : 1.23 ( 1%) usr 0.00 ( 0%) sys 1.21 ( 1%) wall
60 kB ( 0%) ggc
tree VRP : 3.08 ( 1%) usr 0.01 ( 0%) sys 3.09 ( 1%) wall
6719 kB ( 1%) ggc
tree copy propagation : 1.22 ( 1%) usr 0.02 ( 0%) sys 1.14 ( 0%) wall
585 kB ( 0%) ggc
tree find ref. vars : 0.11 ( 0%) usr 0.01 ( 0%) sys 0.12 ( 0%) wall
6045 kB ( 1%) ggc
tree PTA : 0.62 ( 0%) usr 0.00 ( 0%) sys 0.68 ( 0%) wall
2178 kB ( 0%) ggc
tree PHI insertion : 0.04 ( 0%) usr 0.00 ( 0%) sys 0.04 ( 0%) wall
185 kB ( 0%) ggc
tree SSA rewrite : 2.92 ( 1%) usr 0.06 ( 1%) sys 3.04 ( 1%) wall
54148 kB ( 7%) ggc
tree SSA other : 0.24 ( 0%) usr 0.08 ( 1%) sys 0.34 ( 0%) wall
589 kB ( 0%) ggc
tree SSA incremental : 3.46 ( 2%) usr 0.03 ( 0%) sys 3.50 ( 2%) wall
187 kB ( 0%) ggc
tree operand scan : 1.81 ( 1%) usr 0.31 ( 4%) sys 2.40 ( 1%) wall
53424 kB ( 7%) ggc
dominator optimization: 0.73 ( 0%) usr 0.00 ( 0%) sys 0.73 ( 0%) wall
5666 kB ( 1%) ggc
tree SRA : 0.01 ( 0%) usr 0.01 ( 0%) sys 0.02 ( 0%) wall
2 kB ( 0%) ggc
tree CCP : 0.95 ( 0%) usr 0.00 ( 0%) sys 0.93 ( 0%) wall
617 kB ( 0%) ggc
tree PHI const/copy prop: 0.02 ( 0%) usr 0.00 ( 0%) sys 0.02 ( 0%) wall
8 kB ( 0%) ggc
tree reassociation : 0.20 ( 0%) usr 0.00 ( 0%) sys 0.20 ( 0%) wall
280 kB ( 0%) ggc
tree PRE : 2.02 ( 1%) usr 1.01 (13%) sys 3.02 ( 1%) wall
2092 kB ( 0%) ggc
tree FRE : 2.62 ( 1%) usr 1.12 (14%) sys 3.75 ( 2%) wall
1674 kB ( 0%) ggc
tree code sinking : 0.15 ( 0%) usr 0.00 ( 0%) sys 0.17 ( 0%) wall
1126 kB ( 0%) ggc
tree linearize phis : 0.07 ( 0%) usr 0.01 ( 0%) sys 0.09 ( 0%) wall
4 kB ( 0%) ggc
tree forward propagate: 0.19 ( 0%) usr 0.00 ( 0%) sys 0.17 ( 0%) wall
75 kB ( 0%) ggc
tree phiprop : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall
0 kB ( 0%) ggc
tree conservative DCE : 0.61 ( 0%) usr 0.20 ( 2%) sys 0.91 ( 0%) wall
5 kB ( 0%) ggc
tree aggressive DCE : 0.65 ( 0%) usr 0.09 ( 1%) sys 0.80 ( 0%) wall
1486 kB ( 0%) ggc
tree buildin call DCE : 0.02 ( 0%) usr 0.00 ( 0%) sys 0.03 ( 0%) wall
0 kB ( 0%) ggc
tree DSE : 0.13 ( 0%) usr 0.00 ( 0%) sys 0.15 ( 0%) wall
34 kB ( 0%) ggc
PHI merge : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.00 ( 0%) wall
26 kB ( 0%) ggc
tree loop bounds : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.00 ( 0%) wall
17 kB ( 0%) ggc
tree loop invariant motion: 0.04 ( 0%) usr 0.00 ( 0%) sys 0.04 ( 0%)
wall 4 kB ( 0%) ggc
scev constant prop : 0.00 ( 0%) usr 0.00 ( 0%) sys 0.02 ( 0%) wall
112 kB ( 0%) ggc
complete unrolling : 0.06 ( 0%) usr 0.00 ( 0%) sys 0.07 ( 0%) wall
379 kB ( 0%) ggc
tree iv optimization : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.00 ( 0%) wall
473 kB ( 0%) ggc
tree loop init : 0.13 ( 0%) usr 0.00 ( 0%) sys 0.13 ( 0%) wall
390 kB ( 0%) ggc
tree copy headers : 0.02 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall
330 kB ( 0%) ggc
tree SSA uncprop : 0.03 ( 0%) usr 0.00 ( 0%) sys 0.04 ( 0%) wall
0 kB ( 0%) ggc
tree rename SSA copies: 0.31 ( 0%) usr 0.00 ( 0%) sys 0.32 ( 0%) wall
0 kB ( 0%) ggc
dominance frontiers : 0.07 ( 0%) usr 0.00 ( 0%) sys 0.09 ( 0%) wall
0 kB ( 0%) ggc
dominance computation : 1.26 ( 1%) usr 0.04 ( 0%) sys 1.31 ( 1%) wall
0 kB ( 0%) ggc
control dependences : 0.03 ( 0%) usr 0.00 ( 0%) sys 0.02 ( 0%) wall
0 kB ( 0%) ggc
expand : 10.92 ( 5%) usr 0.70 ( 9%) sys 11.74 ( 5%) wall
124907 kB (16%) ggc
forward prop : 2.08 ( 1%) usr 0.04 ( 0%) sys 2.06 ( 1%) wall
3631 kB ( 0%) ggc
CSE : 3.19 ( 1%) usr 0.00 ( 0%) sys 3.14 ( 1%) wall
512 kB ( 0%) ggc
dead code elimination : 1.24 ( 1%) usr 0.01 ( 0%) sys 1.25 ( 1%) wall
0 kB ( 0%) ggc
dead store elim1 : 1.26 ( 1%) usr 0.04 ( 0%) sys 1.33 ( 1%) wall
965 kB ( 0%) ggc
dead store elim2 : 1.51 ( 1%) usr 0.02 ( 0%) sys 1.54 ( 1%) wall
7466 kB ( 1%) ggc
loop analysis : 0.03 ( 0%) usr 0.00 ( 0%) sys 0.05 ( 0%) wall
229 kB ( 0%) ggc
loop invariant motion : 0.10 ( 0%) usr 0.03 ( 0%) sys 0.13 ( 0%) wall
2 kB ( 0%) ggc
CPROP : 0.18 ( 0%) usr 0.00 ( 0%) sys 0.20 ( 0%) wall
452 kB ( 0%) ggc
PRE : 0.06 ( 0%) usr 0.00 ( 0%) sys 0.07 ( 0%) wall
15 kB ( 0%) ggc
CSE 2 : 3.20 ( 1%) usr 0.00 ( 0%) sys 3.22 ( 1%) wall
494 kB ( 0%) ggc
branch prediction : 0.28 ( 0%) usr 0.00 ( 0%) sys 0.30 ( 0%) wall
2539 kB ( 0%) ggc
combiner : 1.42 ( 1%) usr 0.02 ( 0%) sys 1.42 ( 1%) wall
11673 kB ( 1%) ggc
if-conversion : 0.19 ( 0%) usr 0.00 ( 0%) sys 0.20 ( 0%) wall
338 kB ( 0%) ggc
regmove : 0.39 ( 0%) usr 0.00 ( 0%) sys 0.35 ( 0%) wall
0 kB ( 0%) ggc
integrated RA : 78.17 (37%) usr 0.46 ( 6%) sys 78.94 (34%) wall
4788 kB ( 1%) ggc
reload : 25.79 (12%) usr 0.07 ( 1%) sys 26.09 (11%) wall
26499 kB ( 3%) ggc
reload CSE regs : 3.00 ( 1%) usr 0.05 ( 1%) sys 3.08 ( 1%) wall
15550 kB ( 2%) ggc
thread pro- & epilogue: 1.39 ( 1%) usr 0.01 ( 0%) sys 1.48 ( 1%) wall
1006 kB ( 0%) ggc
if-conversion 2 : 0.06 ( 0%) usr 0.00 ( 0%) sys 0.08 ( 0%) wall
138 kB ( 0%) ggc
combine stack adjustments: 0.08 ( 0%) usr 0.00 ( 0%) sys 0.09 ( 0%) wall
0 kB ( 0%) ggc
peephole 2 : 0.67 ( 0%) usr 0.01 ( 0%) sys 0.69 ( 0%) wall
3275 kB ( 0%) ggc
hard reg cprop : 1.33 ( 1%) usr 0.02 ( 0%) sys 1.32 ( 1%) wall
979 kB ( 0%) ggc
scheduling 2 : 6.59 ( 3%) usr 0.05 ( 1%) sys 6.63 ( 3%) wall
294 kB ( 0%) ggc
machine dep reorg : 0.69 ( 0%) usr 0.01 ( 0%) sys 0.68 ( 0%) wall
10 kB ( 0%) ggc
reorder blocks : 0.34 ( 0%) usr 0.00 ( 0%) sys 0.39 ( 0%) wall
2878 kB ( 0%) ggc
final : 1.63 ( 1%) usr 0.04 ( 0%) sys 1.64 ( 1%) wall
201 kB ( 0%) ggc
tree if-combine : 0.01 ( 0%) usr 0.01 ( 0%) sys 0.03 ( 0%) wall
1 kB ( 0%) ggc
plugin execution : 0.00 ( 0%) usr 0.03 ( 0%) sys 0.04 ( 0%) wall
0 kB ( 0%) ggc
TOTAL : 214.12 8.02 229.70
793292 kB
At -O1 we seem to never finish early inlining into testsuite though...
(the loop over all callees calling check-inline-limits which again
loops over all callees looks quadratic - at least because we do not
not consider the duplicates in the caller? We seem to be looping
over edges but check limits for !one_only. Why do we not consider
edges individually and avoid calling cgraph_check_inline_limits
with !one_only at all?).
--
rguenth at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
Summary|[4.3/4.5 Regression] gcc |[4.3/4.4/4.5 Regression] gcc
|4.3.1 cannot compile big |4.3.1 cannot compile big
|function |function
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37448