[Bug rtl-optimization/28071] [4.1/4.2 regression] A file that can not be compiled in reasonable time/space
Jan Hubicka
hubicka@ucw.cz
Sat Jul 22 20:51:00 GMT 2006
Hi,
with the attached patch that saves roughly 10 minutes of tree-into-ssa
pass, I can compile with -O3 -fno-tree-fre -fno-tree-pre. Only without
checking-enabled since we do incredibly deep dominator walks running out
of stack space that can be considered as bug too.
TER still manages to enfore few thousdand temporaries with overlapping
liveranges.
THe out-of-ssa pass spends most of time in calculate_live_on_exit
and calculate_live_on_entry that looks rather symmetric to problem cured
by the attached patch, but I don't see directly how to avoid the
quadratic behaviour there.
Honza
garbage collection : 1.22 ( 0%) usr 0.10 ( 1%) sys 8.40 ( 1%) wall 0 kB ( 0%) ggc
callgraph construction: 0.14 ( 0%) usr 0.03 ( 0%) sys 0.18 ( 0%) wall 1147 kB ( 0%) ggc
callgraph optimization: 0.07 ( 0%) usr 0.01 ( 0%) sys 0.45 ( 0%) wall 533 kB ( 0%) ggc
ipa reference : 0.05 ( 0%) usr 0.00 ( 0%) sys 0.06 ( 0%) wall 0 kB ( 0%) ggc
ipa pure const : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall 0 kB ( 0%) ggc
ipa type escape : 0.04 ( 0%) usr 0.00 ( 0%) sys 0.04 ( 0%) wall 0 kB ( 0%) ggc
cfg cleanup : 3.89 ( 1%) usr 0.01 ( 0%) sys 4.11 ( 0%) wall 1576 kB ( 1%) ggc
trivially dead code : 0.46 ( 0%) usr 0.00 ( 0%) sys 0.53 ( 0%) wall 0 kB ( 0%) ggc
life analysis : 51.34 ( 9%) usr 2.65 (21%) sys 73.91 ( 5%) wall 2653 kB ( 1%) ggc
life info update : 48.97 ( 9%) usr 0.14 ( 1%) sys 50.68 ( 4%) wall 641 kB ( 0%) ggc
alias analysis : 0.69 ( 0%) usr 0.00 ( 0%) sys 1.05 ( 0%) wall 4139 kB ( 1%) ggc
register scan : 0.41 ( 0%) usr 0.00 ( 0%) sys 0.40 ( 0%) wall 0 kB ( 0%) ggc
rebuild jump labels : 0.14 ( 0%) usr 0.00 ( 0%) sys 0.12 ( 0%) wall 0 kB ( 0%) ggc
preprocessing : 0.37 ( 0%) usr 0.06 ( 0%) sys 0.34 ( 0%) wall 471 kB ( 0%) ggc
lexical analysis : 0.01 ( 0%) usr 0.05 ( 0%) sys 0.07 ( 0%) wall 0 kB ( 0%) ggc
parser : 0.09 ( 0%) usr 0.02 ( 0%) sys 0.18 ( 0%) wall 3207 kB ( 1%) ggc
inline heuristics : 14.79 ( 3%) usr 0.02 ( 0%) sys 14.86 ( 1%) wall 1118 kB ( 0%) ggc
integration : 17.07 ( 3%) usr 0.22 ( 2%) sys 17.36 ( 1%) wall 79483 kB (27%) ggc
tree gimplify : 0.15 ( 0%) usr 0.01 ( 0%) sys 0.17 ( 0%) wall 3341 kB ( 1%) ggc
tree eh : 0.00 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall 0 kB ( 0%) ggc
tree CFG construction : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.00 ( 0%) wall 1338 kB ( 0%) ggc
tree CFG cleanup : 4.27 ( 1%) usr 0.00 ( 0%) sys 4.27 ( 0%) wall 20 kB ( 0%) ggc
tree VRP : 1.26 ( 0%) usr 0.03 ( 0%) sys 1.33 ( 0%) wall 14 kB ( 0%) ggc
tree copy propagation : 0.85 ( 0%) usr 0.05 ( 0%) sys 0.94 ( 0%) wall 313 kB ( 0%) ggc
tree store copy prop : 0.27 ( 0%) usr 0.01 ( 0%) sys 0.28 ( 0%) wall 5 kB ( 0%) ggc
tree find ref. vars : 0.16 ( 0%) usr 0.03 ( 0%) sys 0.18 ( 0%) wall 12044 kB ( 4%) ggc
tree PTA : 1.55 ( 0%) usr 0.06 ( 0%) sys 1.63 ( 0%) wall 57 kB ( 0%) ggc
tree alias analysis : 2.81 ( 0%) usr 0.29 ( 2%) sys 3.10 ( 0%) wall 0 kB ( 0%) ggc
tree PHI insertion : 0.57 ( 0%) usr 0.92 ( 7%) sys 1.52 ( 0%) wall 3137 kB ( 1%) ggc
tree SSA rewrite : 2.33 ( 0%) usr 0.06 ( 0%) sys 5.02 ( 0%) wall 21592 kB ( 7%) ggc
tree SSA other : 0.41 ( 0%) usr 0.16 ( 1%) sys 0.65 ( 0%) wall 0 kB ( 0%) ggc
tree SSA incremental : 4.18 ( 1%) usr 0.45 ( 4%) sys 4.72 ( 0%) wall 520 kB ( 0%) ggc
tree operand scan : 1.79 ( 0%) usr 0.69 ( 5%) sys 39.97 ( 3%) wall 18374 kB ( 6%) ggc
dominator optimization: 2.91 ( 1%) usr 0.05 ( 0%) sys 2.99 ( 0%) wall 11155 kB ( 4%) ggc
tree SRA : 4.24 ( 1%) usr 0.15 ( 1%) sys 4.51 ( 0%) wall 25568 kB ( 9%) ggc
tree STORE-CCP : 0.29 ( 0%) usr 0.01 ( 0%) sys 0.31 ( 0%) wall 18 kB ( 0%) ggc
tree CCP : 0.87 ( 0%) usr 0.01 ( 0%) sys 2.39 ( 0%) wall 154 kB ( 0%) ggc
tree split crit edges : 0.11 ( 0%) usr 0.02 ( 0%) sys 0.14 ( 0%) wall 9284 kB ( 3%) ggc
tree reassociation : 0.34 ( 0%) usr 0.00 ( 0%) sys 0.33 ( 0%) wall 0 kB ( 0%) ggc
tree code sinking : 0.32 ( 0%) usr 0.00 ( 0%) sys 0.32 ( 0%) wall 0 kB ( 0%) ggc
tree linearize phis : 0.12 ( 0%) usr 0.00 ( 0%) sys 0.12 ( 0%) wall 0 kB ( 0%) ggc
tree forward propagate: 0.10 ( 0%) usr 0.00 ( 0%) sys 0.09 ( 0%) wall 0 kB ( 0%) ggc
tree conservative DCE : 1.13 ( 0%) usr 0.00 ( 0%) sys 1.11 ( 0%) wall 0 kB ( 0%) ggc
tree aggressive DCE : 0.28 ( 0%) usr 0.00 ( 0%) sys 0.28 ( 0%) wall 0 kB ( 0%) ggc
tree DSE : 0.25 ( 0%) usr 0.00 ( 0%) sys 0.22 ( 0%) wall 1 kB ( 0%) ggc
PHI merge : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.00 ( 0%) wall 0 kB ( 0%) ggc
complete unrolling : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.00 ( 0%) wall 0 kB ( 0%) ggc
tree loop init : 0.14 ( 0%) usr 0.00 ( 0%) sys 0.15 ( 0%) wall 0 kB ( 0%) ggc
tree copy headers : 0.10 ( 0%) usr 0.00 ( 0%) sys 0.10 ( 0%) wall 0 kB ( 0%) ggc
tree SSA uncprop : 0.09 ( 0%) usr 0.00 ( 0%) sys 0.09 ( 0%) wall 0 kB ( 0%) ggc
tree SSA to normal : 228.94 (40%) usr 0.64 ( 5%) sys 337.06 (25%) wall 10323 kB ( 4%) ggc
tree rename SSA copies: 0.49 ( 0%) usr 0.03 ( 0%) sys 0.51 ( 0%) wall 0 kB ( 0%) ggc
dominance frontiers : 0.23 ( 0%) usr 0.00 ( 0%) sys 0.26 ( 0%) wall 0 kB ( 0%) ggc
dominance computation : 2.63 ( 0%) usr 0.09 ( 1%) sys 2.85 ( 0%) wall 0 kB ( 0%) ggc
control dependences : 0.04 ( 0%) usr 0.00 ( 0%) sys 0.04 ( 0%) wall 0 kB ( 0%) ggc
expand : 6.10 ( 1%) usr 1.13 ( 9%) sys 192.49 (14%) wall 35008 kB (12%) ggc
jump : 0.09 ( 0%) usr 0.00 ( 0%) sys 0.08 ( 0%) wall 0 kB ( 0%) ggc
CSE : 0.89 ( 0%) usr 0.01 ( 0%) sys 0.89 ( 0%) wall 53 kB ( 0%) ggc
loop analysis : 0.29 ( 0%) usr 0.00 ( 0%) sys 0.28 ( 0%) wall 930 kB ( 0%) ggc
CPROP 1 : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.00 ( 0%) wall 0 kB ( 0%) ggc
CSE 2 : 0.46 ( 0%) usr 0.00 ( 0%) sys 0.46 ( 0%) wall 29 kB ( 0%) ggc
branch prediction : 0.55 ( 0%) usr 0.00 ( 0%) sys 0.56 ( 0%) wall 0 kB ( 0%) ggc
flow analysis : 37.33 ( 6%) usr 0.10 ( 1%) sys 53.59 ( 4%) wall 0 kB ( 0%) ggc
combiner : 1.02 ( 0%) usr 0.02 ( 0%) sys 1.37 ( 0%) wall 2685 kB ( 1%) ggc
if-conversion : 5.21 ( 1%) usr 0.00 ( 0%) sys 5.36 ( 0%) wall 1614 kB ( 1%) ggc
regmove : 0.72 ( 0%) usr 0.01 ( 0%) sys 0.83 ( 0%) wall 4 kB ( 0%) ggc
mode switching : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.02 ( 0%) wall 0 kB ( 0%) ggc
local alloc : 1.06 ( 0%) usr 0.02 ( 0%) sys 1.46 ( 0%) wall 1045 kB ( 0%) ggc
global alloc : 86.33 (15%) usr 4.12 (32%) sys 452.97 (34%) wall 8488 kB ( 3%) ggc
reload CSE regs : 24.86 ( 4%) usr 0.07 ( 1%) sys 28.13 ( 2%) wall 3370 kB ( 1%) ggc
load CSE after reload : 0.08 ( 0%) usr 0.00 ( 0%) sys 0.11 ( 0%) wall 0 kB ( 0%) ggc
flow 2 : 0.36 ( 0%) usr 0.01 ( 0%) sys 1.19 ( 0%) wall 5064 kB ( 2%) ggc
if-conversion 2 : 0.22 ( 0%) usr 0.00 ( 0%) sys 0.24 ( 0%) wall 0 kB ( 0%) ggc
peephole 2 : 0.22 ( 0%) usr 0.00 ( 0%) sys 0.24 ( 0%) wall 0 kB ( 0%) ggc
rename registers : 0.38 ( 0%) usr 0.05 ( 0%) sys 0.50 ( 0%) wall 1 kB ( 0%) ggc
scheduling 2 : 2.10 ( 0%) usr 0.07 ( 1%) sys 2.40 ( 0%) wall 4347 kB ( 1%) ggc
machine dep reorg : 0.31 ( 0%) usr 0.00 ( 0%) sys 0.31 ( 0%) wall 79 kB ( 0%) ggc
reorder blocks : 0.63 ( 0%) usr 0.01 ( 0%) sys 1.06 ( 0%) wall 2738 kB ( 1%) ggc
reg stack : 1.07 ( 0%) usr 0.02 ( 0%) sys 1.53 ( 0%) wall 11030 kB ( 4%) ggc
final : 1.06 ( 0%) usr 0.04 ( 0%) sys 1.18 ( 0%) wall 2182 kB ( 1%) ggc
symout : 0.00 ( 0%) usr 0.00 ( 0%) sys 0.11 ( 0%) wall 0 kB ( 0%) ggc
TOTAL : 575.62 12.78 1351.48 291955 kB
-------------- next part --------------
Index: tree-into-ssa.c
===================================================================
*** tree-into-ssa.c (revision 115645)
--- tree-into-ssa.c (working copy)
*************** compute_global_livein (bitmap livein, bi
*** 369,377 ****
--- 369,386 ----
tos = worklist
= (basic_block *) xmalloc (sizeof (basic_block) * (last_basic_block + 1));
+ /* This function tends to be CPU hog for large CFGs. Avoid unnecesary
+ bitmap queries by caching the in aux blocks. */
EXECUTE_IF_SET_IN_BITMAP (livein, 0, i, bi)
{
*tos++ = BASIC_BLOCK (i);
+ gcc_assert (!BASIC_BLOCK (i)->aux);
+ BASIC_BLOCK (i)->aux = (void *)1;
+ }
+ EXECUTE_IF_SET_IN_BITMAP (def_blocks, 0, i, bi)
+ {
+ gcc_assert (!BASIC_BLOCK (i)->aux);
+ BASIC_BLOCK (i)->aux = (void *)1;
}
/* Iterate until the worklist is empty. */
*************** compute_global_livein (bitmap livein, bi
*** 390,404 ****
int pred_index = pred->index;
/* None of this is necessary for the entry block. */
! if (pred != ENTRY_BLOCK_PTR
! && ! bitmap_bit_p (livein, pred_index)
! && ! bitmap_bit_p (def_blocks, pred_index))
{
*tos++ = pred;
bitmap_set_bit (livein, pred_index);
}
}
}
free (worklist);
}
--- 399,416 ----
int pred_index = pred->index;
/* None of this is necessary for the entry block. */
! if (pred != ENTRY_BLOCK_PTR && !bb->aux)
{
*tos++ = pred;
bitmap_set_bit (livein, pred_index);
+ bb->aux = (void *)1;
}
}
}
+ EXECUTE_IF_SET_IN_BITMAP (livein, 0, i, bi)
+ BASIC_BLOCK (i)->aux = NULL;
+ EXECUTE_IF_SET_IN_BITMAP (def_blocks, 0, i, bi)
+ BASIC_BLOCK (i)->aux = NULL;
free (worklist);
}
More information about the Gcc-bugs
mailing list