Optimize df_worklist_dataflow
Jan Hubicka
hubicka@ucw.cz
Tue Jun 22 16:37:00 GMT 2010
>
> If Steven doesn't complain, go ahead.
Hi,
thanks, I've commited it now. Can track Steven's comments incrementally, if any.
I would like to experiment with optimizing the dataflow worklist implementation now
(i.e. stealing a bit from age to make it cheap to test if the BB is already in queue
and perhaps switch from bitmap to array as worklist implementation).
The WHOPR build time went down to 9m36s now, 10s is due to Jakub's genattrtab
change, 16% improvmenet since the first WHOPR builds.
With inlining bitset/test I can get to 9m32s but I would first like to figure
out if some of most abusive bitmap users can't be better switched to something
else.
The following are passes taking over 1% of time on CC1 LTO link:
garbage collection : 11.32 ( 2%) usr 0.24 ( 3%) sys 11.59 ( 2%) wall 0 kB ( 0%) ggc
ipa lto gimple in : 7.23 ( 1%) usr 0.77 ( 9%) sys 8.68 ( 2%) wall 878440 kB (29%) ggc
ipa lto decl in : 4.77 ( 1%) usr 0.21 ( 3%) sys 4.98 ( 1%) wall 248056 kB ( 8%) ggc
cfg cleanup : 9.17 ( 2%) usr 0.02 ( 0%) sys 9.42 ( 2%) wall 30639 kB ( 1%) ggc
trivially dead code : 3.19 ( 1%) usr 0.00 ( 0%) sys 2.96 ( 1%) wall 0 kB ( 0%) ggc
df reaching defs : 4.64 ( 1%) usr 0.03 ( 0%) sys 4.75 ( 1%) wall 0 kB ( 0%) ggc
df live regs : 24.17 ( 5%) usr 0.02 ( 0%) sys 24.22 ( 5%) wall 0 kB ( 0%) ggc
df live&initialized regs: 13.53 ( 3%) usr 0.03 ( 0%) sys 13.73 ( 3%) wall 0 kB ( 0%) ggc
df use-def / def-use chains: 2.70 ( 1%) usr 0.02 ( 0%) sys 2.50 ( 0%) wall 0 kB ( 0%) ggc
df reg dead/unused notes: 9.41 ( 2%) usr 0.05 ( 1%) sys 9.44 ( 2%) wall 68255 kB ( 2%) ggc
register information : 3.12 ( 1%) usr 0.01 ( 0%) sys 3.04 ( 1%) wall 0 kB ( 0%) ggc
alias analysis : 9.25 ( 2%) usr 0.03 ( 0%) sys 9.24 ( 2%) wall 197016 kB ( 7%) ggc
alias stmt walking : 6.75 ( 1%) usr 0.79 (10%) sys 7.36 ( 1%) wall 18244 kB ( 1%) ggc
integration : 7.86 ( 2%) usr 0.55 ( 7%) sys 8.41 ( 2%) wall 722881 kB (24%) ggc
tree CFG cleanup : 6.69 ( 1%) usr 0.05 ( 1%) sys 6.71 ( 1%) wall 17282 kB ( 1%) ggc
tree VRP : 10.64 ( 2%) usr 0.32 ( 4%) sys 10.85 ( 2%) wall 295619 kB (10%) ggc
tree PTA : 4.34 ( 1%) usr 0.01 ( 0%) sys 4.84 ( 1%) wall 40842 kB ( 1%) ggc
tree SSA rewrite : 2.94 ( 1%) usr 0.04 ( 0%) sys 2.95 ( 1%) wall 52184 kB ( 2%) ggc
tree SSA incremental : 6.18 ( 1%) usr 0.30 ( 4%) sys 6.08 ( 1%) wall 53161 kB ( 2%) ggc
tree operand scan : 2.97 ( 1%) usr 1.33 (16%) sys 4.11 ( 1%) wall 442136 kB (15%) ggc
dominator optimization: 5.28 ( 1%) usr 0.06 ( 1%) sys 5.43 ( 1%) wall 116917 kB ( 4%) ggc
tree PRE : 26.72 ( 5%) usr 0.32 ( 4%) sys 26.66 ( 5%) wall 211068 kB ( 7%) ggc
tree FRE : 5.33 ( 1%) usr 0.25 ( 3%) sys 6.19 ( 1%) wall 20183 kB ( 1%) ggc
tree slp vectorization: 3.44 ( 1%) usr 0.06 ( 1%) sys 3.83 ( 1%) wall 277483 kB ( 9%) ggc
dominance computation : 5.51 ( 1%) usr 0.04 ( 0%) sys 5.75 ( 1%) wall 0 kB ( 0%) ggc
expand : 42.65 ( 9%) usr 0.37 ( 5%) sys 43.32 ( 9%) wall 915669 kB (31%) ggc
forward prop : 4.34 ( 1%) usr 0.03 ( 0%) sys 4.95 ( 1%) wall 64467 kB ( 2%) ggc
CSE : 11.46 ( 2%) usr 0.02 ( 0%) sys 11.53 ( 2%) wall 18427 kB ( 1%) ggc
dead store elim1 : 3.67 ( 1%) usr 0.04 ( 0%) sys 4.05 ( 1%) wall 41282 kB ( 1%) ggc
dead store elim2 : 3.68 ( 1%) usr 0.03 ( 0%) sys 3.82 ( 1%) wall 48489 kB ( 2%) ggc
CPROP : 9.85 ( 2%) usr 0.03 ( 0%) sys 9.98 ( 2%) wall 90860 kB ( 3%) ggc
PRE : 11.78 ( 2%) usr 0.04 ( 0%) sys 11.37 ( 2%) wall 14087 kB ( 0%) ggc
CSE 2 : 6.38 ( 1%) usr 0.00 ( 0%) sys 6.25 ( 1%) wall 11012 kB ( 0%) ggc
combiner : 14.21 ( 3%) usr 0.03 ( 0%) sys 14.32 ( 3%) wall 234301 kB ( 8%) ggc
if-conversion : 3.39 ( 1%) usr 0.01 ( 0%) sys 3.13 ( 1%) wall 29919 kB ( 1%) ggc
integrated RA : 27.45 ( 6%) usr 0.02 ( 0%) sys 27.35 ( 5%) wall 119575 kB ( 4%) ggc
reload : 12.18 ( 2%) usr 0.02 ( 0%) sys 12.24 ( 2%) wall 39450 kB ( 1%) ggc
reload CSE regs : 8.64 ( 2%) usr 0.05 ( 1%) sys 8.59 ( 2%) wall 106855 kB ( 4%) ggc
hard reg cprop : 3.14 ( 1%) usr 0.01 ( 0%) sys 3.17 ( 1%) wall 2226 kB ( 0%) ggc
scheduling 2 : 14.65 ( 3%) usr 0.02 ( 0%) sys 13.78 ( 3%) wall 5666 kB ( 0%) ggc
final : 9.01 ( 2%) usr 0.34 ( 4%) sys 11.21 ( 2%) wall 140923 kB ( 5%) ggc
symout : 6.34 ( 1%) usr 0.34 ( 4%) sys 6.52 ( 1%) wall 390733 kB (13%) ggc
variable tracking : 51.66 (10%) usr 0.05 ( 1%) sys 52.09 (10%) wall 360699 kB (12%) ggc
TOTAL : 494.52 8.22 505.99 2984350 kB
It seems that df is still one of lowest hanging fruits. Just liveness related
stuff is about 10% of compile time.
Honza
More information about the Gcc-patches
mailing list