Optimize df_worklist_dataflow
Jan Hubicka
hubicka@ucw.cz
Sat Jun 12 18:06:00 GMT 2010
>> I am tracking age of basic block. One age is last time when BB info has
>> changed and other age is last time it was re-scanned. When scanning we need to
>> compute confluences only of those basic blocks that changed since last
>> checking.
>
> Steven is the one who experimented with all the dataflow solvers so I'll
> gladly defer this review to him.
OK :)
Note that today profile has changed (I believe due to Eric's patch).
The magical slowness of fast_dce I intended to analyze is gone. I guess problem
was/is that we do not fit the coniditonals for removing pure/const calls on
SSA,du and fast DCEs that might result in scenrario where fast_dce executed
after reload needs many iterations since it is first instance removing some
pure calls. If the problem returns back, I will double check the theory but
it seems only plausible explanation.
Current profile is:
51198 3.8267 lto1 htab_find_slot_with_hash
19428 1.4521 lto1 df_worklist_dataflow
17117 1.2794 lto1 bitmap_set_bit_1
16122 1.2050 lto1 df_note_compute
14601 1.0913 lto1 htab_traverse_noresize
12909 0.9649 lto1 record_reg_classes.constprop.1
12652 0.9457 lto1 htab_find_with_hash
11987 0.8960 lto1 ggc_internal_alloc_stat
11802 0.8821 lto1 bitmap_clear_bit
10892 0.8141 lto1 walk_tree_1
10197 0.7622 lto1 bitmap_copy
8886 0.6642 lto1 bitmap_bit_p_1
8658 0.6471 lto1 et_splay
I am wondering if there are any obvious improvements for df_note_compute?
One thing I noticed is that it might be reorganized to walk reverse superblocks
avoid copying the live bitmaps all the time (it is also one of most busy callers
of bitmap_copy). But I guess benefits of such a trick won't be that grand
and it is bit difficult to reorganize it since it is executed at all_blocks bitmap
(that is I believe always whole CFG) and anything that tries to walk superblocks
would result in random access to it.
Honza
More information about the Gcc-patches
mailing list