This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: Compile-time improvement for if conversion.
On Wed, Oct 12, 2016 at 3:37 PM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
> Richard,
>
> Here is updated patch . I avoided creation of new entry/exit blocks
> but instead add check to border cases - do not consider also blocks
> which are out of region.
>
> Any comments will be appreciated.
I mostly like it now. Can you split out all the common parts from the
dom_info constructor
to a helper method please and just compute m_n_basic_blocks and a max_index and
do all the memory allocation in that common function?
@@ -276,9 +334,10 @@ dom_info::calc_dfs_tree_nonrec (basic_block bb)
bn = e->src;
/* If the next node BN is either already visited or a border
- block the current edge is useless, and simply overwritten
- with the next edge out of the current node. */
- if (bn == m_end_block || m_dfs_order[bn->index])
+ block or out of region the current edge is useless, and simply
+ overwritten with the next edge out of the current node. */
+ if (bn == m_end_block || bn->dom[d_i] == NULL
clever idea ;) Just thinking if this means we support single entry,
multiple exit
regions for CDI_DOMINATORs and multiple entry, single exit regions for
CDI_POST_DOMINATORs. I think so. Please update the function comments
accordingly.
+ m_dfsnum = 1;
+ m_nodes = 0;
+ m_fake_exit_edge = NULL; /* Assume that region is SCC. */
you mean SESE rather than SCC?
@@ -511,7 +573,7 @@ dom_info::calc_idoms ()
: ei_start (bb->preds);
edge_iterator einext;
- if (m_reverse)
+ if (m_reverse && !in_region)
{
/* If this block has a fake edge to exit, process that first. */
if (bitmap_bit_p (m_fake_exit_edge, bb->index))
I think it's better to simply change the if (m_reverse) test to a if
(m_fake_exit_edge) test.
I noticed you do not initialize n_bbs_in_dom_tree (just set it to
zero), it's not really used
anywhere (but in an assert).
free_dominance_info_for_region needs a function level comment (per
coding guidelines).
Please make free_dominance_info_for_region take a struct function * pointer.
@@ -1318,11 +1350,13 @@ if_convertible_loop_p_1 (struct loop *loop,
vec<data_reference_p> *refs)
{
unsigned int i;
basic_block exit_bb = NULL;
+ vec<basic_block> region;
if (find_data_references_in_loop (loop, refs) == chrec_dont_know)
return false;
- calculate_dominance_info (CDI_DOMINATORS);
+ if (dom_info_state (CDI_DOMINATORS) != DOM_OK)
+ calculate_dominance_info (CDI_DOMINATORS);
This is a premature (unwanted) change.
Other than that the only O(function-size) part left is the zeroing in
+ /* Determine max basic block index in region. */
+ int max_index = region[0]->index;
+ for (size_t i = 1; i < num; i++)
+ if (region[i]->index > max_index)
+ max_index = region[i]->index;
+ max_index += 1;
+ m_dfs_order = new_zero_array <TBB> (max_index + 1);
+ m_dfs_last = &m_dfs_order[max_index];
I suppose we can try addressing this as followup.
Thanks a lot for doing this.
Richard.
> 2016-10-11 16:48 GMT+03:00 Richard Biener <richard.guenther@gmail.com>:
>> On Tue, Oct 11, 2016 at 3:23 PM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
>>> Richard,
>>>
>>> I implemented this by passing callback function in_region which
>>> returns true if block belongs to region.
>>> I am testing it now
>>>
>>> I attach modified patch for your quick review.
>>
>> + FOR_EACH_VEC_ELT (region, i, bb)
>> + {
>> + bb->dom[dir_index] = et_new_tree (bb);
>> + }
>> + /* Check that region is SESE region. */
>> + entry = region[0];
>> + if ( EDGE_COUNT (entry->succs) > 1)
>> + {
>> + /* Create new entry block with the only successor. */
>> + basic_block succ = NULL;
>> + bool found = false;
>> + FOR_EACH_EDGE (e, ei, entry->succs)
>> + if (in_region (region, e->dest))
>> + {
>> + gcc_assert (!found);
>>
>> is that check equal to e->dest->dom[dir_index] != NULL? I think we
>> shouldn't need the callback.
>>
>> + new_entry = create_empty_bb (entry);
>> + unchecked_make_edge (new_entry, succ, 0);
>>
>> I still think this is somewhat gross and we should try to avoid it
>> somehow - at least it's well-hidden now ;)
>>
>> + /* Put it to region as entry node. */
>> + region[0] = new_entry;
>>
>> so region[0] is overwritten now?
>>
>> As far as I understand calc_dfs_tree it should be possible to compute
>> the region on-the-fly
>> from calling calc_dfs_tree_nonrec on the entry/exit (also maybe
>> avoiding some of the still
>> "large" complexity of zeroing arrays in the constructor).
>>
>> And if we use that DFS walk directly we should be able to avoid
>> creating those fake entry/exit
>> blocks by using entry/exit edges instead... (somehow).
>>
>> Richard.
>>
>>
>>
>>> Thanks.
>>>
>>> 2016-10-11 13:33 GMT+03:00 Richard Biener <richard.guenther@gmail.com>:
>>>> On Mon, Oct 10, 2016 at 4:17 PM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
>>>>> Richard,
>>>>>
>>>>> If "fake" exit or entry block is created in dominance how we can
>>>>> determine what is its the only predecessor or successor without using
>>>>> a notion of loop?
>>>>
>>>> The caller passes in an entry and exit edge instead of a block or loop.
>>>>
>>>> Richard.
>>>>
>>>>> 2016-10-10 15:00 GMT+03:00 Richard Biener <richard.guenther@gmail.com>:
>>>>>> On Mon, Oct 10, 2016 at 1:42 PM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
>>>>>>> Thanks Richard for your comments.
>>>>>>> I'd like to answer on your last comment regarding use split_edge()
>>>>>>> instead of creating fake post-header. I started with this splitting
>>>>>>> but it requires to fix-up closed ssa form by creating additional phi
>>>>>>> nodes, so I decided to use only cfg change without updating ssa form.
>>>>>>> Other changes look reasonable and will fix them.
>>>>>>
>>>>>> Ah. In this case can you investigate what it takes to make the entry/exit
>>>>>> edges rather than BBs? That is, introduce those "fakes" only internally
>>>>>> in dominance.c?
>>>>>>
>>>>>>> 2016-10-10 12:52 GMT+03:00 Richard Biener <richard.guenther@gmail.com>:
>>>>>>>> On Wed, Oct 5, 2016 at 3:22 PM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
>>>>>>>>> Hi All,
>>>>>>>>>
>>>>>>>>> Here is implementation of Richard proposal:
>>>>>>>>>
>>>>>>>>> < For general infrastructure it would be nice to expose a (post-)dominator
>>>>>>>>> < compute for MESE (post-dominators) / SEME (dominators) regions. I believe
>>>>>>>>> < what makes if-conversion expensive is the post-dom compute which happens
>>>>>>>>> < for each loop for the whole function. It shouldn't be very difficult
>>>>>>>>> < to write this,
>>>>>>>>> < sharing as much as possible code with the current DOM code might need
>>>>>>>>> < quite some refactoring though.
>>>>>>>>>
>>>>>>>>> I implemented this proposal by adding calculation of dominance info
>>>>>>>>> for SESE regions and incorporate this change to if conversion pass.
>>>>>>>>> SESE region is built by adding loop pre-header and possibly fake
>>>>>>>>> post-header blocks to loop body. Fake post-header is deleted after
>>>>>>>>> predication completion.
>>>>>>>>>
>>>>>>>>> Bootstrapping and regression testing did not show any new failures.
>>>>>>>>>
>>>>>>>>> Is it OK for trunk?
>>>>>>>>
>>>>>>>> It's mostly reasonable but I have a few comments. First, re-using
>>>>>>>> bb->dom[] for the dominator info is somewhat fragile but indeed
>>>>>>>> a requirement to make the patch reasonably small. Please,
>>>>>>>> in calculate_dominance_info_for_region, make sure that
>>>>>>>> !dom_info_available_p (dir).
>>>>>>>>
>>>>>>>> You pass loop * everywhere but require ->aux to be set up as
>>>>>>>> an array of BBs forming the region with special BBs at array ends.
>>>>>>>>
>>>>>>>> Please instead pass in a vec<basic_block> which avoids using ->aux
>>>>>>>> and also allows other non-loop-based SESE regions to be used
>>>>>>>> (I couldn't spot anything that relies on this being a loop).
>>>>>>>>
>>>>>>>> Adding a convenience wrapper for loop * would be of course nice,
>>>>>>>> to cover the special pre/post-header code in tree-if-conv.c.
>>>>>>>>
>>>>>>>> In theory a SESE region is fully specified by its entry end exit _edge_,
>>>>>>>> so you might want to see if it's possible to use such a pair of edges
>>>>>>>> to guard the dfs/idom walks to avoid the need to create fake blocks.
>>>>>>>>
>>>>>>>> Btw, instead of using create_empty_bb, unchecked_make_edge, etc.
>>>>>>>> please use split_edge() of the entry/exit edges.
>>>>>>>>
>>>>>>>> Richard.
>>>>>>>>
>>>>>>>>> ChangeLog:
>>>>>>>>> 2016-10-05 Yuri Rumyantsev <ysrumyan@gmail.com>
>>>>>>>>>
>>>>>>>>> * dominance.c : Include cfgloop.h for loop recognition.
>>>>>>>>> (dom_info): Add new functions and add boolean argument to recognize
>>>>>>>>> computation for loop region.
>>>>>>>>> (dom_info::dom_info): New function.
>>>>>>>>> (dom_info::calc_dfs_tree): Add boolean argument IN_REGION to not
>>>>>>>>> handle unvisited blocks.
>>>>>>>>> (dom_info::calc_idoms): Likewise.
>>>>>>>>> (compute_dom_fast_query_in_region): New function.
>>>>>>>>> (calculate_dominance_info): Invoke calc_dfs_tree and calc_idoms with
>>>>>>>>> false argument.
>>>>>>>>> (calculate_dominance_info_for_region): New function.
>>>>>>>>> (free_dominance_info_for_region): Likewise.
>>>>>>>>> (verify_dominators): Invoke calc_dfs_tree and calc_idoms with false
>>>>>>>>> argument.
>>>>>>>>> * dominance.h: Add prototype for introduced functions
>>>>>>>>> calculate_dominance_info_for_region and
>>>>>>>>> free_dominance_info_for_region.
>>>>>>>>> tree-if-conv.c: Add to local variables ifc_sese_bbs & fake_postheader.
>>>>>>>>> (build_sese_region): New function.
>>>>>>>>> (if_convertible_loop_p_1): Invoke local version of post-dominators
>>>>>>>>> calculation, free it after basic block predication and delete created
>>>>>>>>> fake post-header block if any.
>>>>>>>>> (tree_if_conversion): Delete call of free_dominance_info for
>>>>>>>>> post-dominators, free ifc_sese_bbs which represents SESE region.
>>>>>>>>> (pass_if_conversion::execute): Delete detection of infinite loops
>>>>>>>>> and fake edges to exit block since post-dominator calculation is
>>>>>>>>> performed per if-converted loop only.