This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH,1/2] Extended if-conversion for loops marked with pragma omp simd.
- From: Richard Biener <richard dot guenther at gmail dot com>
- To: Yuri Rumyantsev <ysrumyan at gmail dot com>
- Cc: gcc-patches <gcc-patches at gcc dot gnu dot org>, Igor Zamyatin <izamyatin at gmail dot com>
- Date: Tue, 21 Oct 2014 15:38:37 +0200
- Subject: Re: [PATCH,1/2] Extended if-conversion for loops marked with pragma omp simd.
- Authentication-results: sourceware.org; auth=none
- References: <CAEoMCqRM0B1TzUmSehsV3S2c7Pg9W7tnr6Hb78wbxGpeQ+OMxA at mail dot gmail dot com> <CAFiYyc0bL-GSp0c3WZaSgyvTHvjpfkXkzQHKvxMQhibkb+GgUA at mail dot gmail dot com> <CAEoMCqQnaKq2qu+v48wJJsHUF51e1V0Qf=0EneQk7Dz0Zjcqzg at mail dot gmail dot com> <CAFiYyc1MzzAqb-+EJj6FR6yke2KHc=BCH0wyU2wpcTAiE9W6xg at mail dot gmail dot com> <CAEoMCqTK9KQzRwG09RRk0xbu=4+xL9NrN4FKRi6TunAYKngQ4Q at mail dot gmail dot com> <CAFiYyc1jxMQuqmzSMJnHWKt1ZHArK1q2oy4_PdauEJ6rm=+oDQ at mail dot gmail dot com> <CAEoMCqSgvpbCpOOh1SvLssso9O8m3-qOGOpawQLu-Af3uJWVWQ at mail dot gmail dot com> <CAEoMCqRn5n7QmRoN0b0eSNUCT2JermeRqiKD2j+e+o_2vV2QJQ at mail dot gmail dot com> <CAFiYyc3cF_Ds=Dsp63EprVR3DLi0OjFhSXRaqmAmRW+S-hakJA at mail dot gmail dot com> <CAEoMCqSTnic_GCuy48jPz6obf-XH8LMqkx4uh369uBi-_wi1ww at mail dot gmail dot com>
On Tue, Oct 21, 2014 at 3:20 PM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
> Richard,
>
> Yes, This patch does not make sense since phi node predication for bb
> with critical incoming edges only performs another function which is
> absent (predicate_extended_scalar_phi).
>
> BTW I see that commit_edge_insertions() is used for rtx instructions
> only but you propose to use it for tree also.
> Did I miss something?
Ah, it's gsi_commit_edge_inserts () (or gsi_commit_one_edge_insert
if you want easy access to the newly created basic block to push
the predicate to - see gsi_commit_edge_inserts implementation).
Richard.
> Thanks ahead.
>
>
> 2014-10-21 16:44 GMT+04:00 Richard Biener <richard.guenther@gmail.com>:
>> On Tue, Oct 21, 2014 at 2:25 PM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
>>> Richard,
>>>
>>> I did some changes in patch and ChangeLog to mark that support for
>>> if-convert of blocks with only critical incoming edges will be added
>>> in the future (more precise in patch.4).
>>
>> But the same reasoning applies to this version of the patch when
>> flag_force_vectorize is true!? (insertion point and invalid SSA form)
>>
>> Which means the patch doesn't make sense in isolation?
>>
>> Btw, I think for the case you should simply do gsi_insert_on_edge ()
>> and commit_edge_insertions () before the call to combine_blocks
>> (pushing the edge predicate to the newly created block).
>>
>> Richard.
>>
>>> Could you please review it.
>>>
>>> Thanks.
>>>
>>> ChangeLog:
>>>
>>> 2014-10-21 Yuri Rumyantsev <ysrumyan@gmail.com>
>>>
>>> (flag_force_vectorize): New variable.
>>> (edge_predicate): New function.
>>> (set_edge_predicate): New function.
>>> (add_to_dst_predicate_list): Conditionally invoke add_to_predicate_list
>>> if destination block of edge is not always executed. Set-up predicate
>>> for critical edge.
>>> (if_convertible_phi_p): Accept phi nodes with more than two args
>>> if FLAG_FORCE_VECTORIZE was set-up.
>>> (ifcvt_can_use_mask_load_store): Use FLAG_FORCE_VECTORIZE.
>>> (if_convertible_stmt_p): Fix up pre-function comments.
>>> (all_preds_critical_p): New function.
>>> (if_convertible_bb_p): Use call of all_preds_critical_p
>>> to reject temporarily block if-conversion with incoming critical edges
>>> if FLAG_FORCE_VECTORIZE was not set-up. This restriction will be deleted
>>> after adding support for extended predication.
>>> (predicate_bbs): Skip loop exit block also.Invoke build2_loc
>>> to compute predicate instead of fold_build2_loc.
>>> Add zeroing of edge 'aux' field.
>>> (find_phi_replacement_condition): Extend function interface:
>>> it returns NULL if given phi node must be handled by means of
>>> extended phi node predication. If number of predecessors of phi-block
>>> is equal 2 and at least one incoming edge is not critical original
>>> algorithm is used.
>>> (tree_if_conversion): Temporarily set-up FLAG_FORCE_VECTORIZE to false.
>>> Nullify 'aux' field of edges for blocks with two successors.
>>>
>>> 2014-10-20 17:55 GMT+04:00 Yuri Rumyantsev <ysrumyan@gmail.com>:
>>>> Richard,
>>>>
>>>> Thanks for your answer!
>>>>
>>>> In current implementation phi node conversion assume that one of
>>>> incoming edge to bb containing given phi has at least one non-critical
>>>> edge and choose it to insert predicated code. But if we choose
>>>> critical edge we need to determine insert point and insertion
>>>> direction (before/after) since in other case we can get invalid ssa
>>>> form (use before def). This is done by my new function which is not in
>>>> current patch ( I will present this patch later). SO I assume that we
>>>> need to leave this patch as it is to not introduce new bugs.
>>>>
>>>> Thanks.
>>>> Yuri.
>>>>
>>>> 2014-10-20 12:00 GMT+04:00 Richard Biener <richard.guenther@gmail.com>:
>>>>> On Fri, Oct 17, 2014 at 4:09 PM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
>>>>>> Richard,
>>>>>>
>>>>>> I reworked the patch as you proposed, but I didn't understand what
>>>>>> did you mean by:
>>>>>>
>>>>>>>So please rework the patch so critical edges are always handled
>>>>>>>correctly.
>>>>>>
>>>>>> In current patch flag_force_vectorize is used (1) to reject phi nodes
>>>>>> with more than 2 arguments; (2) to reject basic blocks with only
>>>>>> critical incoming edges since support for extended predication of phi
>>>>>> nodes will be in next patch.
>>>>>
>>>>> I mean that (2) should not be rejected dependent on flag_force_vectorize.
>>>>> It was rejected because if-cvt couldn't handle it correctly before but with
>>>>> this patch this is fixed. I see no reason to still reject this then even
>>>>> for !flag_force_vectorize.
>>>>>
>>>>> Rejecting PHIs with more than two arguments with flag_force_vectorize
>>>>> is ok.
>>>>>
>>>>> Richard.
>>>>>
>>>>>> Could you please clarify your statement.
>>>>>>
>>>>>> I attached modified patch.
>>>>>>
>>>>>> ChangeLog:
>>>>>>
>>>>>> 2014-10-17 Yuri Rumyantsev <ysrumyan@gmail.com>
>>>>>>
>>>>>> (flag_force_vectorize): New variable.
>>>>>> (edge_predicate): New function.
>>>>>> (set_edge_predicate): New function.
>>>>>> (add_to_dst_predicate_list): Conditionally invoke add_to_predicate_list
>>>>>> if destination block of edge is not always executed. Set-up predicate
>>>>>> for critical edge.
>>>>>> (if_convertible_phi_p): Accept phi nodes with more than two args
>>>>>> if FLAG_FORCE_VECTORIZE was set-up.
>>>>>> (ifcvt_can_use_mask_load_store): Use FLAG_FORCE_VECTORIZE.
>>>>>> (if_convertible_stmt_p): Fix up pre-function comments.
>>>>>> (all_edges_are_critical): New function.
>>>>>> (if_convertible_bb_p): Use call of all_preds_critical_p
>>>>>> to reject block if-conversion with incoming critical edges only if
>>>>>> FLAG_FORCE_VECTORIZE was not set-up.
>>>>>> (predicate_bbs): Skip loop exit block also.Invoke build2_loc
>>>>>> to compute predicate instead of fold_build2_loc.
>>>>>> Add zeroing of edge 'aux' field.
>>>>>> (find_phi_replacement_condition): Extend function interface:
>>>>>> it returns NULL if given phi node must be handled by means of
>>>>>> extended phi node predication. If number of predecessors of phi-block
>>>>>> is equal 2 and atleast one incoming edge is not critical original
>>>>>> algorithm is used.
>>>>>> (tree_if_conversion): Temporary set-up FLAG_FORCE_VECTORIZE to false.
>>>>>> Nullify 'aux' field of edges for blocks with two successors.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> 2014-10-17 13:09 GMT+04:00 Richard Biener <richard.guenther@gmail.com>:
>>>>>>> On Thu, Oct 16, 2014 at 5:42 PM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
>>>>>>>> Richard,
>>>>>>>>
>>>>>>>> Here is reduced patch as you requested. All your remarks have been fixed.
>>>>>>>> Could you please look at it ( I have already sent the patch with
>>>>>>>> changes in add_to_predicate_list for review).
>>>>>>>
>>>>>>> + if (dump_file && (dump_flags & TDF_DETAILS))
>>>>>>> + fprintf (dump_file, "More than two phi node args.\n");
>>>>>>> + return false;
>>>>>>> + }
>>>>>>> +
>>>>>>> + }
>>>>>>>
>>>>>>> Excess vertical space.
>>>>>>>
>>>>>>>
>>>>>>> +/* Assumes that BB has more than 2 predecessors.
>>>>>>>
>>>>>>> More than 1 predecessor?
>>>>>>>
>>>>>>> + Returns false if at least one successor is not on critical edge
>>>>>>> + and true otherwise. */
>>>>>>> +
>>>>>>> +static inline bool
>>>>>>> +all_edges_are_critical (basic_block bb)
>>>>>>> +{
>>>>>>>
>>>>>>> "all_preds_critical_p" would be a better name
>>>>>>>
>>>>>>> + if (EDGE_COUNT (bb->preds) > 2)
>>>>>>> + {
>>>>>>> + if (!flag_force_vectorize)
>>>>>>> + return false;
>>>>>>> + }
>>>>>>>
>>>>>>> as I said in the last review I don't think we should restrict edge
>>>>>>> predicates to flag_force_vectorize. At least I can't see how
>>>>>>> if-conversion is magically more expensive for that case?
>>>>>>>
>>>>>>> So please rework the patch so critical edges are always handled
>>>>>>> correctly.
>>>>>>>
>>>>>>> Ok with that and the above suggested changes.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Richard.
>>>>>>>
>>>>>>>
>>>>>>>> Thanks.
>>>>>>>> Yuri.
>>>>>>>> ChangeLog
>>>>>>>> 2014-10-16 Yuri Rumyantsev <ysrumyan@gmail.com>
>>>>>>>>
>>>>>>>> (flag_force_vectorize): New variable.
>>>>>>>> (edge_predicate): New function.
>>>>>>>> (set_edge_predicate): New function.
>>>>>>>> (add_to_dst_predicate_list): Conditionally invoke add_to_predicate_list
>>>>>>>> if destination block of edge is not always executed. Set-up predicate
>>>>>>>> for critical edge.
>>>>>>>> (if_convertible_phi_p): Accept phi nodes with more than two args
>>>>>>>> if FLAG_FORCE_VECTORIZE was set-up.
>>>>>>>> (ifcvt_can_use_mask_load_store): Use FLAG_FORCE_VECTORIZE.
>>>>>>>> (if_convertible_stmt_p): Fix up pre-function comments.
>>>>>>>> (all_edges_are_critical): New function.
>>>>>>>> (if_convertible_bb_p): Allow bb has more than two predecessors if
>>>>>>>> FLAG_FORCE_VECTORIZE was set-up. Use call of all_edges_are_critical
>>>>>>>> to reject block if-conversion with incoming critical edges only if
>>>>>>>> FLAG_FORCE_VECTORIZE was not set-up.
>>>>>>>> (predicate_bbs): Skip loop exit block also.Invoke build2_loc
>>>>>>>> to compute predicate instead of fold_build2_loc.
>>>>>>>> Add zeroing of edge 'aux' field.
>>>>>>>> (find_phi_replacement_condition): Extend function interface:
>>>>>>>> it returns NULL if given phi node must be handled by means of
>>>>>>>> extended phi node predication. If number of predecessors of phi-block
>>>>>>>> is equal 2 and atleast one incoming edge is not critical original
>>>>>>>> algorithm is used.
>>>>>>>> (tree_if_conversion): Temporary set-up FLAG_FORCE_VECTORIZE to false.
>>>>>>>> Nullify 'aux' field of edges for blocks with two successors.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> 2014-10-15 13:50 GMT+04:00 Richard Biener <richard.guenther@gmail.com>:
>>>>>>>>> On Mon, Oct 13, 2014 at 11:38 AM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
>>>>>>>>>> Richard,
>>>>>>>>>>
>>>>>>>>>> Here is updated patch (part1) for extended if conversion.
>>>>>>>>>>
>>>>>>>>>> Second part of patch will be sent later.
>>>>>>>>>
>>>>>>>>> Ok, I'm starting to look at this. I'd still like you to split things up
>>>>>>>>> more.
>>>>>>>>>
>>>>>>>>> static inline void
>>>>>>>>> add_to_predicate_list (struct loop *loop, basic_block bb, tree nc)
>>>>>>>>> {
>>>>>>>>> ...
>>>>>>>>>
>>>>>>>>> + /* We use notion of cd equivalence to get simplier predicate for
>>>>>>>>> + join block, e.g. if join block has 2 predecessors with predicates
>>>>>>>>> + p1 & p2 and p1 & !p2, we'd like to get p1 for it instead of
>>>>>>>>> + p1 & p2 | p1 & !p2. */
>>>>>>>>> + if (dom_bb != loop->header
>>>>>>>>> + && get_immediate_dominator (CDI_POST_DOMINATORS, dom_bb) == bb)
>>>>>>>>> + {
>>>>>>>>> + gcc_assert (flow_bb_inside_loop_p (loop, dom_bb));
>>>>>>>>> + bc = bb_predicate (dom_bb);
>>>>>>>>> + gcc_assert (!is_true_predicate (bc));
>>>>>>>>>
>>>>>>>>> these changes look worthwhile even for !flag_force_vectorize. So please
>>>>>>>>> split the change to add_to_predicate_list out and compute post-dominators
>>>>>>>>> unconditionally. Note that you should call free_dominance_info
>>>>>>>>> (CDI_POST_DOMINATORS) at the end of if-conversion.
>>>>>>>>>
>>>>>>>>> + if (!dominated_by_p (CDI_DOMINATORS, loop->latch, e->dest))
>>>>>>>>> + add_to_predicate_list (loop, e->dest, cond);
>>>>>>>>> +
>>>>>>>>> + /* If edge E is critical save predicate on it. */
>>>>>>>>> + if (EDGE_COUNT (e->dest->preds) >= 2)
>>>>>>>>> + set_edge_predicate (e, cond);
>>>>>>>>>
>>>>>>>>> how do we know the edge is critical by this simple check? Why not
>>>>>>>>> simply always save edge predicates (well, you kind of do but omit
>>>>>>>>> the case where e->src dominates e->dest).
>>>>>>>>>
>>>>>>>>> Btw, you can rely on edge->aux being NULL at the start of the
>>>>>>>>> pass but need to clear it at the end (best use clear_aux_for_edges ()
>>>>>>>>> for that). So stuff like
>>>>>>>>>
>>>>>>>>> + extract_true_false_edges_from_block (bb, &true_edge, &false_edge);
>>>>>>>>> + if (flag_force_vectorize)
>>>>>>>>> + true_edge->aux = false_edge->aux = NULL;
>>>>>>>>>
>>>>>>>>> shouldn't be necessary.
>>>>>>>>>
>>>>>>>>> I think the edge predicate handling should also be unconditionally
>>>>>>>>> and not depend on flag_force_vectorize.
>>>>>>>>>
>>>>>>>>> + /* The loop latch and loop exit block are always executed and
>>>>>>>>> + have no extra conditions to be processed: skip them. */
>>>>>>>>> + if (bb == loop->latch
>>>>>>>>> + || bb_with_exit_edge_p (loop, bb))
>>>>>>>>>
>>>>>>>>> I don't think the edge stuff is true - given you still only reset the
>>>>>>>>> loop->latch bb predicate the change looks broken.
>>>>>>>>>
>>>>>>>>> + /* Fold_build2 can produce bool conversion which is not
>>>>>>>>> + supported by vectorizer, so re-build it without folding.
>>>>>>>>> + For example, such conversion is generated for sequence:
>>>>>>>>> + _Bool _7, _8, _9;
>>>>>>>>> + _7 = _6 != 13; _8 = _6 != 0; _9 = _8 & _9;
>>>>>>>>> + if (_9 != 0) --> (bool)_9. */
>>>>>>>>> +
>>>>>>>>> + if (CONVERT_EXPR_P (c)
>>>>>>>>> + && TREE_CODE_CLASS (code) == tcc_comparison)
>>>>>>>>>
>>>>>>>>> I think you should simply use canonicalize_cond_expr_cond on the
>>>>>>>>> folding result. Or rather _not_ fold at all - we are taking the
>>>>>>>>> operands from the GIMPLE condition unmodified after all.
>>>>>>>>>
>>>>>>>>> - add_to_dst_predicate_list (loop, false_edge,
>>>>>>>>> - unshare_expr (cond), c2);
>>>>>>>>> + add_to_dst_predicate_list (loop, false_edge, unshare_expr (cond),
>>>>>>>>> + unshare_expr (c2));
>>>>>>>>>
>>>>>>>>> why is it necessary to unshare c2?
>>>>>>>>>
>>>>>>>>> Please split out the PHI-with-multi-arg handling (I have not looked at
>>>>>>>>> that in detail).
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Richard.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> Changelog.
>>>>>>>>>>
>>>>>>>>>> 2014-10-13 Yuri Rumyantsev <ysrumyan@gmail.com>
>>>>>>>>>>
>>>>>>>>>> * tree-if-conv.c (cgraph.h): Add include file to detect function clone.
>>>>>>>>>> (flag_force_vectorize): New variable.
>>>>>>>>>> (edge_predicate): New function.
>>>>>>>>>> (set_edge_predicate): New function.
>>>>>>>>>> (add_to_predicate_list): Check unconditionally that bb is always
>>>>>>>>>> executed to early exit. Use predicate of cd-equivalent block
>>>>>>>>>> for join blocks if it exists.
>>>>>>>>>> (add_to_dst_predicate_list): Invoke add_to_predicate_list if
>>>>>>>>>> destination block of edge is not always executed. Set-up predicate
>>>>>>>>>> for critical edge.
>>>>>>>>>> (if_convertible_phi_p): Accept phi nodes with more than two args
>>>>>>>>>> if FLAG_FORCE_VECTORIZE was set-up.
>>>>>>>>>> (ifcvt_can_use_mask_load_store): Use FLAG_FORCE_VECTORIZE.
>>>>>>>>>> (if_convertible_stmt_p): Fix up pre-function comments.
>>>>>>>>>> (all_edges_are_critical): New function.
>>>>>>>>>> (if_convertible_bb_p): Allow bb has more than two predecessors if
>>>>>>>>>> FLAG_FORCE_VECTORIZE was set-up. Use call of all_edges_are_critical
>>>>>>>>>> to reject block if-conversion with incoming critical edges only if
>>>>>>>>>> FLAG_FORCE_VECTORIZE was not set-up.
>>>>>>>>>> (predicate_bbs): Skip loop exit block also. Add check that if
>>>>>>>>>> fold_build2 produces bool conversion, recompute predicate using
>>>>>>>>>> build2_loc. Add zeroing of edge 'aux' field under FLAG_FORCE_VECTORIZE.
>>>>>>>>>> (if_convertible_loop_p_1): Recompute POST_DOMINATOR tree if
>>>>>>>>>> FLAG_FORCE_VECTORIZE was set-up to calculate cd equivalent bb's.
>>>>>>>>>> (find_phi_replacement_condition): Extend function interface:
>>>>>>>>>> it returns NULL if given phi node must be handled by means of
>>>>>>>>>> extended phi node predication. If number of predecessors of phi-block
>>>>>>>>>> is equal 2 and atleast one incoming edge is not critical original
>>>>>>>>>> algorithm is used.
>>>>>>>>>> (get_predicate_for_edge): New function.
>>>>>>>>>> (find_insertion_point): New function.
>>>>>>>>>> (predicate_arbitrary_scalar_phi): New function.
>>>>>>>>>> (predicate_all_scalar_phis): Introduce new variable BEFORE.
>>>>>>>>>> Invoke find_insertion_point to initialize gsi and
>>>>>>>>>> predicate_arbitrary_scalar_phi if TRUE_BB is NULL - it signals
>>>>>>>>>> that extended predication must be applied).
>>>>>>>>>> (insert_gimplified_predicates): Add test for non-predicated basic
>>>>>>>>>> blocks that there are no gimplified statements to insert. Insert
>>>>>>>>>> predicates at the block begining for extended if-conversion.
>>>>>>>>>> (tree_if_conversion): Initialize flag_force_vectorize from current
>>>>>>>>>> loop or outer loop (to support pragma omp declare).Do loop versioning
>>>>>>>>>> for innermost loop marked with pragma omp simd and
>>>>>>>>>> FLAG_TREE_LOOP_IF_CONVERT was not sett-up. Nullify 'aux' field of edges
>>>>>>>>>> for blocks with two successors.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> 2014-09-22 12:28 GMT+04:00 Yuri Rumyantsev <ysrumyan@gmail.com>:
>>>>>>>>>>> Richard,
>>>>>>>>>>>
>>>>>>>>>>> here is reduced patch (part.1) which was reduced almost twice.
>>>>>>>>>>> Let's me also answer on your comments.
>>>>>>>>>>>
>>>>>>>>>>> 1. I really use edge field 'aux' to keep predicate for critical edges.
>>>>>>>>>>> My previous code was not correct and now it looks like:
>>>>>>>>>>>
>>>>>>>>>>> if (EDGE_COUNT (b->succs) == 1 || EDGE_COUNT (e->dest->preds) == 1)
>>>>>>>>>>> /* Edge E is not critical, use predicate of edge source bb. */
>>>>>>>>>>> c = bb_predicate (b);
>>>>>>>>>>> else
>>>>>>>>>>> /* Edge E is critical and its aux field contains predicate. */
>>>>>>>>>>> c = edge_predicate (e);
>>>>>>>>>>>
>>>>>>>>>>> 2. I completely delete all code related to creation of conditional
>>>>>>>>>>> expressions and completely rely on bool pattern recognition in
>>>>>>>>>>> vectorizer. But we need to delete all dead predicate computations
>>>>>>>>>>> which are not used since they prevent vectorization. I will add this
>>>>>>>>>>> local-dce function in next patch.
>>>>>>>>>>> 3. I also did not include in this patch recognition of general
>>>>>>>>>>> phi-nodes with two arguments only for which conversion of conditional
>>>>>>>>>>> scalar reduction can be applied also.
>>>>>>>>>>> Note that all these changes are applied for loop marked with pragma
>>>>>>>>>>> omp simd only.
>>>>>>>>>>>
>>>>>>>>>>> 2014-09-22 Yuri Rumyantsev <ysrumyan@gmail.com>
>>>>>>>>>>>
>>>>>>>>>>> * tree-if-conv.c (cgraph.h): Add include file to detect function clone.
>>>>>>>>>>> (flag_force_vectorize): New variable.
>>>>>>>>>>> (edge_predicate): New function.
>>>>>>>>>>> (set_edge_predicate): New function.
>>>>>>>>>>> (convert_name_to_cmp): New function.
>>>>>>>>>>> (add_to_predicate_list): Check unconditionally that bb is always
>>>>>>>>>>> executed to early exit. Use predicate of cd-equivalent block
>>>>>>>>>>> for join blocks if it exists.
>>>>>>>>>>> (add_to_dst_predicate_list): Invoke add_to_predicate_list if
>>>>>>>>>>> destination block of edge is not always executed. Set-up predicate
>>>>>>>>>>> for critical edge.
>>>>>>>>>>> (if_convertible_phi_p): Accept phi nodes with more than two args
>>>>>>>>>>> if FLAG_FORCE_VECTORIZE was set-up.
>>>>>>>>>>> (ifcvt_can_use_mask_load_store): Use FLAG_FORCE_VECTORIZE.
>>>>>>>>>>> (if_convertible_stmt_p): Fix up pre-function comments.
>>>>>>>>>>> (all_edges_are_critical): New function.
>>>>>>>>>>> (if_convertible_bb_p): Allow bb has more than two predecessors if
>>>>>>>>>>> FLAG_FORCE_VECTORIZE was set-up. Use call of all_edges_are_critical
>>>>>>>>>>> to reject block if-conversion with incoming critical edges only if
>>>>>>>>>>> FLAG_FORCE_VECTORIZE was not set-up.
>>>>>>>>>>> (predicate_bbs): Skip loop exit block also. Add check that if
>>>>>>>>>>> fold_build2 produces bool conversion, recompute predicate using
>>>>>>>>>>> build2_loc. Add zeroing of edge 'aux' field under FLAG_FORCE_VECTORIZE.
>>>>>>>>>>> (if_convertible_loop_p_1): Recompute POST_DOMINATOR tree if
>>>>>>>>>>> FLAG_FORCE_VECTORIZE was set-up to calculate cd equivalent bb's.
>>>>>>>>>>> (find_phi_replacement_condition): Extend function interface:
>>>>>>>>>>> it returns NULL if given phi node must be handled by means of
>>>>>>>>>>> extended phi node predication. If number of predecessors of phi-block
>>>>>>>>>>> is equal 2 and atleast one incoming edge is not critical original
>>>>>>>>>>> algorithm is used.
>>>>>>>>>>> (get_predicate_for_edge): New function.
>>>>>>>>>>> (find_insertion_point): New function.
>>>>>>>>>>> (predicate_arbitrary_scalar_phi): New function.
>>>>>>>>>>> (predicate_all_scalar_phis): Introduce new variable BEFORE.
>>>>>>>>>>> Invoke find_insertion_point to initialize gsi and
>>>>>>>>>>> predicate_arbitrary_scalar_phi if TRUE_BB is NULL - it signals
>>>>>>>>>>> that extended predication must be applied).
>>>>>>>>>>> (insert_gimplified_predicates): Add test for non-predicated basic
>>>>>>>>>>> blocks that there are no gimplified statements to insert. Insert
>>>>>>>>>>> predicates at the block begining for extended if-conversion.
>>>>>>>>>>> (tree_if_conversion): Initialize flag_force_vectorize from current
>>>>>>>>>>> loop or outer loop (to support pragma omp declare).Do loop versioning
>>>>>>>>>>> for innermost loop marked with pragma omp simd and
>>>>>>>>>>> FLAG_TREE_LOOP_IF_CONVERT was not sett-up. Nullify 'aux' field of edges
>>>>>>>>>>> for blocks with two successors.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> 2014-09-08 17:10 GMT+04:00 Richard Biener <richard.guenther@gmail.com>:
>>>>>>>>>>>> On Fri, Aug 15, 2014 at 2:02 PM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
>>>>>>>>>>>>> Richard!
>>>>>>>>>>>>> Here is updated patch with the following changes:
>>>>>>>>>>>>>
>>>>>>>>>>>>> 1. Any restrictions on phi-function were eliminated for extended conversion.
>>>>>>>>>>>>> 2. Put predicate for critical edges to 'aux' field of edge, i.e.
>>>>>>>>>>>>> negate_predicate was deleted.
>>>>>>>>>>>>> 3. Deleted splitting of critical edges, i.e. both outgoing edges can
>>>>>>>>>>>>> be critical.
>>>>>>>>>>>>> 4. Use notion of cd-equivalence to set-up predicate for join basic
>>>>>>>>>>>>> blocks to simplify it.
>>>>>>>>>>>>> 5. I decided to not design pre-pass since it will lead generating
>>>>>>>>>>>>> chain of cond expressions for phi-node if conversion, whereas for phi
>>>>>>>>>>>>> of kind
>>>>>>>>>>>>> x = PHI <1(2), 1(3), 2(4)>
>>>>>>>>>>>>> only one cond expression is required and this is considered as simple
>>>>>>>>>>>>> optimization for arbitrary phi-function. More precise,
>>>>>>>>>>>>> if phi-function have only two different arguments and one of them has
>>>>>>>>>>>>> single occurrence, if- conversion is performed as if phi have only 2
>>>>>>>>>>>>> arguments.
>>>>>>>>>>>>> For arbitrary phi function a chain of cond expressions is produced.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Updated patch is attached.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Any comments will be appreciated.
>>>>>>>>>>>>
>>>>>>>>>>>> The patch is still very big and does multiple things at once which makes
>>>>>>>>>>>> it hard to review.
>>>>>>>>>>>>
>>>>>>>>>>>> In addition to that it changes function singatures without updating
>>>>>>>>>>>> the function comments. For example what is the convert_bool
>>>>>>>>>>>> argument doing to add_to_dst_predicate_list? Why do we need
>>>>>>>>>>>> all this added logic.
>>>>>>>>>>>>
>>>>>>>>>>>> You duplicate operand_equal_for_phi_arg_p.
>>>>>>>>>>>>
>>>>>>>>>>>> I think the code handling PHIs with more than two operands but
>>>>>>>>>>>> only two unequal operands is useful generally, so that's an obvious
>>>>>>>>>>>> candidate for splitting out into a separate patch.
>>>>>>>>>>>>
>>>>>>>>>>>> + CONVERT_BOOL argument was added to convert bool predicate computations
>>>>>>>>>>>> + which is not supported by vectorizer to int type through creating of
>>>>>>>>>>>> + conditional expressions. */
>>>>>>>>>>>>
>>>>>>>>>>>> Example? The vectorizer has patterns for bool predicate computations.
>>>>>>>>>>>> This seems to be another feature that needs splitting out.
>>>>>>>>>>>>
>>>>>>>>>>>> The way you get around the critical edge parts looks awkward to me.
>>>>>>>>>>>> Please either do _all_ predicates as edge predicates or simply
>>>>>>>>>>>> split critical edges (of the respective loop body).
>>>>>>>>>>>>
>>>>>>>>>>>> I still think that an utility doing same PHI arg merging by introducing
>>>>>>>>>>>> forwarder blocks would be nicer to have.
>>>>>>>>>>>>
>>>>>>>>>>>> I'd restructure the main tree_if_conversion function to apply these
>>>>>>>>>>>> CFG pre-transforms when we are going to version the loop
>>>>>>>>>>>> for if conversion (eventually transitioning to always doing that).
>>>>>>>>>>>>
>>>>>>>>>>>> So - please split up the patch. It's way too big.
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Richard.
>>>>>>>>>>>>
>>>>>>>>>>>>> 2014-08-15 Yuri Rumyantsev <ysrumyan@gmail.com>
>>>>>>>>>>>>>
>>>>>>>>>>>>> * tree-if-conv.c (cgraph.h): Add include file to detect function clone.
>>>>>>>>>>>>> (flag_force_vectorize): New variable.
>>>>>>>>>>>>> (edge_predicate): New function.
>>>>>>>>>>>>> (set_edge_predicate): New function.
>>>>>>>>>>>>> (add_stmt_to_bb_predicate_gimplified_stmts): New function.
>>>>>>>>>>>>> (init_bb_predicate): Add initialization of negate_predicate field.
>>>>>>>>>>>>> (reset_bb_predicate): Reset negate_predicate to NULL_TREE.
>>>>>>>>>>>>> (convert_name_to_cmp): New function.
>>>>>>>>>>>>> (get_type_for_cond): New function.
>>>>>>>>>>>>> (convert_bool_predicate): New function.
>>>>>>>>>>>>> (predicate_disjunction): New function.
>>>>>>>>>>>>> (predicate_conjunction): New function.
>>>>>>>>>>>>> (add_to_predicate_list): Add convert_bool argument.
>>>>>>>>>>>>> Use predicate of cd-equivalent block if convert_bool is true and
>>>>>>>>>>>>> such bb exists; save it in static variable for further possible use.
>>>>>>>>>>>>> Add call of predicate_disjunction if convert_bool argument is true.
>>>>>>>>>>>>> (add_to_dst_predicate_list): Add convert_bool argument.
>>>>>>>>>>>>> Add early function exit if edge target block is always executed.
>>>>>>>>>>>>> Add call of predicate_conjunction if convert_bool argument is true.
>>>>>>>>>>>>> Pass convert_bool argument for add_to_predicate_list.
>>>>>>>>>>>>> Set-up predicate for crritical edge if convert_bool is true.
>>>>>>>>>>>>> (equal_phi_args): New function.
>>>>>>>>>>>>> (phi_has_two_different_args): New function.
>>>>>>>>>>>>> (if_convertible_phi_p): Accept phi nodes with more than two args
>>>>>>>>>>>>> if flag_force_vectorize wa set-up.
>>>>>>>>>>>>> (ifcvt_can_use_mask_load_store): Add test on flag_force_vectorize.
>>>>>>>>>>>>> (if_convertible_stmt_p): Allow calls of function clones if
>>>>>>>>>>>>> flag_force_vectorize was set-up.
>>>>>>>>>>>>> (all_edges_are_critical): New function.
>>>>>>>>>>>>> (if_convertible_bb_p): Allow bb has more than two predecessors if
>>>>>>>>>>>>> flag_force_vectorize was set-up. Use call of all_edges_are_critical
>>>>>>>>>>>>> to reject block if-conversion with imcoming critical edges only if
>>>>>>>>>>>>> flag_force_vectorize was not set-up.
>>>>>>>>>>>>> (walk_cond_tree): New function.
>>>>>>>>>>>>> (vect_bool_pattern_is_applicable): New function.
>>>>>>>>>>>>> (predicate_bbs): Add convert_bool argument which is used to transform
>>>>>>>>>>>>> comparison expressions of boolean type into conditional expressions
>>>>>>>>>>>>> with integral operands. If convert_bool argument was set-up and
>>>>>>>>>>>>> vect bool pattern can be appied perform the following transformation:
>>>>>>>>>>>>> (bool) x != 0 --> y = (int) x; x != 0;
>>>>>>>>>>>>> Add check that if fold_build2 produces bool conversion if convert_bool
>>>>>>>>>>>>> was set-up, recompute predicate using build2_loc. Additional argument
>>>>>>>>>>>>> 'convert_bool" is passed to add_to_dst_predicate_list and
>>>>>>>>>>>>> add_to_predicate_list.
>>>>>>>>>>>>> (if_convertible_loop_p_1): Recompute POST_DOMINATOR tree if
>>>>>>>>>>>>> flag_force_vectorize was set-up to calculate cd equivalent bb's.
>>>>>>>>>>>>> Call predicate_bbs with additional argument equal to false.
>>>>>>>>>>>>> (find_phi_replacement_condition): Extend function interface:
>>>>>>>>>>>>> it returns NULL if given phi node must be handled by means of
>>>>>>>>>>>>> extended phi node predication. If number of predecessors of phi-block
>>>>>>>>>>>>> is equal 2 and atleast one incoming edge is not critical original
>>>>>>>>>>>>> algorithm is used.
>>>>>>>>>>>>> (is_cond_scalar_reduction): Add 'extended' argument which signals that
>>>>>>>>>>>>> phi arguments must be evaluated through phi_has_two_different_args.
>>>>>>>>>>>>> (predicate_scalar_phi): Add invoÑation of convert_name_to_cmp if cond
>>>>>>>>>>>>> is SSA_NAME. Add 'false' argument to call of is_cond_scalar_reduction.
>>>>>>>>>>>>> (get_predicate_for_edge): New function.
>>>>>>>>>>>>> (find_insertion_point): New function.
>>>>>>>>>>>>> (predicate_arbitrary_phi): New function.
>>>>>>>>>>>>> (predicate_extended_scalar_phi): New function.
>>>>>>>>>>>>> (predicate_all_scalar_phis): Add code to set-up gimple statement
>>>>>>>>>>>>> iterator for predication of extended scalar phi's for insertion.
>>>>>>>>>>>>> (insert_gimplified_predicates): Add test for non-predicated basic
>>>>>>>>>>>>> blocks that there are no gimplified statements to insert. Insert
>>>>>>>>>>>>> predicates at the block begining for extended if-conversion.
>>>>>>>>>>>>> (predicate_mem_writes): Invoke convert_name_to_cmp for extended
>>>>>>>>>>>>> predication to build mask.
>>>>>>>>>>>>> (combine_blocks): Pass flag_force_vectorize to predicate_bbs.
>>>>>>>>>>>>> (tree_if_conversion): Initialize flag_force_vectorize from current
>>>>>>>>>>>>> loop or outer loop (to support pragma omp declare).Do loop versioning
>>>>>>>>>>>>> for innermost loop marked with pragma omp simd.
>>>>>>>>>>>>>
>>>>>>>>>>>>> 2014-08-01 13:40 GMT+04:00 Richard Biener <richard.guenther@gmail.com>:
>>>>>>>>>>>>>> On Wed, Jun 25, 2014 at 4:06 PM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
>>>>>>>>>>>>>>> Hi All,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> We implemented additional support for pragma omp simd in part of
>>>>>>>>>>>>>>> extended if-conversion loops with such pragma. These extensions
>>>>>>>>>>>>>>> include:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> 1. All extensions are performed only if considered loop or its outer
>>>>>>>>>>>>>>> loop was marked with pragma omp simd (force_vectorize); For ordinary
>>>>>>>>>>>>>>> loops behavior was not changed.
>>>>>>>>>>>>>>> 2. Took off cfg restriction on basic block which can have more than 2
>>>>>>>>>>>>>>> predecessors.
>>>>>>>>>>>>>>> 3. Put additional restriction on phi nodes which was missed in current design:
>>>>>>>>>>>>>>> all phi nodes must be in non-predicated basic block to conform
>>>>>>>>>>>>>>> semantic of COND_EXPR which is used for transformation.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> How is that so? If the PHI is predicated then its result will be used
>>>>>>>>>>>>>> in a PHI node again and thus we'd create a sequence of COND_EXPRs.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> No?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> 4. Extend predication of phi nodes: phi may have more than 2 arguments
>>>>>>>>>>>>>>> with some limitations:
>>>>>>>>>>>>>>> - for phi nodes which have more than 2 arguments, but only two
>>>>>>>>>>>>>>> arguments are different and one of them has the only occurence,
>>>>>>>>>>>>>>> transformation to single COND_EXPR can be done.
>>>>>>>>>>>>>>> - if phi node has more different arguments and all edge predicates
>>>>>>>>>>>>>>> correspondent to phi-arguments are disjoint, a chain of COND_EXPR
>>>>>>>>>>>>>>> will be generated for it. In current design very simple check is used:
>>>>>>>>>>>>>>> check starting from end that two edges correspondent to neighbor
>>>>>>>>>>>>>>> arguments have common predecessor which is used for further check
>>>>>>>>>>>>>>> with next edge.
>>>>>>>>>>>>>>> These guarantee that phi predication will produce the correct result.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Btw, you can think of these extensions as unfactoring a PHI node by
>>>>>>>>>>>>>> inserting forwarder blocks. Thus
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> x = PHI <1(2), 1(3), 2(4)>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> becomes
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> bb 5: <forwarder-from(2)-and(3)>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> x = PHI <1(5), 2(4)>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> x = PHI <1(2), 2(3), 3(4)>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> becomes
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> bb 5:
>>>>>>>>>>>>>> x' = PHI <1(2), 2(3)>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> b = PHI<x'(5), 3(4)>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> which means that 3) has to work. Note that we want this kind of
>>>>>>>>>>>>>> PHI transforms for out-of-SSA as well to reduce the number of
>>>>>>>>>>>>>> copies we need to insert on edges.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thus it would be nice if you implemented 4) in terms of a pre-pass
>>>>>>>>>>>>>> over the force_vect loops PHI nodes, applying that CFG transform.
>>>>>>>>>>>>>> And make 3) work properly if it doesn't already.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> It looks like you introduce a "negate predicate" to work around the
>>>>>>>>>>>>>> critical edge limitation? Please instead change if-conversion to
>>>>>>>>>>>>>> work with edge predicates (as opposed to BB predicates).
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>> Richard.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Here is example of such extended predication (compile with -march=core-avx2):
>>>>>>>>>>>>>>> #pragma omp simd safelen(8)
>>>>>>>>>>>>>>> for (i=0; i<512; i++)
>>>>>>>>>>>>>>> {
>>>>>>>>>>>>>>> float t = a[i];
>>>>>>>>>>>>>>> if (t > 0 & t < 1.0e+17f)
>>>>>>>>>>>>>>> if (c[i] != 0)
>>>>>>>>>>>>>>> res += 1;
>>>>>>>>>>>>>>> }
>>>>>>>>>>>>>>> <bb 4>:
>>>>>>>>>>>>>>> # res_15 = PHI <res_1(5), 0(3)>
>>>>>>>>>>>>>>> # i_16 = PHI <i_11(5), 0(3)>
>>>>>>>>>>>>>>> # ivtmp_17 = PHI <ivtmp_14(5), 512(3)>
>>>>>>>>>>>>>>> t_5 = a[i_16];
>>>>>>>>>>>>>>> _6 = t_5 > 0.0;
>>>>>>>>>>>>>>> _7 = t_5 < 9.9999998430674944e+16;
>>>>>>>>>>>>>>> _8 = _7 & _6;
>>>>>>>>>>>>>>> _ifc__28 = (unsigned int) _8;
>>>>>>>>>>>>>>> _10 = &c[i_16];
>>>>>>>>>>>>>>> _ifc__36 = _ifc__28 != 0 ? 4294967295 : 0;
>>>>>>>>>>>>>>> _9 = MASK_LOAD (_10, 0B, _ifc__36);
>>>>>>>>>>>>>>> _ifc__29 = _ifc__28 != 0 ? 1 : 0;
>>>>>>>>>>>>>>> _ifc__30 = (int) _ifc__29;
>>>>>>>>>>>>>>> _ifc__31 = _9 != 0 ? _ifc__30 : 0;
>>>>>>>>>>>>>>> _ifc__32 = _ifc__28 != 0 ? 1 : 0;
>>>>>>>>>>>>>>> _ifc__33 = (int) _ifc__32;
>>>>>>>>>>>>>>> _ifc__34 = _9 == 0 ? _ifc__33 : 0;
>>>>>>>>>>>>>>> _ifc__35 = _ifc__31 != 0 ? 1 : 0;
>>>>>>>>>>>>>>> res_1 = res_15 + _ifc__35;
>>>>>>>>>>>>>>> i_11 = i_16 + 1;
>>>>>>>>>>>>>>> ivtmp_14 = ivtmp_17 - 1;
>>>>>>>>>>>>>>> if (ivtmp_14 != 0)
>>>>>>>>>>>>>>> goto <bb 4>;
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Bootstrap and regression testing did not show any new failures.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> gcc/ChageLog
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> 2014-06-25 Yuri Rumyantsev <ysrumyan@gmail.com>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> * tree-if-conv.c (flag_force_vectorize): New variable.
>>>>>>>>>>>>>>> (struct bb_predicate_s): Add negate_predicate field.
>>>>>>>>>>>>>>> (bb_negate_predicate): New function.
>>>>>>>>>>>>>>> (set_bb_negate_predicate): New function.
>>>>>>>>>>>>>>> (bb_copy_predicate): New function.
>>>>>>>>>>>>>>> (add_stmt_to_bb_predicate_gimplified_stmts): New function.
>>>>>>>>>>>>>>> (init_bb_predicate): Add initialization of negate_predicate field.
>>>>>>>>>>>>>>> (reset_bb_predicate): Reset negate_predicate to NULL_TREE.
>>>>>>>>>>>>>>> (convert_name_to_cmp): New function.
>>>>>>>>>>>>>>> (get_type_for_cond): New function.
>>>>>>>>>>>>>>> (convert_bool_predicate): New function.
>>>>>>>>>>>>>>> (predicate_disjunction): New function.
>>>>>>>>>>>>>>> (predicate_conjunction): New function.
>>>>>>>>>>>>>>> (add_to_predicate_list): Add convert_bool argument.
>>>>>>>>>>>>>>> Add call of predicate_disjunction if convert_bool argument is true.
>>>>>>>>>>>>>>> (add_to_dst_predicate_list): Add convert_bool argument.
>>>>>>>>>>>>>>> Add early function exit if edge target block is always executed.
>>>>>>>>>>>>>>> Add call of predicate_conjunction if convert_bool argument is true.
>>>>>>>>>>>>>>> Pass convert_bool argument for add_to_predicate_list.
>>>>>>>>>>>>>>> (equal_phi_args): New function.
>>>>>>>>>>>>>>> (phi_has_two_different_args): New function.
>>>>>>>>>>>>>>> (phi_args_disjoint): New function.
>>>>>>>>>>>>>>> (if_convertible_phi_p): Accept phi nodes with more than two args
>>>>>>>>>>>>>>> for loops marked with pragma omp simd. Add check that phi nodes are
>>>>>>>>>>>>>>> in non-predicated basic blocks.
>>>>>>>>>>>>>>> (ifcvt_can_use_mask_load_store): Use flag_force_vectorize.
>>>>>>>>>>>>>>> (all_edges_are_critical): New function.
>>>>>>>>>>>>>>> (if_convertible_bb_p): Allow bb has more than two predecessors if
>>>>>>>>>>>>>>> flag_force_vectorize was setup. Use call of all_edges_are_critical
>>>>>>>>>>>>>>> to reject block if-conversion with imcoming critical edges only if
>>>>>>>>>>>>>>> flag_force_vectorize was not setup.
>>>>>>>>>>>>>>> (walk_cond_tree): New function.
>>>>>>>>>>>>>>> (vect_bool_pattern_is_applicable): New function.
>>>>>>>>>>>>>>> (predicate_bbs): Add convert_bool argument that is used to transform
>>>>>>>>>>>>>>> comparison expressions of boolean type into conditional expressions
>>>>>>>>>>>>>>> with integral operands. If bool_conv argument is false or both
>>>>>>>>>>>>>>> outgoing edges are not critical old algorithm of predicate assignments
>>>>>>>>>>>>>>> is used, otherwise the following code was added: check on applicable
>>>>>>>>>>>>>>> of vect-bool-pattern recognition and trnasformation of
>>>>>>>>>>>>>>> (bool) x != 0 --> y = (int) x; x != 0;
>>>>>>>>>>>>>>> compute predicates for both outgoing edges one of which is critical
>>>>>>>>>>>>>>> one using 'normal' edge, i.e. compute true and false predicates using
>>>>>>>>>>>>>>> normal outgoing edge only; evaluated predicates are stored in
>>>>>>>>>>>>>>> predicate and negate_predicate fields of struct bb_predicate_s and
>>>>>>>>>>>>>>> negate_predicate of normal edge conatins predicate of critical edge,
>>>>>>>>>>>>>>> but generated gimplified statements are stored in their destination
>>>>>>>>>>>>>>> block fields. Additional argument 'convert_bool" is passed to
>>>>>>>>>>>>>>> add_to_dst_predicate_list and add_to_predicate_list.
>>>>>>>>>>>>>>> (if_convertible_loop_p_1): Call predicate_bbs with additional argument
>>>>>>>>>>>>>>> equal to false.
>>>>>>>>>>>>>>> (find_phi_replacement_condition): Extend function interface:
>>>>>>>>>>>>>>> it returns NULL if given phi node must be handled by means of
>>>>>>>>>>>>>>> extended phi node predication. If number of predecessors of phi-block
>>>>>>>>>>>>>>> is equal 2 and atleast one incoming edge is not critical original
>>>>>>>>>>>>>>> algorithm is used.
>>>>>>>>>>>>>>> (is_cond_scalar_reduction): Add 'extended' argument which signals that
>>>>>>>>>>>>>>> both phi arguments must be evaluated through phi_has_two_different_args.
>>>>>>>>>>>>>>> (predicate_scalar_phi): Add invoÑation of convert_name_to_cmp if cond
>>>>>>>>>>>>>>> is SSA_NAME. Add 'false' argument to call of is_cond_scalar_reduction.
>>>>>>>>>>>>>>> (get_predicate_for_edge): New function.
>>>>>>>>>>>>>>> (find_insertion_point): New function.
>>>>>>>>>>>>>>> (predicate_phi_disjoint_args): New function.
>>>>>>>>>>>>>>> (predicate_extended_scalar_phi): New function.
>>>>>>>>>>>>>>> (predicate_all_scalar_phis): Add code to set-up gimple statement
>>>>>>>>>>>>>>> iterator for predication of extended scalar phi's for insertion.
>>>>>>>>>>>>>>> (insert_gimplified_predicates): Add test for non-predicated basic
>>>>>>>>>>>>>>> blocks that there are no gimplified statements to insert. Insert
>>>>>>>>>>>>>>> predicates at the block begining for extended if-conversion.
>>>>>>>>>>>>>>> (predicate_mem_writes): Invoke convert_name_to_cmp for extended
>>>>>>>>>>>>>>> predication to build mask.
>>>>>>>>>>>>>>> (combine_blocks): Pass flag_force_vectorize to predicate_bbs.
>>>>>>>>>>>>>>> (split_crit_edge): New function.
>>>>>>>>>>>>>>> (tree_if_conversion): Initialize flag_force_vectorize from current
>>>>>>>>>>>>>>> loop or outer loop (to support pragma omp declare). Invoke
>>>>>>>>>>>>>>> split_crit_edge for extended predication. Do loop versioning for
>>>>>>>>>>>>>>> innermost loop marked with pragma omp simd.
- References:
- Re: [PATCH,1/2] Extended if-conversion for loops marked with pragma omp simd.
- Re: [PATCH,1/2] Extended if-conversion for loops marked with pragma omp simd.
- Re: [PATCH,1/2] Extended if-conversion for loops marked with pragma omp simd.
- Re: [PATCH,1/2] Extended if-conversion for loops marked with pragma omp simd.
- Re: [PATCH,1/2] Extended if-conversion for loops marked with pragma omp simd.
- Re: [PATCH,1/2] Extended if-conversion for loops marked with pragma omp simd.
- Re: [PATCH,1/2] Extended if-conversion for loops marked with pragma omp simd.
- Re: [PATCH,1/2] Extended if-conversion for loops marked with pragma omp simd.
- Re: [PATCH,1/2] Extended if-conversion for loops marked with pragma omp simd.
- Re: [PATCH,1/2] Extended if-conversion for loops marked with pragma omp simd.