This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH,1/2] Extended if-conversion for loops marked with pragma omp simd.
- From: Richard Biener <richard dot guenther at gmail dot com>
- To: Yuri Rumyantsev <ysrumyan at gmail dot com>
- Cc: gcc-patches <gcc-patches at gcc dot gnu dot org>, Igor Zamyatin <izamyatin at gmail dot com>
- Date: Wed, 15 Oct 2014 11:50:22 +0200
- Subject: Re: [PATCH,1/2] Extended if-conversion for loops marked with pragma omp simd.
- Authentication-results: sourceware.org; auth=none
- References: <CAEoMCqRM0B1TzUmSehsV3S2c7Pg9W7tnr6Hb78wbxGpeQ+OMxA at mail dot gmail dot com>
On Mon, Oct 13, 2014 at 11:38 AM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
> Richard,
>
> Here is updated patch (part1) for extended if conversion.
>
> Second part of patch will be sent later.
Ok, I'm starting to look at this. I'd still like you to split things up
more.
static inline void
add_to_predicate_list (struct loop *loop, basic_block bb, tree nc)
{
...
+ /* We use notion of cd equivalence to get simplier predicate for
+ join block, e.g. if join block has 2 predecessors with predicates
+ p1 & p2 and p1 & !p2, we'd like to get p1 for it instead of
+ p1 & p2 | p1 & !p2. */
+ if (dom_bb != loop->header
+ && get_immediate_dominator (CDI_POST_DOMINATORS, dom_bb) == bb)
+ {
+ gcc_assert (flow_bb_inside_loop_p (loop, dom_bb));
+ bc = bb_predicate (dom_bb);
+ gcc_assert (!is_true_predicate (bc));
these changes look worthwhile even for !flag_force_vectorize. So please
split the change to add_to_predicate_list out and compute post-dominators
unconditionally. Note that you should call free_dominance_info
(CDI_POST_DOMINATORS) at the end of if-conversion.
+ if (!dominated_by_p (CDI_DOMINATORS, loop->latch, e->dest))
+ add_to_predicate_list (loop, e->dest, cond);
+
+ /* If edge E is critical save predicate on it. */
+ if (EDGE_COUNT (e->dest->preds) >= 2)
+ set_edge_predicate (e, cond);
how do we know the edge is critical by this simple check? Why not
simply always save edge predicates (well, you kind of do but omit
the case where e->src dominates e->dest).
Btw, you can rely on edge->aux being NULL at the start of the
pass but need to clear it at the end (best use clear_aux_for_edges ()
for that). So stuff like
+ extract_true_false_edges_from_block (bb, &true_edge, &false_edge);
+ if (flag_force_vectorize)
+ true_edge->aux = false_edge->aux = NULL;
shouldn't be necessary.
I think the edge predicate handling should also be unconditionally
and not depend on flag_force_vectorize.
+ /* The loop latch and loop exit block are always executed and
+ have no extra conditions to be processed: skip them. */
+ if (bb == loop->latch
+ || bb_with_exit_edge_p (loop, bb))
I don't think the edge stuff is true - given you still only reset the
loop->latch bb predicate the change looks broken.
+ /* Fold_build2 can produce bool conversion which is not
+ supported by vectorizer, so re-build it without folding.
+ For example, such conversion is generated for sequence:
+ _Bool _7, _8, _9;
+ _7 = _6 != 13; _8 = _6 != 0; _9 = _8 & _9;
+ if (_9 != 0) --> (bool)_9. */
+
+ if (CONVERT_EXPR_P (c)
+ && TREE_CODE_CLASS (code) == tcc_comparison)
I think you should simply use canonicalize_cond_expr_cond on the
folding result. Or rather _not_ fold at all - we are taking the
operands from the GIMPLE condition unmodified after all.
- add_to_dst_predicate_list (loop, false_edge,
- unshare_expr (cond), c2);
+ add_to_dst_predicate_list (loop, false_edge, unshare_expr (cond),
+ unshare_expr (c2));
why is it necessary to unshare c2?
Please split out the PHI-with-multi-arg handling (I have not looked at
that in detail).
Thanks,
Richard.
> Changelog.
>
> 2014-10-13 Yuri Rumyantsev <ysrumyan@gmail.com>
>
> * tree-if-conv.c (cgraph.h): Add include file to detect function clone.
> (flag_force_vectorize): New variable.
> (edge_predicate): New function.
> (set_edge_predicate): New function.
> (add_to_predicate_list): Check unconditionally that bb is always
> executed to early exit. Use predicate of cd-equivalent block
> for join blocks if it exists.
> (add_to_dst_predicate_list): Invoke add_to_predicate_list if
> destination block of edge is not always executed. Set-up predicate
> for critical edge.
> (if_convertible_phi_p): Accept phi nodes with more than two args
> if FLAG_FORCE_VECTORIZE was set-up.
> (ifcvt_can_use_mask_load_store): Use FLAG_FORCE_VECTORIZE.
> (if_convertible_stmt_p): Fix up pre-function comments.
> (all_edges_are_critical): New function.
> (if_convertible_bb_p): Allow bb has more than two predecessors if
> FLAG_FORCE_VECTORIZE was set-up. Use call of all_edges_are_critical
> to reject block if-conversion with incoming critical edges only if
> FLAG_FORCE_VECTORIZE was not set-up.
> (predicate_bbs): Skip loop exit block also. Add check that if
> fold_build2 produces bool conversion, recompute predicate using
> build2_loc. Add zeroing of edge 'aux' field under FLAG_FORCE_VECTORIZE.
> (if_convertible_loop_p_1): Recompute POST_DOMINATOR tree if
> FLAG_FORCE_VECTORIZE was set-up to calculate cd equivalent bb's.
> (find_phi_replacement_condition): Extend function interface:
> it returns NULL if given phi node must be handled by means of
> extended phi node predication. If number of predecessors of phi-block
> is equal 2 and atleast one incoming edge is not critical original
> algorithm is used.
> (get_predicate_for_edge): New function.
> (find_insertion_point): New function.
> (predicate_arbitrary_scalar_phi): New function.
> (predicate_all_scalar_phis): Introduce new variable BEFORE.
> Invoke find_insertion_point to initialize gsi and
> predicate_arbitrary_scalar_phi if TRUE_BB is NULL - it signals
> that extended predication must be applied).
> (insert_gimplified_predicates): Add test for non-predicated basic
> blocks that there are no gimplified statements to insert. Insert
> predicates at the block begining for extended if-conversion.
> (tree_if_conversion): Initialize flag_force_vectorize from current
> loop or outer loop (to support pragma omp declare).Do loop versioning
> for innermost loop marked with pragma omp simd and
> FLAG_TREE_LOOP_IF_CONVERT was not sett-up. Nullify 'aux' field of edges
> for blocks with two successors.
>
>
>
>
> 2014-09-22 12:28 GMT+04:00 Yuri Rumyantsev <ysrumyan@gmail.com>:
>> Richard,
>>
>> here is reduced patch (part.1) which was reduced almost twice.
>> Let's me also answer on your comments.
>>
>> 1. I really use edge field 'aux' to keep predicate for critical edges.
>> My previous code was not correct and now it looks like:
>>
>> if (EDGE_COUNT (b->succs) == 1 || EDGE_COUNT (e->dest->preds) == 1)
>> /* Edge E is not critical, use predicate of edge source bb. */
>> c = bb_predicate (b);
>> else
>> /* Edge E is critical and its aux field contains predicate. */
>> c = edge_predicate (e);
>>
>> 2. I completely delete all code related to creation of conditional
>> expressions and completely rely on bool pattern recognition in
>> vectorizer. But we need to delete all dead predicate computations
>> which are not used since they prevent vectorization. I will add this
>> local-dce function in next patch.
>> 3. I also did not include in this patch recognition of general
>> phi-nodes with two arguments only for which conversion of conditional
>> scalar reduction can be applied also.
>> Note that all these changes are applied for loop marked with pragma
>> omp simd only.
>>
>> 2014-09-22 Yuri Rumyantsev <ysrumyan@gmail.com>
>>
>> * tree-if-conv.c (cgraph.h): Add include file to detect function clone.
>> (flag_force_vectorize): New variable.
>> (edge_predicate): New function.
>> (set_edge_predicate): New function.
>> (convert_name_to_cmp): New function.
>> (add_to_predicate_list): Check unconditionally that bb is always
>> executed to early exit. Use predicate of cd-equivalent block
>> for join blocks if it exists.
>> (add_to_dst_predicate_list): Invoke add_to_predicate_list if
>> destination block of edge is not always executed. Set-up predicate
>> for critical edge.
>> (if_convertible_phi_p): Accept phi nodes with more than two args
>> if FLAG_FORCE_VECTORIZE was set-up.
>> (ifcvt_can_use_mask_load_store): Use FLAG_FORCE_VECTORIZE.
>> (if_convertible_stmt_p): Fix up pre-function comments.
>> (all_edges_are_critical): New function.
>> (if_convertible_bb_p): Allow bb has more than two predecessors if
>> FLAG_FORCE_VECTORIZE was set-up. Use call of all_edges_are_critical
>> to reject block if-conversion with incoming critical edges only if
>> FLAG_FORCE_VECTORIZE was not set-up.
>> (predicate_bbs): Skip loop exit block also. Add check that if
>> fold_build2 produces bool conversion, recompute predicate using
>> build2_loc. Add zeroing of edge 'aux' field under FLAG_FORCE_VECTORIZE.
>> (if_convertible_loop_p_1): Recompute POST_DOMINATOR tree if
>> FLAG_FORCE_VECTORIZE was set-up to calculate cd equivalent bb's.
>> (find_phi_replacement_condition): Extend function interface:
>> it returns NULL if given phi node must be handled by means of
>> extended phi node predication. If number of predecessors of phi-block
>> is equal 2 and atleast one incoming edge is not critical original
>> algorithm is used.
>> (get_predicate_for_edge): New function.
>> (find_insertion_point): New function.
>> (predicate_arbitrary_scalar_phi): New function.
>> (predicate_all_scalar_phis): Introduce new variable BEFORE.
>> Invoke find_insertion_point to initialize gsi and
>> predicate_arbitrary_scalar_phi if TRUE_BB is NULL - it signals
>> that extended predication must be applied).
>> (insert_gimplified_predicates): Add test for non-predicated basic
>> blocks that there are no gimplified statements to insert. Insert
>> predicates at the block begining for extended if-conversion.
>> (tree_if_conversion): Initialize flag_force_vectorize from current
>> loop or outer loop (to support pragma omp declare).Do loop versioning
>> for innermost loop marked with pragma omp simd and
>> FLAG_TREE_LOOP_IF_CONVERT was not sett-up. Nullify 'aux' field of edges
>> for blocks with two successors.
>>
>>
>>
>>
>> 2014-09-08 17:10 GMT+04:00 Richard Biener <richard.guenther@gmail.com>:
>>> On Fri, Aug 15, 2014 at 2:02 PM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
>>>> Richard!
>>>> Here is updated patch with the following changes:
>>>>
>>>> 1. Any restrictions on phi-function were eliminated for extended conversion.
>>>> 2. Put predicate for critical edges to 'aux' field of edge, i.e.
>>>> negate_predicate was deleted.
>>>> 3. Deleted splitting of critical edges, i.e. both outgoing edges can
>>>> be critical.
>>>> 4. Use notion of cd-equivalence to set-up predicate for join basic
>>>> blocks to simplify it.
>>>> 5. I decided to not design pre-pass since it will lead generating
>>>> chain of cond expressions for phi-node if conversion, whereas for phi
>>>> of kind
>>>> x = PHI <1(2), 1(3), 2(4)>
>>>> only one cond expression is required and this is considered as simple
>>>> optimization for arbitrary phi-function. More precise,
>>>> if phi-function have only two different arguments and one of them has
>>>> single occurrence, if- conversion is performed as if phi have only 2
>>>> arguments.
>>>> For arbitrary phi function a chain of cond expressions is produced.
>>>>
>>>> Updated patch is attached.
>>>>
>>>> Any comments will be appreciated.
>>>
>>> The patch is still very big and does multiple things at once which makes
>>> it hard to review.
>>>
>>> In addition to that it changes function singatures without updating
>>> the function comments. For example what is the convert_bool
>>> argument doing to add_to_dst_predicate_list? Why do we need
>>> all this added logic.
>>>
>>> You duplicate operand_equal_for_phi_arg_p.
>>>
>>> I think the code handling PHIs with more than two operands but
>>> only two unequal operands is useful generally, so that's an obvious
>>> candidate for splitting out into a separate patch.
>>>
>>> + CONVERT_BOOL argument was added to convert bool predicate computations
>>> + which is not supported by vectorizer to int type through creating of
>>> + conditional expressions. */
>>>
>>> Example? The vectorizer has patterns for bool predicate computations.
>>> This seems to be another feature that needs splitting out.
>>>
>>> The way you get around the critical edge parts looks awkward to me.
>>> Please either do _all_ predicates as edge predicates or simply
>>> split critical edges (of the respective loop body).
>>>
>>> I still think that an utility doing same PHI arg merging by introducing
>>> forwarder blocks would be nicer to have.
>>>
>>> I'd restructure the main tree_if_conversion function to apply these
>>> CFG pre-transforms when we are going to version the loop
>>> for if conversion (eventually transitioning to always doing that).
>>>
>>> So - please split up the patch. It's way too big.
>>>
>>> Thanks,
>>> Richard.
>>>
>>>> 2014-08-15 Yuri Rumyantsev <ysrumyan@gmail.com>
>>>>
>>>> * tree-if-conv.c (cgraph.h): Add include file to detect function clone.
>>>> (flag_force_vectorize): New variable.
>>>> (edge_predicate): New function.
>>>> (set_edge_predicate): New function.
>>>> (add_stmt_to_bb_predicate_gimplified_stmts): New function.
>>>> (init_bb_predicate): Add initialization of negate_predicate field.
>>>> (reset_bb_predicate): Reset negate_predicate to NULL_TREE.
>>>> (convert_name_to_cmp): New function.
>>>> (get_type_for_cond): New function.
>>>> (convert_bool_predicate): New function.
>>>> (predicate_disjunction): New function.
>>>> (predicate_conjunction): New function.
>>>> (add_to_predicate_list): Add convert_bool argument.
>>>> Use predicate of cd-equivalent block if convert_bool is true and
>>>> such bb exists; save it in static variable for further possible use.
>>>> Add call of predicate_disjunction if convert_bool argument is true.
>>>> (add_to_dst_predicate_list): Add convert_bool argument.
>>>> Add early function exit if edge target block is always executed.
>>>> Add call of predicate_conjunction if convert_bool argument is true.
>>>> Pass convert_bool argument for add_to_predicate_list.
>>>> Set-up predicate for crritical edge if convert_bool is true.
>>>> (equal_phi_args): New function.
>>>> (phi_has_two_different_args): New function.
>>>> (if_convertible_phi_p): Accept phi nodes with more than two args
>>>> if flag_force_vectorize wa set-up.
>>>> (ifcvt_can_use_mask_load_store): Add test on flag_force_vectorize.
>>>> (if_convertible_stmt_p): Allow calls of function clones if
>>>> flag_force_vectorize was set-up.
>>>> (all_edges_are_critical): New function.
>>>> (if_convertible_bb_p): Allow bb has more than two predecessors if
>>>> flag_force_vectorize was set-up. Use call of all_edges_are_critical
>>>> to reject block if-conversion with imcoming critical edges only if
>>>> flag_force_vectorize was not set-up.
>>>> (walk_cond_tree): New function.
>>>> (vect_bool_pattern_is_applicable): New function.
>>>> (predicate_bbs): Add convert_bool argument which is used to transform
>>>> comparison expressions of boolean type into conditional expressions
>>>> with integral operands. If convert_bool argument was set-up and
>>>> vect bool pattern can be appied perform the following transformation:
>>>> (bool) x != 0 --> y = (int) x; x != 0;
>>>> Add check that if fold_build2 produces bool conversion if convert_bool
>>>> was set-up, recompute predicate using build2_loc. Additional argument
>>>> 'convert_bool" is passed to add_to_dst_predicate_list and
>>>> add_to_predicate_list.
>>>> (if_convertible_loop_p_1): Recompute POST_DOMINATOR tree if
>>>> flag_force_vectorize was set-up to calculate cd equivalent bb's.
>>>> Call predicate_bbs with additional argument equal to false.
>>>> (find_phi_replacement_condition): Extend function interface:
>>>> it returns NULL if given phi node must be handled by means of
>>>> extended phi node predication. If number of predecessors of phi-block
>>>> is equal 2 and atleast one incoming edge is not critical original
>>>> algorithm is used.
>>>> (is_cond_scalar_reduction): Add 'extended' argument which signals that
>>>> phi arguments must be evaluated through phi_has_two_different_args.
>>>> (predicate_scalar_phi): Add invoÑation of convert_name_to_cmp if cond
>>>> is SSA_NAME. Add 'false' argument to call of is_cond_scalar_reduction.
>>>> (get_predicate_for_edge): New function.
>>>> (find_insertion_point): New function.
>>>> (predicate_arbitrary_phi): New function.
>>>> (predicate_extended_scalar_phi): New function.
>>>> (predicate_all_scalar_phis): Add code to set-up gimple statement
>>>> iterator for predication of extended scalar phi's for insertion.
>>>> (insert_gimplified_predicates): Add test for non-predicated basic
>>>> blocks that there are no gimplified statements to insert. Insert
>>>> predicates at the block begining for extended if-conversion.
>>>> (predicate_mem_writes): Invoke convert_name_to_cmp for extended
>>>> predication to build mask.
>>>> (combine_blocks): Pass flag_force_vectorize to predicate_bbs.
>>>> (tree_if_conversion): Initialize flag_force_vectorize from current
>>>> loop or outer loop (to support pragma omp declare).Do loop versioning
>>>> for innermost loop marked with pragma omp simd.
>>>>
>>>> 2014-08-01 13:40 GMT+04:00 Richard Biener <richard.guenther@gmail.com>:
>>>>> On Wed, Jun 25, 2014 at 4:06 PM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
>>>>>> Hi All,
>>>>>>
>>>>>> We implemented additional support for pragma omp simd in part of
>>>>>> extended if-conversion loops with such pragma. These extensions
>>>>>> include:
>>>>>>
>>>>>> 1. All extensions are performed only if considered loop or its outer
>>>>>> loop was marked with pragma omp simd (force_vectorize); For ordinary
>>>>>> loops behavior was not changed.
>>>>>> 2. Took off cfg restriction on basic block which can have more than 2
>>>>>> predecessors.
>>>>>> 3. Put additional restriction on phi nodes which was missed in current design:
>>>>>> all phi nodes must be in non-predicated basic block to conform
>>>>>> semantic of COND_EXPR which is used for transformation.
>>>>>
>>>>> How is that so? If the PHI is predicated then its result will be used
>>>>> in a PHI node again and thus we'd create a sequence of COND_EXPRs.
>>>>>
>>>>> No?
>>>>>
>>>>>> 4. Extend predication of phi nodes: phi may have more than 2 arguments
>>>>>> with some limitations:
>>>>>> - for phi nodes which have more than 2 arguments, but only two
>>>>>> arguments are different and one of them has the only occurence,
>>>>>> transformation to single COND_EXPR can be done.
>>>>>> - if phi node has more different arguments and all edge predicates
>>>>>> correspondent to phi-arguments are disjoint, a chain of COND_EXPR
>>>>>> will be generated for it. In current design very simple check is used:
>>>>>> check starting from end that two edges correspondent to neighbor
>>>>>> arguments have common predecessor which is used for further check
>>>>>> with next edge.
>>>>>> These guarantee that phi predication will produce the correct result.
>>>>>
>>>>> Btw, you can think of these extensions as unfactoring a PHI node by
>>>>> inserting forwarder blocks. Thus
>>>>>
>>>>> x = PHI <1(2), 1(3), 2(4)>
>>>>>
>>>>> becomes
>>>>>
>>>>> bb 5: <forwarder-from(2)-and(3)>
>>>>>
>>>>> x = PHI <1(5), 2(4)>
>>>>>
>>>>> and
>>>>>
>>>>> x = PHI <1(2), 2(3), 3(4)>
>>>>>
>>>>> becomes
>>>>>
>>>>> bb 5:
>>>>> x' = PHI <1(2), 2(3)>
>>>>>
>>>>> b = PHI<x'(5), 3(4)>
>>>>>
>>>>> which means that 3) has to work. Note that we want this kind of
>>>>> PHI transforms for out-of-SSA as well to reduce the number of
>>>>> copies we need to insert on edges.
>>>>>
>>>>> Thus it would be nice if you implemented 4) in terms of a pre-pass
>>>>> over the force_vect loops PHI nodes, applying that CFG transform.
>>>>> And make 3) work properly if it doesn't already.
>>>>>
>>>>> It looks like you introduce a "negate predicate" to work around the
>>>>> critical edge limitation? Please instead change if-conversion to
>>>>> work with edge predicates (as opposed to BB predicates).
>>>>>
>>>>> Thanks,
>>>>> Richard.
>>>>>
>>>>>>
>>>>>> Here is example of such extended predication (compile with -march=core-avx2):
>>>>>> #pragma omp simd safelen(8)
>>>>>> for (i=0; i<512; i++)
>>>>>> {
>>>>>> float t = a[i];
>>>>>> if (t > 0 & t < 1.0e+17f)
>>>>>> if (c[i] != 0)
>>>>>> res += 1;
>>>>>> }
>>>>>> <bb 4>:
>>>>>> # res_15 = PHI <res_1(5), 0(3)>
>>>>>> # i_16 = PHI <i_11(5), 0(3)>
>>>>>> # ivtmp_17 = PHI <ivtmp_14(5), 512(3)>
>>>>>> t_5 = a[i_16];
>>>>>> _6 = t_5 > 0.0;
>>>>>> _7 = t_5 < 9.9999998430674944e+16;
>>>>>> _8 = _7 & _6;
>>>>>> _ifc__28 = (unsigned int) _8;
>>>>>> _10 = &c[i_16];
>>>>>> _ifc__36 = _ifc__28 != 0 ? 4294967295 : 0;
>>>>>> _9 = MASK_LOAD (_10, 0B, _ifc__36);
>>>>>> _ifc__29 = _ifc__28 != 0 ? 1 : 0;
>>>>>> _ifc__30 = (int) _ifc__29;
>>>>>> _ifc__31 = _9 != 0 ? _ifc__30 : 0;
>>>>>> _ifc__32 = _ifc__28 != 0 ? 1 : 0;
>>>>>> _ifc__33 = (int) _ifc__32;
>>>>>> _ifc__34 = _9 == 0 ? _ifc__33 : 0;
>>>>>> _ifc__35 = _ifc__31 != 0 ? 1 : 0;
>>>>>> res_1 = res_15 + _ifc__35;
>>>>>> i_11 = i_16 + 1;
>>>>>> ivtmp_14 = ivtmp_17 - 1;
>>>>>> if (ivtmp_14 != 0)
>>>>>> goto <bb 4>;
>>>>>>
>>>>>> Bootstrap and regression testing did not show any new failures.
>>>>>>
>>>>>> gcc/ChageLog
>>>>>>
>>>>>> 2014-06-25 Yuri Rumyantsev <ysrumyan@gmail.com>
>>>>>>
>>>>>> * tree-if-conv.c (flag_force_vectorize): New variable.
>>>>>> (struct bb_predicate_s): Add negate_predicate field.
>>>>>> (bb_negate_predicate): New function.
>>>>>> (set_bb_negate_predicate): New function.
>>>>>> (bb_copy_predicate): New function.
>>>>>> (add_stmt_to_bb_predicate_gimplified_stmts): New function.
>>>>>> (init_bb_predicate): Add initialization of negate_predicate field.
>>>>>> (reset_bb_predicate): Reset negate_predicate to NULL_TREE.
>>>>>> (convert_name_to_cmp): New function.
>>>>>> (get_type_for_cond): New function.
>>>>>> (convert_bool_predicate): New function.
>>>>>> (predicate_disjunction): New function.
>>>>>> (predicate_conjunction): New function.
>>>>>> (add_to_predicate_list): Add convert_bool argument.
>>>>>> Add call of predicate_disjunction if convert_bool argument is true.
>>>>>> (add_to_dst_predicate_list): Add convert_bool argument.
>>>>>> Add early function exit if edge target block is always executed.
>>>>>> Add call of predicate_conjunction if convert_bool argument is true.
>>>>>> Pass convert_bool argument for add_to_predicate_list.
>>>>>> (equal_phi_args): New function.
>>>>>> (phi_has_two_different_args): New function.
>>>>>> (phi_args_disjoint): New function.
>>>>>> (if_convertible_phi_p): Accept phi nodes with more than two args
>>>>>> for loops marked with pragma omp simd. Add check that phi nodes are
>>>>>> in non-predicated basic blocks.
>>>>>> (ifcvt_can_use_mask_load_store): Use flag_force_vectorize.
>>>>>> (all_edges_are_critical): New function.
>>>>>> (if_convertible_bb_p): Allow bb has more than two predecessors if
>>>>>> flag_force_vectorize was setup. Use call of all_edges_are_critical
>>>>>> to reject block if-conversion with imcoming critical edges only if
>>>>>> flag_force_vectorize was not setup.
>>>>>> (walk_cond_tree): New function.
>>>>>> (vect_bool_pattern_is_applicable): New function.
>>>>>> (predicate_bbs): Add convert_bool argument that is used to transform
>>>>>> comparison expressions of boolean type into conditional expressions
>>>>>> with integral operands. If bool_conv argument is false or both
>>>>>> outgoing edges are not critical old algorithm of predicate assignments
>>>>>> is used, otherwise the following code was added: check on applicable
>>>>>> of vect-bool-pattern recognition and trnasformation of
>>>>>> (bool) x != 0 --> y = (int) x; x != 0;
>>>>>> compute predicates for both outgoing edges one of which is critical
>>>>>> one using 'normal' edge, i.e. compute true and false predicates using
>>>>>> normal outgoing edge only; evaluated predicates are stored in
>>>>>> predicate and negate_predicate fields of struct bb_predicate_s and
>>>>>> negate_predicate of normal edge conatins predicate of critical edge,
>>>>>> but generated gimplified statements are stored in their destination
>>>>>> block fields. Additional argument 'convert_bool" is passed to
>>>>>> add_to_dst_predicate_list and add_to_predicate_list.
>>>>>> (if_convertible_loop_p_1): Call predicate_bbs with additional argument
>>>>>> equal to false.
>>>>>> (find_phi_replacement_condition): Extend function interface:
>>>>>> it returns NULL if given phi node must be handled by means of
>>>>>> extended phi node predication. If number of predecessors of phi-block
>>>>>> is equal 2 and atleast one incoming edge is not critical original
>>>>>> algorithm is used.
>>>>>> (is_cond_scalar_reduction): Add 'extended' argument which signals that
>>>>>> both phi arguments must be evaluated through phi_has_two_different_args.
>>>>>> (predicate_scalar_phi): Add invoÑation of convert_name_to_cmp if cond
>>>>>> is SSA_NAME. Add 'false' argument to call of is_cond_scalar_reduction.
>>>>>> (get_predicate_for_edge): New function.
>>>>>> (find_insertion_point): New function.
>>>>>> (predicate_phi_disjoint_args): New function.
>>>>>> (predicate_extended_scalar_phi): New function.
>>>>>> (predicate_all_scalar_phis): Add code to set-up gimple statement
>>>>>> iterator for predication of extended scalar phi's for insertion.
>>>>>> (insert_gimplified_predicates): Add test for non-predicated basic
>>>>>> blocks that there are no gimplified statements to insert. Insert
>>>>>> predicates at the block begining for extended if-conversion.
>>>>>> (predicate_mem_writes): Invoke convert_name_to_cmp for extended
>>>>>> predication to build mask.
>>>>>> (combine_blocks): Pass flag_force_vectorize to predicate_bbs.
>>>>>> (split_crit_edge): New function.
>>>>>> (tree_if_conversion): Initialize flag_force_vectorize from current
>>>>>> loop or outer loop (to support pragma omp declare). Invoke
>>>>>> split_crit_edge for extended predication. Do loop versioning for
>>>>>> innermost loop marked with pragma omp simd.