This is the mail archive of the
mailing list for the GCC project.
Re: [PATCH] Extended if-conversion for loops marked with pragma omp simd.
- From: Yuri Rumyantsev <ysrumyan at gmail dot com>
- To: Richard Biener <richard dot guenther at gmail dot com>
- Cc: gcc-patches <gcc-patches at gcc dot gnu dot org>, Igor Zamyatin <izamyatin at gmail dot com>
- Date: Mon, 28 Jul 2014 15:17:32 +0400
- Subject: Re: [PATCH] Extended if-conversion for loops marked with pragma omp simd.
- Authentication-results: sourceware.org; auth=none
- References: <CAEoMCqQCGRPa6Td5mzGgSy8U2izn7Rak9r32frabtvRXv5bwEw at mail dot gmail dot com> <CAEoMCqT1V=Q5ip=4NVv_ph9sQ1iPBitmpF4CQc58Rp5RJuvpVA at mail dot gmail dot com> <CAFiYyc2w+-Vy3ySBhOTt1YV4pynrcE_xxx9YWKrQQLYPtfyQAA at mail dot gmail dot com>
2014-07-14 16:16 GMT+04:00 Richard Biener <firstname.lastname@example.org>:
> On Mon, Jul 14, 2014 at 12:16 PM, Yuri Rumyantsev <email@example.com> wrote:
> It's in my queue (pretty large patch for a drive-by review - maybe there is
> an opportunity to split the patch up?).
> Won't get to it before the Cauldron though.
>> 2014-06-25 18:06 GMT+04:00 Yuri Rumyantsev <firstname.lastname@example.org>:
>>> Hi All,
>>> We implemented additional support for pragma omp simd in part of
>>> extended if-conversion loops with such pragma. These extensions
>>> 1. All extensions are performed only if considered loop or its outer
>>> loop was marked with pragma omp simd (force_vectorize); For ordinary
>>> loops behavior was not changed.
>>> 2. Took off cfg restriction on basic block which can have more than 2
>>> 3. Put additional restriction on phi nodes which was missed in current design:
>>> all phi nodes must be in non-predicated basic block to conform
>>> semantic of COND_EXPR which is used for transformation.
>>> 4. Extend predication of phi nodes: phi may have more than 2 arguments
>>> with some limitations:
>>> - for phi nodes which have more than 2 arguments, but only two
>>> arguments are different and one of them has the only occurence,
>>> transformation to single COND_EXPR can be done.
>>> - if phi node has more different arguments and all edge predicates
>>> correspondent to phi-arguments are disjoint, a chain of COND_EXPR
>>> will be generated for it. In current design very simple check is used:
>>> check starting from end that two edges correspondent to neighbor
>>> arguments have common predecessor which is used for further check
>>> with next edge.
>>> These guarantee that phi predication will produce the correct result.
>>> Here is example of such extended predication (compile with -march=core-avx2):
>>> #pragma omp simd safelen(8)
>>> for (i=0; i<512; i++)
>>> float t = a[i];
>>> if (t > 0 & t < 1.0e+17f)
>>> if (c[i] != 0)
>>> res += 1;
>>> <bb 4>:
>>> # res_15 = PHI <res_1(5), 0(3)>
>>> # i_16 = PHI <i_11(5), 0(3)>
>>> # ivtmp_17 = PHI <ivtmp_14(5), 512(3)>
>>> t_5 = a[i_16];
>>> _6 = t_5 > 0.0;
>>> _7 = t_5 < 9.9999998430674944e+16;
>>> _8 = _7 & _6;
>>> _ifc__28 = (unsigned int) _8;
>>> _10 = &c[i_16];
>>> _ifc__36 = _ifc__28 != 0 ? 4294967295 : 0;
>>> _9 = MASK_LOAD (_10, 0B, _ifc__36);
>>> _ifc__29 = _ifc__28 != 0 ? 1 : 0;
>>> _ifc__30 = (int) _ifc__29;
>>> _ifc__31 = _9 != 0 ? _ifc__30 : 0;
>>> _ifc__32 = _ifc__28 != 0 ? 1 : 0;
>>> _ifc__33 = (int) _ifc__32;
>>> _ifc__34 = _9 == 0 ? _ifc__33 : 0;
>>> _ifc__35 = _ifc__31 != 0 ? 1 : 0;
>>> res_1 = res_15 + _ifc__35;
>>> i_11 = i_16 + 1;
>>> ivtmp_14 = ivtmp_17 - 1;
>>> if (ivtmp_14 != 0)
>>> goto <bb 4>;
>>> Bootstrap and regression testing did not show any new failures.
>>> 2014-06-25 Yuri Rumyantsev <email@example.com>
>>> * tree-if-conv.c (flag_force_vectorize): New variable.
>>> (struct bb_predicate_s): Add negate_predicate field.
>>> (bb_negate_predicate): New function.
>>> (set_bb_negate_predicate): New function.
>>> (bb_copy_predicate): New function.
>>> (add_stmt_to_bb_predicate_gimplified_stmts): New function.
>>> (init_bb_predicate): Add initialization of negate_predicate field.
>>> (reset_bb_predicate): Reset negate_predicate to NULL_TREE.
>>> (convert_name_to_cmp): New function.
>>> (get_type_for_cond): New function.
>>> (convert_bool_predicate): New function.
>>> (predicate_disjunction): New function.
>>> (predicate_conjunction): New function.
>>> (add_to_predicate_list): Add convert_bool argument.
>>> Add call of predicate_disjunction if convert_bool argument is true.
>>> (add_to_dst_predicate_list): Add convert_bool argument.
>>> Add early function exit if edge target block is always executed.
>>> Add call of predicate_conjunction if convert_bool argument is true.
>>> Pass convert_bool argument for add_to_predicate_list.
>>> (equal_phi_args): New function.
>>> (phi_has_two_different_args): New function.
>>> (phi_args_disjoint): New function.
>>> (if_convertible_phi_p): Accept phi nodes with more than two args
>>> for loops marked with pragma omp simd. Add check that phi nodes are
>>> in non-predicated basic blocks.
>>> (ifcvt_can_use_mask_load_store): Use flag_force_vectorize.
>>> (all_edges_are_critical): New function.
>>> (if_convertible_bb_p): Allow bb has more than two predecessors if
>>> flag_force_vectorize was setup. Use call of all_edges_are_critical
>>> to reject block if-conversion with imcoming critical edges only if
>>> flag_force_vectorize was not setup.
>>> (walk_cond_tree): New function.
>>> (vect_bool_pattern_is_applicable): New function.
>>> (predicate_bbs): Add convert_bool argument that is used to transform
>>> comparison expressions of boolean type into conditional expressions
>>> with integral operands. If bool_conv argument is false or both
>>> outgoing edges are not critical old algorithm of predicate assignments
>>> is used, otherwise the following code was added: check on applicable
>>> of vect-bool-pattern recognition and trnasformation of
>>> (bool) x != 0 --> y = (int) x; x != 0;
>>> compute predicates for both outgoing edges one of which is critical
>>> one using 'normal' edge, i.e. compute true and false predicates using
>>> normal outgoing edge only; evaluated predicates are stored in
>>> predicate and negate_predicate fields of struct bb_predicate_s and
>>> negate_predicate of normal edge conatins predicate of critical edge,
>>> but generated gimplified statements are stored in their destination
>>> block fields. Additional argument 'convert_bool" is passed to
>>> add_to_dst_predicate_list and add_to_predicate_list.
>>> (if_convertible_loop_p_1): Call predicate_bbs with additional argument
>>> equal to false.
>>> (find_phi_replacement_condition): Extend function interface:
>>> it returns NULL if given phi node must be handled by means of
>>> extended phi node predication. If number of predecessors of phi-block
>>> is equal 2 and atleast one incoming edge is not critical original
>>> algorithm is used.
>>> (is_cond_scalar_reduction): Add 'extended' argument which signals that
>>> both phi arguments must be evaluated through phi_has_two_different_args.
>>> (predicate_scalar_phi): Add invoÑation of convert_name_to_cmp if cond
>>> is SSA_NAME. Add 'false' argument to call of is_cond_scalar_reduction.
>>> (get_predicate_for_edge): New function.
>>> (find_insertion_point): New function.
>>> (predicate_phi_disjoint_args): New function.
>>> (predicate_extended_scalar_phi): New function.
>>> (predicate_all_scalar_phis): Add code to set-up gimple statement
>>> iterator for predication of extended scalar phi's for insertion.
>>> (insert_gimplified_predicates): Add test for non-predicated basic
>>> blocks that there are no gimplified statements to insert. Insert
>>> predicates at the block begining for extended if-conversion.
>>> (predicate_mem_writes): Invoke convert_name_to_cmp for extended
>>> predication to build mask.
>>> (combine_blocks): Pass flag_force_vectorize to predicate_bbs.
>>> (split_crit_edge): New function.
>>> (tree_if_conversion): Initialize flag_force_vectorize from current
>>> loop or outer loop (to support pragma omp declare). Invoke
>>> split_crit_edge for extended predication. Do loop versioning for
>>> innermost loop marked with pragma omp simd.