[SVE] PR91272
Prathamesh Kulkarni
prathamesh.kulkarni@linaro.org
Mon Oct 21 20:16:00 GMT 2019
On Fri, 18 Oct 2019 at 14:36, Richard Sandiford
<richard.sandiford@arm.com> wrote:
>
> Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org> writes:
> > Hi,
> > The attached patch tries to fix PR91272.
> > Does it look OK ?
> >
> > With patch, I see following failures for aarch64-sve.exp:
> > FAIL: gcc.target/aarch64/sve/clastb_1.c -march=armv8.2-a+sve
> > scan-assembler \\tclastb\\tw[0-9]+, p[0-7], w[0-9]+, z[0-9]+\\.s
> > FAIL: gcc.target/aarch64/sve/clastb_2.c -march=armv8.2-a+sve
> > scan-assembler \\tclastb\\tw[0-9]+, p[0-7]+, w[0-9]+, z[0-9]+\\.s
> > FAIL: gcc.target/aarch64/sve/clastb_3.c -march=armv8.2-a+sve
> > scan-assembler \\tclastb\\tw[0-9]+, p[0-7]+, w[0-9]+, z[0-9]+\\.b
> > FAIL: gcc.target/aarch64/sve/clastb_5.c -march=armv8.2-a+sve
> > scan-assembler \\tclastb\\tx[0-9]+, p[0-7], x[0-9]+, z[0-9]+\\.d
> >
> > For instance, in clastb_1.c, it now emits:
> > clastb s1, p1, s1, z0.s
> > while using a fully predicated loop.
> > Should I adjust the tests ?
>
> Yeah, that's an improvement really.
>
> > diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
> > index acdd90784dc..2cad2cb94c8 100644
> > --- a/gcc/tree-vect-stmts.c
> > +++ b/gcc/tree-vect-stmts.c
> > @@ -10016,7 +10016,8 @@ vectorizable_condition (stmt_vec_info stmt_info, gimple_stmt_iterator *gsi,
> > /* See whether another part of the vectorized code applies a loop
> > mask to the condition, or to its inverse. */
> >
> > - if (loop_vinfo && LOOP_VINFO_FULLY_MASKED_P (loop_vinfo))
> > + if (loop_vinfo && LOOP_VINFO_FULLY_MASKED_P (loop_vinfo)
> > + && reduction_type != EXTRACT_LAST_REDUCTION)
> > {
> > scalar_cond_masked_key cond (cond_expr, ncopies);
> > if (loop_vinfo->scalar_cond_masked_set.contains (cond))
>
> The context here is:
>
> if (loop_vinfo->scalar_cond_masked_set.contains (cond))
> {
> vec_loop_masks *masks = &LOOP_VINFO_MASKS (loop_vinfo);
> loop_mask = vect_get_loop_mask (gsi, masks, ncopies, vectype, j);
> }
> else
> {
> bool honor_nans = HONOR_NANS (TREE_TYPE (cond.op0));
> cond.code = invert_tree_comparison (cond.code, honor_nans);
> if (loop_vinfo->scalar_cond_masked_set.contains (cond))
> {
> vec_loop_masks *masks = &LOOP_VINFO_MASKS (loop_vinfo);
> loop_mask = vect_get_loop_mask (gsi, masks, ncopies,
> vectype, j);
> cond_code = cond.code;
> swap_cond_operands = true;
> }
> }
>
> Rather than have another instance of vect_get_loop_mask, I think
> it would cleaner to use a nonnull "masks" to decide whether to apply
> the loop mask:
>
> vec_loop_masks *masks = NULL;
> if (loop_vinfo && LOOP_VINFO_FULLY_MASKED_P (loop_vinfo)
> {
> if (reduction_type == EXTRACT_LAST_REDUCTION
> || loop_vinfo->scalar_cond_masked_set.contains (cond))
> masks = &LOOP_VINFO_MASKS (loop_vinfo);
> else
> {
> ...
> }
>
> Then:
>
> > @@ -10116,6 +10117,15 @@ vectorizable_condition (stmt_vec_info stmt_info, gimple_stmt_iterator *gsi,
> > vec_then_clause = vec_oprnds2[i];
> > vec_else_clause = vec_oprnds3[i];
> >
> > + if (loop_vinfo && LOOP_VINFO_FULLY_MASKED_P (loop_vinfo)
> > + && reduction_type == EXTRACT_LAST_REDUCTION)
> > + {
> > + vec_loop_masks *masks = &LOOP_VINFO_MASKS (loop_vinfo);
> > + unsigned vec_num = vec_oprnds0.length ();
> > + loop_mask = vect_get_loop_mask (gsi, masks, vec_num * ncopies,
> > + vectype, vec_num * j + i);
> > + }
> > +
>
> ...do this vect_get_loop_mask under the condition of "if (masks)".
>
> > if (swap_cond_operands)
> > std::swap (vec_then_clause, vec_else_clause);
> >
> > @@ -10180,7 +10190,7 @@ vectorizable_condition (stmt_vec_info stmt_info, gimple_stmt_iterator *gsi,
> > vec != { 0, ... } (masked in the MASK_LOAD,
> > unmasked in the VEC_COND_EXPR). */
> >
> > - if (loop_mask)
> > + if (loop_mask && reduction_type != EXTRACT_LAST_REDUCTION)
> > {
> > if (COMPARISON_CLASS_P (vec_compare))
> > {
> > @@ -10220,6 +10230,16 @@ vectorizable_condition (stmt_vec_info stmt_info, gimple_stmt_iterator *gsi,
> > vect_finish_stmt_generation (stmt_info, new_stmt, gsi);
> > vec_compare = vec_compare_name;
> > }
>
> The code above here:
>
> if (!is_gimple_val (vec_compare))
> {
> tree vec_compare_name = make_ssa_name (vec_cmp_type);
> gassign *new_stmt = gimple_build_assign (vec_compare_name,
> vec_compare);
> vect_finish_stmt_generation (stmt_info, new_stmt, gsi);
> vec_compare = vec_compare_name;
> }
>
> is doing something similar to the new COND_EXPR handling:
>
> if (COMPARISON_CLASS_P (vec_compare))
> {
> tree tmp = make_ssa_name (vec_cmp_type);
> tree op0 = TREE_OPERAND (vec_compare, 0);
> tree op1 = TREE_OPERAND (vec_compare, 1);
> gassign *g = gimple_build_assign (tmp,
> TREE_CODE (vec_compare),
> op0, op1);
> vect_finish_stmt_generation (stmt_info, g, gsi);
> vec_compare = tmp;
> }
>
> There's then an added case:
>
> if (must_invert_cmp_result)
> {
> tree vec_compare_name = make_ssa_name (vec_cmp_type);
> gassign *new_stmt = gimple_build_assign (vec_compare_name,
> BIT_NOT_EXPR,
> vec_compare);
> vect_finish_stmt_generation (stmt_info, new_stmt, gsi);
> vec_compare = vec_compare_name;
> }
>
> that would be a safe no-op for the normal COND_EXPR path (because
> must_invert_cmp_result is always false then). Then this:
>
> > +
> > + if (loop_mask)
> > + {
> > + tree tmp = make_ssa_name (vec_cmp_type);
> > + gassign *g = gimple_build_assign (tmp, BIT_AND_EXPR,
> > + vec_compare, loop_mask);
> > + vect_finish_stmt_generation (stmt_info, g, gsi);
> > + vec_compare = tmp;
> > + }
> > +
>
> is the equivalent of the above:
>
> tree tmp2 = make_ssa_name (vec_cmp_type);
> gassign *g = gimple_build_assign (tmp2, BIT_AND_EXPR,
> vec_compare, loop_mask);
> vect_finish_stmt_generation (stmt_info, g, gsi);
> vec_compare = tmp2;
>
> So with this patch, I think the EXTRACT_LAST_REDUCTION and the normal
> COND_EXPR paths should be able to share the mask generation code.
Hi Richard,
Thanks for the suggestions, does the attached patch look OK ?
A quick comparison of aarch64-sve.exp shows no regressions,
bootstrap+test in progress.
Thanks,
Prathamesh
>
> Thanks,
> Richard
-------------- next part --------------
A non-text attachment was scrubbed...
Name: pr91272-2.diff
Type: text/x-patch
Size: 10436 bytes
Desc: not available
URL: <http://gcc.gnu.org/pipermail/gcc-patches/attachments/20191021/f4c2e270/attachment.bin>
More information about the Gcc-patches
mailing list