[PATCH 1/2][vect]PR 88915: Vectorize epilogues when versioning loops

Andre Vieira (lists) andre.simoesdiasvieira@arm.com
Mon Oct 28 18:37:00 GMT 2019


Hi,

Reworked according to your comments, see inline for clarification.

Is this OK for trunk?

gcc/ChangeLog:
2019-10-28  Andre Vieira  <andre.simoesdiasvieira@arm.com>

     PR 88915
     * tree-ssa-loop-niter.h (simplify_replace_tree): Change declaration.
     * tree-ssa-loop-niter.c (simplify_replace_tree): Add context parameter
     and make the valueize function pointer also take a void pointer.
     * gcc/tree-ssa-sccvn.c (vn_valueize_wrapper): New function to wrap
     around vn_valueize, to call it without a context.
     (process_bb): Use vn_valueize_wrapper instead of vn_valueize.
     * tree-vect-loop.c (_loop_vec_info): Initialize epilogue_vinfos.
     (~_loop_vec_info): Release epilogue_vinfos.
     (vect_analyze_loop_costing): Use knowledge of main VF to estimate
     number of iterations of epilogue.
     (vect_analyze_loop_2): Adapt to analyse main loop for all supported
     vector sizes when vect-epilogues-nomask=1.  Also keep track of lowest
     versioning threshold needed for main loop.
     (vect_analyze_loop): Likewise.
     (find_in_mapping): New helper function.
     (update_epilogue_loop_vinfo): New function.
     (vect_transform_loop): When vectorizing epilogues re-use analysis done
     on main loop and call update_epilogue_loop_vinfo to update it.
     * tree-vect-loop-manip.c (vect_update_inits_of_drs): No longer insert
     stmts on loop preheader edge.
     (vect_do_peeling): Enable skip-vectors when doing loop versioning if
     we decided to vectorize epilogues.  Update epilogues NITERS and
     construct ADVANCE to update epilogues data references where needed.
     * tree-vectorizer.h (_loop_vec_info): Add epilogue_vinfos.
     (vect_do_peeling, vect_update_inits_of_drs,
      determine_peel_for_niter, vect_analyze_loop): Add or update 
declarations.
     * tree-vectorizer.c (try_vectorize_loop_1): Make sure to use already
     created loop_vec_info's for epilogues when available.  Otherwise 
analyse
     epilogue separately.



Cheers,
Andre

On 28/10/2019 14:16, Richard Biener wrote:
> On Fri, 25 Oct 2019, Andre Vieira (lists) wrote:
> 
>> Hi,
>>
>> This is the reworked patch after your comments.
>>
>> I have moved the epilogue check into the analysis form disguised under
>> '!epilogue_vinfos.is_empty ()'.  This because I realized that I am doing the
>> "lowest threshold" check there.
>>
>> The only place where we may reject an epilogue_vinfo is when we know the
>> number of scalar iterations and we realize the number of iterations left after
>> the main loop are not enough to enter the vectorized epilogue so we optimize
>> away that code-gen.  The only way we know this to be true is if the number of
>> scalar iterations are known and the peeling for alignment is known. So we know
>> we will enter the main loop regardless, so whether the threshold we use is for
>> a lower VF or not it shouldn't matter as much, I would even like to think that
>> check isn't done, but I am not sure... Might be worth checking as an
>> optimization.
>>
>>
>> Is this OK for trunk?
> 
> +      for (epilogue_phi_gsi = gsi_start_phis (epilogue_bbs[i]);
> +          !gsi_end_p (epilogue_phi_gsi); gsi_next (&epilogue_phi_gsi))
> +       {
> ..
> +         if (STMT_VINFO_PATTERN_DEF_SEQ (stmt_vinfo))
> +           pattern_worklist.safe_push (stmt_vinfo);
> +
> +         related_vinfo = STMT_VINFO_RELATED_STMT (stmt_vinfo);
> +         while (related_vinfo && related_vinfo != stmt_vinfo)
> +           {
> 
> I think PHIs cannot have patterns.  You can assert
> that STMT_VINFO_RELATED_STMT is NULL I think.

Done.
> 
> +         related_vinfo = STMT_VINFO_RELATED_STMT (stmt_vinfo);
> +         while (related_vinfo && related_vinfo != stmt_vinfo)
> +           {
> +             related_worklist.safe_push (related_vinfo);
> +             /* Set BB such that the assert in
> +               'get_initial_def_for_reduction' is able to determine that
> +               the BB of the related stmt is inside this loop.  */
> +             gimple_set_bb (STMT_VINFO_STMT (related_vinfo),
> +                            gimple_bb (new_stmt));
> +             related_vinfo = STMT_VINFO_RELATED_STMT (related_vinfo);
> +           }
> 
> do we really keep references to "nested" patterns?  Thus, do you
> need this loop?

Changed and added asserts.  They didn't trigger so I suppose you are 
right, I didn't know at the time whether it was possible, so I just 
operated on the side of caution.  Can remove the asserts and so on if 
you want.
> 
> +  /* The PATTERN_DEF_SEQs in the epilogue were constructed using the
> +     original main loop and thus need to be updated to refer to the
> cloned
> +     variables used in the epilogue.  */
> +  for (unsigned i = 0; i < pattern_worklist.length (); ++i)
> +    {
> ...
> +                 op = simplify_replace_tree (op, NULL_TREE, NULL_TREE,
> +                                        &find_in_mapping, &mapping);
> +                 gimple_set_op (seq, j, op);
> 
> you do this for the pattern-def seq but not for the related one.
> I guess you ran into this for COND_EXPR conditions.  I wondered
> to use a shared worklist for both the def-seq and the main pattern
> stmt or at least to split out the replacement so you can share it.

I think that was it yeah, reworked it now to use the same list. Less 
code, thanks!
> 
> +      /* Data references for gather loads and scatter stores do not use
> the
> +        updated offset we set using ADVANCE.  Instead we have to make
> sure the
> +        reference in the data references point to the corresponding copy
> of
> +        the original in the epilogue.  */
> +      if (STMT_VINFO_GATHER_SCATTER_P (stmt_vinfo))
> +       {
> +         int j;
> +         if (TREE_CODE (DR_REF (dr)) == MEM_REF)
> +           j = 0;
> +         else if (TREE_CODE (DR_REF (dr)) == ARRAY_REF)
> +           j = 1;
> +         else
> +           gcc_unreachable ();
> +
> +         if (tree *new_op = mapping.get (TREE_OPERAND (DR_REF (dr), j)))
> +           {
> +             DR_REF (dr) = unshare_expr (DR_REF (dr));
> +             TREE_OPERAND (DR_REF (dr), j) = *new_op;
> +           }
> 
> huh, do you really only ever see MEM_REF or ARRAY_REF here?
> I would guess using simplify_replace_tree is safer.
> There's also DR_BASE_ADDRESS - we seem to leave the DRs partially
> updated, is that correct?

Yeah can use simplify_replace_tree indeed.  And I have changed it so it 
updates DR_BASE_ADDRESS.  I think DR_BASE_ADDRESS never actually changed 
in the way we use data_references... Either way, replacing them if they 
do change is cleaner and more future proof.
> 
> Otherwise looks OK to me.
> 
> Thanks,
> Richard.
> 
> 
>> gcc/ChangeLog:
>> 2019-10-25  Andre Vieira  <andre.simoesdiasvieira@arm.com>
>>
>>      PR 88915
>>      * tree-ssa-loop-niter.h (simplify_replace_tree): Change declaration.
>>      * tree-ssa-loop-niter.c (simplify_replace_tree): Add context parameter
>>      and make the valueize function pointer also take a void pointer.
>>      * gcc/tree-ssa-sccvn.c (vn_valueize_wrapper): New function to wrap
>>      around vn_valueize, to call it without a context.
>>      (process_bb): Use vn_valueize_wrapper instead of vn_valueize.
>>      * tree-vect-loop.c (_loop_vec_info): Initialize epilogue_vinfos.
>>      (~_loop_vec_info): Release epilogue_vinfos.
>>      (vect_analyze_loop_costing): Use knowledge of main VF to estimate
>>      number of iterations of epilogue.
>>      (vect_analyze_loop_2): Adapt to analyse main loop for all supported
>>      vector sizes when vect-epilogues-nomask=1.  Also keep track of lowest
>>      versioning threshold needed for main loop.
>>      (vect_analyze_loop): Likewise.
>>      (find_in_mapping): New helper function.
>>      (update_epilogue_loop_vinfo): New function.
>>      (vect_transform_loop): When vectorizing epilogues re-use analysis done
>>      on main loop and call update_epilogue_loop_vinfo to update it.
>>      * tree-vect-loop-manip.c (vect_update_inits_of_drs): No longer insert
>>      stmts on loop preheader edge.
>>      (vect_do_peeling): Enable skip-vectors when doing loop versioning if
>>      we decided to vectorize epilogues.  Update epilogues NITERS and
>>      construct ADVANCE to update epilogues data references where needed.
>>      * tree-vectorizer.h (_loop_vec_info): Add epilogue_vinfos.
>>      (vect_do_peeling, vect_update_inits_of_drs,
>>       determine_peel_for_niter, vect_analyze_loop): Add or update declarations.
>>      * tree-vectorizer.c (try_vectorize_loop_1): Make sure to use already
>>      created loop_vec_info's for epilogues when available.  Otherwise
>> analyse
>>      epilogue separately.
>>
>>
>>
>> Cheers,
>> Andre
>>
> 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: epilogues_1_7.patch
Type: text/x-patch
Size: 37977 bytes
Desc: not available
URL: <http://gcc.gnu.org/pipermail/gcc-patches/attachments/20191028/0751a32b/attachment.bin>


More information about the Gcc-patches mailing list