This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH][gimple-interchange] Add reduction validity check


On Mon, Dec 4, 2017 at 1:11 PM, Richard Biener <rguenther@suse.de> wrote:
>
> I've noticed we perform FP reduction association without the required
> checks for associative math.  I've added
> gcc.dg/tree-ssa/loop-interchange-1b.c to cover this.
>
> I also noticed we happily interchange a loop with a reduction like
>
>  sum = a[i] - sum;
>
> where a change in order of elements isn't ok.  Unfortunately bwaves
> is exactly a case where single_use != next_def (tried to simply remove
> that case for now), because reassoc didn't have a chance to fix the
> operand order.  Thus this patch exports the relevant handling from
> the vectorizer (for stage1 having a separate infrastructure gathering /
> analyzing of reduction/induction infrastructure would be nice...)
> and uses it from interchange.  We then don't handle
> gcc.dg/tree-ssa/loop-interchange-4.c anymore (similar vectorizer
> missed-opt is PR65930).  I didn't bother to split up the vectorizer
> code further to implement relaxed validity checking but simply XFAILed
> this testcase.
>
> Earlier I simplified allocation stuff in the main loop which is why
> this part is included in this patch.
>
> Bootstrap running on x86_64-unknown-linux-gnu.
>
> I'll see to craft a testcase with the sum = a[i] - sum; mis-handling.
>
> Ok?
>
> Thanks,
> Richard.
>
> 2017-12-04  Richard Biener  <rguenther@suse.de>
>
>         * tree-vectorizer.h (check_reduction_path): Declare.
>         * tree-vect-loop.c (check_reduction_path): New function, split out
>         from ...
>         (vect_is_simple_reduction): ... here.
>         * gimple-loop-interchange.cc: Include tree-vectorizer.h.
>         (loop_cand::analyze_iloop_reduction_var): Use single_imm_use.
>         Properly check for a supported reduction operation and a
>         valid expression if the reduction covers multiple stmts.
>         (prepare_perfect_loop_nest): Simpify allocation.
>         (pass_linterchange::execute): Likewise.
>
>         * gcc.dg/tree-ssa/loop-interchange-1.c: Add fast-math flags.
>         * gcc.dg/tree-ssa/loop-interchange-1b.c: New test variant.
>         * gcc.dg/tree-ssa/loop-interchange-4.c: XFAIL.
>
>
> Index: gcc/gimple-loop-interchange.cc
> ===================================================================
> --- gcc/gimple-loop-interchange.cc      (revision 255375)
> +++ gcc/gimple-loop-interchange.cc      (working copy)
> @@ -41,6 +41,7 @@ along with GCC; see the file COPYING3.
>  #include "tree-ssa-loop-ivopts.h"
>  #include "tree-ssa-dce.h"
>  #include "tree-data-ref.h"
> +#include "tree-vectorizer.h"
>
>  /* This pass performs loop interchange: for example, the loop nest
>
> @@ -551,23 +552,29 @@ loop_cand::analyze_iloop_reduction_var (
>       in a way that reduction operation is seen as black box.  In general,
>       we can ignore reassociation of reduction operator; we can handle fake
>       reductions in which VAR is not even used to compute NEXT.  */
> -  FOR_EACH_IMM_USE_FAST (use_p, iterator, var)
> -    {
> -      stmt = USE_STMT (use_p);
> -      if (is_gimple_debug (stmt))
> -       continue;
> -
> -      if (!flow_bb_inside_loop_p (m_loop, gimple_bb (stmt)))
> -       return false;
> -
> -      if (single_use != NULL)
> -       return false;
> +  if (! single_imm_use (var, &use_p, &single_use)
> +      || ! flow_bb_inside_loop_p (m_loop, gimple_bb (single_use)))
> +    return false;
>
> -      single_use = stmt;
> -    }
> +  /* Check the reduction operation.  We require a commutative or
> +     left-associative operation.  For FP math we also need to be allowed
> +     to associate operations.  */
> +  if (! is_gimple_assign (single_use)
> +      || ! (commutative_tree_code (gimple_assign_rhs_code (single_use))
> +           || (commutative_ternary_tree_code
> +                 (gimple_assign_rhs_code (single_use))
> +               && (use_p->use == gimple_assign_rhs1_ptr (single_use)
> +                   || use_p->use == gimple_assign_rhs2_ptr (single_use)))
> +           || (gimple_assign_rhs_code (single_use) == MINUS_EXPR
> +               && use_p->use == gimple_assign_rhs1_ptr (single_use)))
> +      || (FLOAT_TYPE_P (TREE_TYPE (var))
> +         && ! flag_associative_math))
> +    return false;
>
> +  /* Handle and verify a series of stmts feeding the reduction op.  */
>    if (single_use != next_def
> -      && !stmt_dominates_stmt_p (single_use, next_def))
> +      && !check_reduction_path (UNKNOWN_LOCATION, m_loop, phi, next,
> +                               gimple_assign_rhs_code (single_use)))
>      return false;
>
>    /* Only support cases in which INIT is used in inner loop.  */
> @@ -1964,7 +1971,7 @@ prepare_perfect_loop_nest (struct loop *
>                            vec<data_reference_p> *datarefs, vec<ddr_p> *ddrs)
>  {
>    struct loop *start_loop = NULL, *innermost = loop;
> -  struct loop *outermost = superloop_at_depth (loop, 0);
> +  struct loop *outermost = loops_for_fn (cfun)->tree_root;
>
>    /* Find loop nest from the innermost loop.  The outermost is the innermost
>       outer*/
> @@ -1985,21 +1992,26 @@ prepare_perfect_loop_nest (struct loop *
>    if (!start_loop || !start_loop->inner)
>      return false;
>
> +  /* Prepare the data reference vector for the loop nest, pruning outer
> +     loops we cannot handle.  */
>    datarefs->create (20);
> -  if ((start_loop = prepare_data_references (start_loop, datarefs)) == NULL
> +  start_loop = prepare_data_references (start_loop, datarefs);
> +  if (!start_loop
>        /* Check if there is no data reference.  */
>        || datarefs->is_empty ()
>        /* Check if there are too many of data references.  */
> -      || (int) datarefs->length () > MAX_DATAREFS
> -      /* Compute access strides for all data references.  */
> -      || ((start_loop = compute_access_strides (start_loop, innermost,
> -                                               *datarefs)) == NULL)
> -      /* Check if loop nest should be interchanged.  */
> -      || !should_interchange_loop_nest (start_loop, innermost, *datarefs))
> -    {
> -      free_data_refs_with_aux (*datarefs);
> -      return false;
> -    }
> +      || (int) datarefs->length () > MAX_DATAREFS)
> +    return false;
> +
> +  /* Compute access strides for all data references, pruning outer
> +     loops we cannot analyze refs in.  */
> +  start_loop = compute_access_strides (start_loop, innermost, *datarefs);
> +  if (!start_loop)
> +    return false;
> +
> +  /* Check if any interchange is profitable in the loop nest.  */
> +  if (!should_interchange_loop_nest (start_loop, innermost, *datarefs))
> +    return false;
>
>    /* Check if data dependences can be computed for loop nest starting from
>       start_loop.  */
> @@ -2029,14 +2041,15 @@ prepare_perfect_loop_nest (struct loop *
>         return true;
>        }
>
> +    /* ???  We should be able to adjust DDRs we already computed by
> +       truncating distance vectors.  */
Note it could be hard enough to adjust DDRs for stripped outer loop.
Dependence can become non-dep, even worse, we may result in wrong
"invalid dep" in case of:
   DDR<a,b>:
     (<, <, =)
     (<, >, =)
because after simply stripping, it becomes:
   DDR<a,b>:
     (<, =)
     (>, =)
While the correct result is "no-dep".

Thanks,
bin


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]