[PATCH][GRAPHITE] Fix PR82451 (and PR82355 in a different way)

Bin.Cheng amker.cheng@gmail.com
Thu Oct 12 10:40:00 GMT 2017


On Wed, Oct 11, 2017 at 3:43 PM, Richard Biener <rguenther@suse.de> wrote:
>
> For PR82355 I introduced a fake dimension to ISL to allow CHRECs
> having an evolution in a loop that isn't fully part of the SESE
> region we are processing.  That was easier than fending off those
> CHRECs (without simply giving up on SESE regions with those).
>
> But it didn't fully solve the issue as PR82451 shows where we run
> into the issue that we eventually have to code-gen those
> evolutions and thus in theory need a canonical IV of that containing loop.
>
> So I decided (after Micha pressuring me a bit...) to revisit the
> original issue and make SCEV analysis "properly" handle SE regions.
> It turns out that it is mostly instantiate_scev lacking proper support
> plus the necessary interfacing change (really just cosmetic in some sense)
> from a instantiate_before basic-block to a instantiate_before edge.
>
> data-ref interfaces have been similarly adjusted, here changing
> the "loop nest" loop parameter to the entry edge for the SE region
> and passing that down accordingly.
>
> I've for now tried to keep other high-level loop-based interfaces the
> same by simply using the loop preheader edge as entry where appropriate
> (needing loop_preheader_edge cope with the loop root tree for simplicity).
>
> In the process I ran into issues with us too overly aggressive
> instantiating random trees and thus I cut those down.  That part
> doesn't successfully test separately (when I remove the strange
> ARRAY_REF instantiation), so it's part of this patch.  I've also
> run into an SSA verification fail (the id-27.f90 testcase) which
> shows we _do_ need to clear the SCEV cache after introducing
> the versioned CFG (and added a comment before it).
>
> On the previously failing testcases I've verified we produce
> sensible instantiations for those pesky refs residing in "no" loop
> in the SCOP and that we get away with the result in terms of
> optimizing.
>
> SPEC 2k6 testing shows
>
> loop nest optimized: 311
> loop nest not optimized, code generation error: 0
> loop nest not optimized, optimized schedule is identical to original
> schedule: 173
> loop nest not optimized, optimization timed out: 59
> loop nest not optimized, ISL signalled an error: 9
> loop nest: 552
>
> for SPEC 2k6 and -floop-nest-optimize while adding -fgraphite-identity
> still reveals some codegen errors:
>
> loop nest optimized: 437
> loop nest not optimized, code generation error: 25
> loop nest not optimized, optimized schedule is identical to original
> schedule: 169
> loop nest not optimized, optimization timed out: 60
> loop nest not optimized, ISL signalled an error: 9
> loop nest: 700
>
> Bootstrap and regtest in progress on x86_64-unknown-linux-gnu
> (with and without -fgraphite-identity -floop-nest-optimize).
>
> Ok?
>
> Thanks,
> Richard.
>

> Index: gcc/tree-scalar-evolution.c
> ===================================================================
> --- gcc/tree-scalar-evolution.c (revision 253645)
> +++ gcc/tree-scalar-evolution.c (working copy)
> @@ -2344,7 +2348,7 @@ static tree instantiate_scev_r (basic_bl
>     instantiated, and to stop if it exceeds some limit.  */
>
>  static tree
> -instantiate_scev_name (basic_block instantiate_below,
> +instantiate_scev_name (edge instantiate_below,
>                        struct loop *evolution_loop, struct loop *inner_loop,
>                        tree chrec,
>                        bool *fold_conversions,
> @@ -2358,7 +2362,7 @@ instantiate_scev_name (basic_block insta
>       evolutions in outer loops), nothing to do.  */
>    if (!def_bb
>        || loop_depth (def_bb->loop_father) == 0
> -      || dominated_by_p (CDI_DOMINATORS, instantiate_below, def_bb))
> +      || ! dominated_by_p (CDI_DOMINATORS, def_bb, instantiate_below->dest))
>      return chrec;
>
>    /* We cache the value of instantiated variable to avoid exponential
> @@ -2380,6 +2384,51 @@ instantiate_scev_name (basic_block insta
>
>    def_loop = find_common_loop (evolution_loop, def_bb->loop_father);
>
> +  if (! dominated_by_p (CDI_DOMINATORS,
> +                       def_loop->header, instantiate_below->dest))
> +    {
> +      gimple *def = SSA_NAME_DEF_STMT (chrec);
> +      if (gassign *ass = dyn_cast <gassign *> (def))
> +       {
> +         switch (gimple_assign_rhs_class (ass))
> +           {
> +           case GIMPLE_UNARY_RHS:
> +             {
> +               tree op0 = instantiate_scev_r (instantiate_below, evolution_loop,
> +                                              inner_loop, gimple_assign_rhs1 (ass),
> +                                              fold_conversions, size_expr);
> +               if (op0 == chrec_dont_know)
> +                 return chrec_dont_know;
> +               res = fold_build1 (gimple_assign_rhs_code (ass),
> +                                  TREE_TYPE (chrec), op0);
> +               break;
> +             }
> +           case GIMPLE_BINARY_RHS:
> +             {
> +               tree op0 = instantiate_scev_r (instantiate_below, evolution_loop,
> +                                              inner_loop, gimple_assign_rhs1 (ass),
> +                                              fold_conversions, size_expr);
> +               if (op0 == chrec_dont_know)
> +                 return chrec_dont_know;
> +               tree op1 = instantiate_scev_r (instantiate_below, evolution_loop,
> +                                              inner_loop, gimple_assign_rhs2 (ass),
> +                                              fold_conversions, size_expr);
> +               if (op1 == chrec_dont_know)
> +                 return chrec_dont_know;
> +               res = fold_build2 (gimple_assign_rhs_code (ass),
> +                                  TREE_TYPE (chrec), op0, op1);
> +               break;
> +             }
> +           default:
> +             res = chrec_dont_know;
> +           }
> +       }
> +      else
> +       res = chrec_dont_know;
> +      global_cache->set (si, res);
> +      return res;
> +    }
> +
>    /* If the analysis yields a parametric chrec, instantiate the
>       result again.  */
>    res = analyze_scalar_evolution (def_loop, chrec);

IIUC, after changing instantiate_scev_r from loop based to region
based, there are
two cases.  In one case, def_loop is dominated by instantiate_edge,
which we'd like
to analyze/instantiate scev wrto the new def_loop; In the other case,
def_loop is not
fully part of sese region, which we'd like to expand ssa_name wrto the
basic block
instantiate_edge->dest.  It's simply ssa expanding and no loop is involved.
So how about factor out above big if-statement into a function with name like
expand_scev_name (Other better names?).  The code like:

  /* Some comment explaining the two cases in region based instantiation.  */
  if (dominated_by_p (CDI_DOMINATORS, def_loop->header,
instantiate_below->dest))
    res = analyze_scalar_evolution (def_loop, chrec);
  else
    res = expand_scev_name (instantiate_below, chrec);

could be easier to read?

Thanks,
bin



More information about the Gcc-patches mailing list