[PATCH] Set bound/cmp/control for until wrap loop.

Richard Biener rguenther@suse.de
Tue Aug 31 13:37:39 GMT 2021


On Tue, 31 Aug 2021, guojiufu wrote:

> On 2021-08-30 20:02, Richard Biener wrote:
> > On Mon, 30 Aug 2021, guojiufu wrote:
> > 
> >> On 2021-08-30 14:15, Jiufu Guo wrote:
> >> > Hi,
> >> >
> >> > In patch r12-3136, niter->control, niter->bound and niter->cmp are
> >> > derived from number_of_iterations_lt.  While for 'until wrap condition',
> >> > the calculation in number_of_iterations_lt is not align the requirements
> >> > on the define of them and requirements in determine_exit_conditions.
> >> >
> >> > This patch calculate niter->control, niter->bound and niter->cmp in
> >> > number_of_iterations_until_wrap.
> >> >
> >> > The ICEs in the PR are pass with this patch.
> >> > Bootstrap and reg-tests pass on ppc64/ppc64le and x86.
> >> > Is this ok for trunk?
> >> >
> >> > BR.
> >> > Jiufu Guo
> >> >
> >> Add ChangeLog:
> >> gcc/ChangeLog:
> >> 
> >> 2021-08-30  Jiufu Guo  <guojiufu@linux.ibm.com>
> >> 
> >>         PR tree-optimization/102087
> >>         * tree-ssa-loop-niter.c (number_of_iterations_until_wrap):
> >>         Set bound/cmp/control for niter.
> >> 
> >> gcc/testsuite/ChangeLog:
> >> 
> >> 2021-08-30  Jiufu Guo  <guojiufu@linux.ibm.com>
> >> 
> >>         PR tree-optimization/102087
> >>         * gcc.dg/vect/pr101145_3.c: Update tests.
> >>         * gcc.dg/pr102087.c: New test.
> >> 
> >> > ---
> >> >  gcc/tree-ssa-loop-niter.c              | 14 +++++++++++++-
> >> >  gcc/testsuite/gcc.dg/pr102087.c        | 25 +++++++++++++++++++++++++
> >> >  gcc/testsuite/gcc.dg/vect/pr101145_3.c |  4 +++-
> >> >  3 files changed, 41 insertions(+), 2 deletions(-)
> >> >  create mode 100644 gcc/testsuite/gcc.dg/pr102087.c
> >> >
> >> > diff --git a/gcc/tree-ssa-loop-niter.c b/gcc/tree-ssa-loop-niter.c
> >> > index 7af92d1c893..747f04d3ce0 100644
> >> > --- a/gcc/tree-ssa-loop-niter.c
> >> > +++ b/gcc/tree-ssa-loop-niter.c
> >> > @@ -1482,7 +1482,7 @@ number_of_iterations_until_wrap (class loop *,
> >> > tree type, affine_iv *iv0,
> >> >  				 affine_iv *iv1, class tree_niter_desc *niter)
> >> >  {
> >> >    tree niter_type = unsigned_type_for (type);
> >> > -  tree step, num, assumptions, may_be_zero;
> >> > +  tree step, num, assumptions, may_be_zero, span;
> >> >    wide_int high, low, max, min;
> >> >
> >> >    may_be_zero = fold_build2 (LE_EXPR, boolean_type_node, iv1->base,
> >> > iv0->base);
> >> > @@ -1513,6 +1513,8 @@ number_of_iterations_until_wrap (class loop *,
> >> > tree type, affine_iv *iv0,
> >> >   low = wi::to_wide (iv0->base);
> >> >          else
> >> > 	low = min;
> >> > +
> >> > +      niter->control = *iv1;
> >> >      }
> >> >    /* {base, -C} < n.  */
> >> >    else if (tree_int_cst_sign_bit (iv0->step) && integer_zerop
> >> > (iv1->step))
> >> > @@ -1533,6 +1535,8 @@ number_of_iterations_until_wrap (class loop *,
> >> > tree type, affine_iv *iv0,
> >> >   high = wi::to_wide (iv1->base);
> >> >          else
> >> > 	high = max;
> >> > +
> >> > +      niter->control = *iv0;
> >> >      }
> >> >    else
> >> >      return false;
> > 
> > it looks like the above two should already be in effect from the
> > caller (guarding with integer_nozerop)?
> 
> I add them just because set these fields in one function.
> Yes, they have been set in caller already,  I could remove them here.
> 
> > 
> >> > @@ -1556,6 +1560,14 @@ number_of_iterations_until_wrap (class loop *,
> >> > tree type, affine_iv *iv0,
> >> >            niter->assumptions, assumptions);
> >> >
> >> >    niter->control.no_overflow = false;
> >> > +  niter->control.base = fold_build2 (MINUS_EXPR, niter_type,
> >> > +				     niter->control.base,
> >> > niter->control.step);
> > 
> > how do we know IVn - STEP doesn't already wrap?
> 
> The last IV value is just cross the max/min value of the type
> at the last iteration,  then IVn - STEP is the nearest value
> to max(or min) and not wrap.
> 
> > A comment might be
> > good to explain you're turning the simplified exit condition into
> > 
> >    { IVbase - STEP, +, STEP } != niter * STEP + (IVbase - STEP)
> > 
> > which, when mathematically looking at it makes me wonder why there's
> > the seemingly redundant '- STEP' term?  Also is NE_EXPR really
> > correct since STEP might be not 1?  Only for non equality compares
> > the '- STEP' should matter?
> 
> I need to add comments for this.  This is a little tricky.
> The last value of the original IV just cross max/min at most one STEP,
> at there wrapping already happen.
> Using "{IVbase, +, STEP} != niter * STEP + IVbase" is not wrong
> in the aspect of exit condition.
> 
> But this would not work well with existing code:
> like determine_exit_conditions, which will convert NE_EXP to
> LT_EXPR/GT_EXPR.  And so, the '- STEP' is added to adjust the
> IV.base and bound, with '- STEP' the bound will be the last value
> just before wrap.

Hmm.  The control IV is documented as

  /* The simplified shape of the exit condition.  The loop exits if
     CONTROL CMP BOUND is false, where CMP is one of NE_EXPR,
     LT_EXPR, or GT_EXPR, and step of CONTROL is positive if CMP is
     LE_EXPR and negative if CMP is GE_EXPR.  This information is used
     by loop unrolling.  */
  affine_iv control;

but determine_exit_conditions seems to assume the IV does not wrap?
In fact determine_exit_conditions seems to just build ->base CMP bound
where bound is the IV bound biased by #unroll * step - step.  So how
does biasing by step * 1 help?

Does the control IV wrap in our case?

Richard.

> Thanks again for your review!
> 
> BR.
> Jiufu
> 
> > 
> > Richard.
> > 
> >> > +  span = fold_build2 (MULT_EXPR, niter_type, niter->niter,
> >> > +		      fold_convert (niter_type, niter->control.step));
> >> > +  niter->bound = fold_build2 (PLUS_EXPR, niter_type, span,
> >> > +			      fold_convert (niter_type, niter->control.base));
> >> > +  niter->bound = fold_convert (type, niter->bound);
> >> > +  niter->cmp = NE_EXPR;
> >> >
> >> >    return true;
> >> > }
> >> > diff --git a/gcc/testsuite/gcc.dg/pr102087.c
> >> > b/gcc/testsuite/gcc.dg/pr102087.c
> >> > new file mode 100644
> >> > index 00000000000..ef1f9f5cba9
> >> > --- /dev/null
> >> > +++ b/gcc/testsuite/gcc.dg/pr102087.c
> >> > @@ -0,0 +1,25 @@
> >> > +/* { dg-do compile } */
> >> > +/* { dg-options "-O3" } */
> >> > +
> >> > +unsigned __attribute__ ((noinline))
> >> > +foo (int *__restrict__ a, int *__restrict__ b, unsigned l, unsigned n)
> >> > +{
> >> > +  while (n < ++l)
> >> > +    *a++ = *b++ + 1;
> >> > +  return l;
> >> > +}
> >> > +
> >> > +volatile int a[1];
> >> > +unsigned b;
> >> > +int c;
> >> > +
> >> > +int
> >> > +check ()
> >> > +{
> >> > +  int d;
> >> > +  for (; b > 1; b++)
> >> > +    for (c = 0; c < 2; c++)
> >> > +      for (d = 0; d < 2; d++)
> >> > +	a[0];
> >> > +  return 0;
> >> > +}
> >> > diff --git a/gcc/testsuite/gcc.dg/vect/pr101145_3.c
> >> > b/gcc/testsuite/gcc.dg/vect/pr101145_3.c
> >> > index 99289afec0b..40cb0240aaa 100644
> >> > --- a/gcc/testsuite/gcc.dg/vect/pr101145_3.c
> >> > +++ b/gcc/testsuite/gcc.dg/vect/pr101145_3.c
> >> > @@ -1,5 +1,6 @@
> >> >  /* { dg-require-effective-target vect_int } */
> >> >  /* { dg-options "-O3 -fdump-tree-vect-details" } */
> >> > +
> >> >  #define TYPE int *
> >> >  #define MIN ((TYPE)0)
> >> >  #define MAX ((TYPE)((long long)-1))
> >> > @@ -10,4 +11,5 @@
> >> >
> >> >  #include "pr101145.inc"
> >> >
> >> > -/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 2 "vect" } }
> >> > */
> >> > +/* pointer size may not be vectorized, checking niter is ok. */
> >> > +/* { dg-final { scan-tree-dump "Symbolic number of iterations is" "vect"
> >> > }
> >> > } */
> >> 
> 
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)


More information about the Gcc-patches mailing list