This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Fix PR60505


On Fri, 28 Mar 2014, Cong Hou wrote:

> Ping?

Ok.

Thanks,
Richard.

> 
> thanks,
> Cong
> 
> 
> On Wed, Mar 19, 2014 at 11:39 AM, Cong Hou <congh@google.com> wrote:
> > On Tue, Mar 18, 2014 at 4:43 AM, Richard Biener <rguenther@suse.de> wrote:
> >>
> >> On Mon, 17 Mar 2014, Cong Hou wrote:
> >>
> >> > On Mon, Mar 17, 2014 at 6:44 AM, Richard Biener <rguenther@suse.de> wrote:
> >> > > On Fri, 14 Mar 2014, Cong Hou wrote:
> >> > >
> >> > >> On Fri, Mar 14, 2014 at 12:58 AM, Richard Biener <rguenther@suse.de> wrote:
> >> > >> > On Fri, 14 Mar 2014, Jakub Jelinek wrote:
> >> > >> >
> >> > >> >> On Fri, Mar 14, 2014 at 08:52:07AM +0100, Richard Biener wrote:
> >> > >> >> > > Consider this fact and if there are alias checks, we can safely remove
> >> > >> >> > > the epilogue if the maximum trip count of the loop is less than or
> >> > >> >> > > equal to the calculated threshold.
> >> > >> >> >
> >> > >> >> > You have to consider n % vf != 0, so an argument on only maximum
> >> > >> >> > trip count or threshold cannot work.
> >> > >> >>
> >> > >> >> Well, if you only check if maximum trip count is <= vf and you know
> >> > >> >> that for n < vf the vectorized loop + it's epilogue path will not be taken,
> >> > >> >> then perhaps you could, but it is a very special case.
> >> > >> >> Now, the question is when we are guaranteed we enter the scalar versioned
> >> > >> >> loop instead for n < vf, is that in case of versioning for alias or
> >> > >> >> versioning for alignment?
> >> > >> >
> >> > >> > I think neither - I have plans to do the cost model check together
> >> > >> > with the versioning condition but didn't get around to implement that.
> >> > >> > That would allow stronger max bounds for the epilogue loop.
> >> > >>
> >> > >> In vect_transform_loop(), check_profitability will be set to true if
> >> > >> th >= VF-1 and the number of iteration is unknown (we only consider
> >> > >> unknown trip count here), where th is calculated based on the
> >> > >> parameter PARAM_MIN_VECT_LOOP_BOUND and cost model, with the minimum
> >> > >> value VF-1. If the loop needs to be versioned, then
> >> > >> check_profitability with true value will be passed to
> >> > >> vect_loop_versioning(), in which an enhanced loop bound check
> >> > >> (considering cost) will be built. So I think if the loop is versioned
> >> > >> and n < VF, then we must enter the scalar version, and in this case
> >> > >> removing epilogue should be safe when the maximum trip count <= th+1.
> >> > >
> >> > > You mean exactly in the case where the profitability check ensures
> >> > > that n % vf == 0?  Thus effectively if n == maximum trip count?
> >> > > That's quite a special case, no?
> >> >
> >> >
> >> > Yes, it is a special case. But it is in this special case that those
> >> > warnings are thrown out. Also, I think declaring an array with VF*N as
> >> > length is not unusual.
> >>
> >> Ok, but then for the patch compute the cost model threshold once
> >> in vect_analyze_loop_2 and store it in a new
> >> LOOP_VINFO_COST_MODEL_THRESHOLD.
> >
> >
> > Done.
> >
> >
> >> Also you have to check
> >> the return value from max_stmt_executions_int as that may return
> >> -1 if the number cannot be computed (or isn't representable in
> >> a HOST_WIDE_INT).
> >
> >
> > It will be converted to unsigned type so that -1 means infinity.
> >
> >
> >> You also should check for
> >> LOOP_REQUIRES_VERSIONING_FOR_ALIGNMENT which should have the
> >> same effect on the cost model check.
> >
> >
> > Done.
> >
> >
> >>
> >>
> >> The existing condition is already complicated enough - adding new
> >> stuff warrants comments before the (sub-)checks.
> >
> >
> > OK. Comments added.
> >
> > Below is the revised patch. Bootstrapped and tested on a x86-64 machine.
> >
> >
> > Cong
> >
> >
> >
> > diff --git a/gcc/ChangeLog b/gcc/ChangeLog
> > index e1d8666..eceefb3 100644
> > --- a/gcc/ChangeLog
> > +++ b/gcc/ChangeLog
> > @@ -1,3 +1,18 @@
> > +2014-03-11  Cong Hou  <congh@google.com>
> > +
> > + PR tree-optimization/60505
> > + * tree-vectorizer.h (struct _stmt_vec_info): Add th field as the
> > + threshold of number of iterations below which no vectorization will be
> > + done.
> > + * tree-vect-loop.c (new_loop_vec_info):
> > + Initialize LOOP_VINFO_COST_MODEL_THRESHOLD.
> > + * tree-vect-loop.c (vect_analyze_loop_operations):
> > + Set LOOP_VINFO_COST_MODEL_THRESHOLD.
> > + * tree-vect-loop.c (vect_transform_loop):
> > + Use LOOP_VINFO_COST_MODEL_THRESHOLD.
> > + * tree-vect-loop.c (vect_analyze_loop_2): Check the maximum number
> > + of iterations of the loop and see if we should build the epilogue.
> > +
> >  2014-03-10  Jakub Jelinek  <jakub@redhat.com>
> >
> >   PR ipa/60457
> > diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
> > index 41b6875..09ec1c0 100644
> > --- a/gcc/testsuite/ChangeLog
> > +++ b/gcc/testsuite/ChangeLog
> > @@ -1,3 +1,8 @@
> > +2014-03-11  Cong Hou  <congh@google.com>
> > +
> > + PR tree-optimization/60505
> > + * gcc.dg/vect/pr60505.c: New test.
> > +
> >  2014-03-10  Jakub Jelinek  <jakub@redhat.com>
> >
> >   PR ipa/60457
> > diff --git a/gcc/testsuite/gcc.dg/vect/pr60505.c
> > b/gcc/testsuite/gcc.dg/vect/pr60505.c
> > new file mode 100644
> > index 0000000..6940513
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.dg/vect/pr60505.c
> > @@ -0,0 +1,12 @@
> > +/* { dg-do compile } */
> > +/* { dg-additional-options "-Wall -Werror" } */
> > +
> > +void foo(char *in, char *out, int num)
> > +{
> > +  int i;
> > +  char ovec[16] = {0};
> > +
> > +  for(i = 0; i < num ; ++i)
> > +    out[i] = (ovec[i] = in[i]);
> > +  out[num] = ovec[num/2];
> > +}
> > diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c
> > index df6ab6f..1c78e11 100644
> > --- a/gcc/tree-vect-loop.c
> > +++ b/gcc/tree-vect-loop.c
> > @@ -933,6 +933,7 @@ new_loop_vec_info (struct loop *loop)
> >    LOOP_VINFO_NITERS (res) = NULL;
> >    LOOP_VINFO_NITERS_UNCHANGED (res) = NULL;
> >    LOOP_VINFO_COST_MODEL_MIN_ITERS (res) = 0;
> > +  LOOP_VINFO_COST_MODEL_THRESHOLD (res) = 0;
> >    LOOP_VINFO_VECTORIZABLE_P (res) = 0;
> >    LOOP_VINFO_PEELING_FOR_ALIGNMENT (res) = 0;
> >    LOOP_VINFO_VECT_FACTOR (res) = 0;
> > @@ -1579,6 +1580,8 @@ vect_analyze_loop_operations (loop_vec_info
> > loop_vinfo, bool slp)
> >            || min_profitable_iters > min_scalar_loop_bound))
> >      th = (unsigned) min_profitable_iters;
> >
> > +  LOOP_VINFO_COST_MODEL_THRESHOLD (loop_vinfo) = th;
> > +
> >    if (LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo)
> >        && LOOP_VINFO_INT_NITERS (loop_vinfo) <= th)
> >      {
> > @@ -1625,6 +1628,7 @@ vect_analyze_loop_2 (loop_vec_info loop_vinfo)
> >    bool ok, slp = false;
> >    int max_vf = MAX_VECTORIZATION_FACTOR;
> >    int min_vf = 2;
> > +  unsigned int th;
> >
> >    /* Find all data references in the loop (which correspond to vdefs/vuses)
> >       and analyze their evolution in the loop.  Also adjust the minimal
> > @@ -1769,6 +1773,10 @@ vect_analyze_loop_2 (loop_vec_info loop_vinfo)
> >
> >    /* Decide whether we need to create an epilogue loop to handle
> >       remaining scalar iterations.  */
> > +  th = ((LOOP_VINFO_COST_MODEL_THRESHOLD (loop_vinfo) + 1)
> > +        / LOOP_VINFO_VECT_FACTOR (loop_vinfo))
> > +       * LOOP_VINFO_VECT_FACTOR (loop_vinfo);
> > +
> >    if (LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo)
> >        && LOOP_VINFO_PEELING_FOR_ALIGNMENT (loop_vinfo) > 0)
> >      {
> > @@ -1779,7 +1787,14 @@ vect_analyze_loop_2 (loop_vec_info loop_vinfo)
> >      }
> >    else if (LOOP_VINFO_PEELING_FOR_ALIGNMENT (loop_vinfo)
> >     || (tree_ctz (LOOP_VINFO_NITERS (loop_vinfo))
> > -       < (unsigned)exact_log2 (LOOP_VINFO_VECT_FACTOR (loop_vinfo))))
> > +       < (unsigned)exact_log2 (LOOP_VINFO_VECT_FACTOR (loop_vinfo))
> > +               /* In case of versioning, check if the maximum number of
> > +                  iterations is greater than th.  If they are identical,
> > +                  the epilogue is unnecessary.  */
> > +       && ((!LOOP_REQUIRES_VERSIONING_FOR_ALIAS (loop_vinfo)
> > +            && !LOOP_REQUIRES_VERSIONING_FOR_ALIGNMENT (loop_vinfo))
> > +                   || (unsigned HOST_WIDE_INT)max_stmt_executions_int
> > +        (LOOP_VINFO_LOOP (loop_vinfo)) > th)))
> >      LOOP_VINFO_PEELING_FOR_NITER (loop_vinfo) = true;
> >
> >    /* If an epilogue loop is required make sure we can create one.  */
> > @@ -5775,9 +5790,7 @@ vect_transform_loop (loop_vec_info loop_vinfo)
> >       by our caller.  If the threshold makes all loops profitable that
> >       run at least the vectorization factor number of times checking
> >       is pointless, too.  */
> > -  th = ((PARAM_VALUE (PARAM_MIN_VECT_LOOP_BOUND)
> > - * LOOP_VINFO_VECT_FACTOR (loop_vinfo)) - 1);
> > -  th = MAX (th, LOOP_VINFO_COST_MODEL_MIN_ITERS (loop_vinfo));
> > +  th = LOOP_VINFO_COST_MODEL_THRESHOLD (loop_vinfo);
> >    if (th >= LOOP_VINFO_VECT_FACTOR (loop_vinfo) - 1
> >        && !LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo))
> >      {
> > diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h
> > index f8efe47..f2087e2 100644
> > --- a/gcc/tree-vectorizer.h
> > +++ b/gcc/tree-vectorizer.h
> > @@ -264,6 +264,11 @@ typedef struct _loop_vec_info {
> >       values unknown at compile time.  */
> >    int min_profitable_iters;
> >
> > +  /* Threshold of number of iterations below which vectorzation will not be
> > +     performed. It is calculated from MIN_PROFITABLE_ITERS and
> > +     PARAM_MIN_VECT_LOOP_BOUND.  */
> > +  unsigned int th;
> > +
> >    /* Is the loop vectorizable? */
> >    bool vectorizable;
> >
> > @@ -382,6 +387,7 @@ typedef struct _loop_vec_info {
> >     cost model.  */
> >  #define LOOP_VINFO_NITERS_UNCHANGED(L)     (L)->num_iters_unchanged
> >  #define LOOP_VINFO_COST_MODEL_MIN_ITERS(L) (L)->min_profitable_iters
> > +#define LOOP_VINFO_COST_MODEL_THRESHOLD(L) (L)->th
> >  #define LOOP_VINFO_VECTORIZABLE_P(L)       (L)->vectorizable
> >  #define LOOP_VINFO_VECT_FACTOR(L)          (L)->vectorization_factor
> >  #define LOOP_VINFO_PTR_MASK(L)             (L)->ptr_mask
> >
> >
> >
> >
> >
> >>
> >>
> >> Richard.
> 
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE / SUSE Labs
SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
GF: Jeff Hawn, Jennifer Guild, Felix Imend"orffer


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]