This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH PR78005]Fix miscompare issue by computing correct guard condition for vectorized loop


On Mon, 12 Jun 2017, Bin.Cheng wrote:

> On Mon, Jun 12, 2017 at 9:19 AM, Richard Sandiford
> <richard.sandiford@linaro.org> wrote:
> > "Bin.Cheng" <amker.cheng@gmail.com> writes:
> >> On Sat, Jun 10, 2017 at 10:40 AM, Richard Sandiford
> >> <richard.sandiford@linaro.org> wrote:
> >>> Sorry to return this old patch, but:
> >>>
> >>> Bin Cheng <Bin.Cheng@arm.com> writes:
> >>>> -/* Calculate the number of iterations under which scalar loop will be
> >>>> -   preferred than vectorized loop.  NITERS_PROLOG is the number of
> >>>> -   iterations of prolog loop.  If it's integer const, the integer
> >>>> -   number is also passed by INT_NITERS_PROLOG.  VF is vector factor;
> >>>> -   TH is the threshold for vectorized loop if CHECK_PROFITABILITY is
> >>>> -   true.  This function also store upper bound of the result in BOUND.  */
> >>>> +/* Calculate the number of iterations above which vectorized loop will be
> >>>> +   preferred than scalar loop.  NITERS_PROLOG is the number of iterations
> >>>> +   of prolog loop.  If it's integer const, the integer number is also passed
> >>>> +   in INT_NITERS_PROLOG.  BOUND_PROLOG is the upper bound (included) of
> >>>> +   number of iterations of prolog loop.  VFM1 is vector factor minus one.
> >>>> +   If CHECK_PROFITABILITY is true, TH is the threshold below which scalar
> >>>> +   (rather than vectorized) loop will be executed.  This function stores
> >>>> +   upper bound (included) of the result in BOUND_SCALAR.  */
> >>>>
> >>>>  static tree
> >>>>  vect_gen_scalar_loop_niters (tree niters_prolog, int int_niters_prolog,
> >>>> -                          int bound_prolog, int vf, int th, int *bound,
> >>>> -                          bool check_profitability)
> >>>> +                          int bound_prolog, int vfm1, int th,
> >>>> +                          int *bound_scalar, bool check_profitability)
> >>>>  {
> >>>>    tree type = TREE_TYPE (niters_prolog);
> >>>>    tree niters = fold_build2 (PLUS_EXPR, type, niters_prolog,
> >>>> -                          build_int_cst (type, vf));
> >>>> +                          build_int_cst (type, vfm1));
> >>>>
> >>>> -  *bound = vf + bound_prolog;
> >>>> +  *bound_scalar = vfm1 + bound_prolog;
> >>>>    if (check_profitability)
> >>>>      {
> >>>> -      th++;
> >>>> +      /* TH indicates the minimum niters of vectorized loop, while we
> >>>> +      compute the maximum niters of scalar loop.  */
> >>>> +      th--;
> >>>
> >>> Are you sure about this last change?  It looks like it should be dropping
> >> Hi Richard,
> >> Thanks for spotting this.  I vaguely remember I got this from the way
> >> how niter/th was checked in previous peeling code, but did't double
> >> check it now.  I tend to believe there is inconsistence about th,
> >> especially with comment like:
> >>
> >>   /* Threshold of number of iterations below which vectorzation will not be
> >>      performed. It is calculated from MIN_PROFITABLE_ITERS and
> >>      PARAM_MIN_VECT_LOOP_BOUND. */
> >>   unsigned int th;
> >>
> >> I also tend to believe the inconsistence was introduced partly because
> >> niters in vectorizer stands for latch_niters + 1, while latch_niters
> >> in rest of the compiler.
> >>
> >> and...,
> >>
> >>> the increment rather than replacing it with a decrement.
> >>>
> >>> It looks like the threshold is already the maximum niters for the scalar
> >>> loop.  It's set by:
> >>>
> >>>   min_scalar_loop_bound = ((PARAM_VALUE (PARAM_MIN_VECT_LOOP_BOUND)
> >>>                             * vectorization_factor) - 1);
> >>>
> >>>   /* Use the cost model only if it is more conservative than user specified
> >>>      threshold.  */
> >>>   th = (unsigned) min_scalar_loop_bound;
> >>>   if (min_profitable_iters
> >>>       && (!min_scalar_loop_bound
> >>>           || min_profitable_iters > min_scalar_loop_bound))
> >>>     th = (unsigned) min_profitable_iters;
> >>>
> >>>   LOOP_VINFO_COST_MODEL_THRESHOLD (loop_vinfo) = th;
> >>>
> >>> (Note the "- 1" in the min_scalar_loop_bound.  The multiplication result
> >>> is the minimum niters for the vector loop.)
> >> To be honest, min_scalar_loop_bound is more likely for something's
> >> lower bound which is the niters for the vector loop.  If it refers to
> >> the niters scalar loop, it is in actuality the "max" value we should
> >> use.  I am not quite sure here, partly because I am not a native
> >> speaker.
> >>
> >>>
> >>> min_profitable_iters sounds like it _ought_ to be the minimum niters for
> >>> which the vector loop is used, but vect_estimate_min_profitable_iters
> >>> instead returns the largest niters for which the scalar loop should be
> >>> preferred:
> >>>
> >>>   /* Cost model disabled.  */
> >>>   if (unlimited_cost_model (LOOP_VINFO_LOOP (loop_vinfo)))
> >>>     {
> >>>       dump_printf_loc (MSG_NOTE, vect_location, "cost model disabled.\n");
> >>>       *ret_min_profitable_niters = 0;
> >>>       *ret_min_profitable_estimate = 0;
> >>>       return;
> >>>     }
> >>>   [...]
> >>>   /* Because the condition we create is:
> >>>      if (niters <= min_profitable_iters)
> >>>        then skip the vectorized loop.  */
> >>>   min_profitable_iters--;
> >>>   [...]
> >>>   min_profitable_estimate --;
> >>>
> >>> Also, when deciding whether the threshold needs to be used, we have:
> >>>
> >>>   th = LOOP_VINFO_COST_MODEL_THRESHOLD (loop_vinfo);
> >>>   if (th >= LOOP_VINFO_VECT_FACTOR (loop_vinfo) - 1
> >>>       && !LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo))
> >>>     {
> >>>       if (dump_enabled_p ())
> >>>         dump_printf_loc (MSG_NOTE, vect_location,
> >>>                          "Profitability threshold is %d loop iterations.\n",
> >>>                          th);
> >>>       check_profitability = true;
> >>>     }
> >> Yes, this indeed indicates that th refers to the maximum niters of the
> >> scalar loop.  Anyway, we should resolve the inconsistence and make it
> >> more clear in variable names etc..
> >
> > Yeah, agree the variable names are really confusing.  Given the choice
> > between changing the names to match the current meaning and changing
> > the meaning to match the current names, the latter looks like it would
> > give cleaner code.  I'm happy to do that it that sounds like the way
> > to go.
> I am all for this direction, but Richard B may have some to say here.
> BTW, will you also fix the other inconsistence you found?

I'm all to making the code clearer but I guess it will remain a
tricky area.  Each time I need to change sth there I need half an
hour to recap how all the stuff is tied together...

So please go forward.  Many bonus points if you manage to
create testcases covering the mistakes you fix.

Richard.

> Thanks,
> bin
> 
> >
> > FWIW, I came across this while trying make sense of the "+ 1" in:
> >
> >   /* Decide whether we need to create an epilogue loop to handle
> >      remaining scalar iterations.  */
> >   th = ((LOOP_VINFO_COST_MODEL_THRESHOLD (loop_vinfo) + 1)
> >         / LOOP_VINFO_VECT_FACTOR (loop_vinfo))
> >        * LOOP_VINFO_VECT_FACTOR (loop_vinfo);
> >
> > Thanks,
> > Richard
> 
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]