This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH PR85804]Fix wrong code by correcting bump step computation in vector(1) load of single-element group access


On Wed, May 23, 2018 at 1:10 PM Bin.Cheng <amker.cheng@gmail.com> wrote:

> On Wed, May 23, 2018 at 12:01 PM, Richard Sandiford
> <richard.sandiford@linaro.org> wrote:
> > "Bin.Cheng" <amker.cheng@gmail.com> writes:
> >> On Wed, May 23, 2018 at 11:19 AM, Richard Biener
> >> <richard.guenther@gmail.com> wrote:
> >>> On Tue, May 22, 2018 at 2:11 PM Richard Sandiford <
> >>> richard.sandiford@linaro.org> wrote:
> >>>
> >>>> Richard Biener <richard.guenther@gmail.com> writes:
> >>>> > On Mon, May 21, 2018 at 3:14 PM Bin Cheng <Bin.Cheng@arm.com>
wrote:
> >>>> >
> >>>> >> Hi,
> >>>> >> As reported in PR85804, bump step is wrongly computed for
vector(1)
> >>> load
> >>>> > of
> >>>> >> single-element group access.  This patch fixes the issue by
correcting
> >>>> > bump
> >>>> >> step computation for the specific VMAT_CONTIGUOUS case.
> >>>> >
> >>>> >> Bootstrap and test on x86_64 and AArch64 ongoing, is it OK?
> >>>> >
> >>>> > To me it looks like the classification as VMAT_CONTIGUOUS is bogus.
> >>>> > We'd fall into the grouped_load case otherwise which should handle
> >>>> > the situation correctly?
> >>>> >
> >>>> > Richard?
> >>>
> >>>> Yeah, I agree.  I mentioned to Bin privately that that was probably
> >>>> a misstep and that we should instead continue to treat them as
> >>>> VMAT_CONTIGUOUS_PERMUTE, but simply select the required vector
> >>>> from the array of loaded vectors, instead of doing an actual permute.
> >>>
> >>> I'd classify them as VMAT_ELEMENTWISE instead.  CONTIGUOUS
> >>> should be only for no-gap vectors.  How do we classify single-element
> >>> interleaving?  That would be another classification choice.
> >> Yes, I suggested this offline too, but Richard may have more to say
about this.
> >> One thing worth noting is classifying it as VMAT_ELEMENTWISE would
> >> disable vectorization in this case because of cost model issue as
> >> commented at the end of get_load_store_type.
> >
> > Yeah, that's the problem.  Using VMAT_ELEMENTWISE also means that we
> > use a scalar load and then insert it into a vector, whereas all we want
> > (and all we currently generate) is a single vector load.
> >
> > So if we classify them as VMAT_ELEMENTWISE, they'll be a special case
> > for both costing (vector load rather than scalar load and vector
> > construct) and code-generation.
> Looks to me it will be a special case for VMAT_ELEMENTWISE or
> VMAT_CONTIGUOUS* anyway, probably VMAT_CONTIGUOUS is the easiest one?

The question is why we do

   if (memory_access_type == VMAT_GATHER_SCATTER
       || (!slp && memory_access_type == VMAT_CONTIGUOUS))
     grouped_load = false;

I think for the particular testcase removing this would fix the issue as
well?

Richard.

> Thanks,
> bin
> >
> > Thanks,
> > Richard


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]