This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH][vect] PR 92351: When peeling for alignment make alignment of epilogues unknown
On Fri, 8 Nov 2019, Richard Sandiford wrote:
> Richard Biener <rguenther@suse.de> writes:
> > On Thu, 7 Nov 2019, Andre Vieira (lists) wrote:
> >> On 07/11/2019 14:00, Richard Biener wrote:
> >> > On Thu, 7 Nov 2019, Andre Vieira (lists) wrote:
> >> >
> >> >> Hi,
> >> >>
> >> >> PR92351 reports a bug in which a wrongly aligned load is generated for an
> >> >> epilogue of a main loop for which we peeled for alignment. There is no way
> >> >> to
> >> >> guarantee that epilogue data accesses are aligned when the main loop is
> >> >> peeling for alignment.
> >> >>
> >> >> I also had to split vect-peel-2.c as there were scans there for the number
> >> >> of
> >> >> unaligned accesses that were vectorized, thanks to this change that now
> >> >> depends on whether we are vectorizing the epilogue, which will also contain
> >> >> unaligned accesses. Since not all targets need to be able to vectorize the
> >> >> epilogue I decided to disable epilogue vectorization for the version in
> >> >> which
> >> >> we scan the dumps and add a version that attempts epilogue vectorization
> >> >> but
> >> >> does not scan the dumps.
> >> >>
> >> >> Bootstrapped and regression tested on x86_64 and aarch64.
> >> >>
> >> >> Is this OK for trunk?
> >> >
> >> > @@ -938,6 +938,18 @@ vect_compute_data_ref_alignment (dr_vec_info
> >> > *dr_info)
> >> > = exact_div (vect_calculate_target_alignment (dr_info),
> >> > BITS_PER_UNIT);
> >> > DR_TARGET_ALIGNMENT (dr_info) = vector_alignment;
> >> > + /* If the main loop has peeled for alignment we have no way of knowing
> >> > + whether the data accesses in the epilogues are aligned. We can't at
> >> > + compile time answer the question whether we have entered the main
> >> > loop
> >> > or
> >> > + not. Fixes PR 92351. */
> >> > + if (loop_vinfo)
> >> > + {
> >> > + loop_vec_info orig_loop_vinfo = LOOP_VINFO_ORIG_LOOP_INFO
> >> > (loop_vinfo);
> >> > + if (orig_loop_vinfo
> >> > + && LOOP_VINFO_PEELING_FOR_ALIGNMENT (orig_loop_vinfo) != 0)
> >> > + return;
> >> > + }
> >> >
> >> > so I'm not sure this is the correct place to do the fixup. Isn't the
> >> > above done when analyzing the loops with different vector size/mode?
> >> > So we don't yet know whether we analyze the loop as epilogue or
> >> > not epilogue? Looks like we at the moment always choose the
> >> > very first loop we analyze successfully as "main" loop?
> >> >
> >> > So, can we do this instead in update_epilogue_loop_vinfo? There
> >> > we should also know whether we created the jump-around the
> >> > main vect loop.
> >> >
> >>
> >> So we do know we are analyzing it as an epilogue, that is the only case
> >> orig_loop_vinfo is set.
> >>
> >> The reason why we shouldn't do it in update_epilogue_loop_vinfo is that the
> >> target might not know how to vectorize memory accesses for unaligned memory
> >> for the given VF. Or maybe it does but is too expensive don't know if we
> >> currently check that though. I do not have an example but this is why I
> >> believe it would be better to do it during analysis. I thought it had been you
> >> who alerted me to this, but maybe it was Sandiford, or maybe I dreamt it up ;)
> >
> > It was probably me, yes. But don't we have a catch-22 now? If we
> > have multiple vector sizes and as Richard, want to first compute
> > the "cheapest" to use as the main vectorized body don't we then have
> > to re-analyze the smaller vector sizes for epilogue use?
>
> It was a nice hack that we could vectorise as an epilogue even when
> choosing main loops, and optionally "promote" them later, but it's
> probably going to have to yield at some point anyway. E.g. from what
> Andre said on IRC yesterday, he might have to take peeling for gaps
> into account too.
>
> > So how do we handle this situation at the moment?
> >
> > I think during alignment peeling analysis we look whether a DR
> > absolutely needs to be aligned, that is, we use
> > vect_supportable_dr_alignment (*, true). If that returns
> > dr_unaligned_unsupported we should probably simply disable
> > epilogue vectorization if we didn't version for alignment
> > (or we know the vectorized loop was entered).
>
> I guess doing this based on the main loop would hard-code an assumption
> that the shorter vectors have the same sensitivity to alignment as
> longer vectors. Which is probably fine in practice, but it would
> be good to avoid if possible.
>
> > So, during analysis reject epilogues that have DRs with
> > dr_unaligned_unsupported but allow them as "main" loops still
> > (so disable epilogue vectorization for a main loop with such DRs).
> >
> > Then at update_epilogue_loop_vinfo time simply make alignment
> > unknown.
> >
> > Would that work?
>
> Agree it sounds like it would work. But at the moment we don't yet have
> a dr_unaligned_unsupported target that wants the "best loop" behaviour.
> Given that we might have to do what the vect_analyze_loop comment in
>
> https://gcc.gnu.org/ml/gcc-patches/2019-11/msg00296.html
>
> explains away anyway, it might not be worth the effort to support
> that case.
Hmm, OK.
So I'm fine with the original patch then. I guess we'll need to
"improve" things anyway at some point.
Thanks,
Richard.