[Bug target/106022] [12/13 Regression] Enable vectorizer generates extra load
rguenth at gcc dot gnu.org
Thu Jun 23 06:19:54 GMT 2022
--- Comment #11 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to H.J. Lu from comment #9)
> (In reply to Richard Biener from comment #8)
> > (In reply to H.J. Lu from comment #6)
> > > Created attachment 53169 [details]
> > > A patch
> > >
> > > This patch multiplies the vector store cost by the number of scalar elements
> > > in
> > > a word to properly compare scalar store cost against vector store cost.
> > But that's not "properly" but "wrong" ...
> > Note we already cost the vector load from the constant pool so the vector
> > side costing is correct.
> > What's eventually imprecise is the scalar cost where you could anticipate
> > store merging, but adjusting the vector cost side is just wrong.
> I tried to adjust the scalar cost. When the scalar cost of storing a byte
> is 6, dividing it by 8 (the number of scalar elements in a word) becomes 0.
> Will it work?
No, I think you would need to pattern match an actual store sequence,
for example by looking at
if (STMT_VINFO_GROUPED_ACCESS (stmt_info)
&& pow2p_hwi (DR_GROUP_STORE_COUNT (stmt_info)))
/* cost a possibly merged store only once (but with larger mode?) */
if (DR_GROUP_FIRST_ELEMENT (stmt_info) == stmt_info)
So costing the whole sequence of scalar stores a single time, with
store-merging also handles non-QImode stores btw.
More information about the Gcc-bugs