[Bug target/103771] [12 Regression] Missed vectorization under -mavx512f -mavx512vl after r12-5489

Thu Jan 13 13:42:32 GMT 2022

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103771

--- Comment #16 from rguenther at suse dot de <rguenther at suse dot de> ---
On Thu, 13 Jan 2022, crazylht at gmail dot com wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103771
> 
> --- Comment #15 from Hongtao.liu <crazylht at gmail dot com> ---
> (In reply to rguenther@suse.de from comment #12)
> > On Thu, 13 Jan 2022, crazylht at gmail dot com wrote:
> > 
> > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103771
> > > 
> > > --- Comment #10 from Hongtao.liu <crazylht at gmail dot com> ---
> > > with
> > > @@ -12120,7 +12120,8 @@ supportable_narrowing_operation (enum tree_code code,
> > >        c1 = VEC_PACK_TRUNC_EXPR;
> > >        if (VECTOR_BOOLEAN_TYPE_P (narrow_vectype)
> > >           && VECTOR_BOOLEAN_TYPE_P (vectype)
> > > -         && TYPE_MODE (narrow_vectype) == TYPE_MODE (vectype)
> > > +         && (TYPE_MODE (narrow_vectype) == TYPE_MODE (vectype)
> > > +             || known_lt (TYPE_VECTOR_SUBPARTS (vectype), BITS_PER_UNIT))
> > >           && SCALAR_INT_MODE_P (TYPE_MODE (vectype)))
> > 
> > I think we instead simply want
> > 
> >          if (VECTOR_BOOLEAN_TYPE_P (narrow_vectype)
> >              && TYPE_PRECISION (TREE_TYPE (narrow_vectype)) == 1
> >              && VECTOR_BOOLEAN_TYPE_P (vectype)
> >              && TYPE_PRECISION (TREE_TYPE (vectype)) == 1)
> > 
> > note the docs of vec_pack_sbool_trunc say
> > 
> > This instruction pattern is used when all the vector input and output
> > operands have the same scalar mode @var{m} and thus using
> > @code{vec_pack_trunc_@var{m}} would be ambiguous.
> > 
> > It also says "_Narrow_ and merge the elements of two vectors.", I think
> > "narrow" is misleading here, _trunc in the optab name as well.  So
> > with the above it suggests we could have used vect_pack_trunc_hi here?
> > 
> > To avoid breaking things for the VnBImode using targets we probably
> > want to retain the SCALAR_INT_MODE_P (prev_mode) check.  And we
> > probably want to adjust the documentation a bit.
> > 
> > This all is with my pasted pattern patch or is this with the weird
> > inserted conversion still?
> 
> It's w/o your patch, I'm try to handle the weird conversion with multi
> steps(first pack QI:4 -> QI:8 through vec_pack_sbool_trunc_qi, then pack QI:8
> -> HI:16 through vec_pack_sbool_trunc_hi). But on the othersize the weird
> inserted conversion shouldn't be existed.

But the weird conversion suggests packing { 0, 1, 0, 1 } as
{ 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, ... }
thus expanding each bit to 8 bits.  So it's rather an unpacking :/
As said, the scalar conversion does not make any sense...

But maybe I'm missing something very obvious?