[SLP] SLP vectorization: vectorize vector constructors
Richard Biener
rguenther@suse.de
Mon Nov 4 16:27:00 GMT 2019
On November 4, 2019 4:30:57 PM GMT+01:00, Joel Hutton <Joel.Hutton@arm.com> wrote:
>First copy bounced from mailing list.
> > > On 30/10/2019 13:49, Richard Biener wrote:
> > > >>
>> > >> * expr.c (store_constructor): Add case for constructor
>of
> > > vectors.
> > > > Why do you need this? The vectorizer already creates such
>CTORs. Any
> > > > testcase that you can show?
> > > >
> > > > typedef long v2di __attribute__((vector_size(16)));
> > > > v2di v;
> > > > void
> > > > foo()
> > > > {
> > > > v = (v2di){v[1], v[0]};
> > > > }
>> > There is a special case for single element vectors, where the
>vector
> > > mode and
> > > the element mode are the same.
>> > I've changed this slightly, as your solution of using VECTOR_MODE_P
>
>didn't
> > > work for my case where:
> > > emode = E_DImode
> > > eltmode = E_DImode
> > > On aarch64. As E_DImode is not a vector mode.
>> > Based on some checks I found in verify_gimple, I set the type of
>the
> > > replacement
>> > constructor to the same as the original constructor, as
>verify_gimple
> > > seems to
>> > expect flattened types. i.e. a vector of vectors of ints should
>have
> > > type vector
> > > of ints.
> >
> > Huh. On aarch64 for the above testcase I see mode V2DImode and
> > emode = eltmode = DImode. That's just a regular CTOR with
> > non-vector elements so not sure why you need any adjustment at all
> > here?
> >
> > It looks like your vectorization of the CTOR introduces a
> > V2DImode CTOR of vector(1) long elements which incidentially
> > have DImode. That's because somehow V2DI vectorization isn't
> > profitable but V1DI one is. In the end it's a noop transform
> > but the testcase shows that for V2DI vectorization we fail
> > to cost the CTOR in the scalar cost computation (you have to
> > include the pattern root there I guess).
> >
>Yes, sorry, I was unclear about this, it's the new constructor that the
>
>change is needed for.
>
> > Technically we feed valid GIMPLE to the expander so we indeed
> > shouldn't ICE. So I'd like to have the earlier keying on
> > vec_vec_init_p match the later assert we run into, thus
> > sth like
> >
> > Index: gcc/expr.c
> > ===================================================================
> > --- gcc/expr.c (revision 277765)
> > +++ gcc/expr.c (working copy)
> > @@ -6809,6 +6809,7 @@ store_constructor (tree exp, rtx target,
> > && n_elts.is_constant (&const_n_elts))
> > {
> > machine_mode emode = eltmode;
> > + bool vector_typed_elts_p = false;
> >
> > if (CONSTRUCTOR_NELTS (exp)
> > && (TREE_CODE (TREE_TYPE (CONSTRUCTOR_ELT (exp,
> > 0)->value))
> > @@ -6819,13 +6820,14 @@ store_constructor (tree exp, rtx target,
>> * TYPE_VECTOR_SUBPARTS (etype),
> > n_elts));
> > emode = TYPE_MODE (etype);
> > + vector_typed_elts_p = true;
> > }
>> icode = convert_optab_handler (vec_init_optab, mode,
>emode);
> > if (icode != CODE_FOR_nothing)
> > {
> > unsigned int n = const_n_elts;
> >
> > - if (emode != eltmode)
> > + if (vector_typed_elts_p)
> > {
> > n = CONSTRUCTOR_NELTS (exp);
> > vec_vec_init_p = true;
> >
> >
>I've used this, thank you.
>
> > > > @@ -4061,15 +4201,42 @@ vect_schedule_slp (vec_info *vinfo)
> > > > if (is_a <loop_vec_info> (vinfo))
> > > >
>> > > the ChangeLog doesn't mention this. I guess this isn't
>necessary
> > > > unless you fail vectorizing late which you shouldn't.
> > > >
> > > It's necessary to avoid:
> > >
> > > removing stmts twice for constructors of the form
>{_1,_1,_1,_1}
>> > removing stmts that define ssa names used elsewhere (which
>> > previously wasn't a consideration because the scalar_stmts were
>stores
> > > to memory, not assignments to ssa names)
> >
> > OK, but the code you are patching is just supposed to remove stores
> > from the final scalar stmt seeds - it doesn't expect any loads there
> > which is probably what happens. I'd instead add a
> >
> > /* Do not CSE the stmts feeding a CTOR, they might have uses
> > outside of the vectorized stmts. */
> > if (SLP_INSTANCE_ROOT_STMT (instance))
> > continue;
> >
> > before the loop over SLP_TREE_SCALAR_STMTS.
>Done.
>
> > + if (!subparts.is_constant () || !(subparts.to_constant ()
>> + == CONSTRUCTOR_NELTS
>(rhs)))
> > + continue;
> >
> > can be better written as if (maybe_ne (subparts, CONSTRUCTOR_NELTS
>(rhs)))
>Done.
>
> > +/* Vectorize the instance root. Return success or failure. */
> > +
> > +void
> >
> > since it's now void the comment has to be adjusted.
>Done.
>
> > + vect_schedule_slp_instance (node,
> > instance, bst_map);
> >
> > this now fits in one line.
>Done.
>
> > OK with those changes.
>Tested and bootstrapped on aarch64.
>OK?
Ok.
Thanks,
Richard.
>2019-11-04 Joel Hutton <Joel.Hutton@arm.com>
>
> * expr.c (store_constructor): Modify to handle single element
>vectors.
> * tree-vect-slp.c (vect_analyze_slp_instance): Add case for
>vector constructors.
> (vect_slp_check_for_constructors): New function.
> (vect_slp_analyze_bb_1): Call new function to check for vector
>constructors.
> (vectorize_slp_instance_root_stmt): New function.
> (vect_schedule_slp): Call new function to vectorize root stmt
>of vector constructors.
> * tree-vectorizer.h (SLP_INSTANCE_ROOT_STMT): New field.
>
>gcc/testsuite/ChangeLog:
>
>2019-11-04 Joel Hutton <Joel.Hutton@arm.com>
>
> * gcc.dg/vect/bb-slp-40.c: New test.
> * gcc.dg/vect/bb-slp-41.c: New test.
More information about the Gcc-patches
mailing list