[PATCH] Enable GCC support for AVX512_VP2INTERSECT.
Uros Bizjak
ubizjak@gmail.com
Thu Jun 20 11:37:00 GMT 2019
On Thu, Jun 20, 2019 at 12:54 PM Hongtao Liu <crazylht@gmail.com> wrote:
>
> On Thu, Jun 20, 2019 at 2:13 PM Uros Bizjak <ubizjak@gmail.com> wrote:
> >
> > On Thu, Jun 20, 2019 at 7:36 AM Hongtao Liu <crazylht@gmail.com> wrote:
> > >
> > > On Sat, Jun 8, 2019 at 4:12 AM Uros Bizjak <ubizjak@gmail.com> wrote:
> > > >
> > > > On 6/7/19, H.J. Lu <hjl.tools@gmail.com> wrote:
> > > >
> > > > >> > > +/* Register pair. */
> > > > >> > > +VECTOR_MODES_WITH_PREFIX (P, INT, 2); /* P2QI */
> > > > >> > > +VECTOR_MODES_WITH_PREFIX (P, INT, 4); /* P2HI P4QI */
> > > > >> > >
> > > > >> > > I think
> > > > >> > >
> > > > >> > > INT_MODE (P2QI, 16);
> > > > >> > > INT_MODE (P2HI, 32);
> > > > >> > >
> > > > >> > > with the above subreg approach should work.
> Yes, it works.
>
> But i didn't figure out how did pass_reload correctly handle such subreg,
> do you have suggestions such as "which function i can dig into first" or
> "which piece of codes handle subreg"?
I'm really not an expert in this part of the compiler, so I'll leave
the answer for someone else.
> > > > >> > >
> > > > >> >
> > > > >> > I don't think subreg works on pseudo registers with non-zero
> > > > >> > offset. validate_subreg has
> > > > >> >
> > > > >> > if (maybe_lt (osize, regsize)
> > > > >> > && ! (lra_in_progress && (FLOAT_MODE_P (imode) || FLOAT_MODE_P
> > > > >> > (omode))))
> > > > >> > {
> > > > >> > /* It is invalid for the target to pick a register size for a
> > > > >> > mode
> > > > >> > that isn't ordered wrt to the size of that mode. */
> > > > >> > poly_uint64 block_size = ordered_min (isize, regsize);
> > > > >> > unsigned int start_reg;
> > > > >> > poly_uint64 offset_within_reg;
> > > > >> > if (!can_div_trunc_p (offset, block_size, &start_reg,
> > > > >> > &offset_within_reg)
> > > > >> > || (BYTES_BIG_ENDIAN
> > > > >> > ? maybe_ne (offset_within_reg, block_size - osize)
> > > > >> > : maybe_ne (offset_within_reg, 0U)))
> > > > >> > return false;
> > > > >>
> > > > >> It works with SImode subregs of DImode values on 32bit targets. Please
> > > > >> look for calls to gen_highpart, one concrete example is in
> > > > >> atomic_compare_and_swap<mode>.
> > > > >>
> > > > >
> > > > > It works because of
> > > > >
> > > > > #define REGMODE_NATURAL_SIZE(MODE) UNITS_PER_WORD
> > > > >
> > > > > and only works for the high part of SImode of DImode.
> > > > >
> > > > > P2QI and P2HI are 2 special modes of mask register pair for
> > > > > 2 instructions. Do we want to make them more generic?
> > > >
> > > > If enhancing the referred define means that we don't need two
> > > > artificial instructions and leave all heavy lifting to the existing
> > > Do you mean that we take P2HI and P2QI as normal vector modes,
> > > and reuse ix86_expand_vector_* things?
> > > But still two artificial instructions can't be avoided.
> > > > generic functionality, then this is the way to go.
> >
> > No, declare them as integer modes and use subregs to access high and
> > low register. This should work in the same way as SImode hard
> > registers are accessed in DImode pair for 32bit targets.
> >
> > Uros.
>
> Update patch.
Does gen_lowpart/gen_higpart instead of simplify_gen_subreg work?
These two are just a handy wrapper for simplify_gen_subreg. Other than
that, patch LGTM.
Uros.
More information about the Gcc-patches
mailing list