This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [PATCH] Improve AVX512 sse movcc (PR target/88547)

From: Jakub Jelinek <jakub at redhat dot com>
To: Uros Bizjak <ubizjak at gmail dot com>
Cc: "gcc-patches at gcc dot gnu dot org" <gcc-patches at gcc dot gnu dot org>
Date: Thu, 20 Dec 2018 08:49:40 +0100
Subject: Re: [PATCH] Improve AVX512 sse movcc (PR target/88547)
References: <20181219232007.GL23305@tucnak> <CAFULd4af=Vpat===Qp1GaoRj4patAO=umuH+UjUQpg70RkTdQw@mail.gmail.com>
Reply-to: Jakub Jelinek <jakub at redhat dot com>

On Thu, Dec 20, 2018 at 08:42:05AM +0100, Uros Bizjak wrote:
> > If one vcond argument is all ones (non-bool) vector and another one is all
> > zeros, we can use for AVX512{DQ,BW} (sometimes + VL) the vpmovm2? insns.
> > While if op_true is all ones and op_false, we emit large code that the
> > combiner often optimizes to that vpmovm2?, if the arguments are swapped,
> > we emit vpxor + vpternlog + and masked move (blend), while we could just
> > invert the mask with knot* and use vpmovm2?.
> >
> > Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for
> > trunk?  The patch is large, but it is mostly reindentation, in the
> > attachment there is diff -ubpd variant of the i386.c changes to make it more
> > readable.
> >
> > 2018-12-19  Jakub Jelinek  <jakub@redhat.com>
> >
> >         PR target/88547
> >         * config/i386/i386.c (ix86_expand_sse_movcc): For maskcmp, try to
> >         emit vpmovm2? instruction perhaps after knot?.  Reorganize code
> >         so that it doesn't have to test !maskcmp in almost every conditional.
> >
> >         * gcc.target/i386/pr88547-1.c: New test.
> 
> LGTM, under assumption that interunit moves from mask reg to xmm regs are fast.

In a simple benchmark (calling these functions in a tight loop on i9-7960X)
the performance is the same, just shorter sequences.

	Jakub

References:
- [PATCH] Improve AVX512 sse movcc (PR target/88547)
  - From: Jakub Jelinek
- Re: [PATCH] Improve AVX512 sse movcc (PR target/88547)
  - From: Uros Bizjak

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]