[PATCH] Improve *avx_vperm_broadcast_*

Jakub Jelinek jakub@redhat.com
Tue May 31 15:00:00 GMT 2016


On Tue, May 31, 2016 at 06:54:14AM -0700, H.J. Lu wrote:
> On Mon, May 23, 2016 at 10:15 AM, Jakub Jelinek <jakub@redhat.com> wrote:
> > Hi!
> >
> > The vbroadcastss and vpermilps insns are already in AVX512F & AVX512VL,
> > so can be used with v instead of x, the splitter case where we for AVX
> > emit vpermilps plus vpermf128 is more problematic, because the latter
> > insn isn't available in EVEX.  But, we can get the same effect with
> > vshuff32x4 when both source operands are the same.
> > Alternatively, we could replace the vpermilps and vshuff32x4 insns
> > with the AVX512VL arbitrary permutations I think, the question is
> > what is faster, because we'd need to load the mask from memory.
> >
> > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> >
> > 2016-05-23  Jakub Jelinek  <jakub@redhat.com>
> >
> >         * config/i386/sse.md
> >         (<mask_codefor>avx512vl_shuf_<shuffletype>32x4_1<mask_name>): Rename
> >         to ...
> >         (avx512vl_shuf_<shuffletype>32x4_1<mask_name>): ... this.
> >         (*avx_vperm_broadcast_v4sf): Use v constraint instead of x.  Use
> >         maybe_evex prefix instead of vex.
> >         (*avx_vperm_broadcast_<mode>): Use v constraint instead of x.  Handle
> >         EXT_REX_SSE_REG_P (op0) case in the splitter.
> >
> >         * gcc.target/i386/avx512vl-vbroadcast-3.c: New test.
> >
> 
> The new test fails on x32 due to 32-bit register in address.  This
> patch fixes it.  Tested on x86-64.  OK for trunk?

Ok, thanks.
> 2016-05-31  H.J. Lu  <hongjiu.lu@intel.com>
> 
> * gcc.target/i386/avx512vl-vbroadcast-3.c: Scan %\[re\]di
> instead of %rdi.
> * gcc.target/i386/avx512vl-vcvtps2ph-3.c: Likewise.

	Jakub



More information about the Gcc-patches mailing list