This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Improve *avx_vperm_broadcast_*


On Mon, May 23, 2016 at 10:15 AM, Jakub Jelinek <jakub@redhat.com> wrote:
> Hi!
>
> The vbroadcastss and vpermilps insns are already in AVX512F & AVX512VL,
> so can be used with v instead of x, the splitter case where we for AVX
> emit vpermilps plus vpermf128 is more problematic, because the latter
> insn isn't available in EVEX.  But, we can get the same effect with
> vshuff32x4 when both source operands are the same.
> Alternatively, we could replace the vpermilps and vshuff32x4 insns
> with the AVX512VL arbitrary permutations I think, the question is
> what is faster, because we'd need to load the mask from memory.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>
> 2016-05-23  Jakub Jelinek  <jakub@redhat.com>
>
>         * config/i386/sse.md
>         (<mask_codefor>avx512vl_shuf_<shuffletype>32x4_1<mask_name>): Rename
>         to ...
>         (avx512vl_shuf_<shuffletype>32x4_1<mask_name>): ... this.
>         (*avx_vperm_broadcast_v4sf): Use v constraint instead of x.  Use
>         maybe_evex prefix instead of vex.
>         (*avx_vperm_broadcast_<mode>): Use v constraint instead of x.  Handle
>         EXT_REX_SSE_REG_P (op0) case in the splitter.
>
>         * gcc.target/i386/avx512vl-vbroadcast-3.c: New test.
>

The new test fails on x32 due to 32-bit register in address.  This
patch fixes it.  Tested on x86-64.  OK for trunk?

Thanks.

H.J.
----
2016-05-31  H.J. Lu  <hongjiu.lu@intel.com>

* gcc.target/i386/avx512vl-vbroadcast-3.c: Scan %\[re\]di
instead of %rdi.
* gcc.target/i386/avx512vl-vcvtps2ph-3.c: Likewise.

diff --git a/gcc/testsuite/gcc.target/i386/avx512vl-vbroadcast-3.c
b/gcc/testsuite/gcc.target/i386/avx512vl-vbroadcast-3.c
index d981fe4..7233398 100644
--- a/gcc/testsuite/gcc.target/i386/avx512vl-vbroadcast-3.c
+++ b/gcc/testsuite/gcc.target/i386/avx512vl-vbroadcast-3.c
@@ -150,9 +150,9 @@ f16 (V2 *x)
   asm volatile ("" : "+v" (a));
 }

-/* { dg-final { scan-assembler-times
"vbroadcastss\[^\n\r]*%rdi\[^\n\r]*%xmm16" 4 } } */
+/* { dg-final { scan-assembler-times
"vbroadcastss\[^\n\r]*%\[re\]di\[^\n\r]*%xmm16" 4 } } */
 /* { dg-final { scan-assembler-times
"vbroadcastss\[^\n\r]*%xmm16\[^\n\r]*%ymm16" 3 } } */
-/* { dg-final { scan-assembler-times
"vbroadcastss\[^\n\r]*%rdi\[^\n\r]*%ymm16" 3 } } */
+/* { dg-final { scan-assembler-times
"vbroadcastss\[^\n\r]*%\[re\]di\[^\n\r]*%ymm16" 3 } } */
 /* { dg-final { scan-assembler-times
"vpermilps\[^\n\r]*\\\$0\[^\n\r]*%xmm16\[^\n\r]*%xmm16" 1 } } */
 /* { dg-final { scan-assembler-times
"vpermilps\[^\n\r]*\\\$85\[^\n\r]*%xmm16\[^\n\r]*%xmm16" 1 } } */
 /* { dg-final { scan-assembler-times
"vpermilps\[^\n\r]*\\\$170\[^\n\r]*%xmm16\[^\n\r]*%xmm16" 1 } } */
diff --git a/gcc/testsuite/gcc.target/i386/avx512vl-vcvtps2ph-3.c
b/gcc/testsuite/gcc.target/i386/avx512vl-vcvtps2ph-3.c
index 2fd2215..c2e3f01 100644
--- a/gcc/testsuite/gcc.target/i386/avx512vl-vcvtps2ph-3.c
+++ b/gcc/testsuite/gcc.target/i386/avx512vl-vcvtps2ph-3.c
@@ -38,4 +38,4 @@ f3 (__m256 x, __v8hi *y)
   *y = (__v8hi) _mm256_cvtps_ph (a, 1);
 }

-/* { dg-final { scan-assembler
"vcvtps2ph\[^\n\r]*\\\$1\[^\n\r]*%ymm16\[^\n\r]*%rdi" } } */
+/* { dg-final { scan-assembler
"vcvtps2ph\[^\n\r]*\\\$1\[^\n\r]*%ymm16\[^\n\r]*%\[re\]di" } } */


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]