This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
[PATCH 00/10] i386: Properly encode xmm16-xmm31/ymm16-ymm31 for vector move
- From: "H.J. Lu" <hjl dot tools at gmail dot com>
- To: gcc-patches at gcc dot gnu dot org
- Cc: Jakub Jelinek <jakub at redhat dot com>, Jeffrey Law <law at redhat dot com>, Jan Hubicka <hubicka at ucw dot cz>, Uros Bizjak <ubizjak at gmail dot com>
- Date: Sat, 15 Feb 2020 07:26:18 -0800
- Subject: [PATCH 00/10] i386: Properly encode xmm16-xmm31/ymm16-ymm31 for vector move
This patch set was originally submitted in Feb 2019:
https://gcc.gnu.org/ml/gcc-patches/2019-02/msg01841.html
I broke it into 10 smaller patches for easy review.
On x86, when AVX and AVX512 are enabled, vector move instructions can
be encoded with either 2-byte/3-byte VEX (AVX) or 4-byte EVEX (AVX512):
0: c5 f9 6f d1 vmovdqa %xmm1,%xmm2
4: 62 f1 fd 08 6f d1 vmovdqa64 %xmm1,%xmm2
We prefer VEX encoding over EVEX since VEX is shorter. Also AVX512F
only supports 512-bit vector moves. AVX512F + AVX512VL supports 128-bit
and 256-bit vector moves. Mode attributes on x86 vector move patterns
indicate target preferences of vector move encoding. For vector register
to vector register move, we can use 512-bit vector move instructions to
move 128-bit/256-bit vector if AVX512VL isn't available. With AVX512F
and AVX512VL, we should use VEX encoding for 128-bit/256-bit vector moves
if upper 16 vector registers aren't used. This patch adds a function,
ix86_output_ssemov, to generate vector moves:
1. If zmm registers are used, use EVEX encoding.
2. If xmm16-xmm31/ymm16-ymm31 registers aren't used, SSE or VEX encoding
will be generated.
3. If xmm16-xmm31/ymm16-ymm31 registers are used:
a. With AVX512VL, AVX512VL vector moves will be generated.
b. Without AVX512VL, xmm16-xmm31/ymm16-ymm31 register to register
move will be done with zmm register move.
Tested on AVX2 and AVX512 with and without --with-arch=native.
H.J. Lu (10):
i386: Properly encode vector registers in vector move
i386: Use ix86_output_ssemov for XImode TYPE_SSEMOV
i386: Use ix86_output_ssemov for OImode TYPE_SSEMOV
i386: Use ix86_output_ssemov for TImode TYPE_SSEMOV
i386: Use ix86_output_ssemov for DImode TYPE_SSEMOV
i386: Use ix86_output_ssemov for SImode TYPE_SSEMOV
i386: Use ix86_output_ssemov for TFmode TYPE_SSEMOV
i386: Use ix86_output_ssemov for DFmode TYPE_SSEMOV
i386: Use ix86_output_ssemov for SFmode TYPE_SSEMOV
i386: Use ix86_output_ssemov for MMX TYPE_SSEMOV
gcc/config/i386/i386-protos.h | 2 +
gcc/config/i386/i386.c | 274 ++++++++++++++++++
gcc/config/i386/i386.md | 212 +-------------
gcc/config/i386/mmx.md | 29 +-
gcc/config/i386/predicates.md | 5 -
gcc/config/i386/sse.md | 98 +------
.../gcc.target/i386/avx512vl-vmovdqa64-1.c | 7 +-
gcc/testsuite/gcc.target/i386/pr89229-2a.c | 15 +
gcc/testsuite/gcc.target/i386/pr89229-2b.c | 13 +
gcc/testsuite/gcc.target/i386/pr89229-2c.c | 6 +
gcc/testsuite/gcc.target/i386/pr89229-3a.c | 17 ++
gcc/testsuite/gcc.target/i386/pr89229-3b.c | 6 +
gcc/testsuite/gcc.target/i386/pr89229-3c.c | 7 +
gcc/testsuite/gcc.target/i386/pr89229-4a.c | 17 ++
gcc/testsuite/gcc.target/i386/pr89229-4b.c | 6 +
gcc/testsuite/gcc.target/i386/pr89229-4c.c | 7 +
gcc/testsuite/gcc.target/i386/pr89229-5a.c | 16 +
gcc/testsuite/gcc.target/i386/pr89229-5b.c | 12 +
gcc/testsuite/gcc.target/i386/pr89229-5c.c | 6 +
gcc/testsuite/gcc.target/i386/pr89229-6a.c | 16 +
gcc/testsuite/gcc.target/i386/pr89229-6b.c | 7 +
gcc/testsuite/gcc.target/i386/pr89229-6c.c | 6 +
gcc/testsuite/gcc.target/i386/pr89229-7a.c | 16 +
gcc/testsuite/gcc.target/i386/pr89229-7b.c | 6 +
gcc/testsuite/gcc.target/i386/pr89229-7c.c | 6 +
gcc/testsuite/gcc.target/i386/pr89346.c | 15 +
26 files changed, 497 insertions(+), 330 deletions(-)
create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-2a.c
create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-2b.c
create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-2c.c
create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-3a.c
create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-3b.c
create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-3c.c
create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-4a.c
create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-4b.c
create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-4c.c
create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-5a.c
create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-5b.c
create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-5c.c
create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-6a.c
create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-6b.c
create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-6c.c
create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-7a.c
create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-7b.c
create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-7c.c
create mode 100644 gcc/testsuite/gcc.target/i386/pr89346.c
--
2.24.1