This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH] Improve AVX512 vector shift patterns (PR target/82370)
- From: Kirill Yukhin <kirill dot yukhin at gmail dot com>
- To: Jakub Jelinek <jakub at redhat dot com>
- Cc: Uros Bizjak <ubizjak at gmail dot com>, gcc-patches at gcc dot gnu dot org
- Date: Fri, 20 Oct 2017 08:48:22 +0300
- Subject: Re: [PATCH] Improve AVX512 vector shift patterns (PR target/82370)
- Authentication-results: sourceware.org; auth=none
- References: <20171004192942.GA18588@tucnak>
Hello Jakub,
On 04 Oct 21:29, Jakub Jelinek wrote:
> Hi!
>
> EVEX encoded vector shifts by immediate allow memory operand as input.
> We handle this right for the sra patterns by having 3 distinct
> define_insns, one TARGET_AVX512VL with masking, where the non-masked
> insn names start with *, that have (=v,v,v) and (=v,vm,N) alternatives
> and nonimmediate_operand for the middle operand, then SSE2 pattern
> for the same modes with just the noavx and avx alternatives and finally
> a 512-bit vector pattern with masking that also has the nonimmediate_operand
> etc. For the logical shifts we have 3 define_insns too, but with very
> different split that makes it not possible to do this.
>
> The following patch reworks the logical vector shifts so that they are
> similar to the arithmetic right vector shifts, except for the needed V?DI
> differences.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
The patch is OK for trunk.
--
Thanks, K
>
> 2017-10-04 Jakub Jelinek <jakub@redhat.com>
>
> PR target/82370
> * config/i386/sse.md (VI248_AVX2, VI248_AVX512BW, VI248_AVX512BW_2):
> New mode iterators.
> (<shift_insn><mode>3<mask_name>): Change the last of the 3
> define_insns for logical vector shifts to use VI248_AVX512BW
> iterator instead of VI48_AVX512, remove <mask_mode512bit_condition>
> condition, useless isa and prefix attributes. Change the first
> 2 of these define_insns to ...
> (<mask_codefor><shift_insn><mode>3<mask_name>): ... this, new
> define_insn for avx512vl.
> (<shift_insn><mode>3): ... and this, new define_insn without
> masking for non-avx512vl.
>
> * gcc.target/i386/avx-pr82370.c: New test.
> * gcc.target/i386/avx2-pr82370.c: New test.
> * gcc.target/i386/avx512f-pr82370.c: New test.
> * gcc.target/i386/avx512bw-pr82370.c: New test.
> * gcc.target/i386/avx512vl-pr82370.c: New test.
> * gcc.target/i386/avx512vlbw-pr82370.c: New test.
>
> --- gcc/config/i386/sse.md.jj 2017-10-04 09:45:55.000000000 +0200
> +++ gcc/config/i386/sse.md 2017-10-04 12:18:19.163858188 +0200
> @@ -403,11 +403,19 @@ (define_mode_iterator VI48_AVX2
> [(V8SI "TARGET_AVX2") V4SI
> (V4DI "TARGET_AVX2") V2DI])
>
> +(define_mode_iterator VI248_AVX2
> + [(V16HI "TARGET_AVX2") V8HI
> + (V8SI "TARGET_AVX2") V4SI
> + (V4DI "TARGET_AVX2") V2DI])
> +
> (define_mode_iterator VI248_AVX2_8_AVX512F_24_AVX512BW
> [(V32HI "TARGET_AVX512BW") (V16HI "TARGET_AVX2") V8HI
> (V16SI "TARGET_AVX512BW") (V8SI "TARGET_AVX2") V4SI
> (V8DI "TARGET_AVX512F") (V4DI "TARGET_AVX2") V2DI])
>
> +(define_mode_iterator VI248_AVX512BW
> + [(V32HI "TARGET_AVX512BW") V16SI V8DI])
> +
> (define_mode_iterator VI248_AVX512BW_AVX512VL
> [(V32HI "TARGET_AVX512BW")
> (V4DI "TARGET_AVX512VL") V16SI V8DI])
> @@ -418,6 +426,11 @@ (define_mode_iterator VI248_AVX512BW_1
> V8SI V4SI
> V2DI])
>
> +(define_mode_iterator VI248_AVX512BW_2
> + [(V16HI "TARGET_AVX512BW") (V8HI "TARGET_AVX512BW")
> + V8SI V4SI
> + V4DI V2DI])
> +
> (define_mode_iterator VI48_AVX512F
> [(V16SI "TARGET_AVX512F") V8SI V4SI
> (V8DI "TARGET_AVX512F") V4DI V2DI])
> @@ -10731,59 +10744,51 @@ (define_insn "ashr<mode>3<mask_name>"
> (const_string "0")))
> (set_attr "mode" "<sseinsnmode>")])
>
> -(define_insn "<shift_insn><mode>3<mask_name>"
> - [(set (match_operand:VI2_AVX2_AVX512BW 0 "register_operand" "=x,v")
> - (any_lshift:VI2_AVX2_AVX512BW
> - (match_operand:VI2_AVX2_AVX512BW 1 "register_operand" "0,v")
> - (match_operand:DI 2 "nonmemory_operand" "xN,vN")))]
> - "TARGET_SSE2 && <mask_mode512bit_condition> && <mask_avx512bw_condition>"
> - "@
> - p<vshift><ssemodesuffix>\t{%2, %0|%0, %2}
> - vp<vshift><ssemodesuffix>\t{%2, %1, %0<mask_operand3>|%0<mask_operand3>, %1, %2}"
> - [(set_attr "isa" "noavx,avx")
> - (set_attr "type" "sseishft")
> +(define_insn "<mask_codefor><shift_insn><mode>3<mask_name>"
> + [(set (match_operand:VI248_AVX512BW_2 0 "register_operand" "=v,v")
> + (any_lshift:VI248_AVX512BW_2
> + (match_operand:VI248_AVX512BW_2 1 "nonimmediate_operand" "v,vm")
> + (match_operand:DI 2 "nonmemory_operand" "v,N")))]
> + "TARGET_AVX512VL"
> + "vp<vshift><ssemodesuffix>\t{%2, %1, %0<mask_operand3>|%0<mask_operand3>, %1, %2}"
> + [(set_attr "type" "sseishft")
> (set (attr "length_immediate")
> (if_then_else (match_operand 2 "const_int_operand")
> (const_string "1")
> (const_string "0")))
> - (set_attr "prefix_data16" "1,*")
> - (set_attr "prefix" "orig,vex")
> (set_attr "mode" "<sseinsnmode>")])
>
> -(define_insn "<shift_insn><mode>3<mask_name>"
> - [(set (match_operand:VI48_AVX2 0 "register_operand" "=x,x,v")
> - (any_lshift:VI48_AVX2
> - (match_operand:VI48_AVX2 1 "register_operand" "0,x,v")
> - (match_operand:DI 2 "nonmemory_operand" "xN,xN,vN")))]
> - "TARGET_SSE2 && <mask_mode512bit_condition>"
> +(define_insn "<shift_insn><mode>3"
> + [(set (match_operand:VI248_AVX2 0 "register_operand" "=x,x")
> + (any_lshift:VI248_AVX2
> + (match_operand:VI248_AVX2 1 "register_operand" "0,x")
> + (match_operand:DI 2 "nonmemory_operand" "xN,xN")))]
> + "TARGET_SSE2"
> "@
> p<vshift><ssemodesuffix>\t{%2, %0|%0, %2}
> - vp<vshift><ssemodesuffix>\t{%2, %1, %0<mask_operand3>|%0<mask_operand3>, %1, %2}
> - vp<vshift><ssemodesuffix>\t{%2, %1, %0<mask_operand3>|%0<mask_operand3>, %1, %2}"
> - [(set_attr "isa" "noavx,avx,avx512bw")
> + vp<vshift><ssemodesuffix>\t{%2, %1, %0|%0, %1, %2}"
> + [(set_attr "isa" "noavx,avx")
> (set_attr "type" "sseishft")
> (set (attr "length_immediate")
> (if_then_else (match_operand 2 "const_int_operand")
> (const_string "1")
> (const_string "0")))
> - (set_attr "prefix_data16" "1,*,*")
> - (set_attr "prefix" "orig,vex,evex")
> + (set_attr "prefix_data16" "1,*")
> + (set_attr "prefix" "orig,vex")
> (set_attr "mode" "<sseinsnmode>")])
>
> (define_insn "<shift_insn><mode>3<mask_name>"
> - [(set (match_operand:VI48_512 0 "register_operand" "=v,v")
> - (any_lshift:VI48_512
> - (match_operand:VI48_512 1 "nonimmediate_operand" "v,m")
> + [(set (match_operand:VI248_AVX512BW 0 "register_operand" "=v,v")
> + (any_lshift:VI248_AVX512BW
> + (match_operand:VI248_AVX512BW 1 "nonimmediate_operand" "v,m")
> (match_operand:DI 2 "nonmemory_operand" "vN,N")))]
> - "TARGET_AVX512F && <mask_mode512bit_condition>"
> + "TARGET_AVX512F"
> "vp<vshift><ssemodesuffix>\t{%2, %1, %0<mask_operand3>|%0<mask_operand3>, %1, %2}"
> - [(set_attr "isa" "avx512f")
> - (set_attr "type" "sseishft")
> + [(set_attr "type" "sseishft")
> (set (attr "length_immediate")
> (if_then_else (match_operand 2 "const_int_operand")
> (const_string "1")
> (const_string "0")))
> - (set_attr "prefix" "evex")
> (set_attr "mode" "<sseinsnmode>")])
>
>
> --- gcc/testsuite/gcc.target/i386/avx-pr82370.c.jj 2017-10-04 13:00:37.272449155 +0200
> +++ gcc/testsuite/gcc.target/i386/avx-pr82370.c 2017-10-04 13:06:01.198536379 +0200
> @@ -0,0 +1,65 @@
> +/* PR target/82370 */
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -mavx -mno-avx2 -masm=att" } */
> +/* { dg-final { scan-assembler-times "vpslld\[ \t]\+\\\$7, %xmm\[0-9]\+, %xmm\[0-9]\+" 3 } } */
> +/* { dg-final { scan-assembler-times "vpsllq\[ \t]\+\\\$7, %xmm\[0-9]\+, %xmm\[0-9]\+" 3 } } */
> +/* { dg-final { scan-assembler-times "vpsllw\[ \t]\+\\\$7, %xmm\[0-9]\+, %xmm\[0-9]\+" 3 } } */
> +/* { dg-final { scan-assembler-times "vpsrad\[ \t]\+\\\$3, %xmm\[0-9]\+, %xmm\[0-9]\+" 3 } } */
> +/* { dg-final { scan-assembler-times "vpsraq\[ \t]\+\\\$3, %xmm\[0-9]\+, %xmm\[0-9]\+" 0 } } */
> +/* { dg-final { scan-assembler-times "vpsraw\[ \t]\+\\\$3, %xmm\[0-9]\+, %xmm\[0-9]\+" 3 } } */
> +/* { dg-final { scan-assembler-times "vpsrld\[ \t]\+\\\$5, %xmm\[0-9]\+, %xmm\[0-9]\+" 3 } } */
> +/* { dg-final { scan-assembler-times "vpsrlq\[ \t]\+\\\$5, %xmm\[0-9]\+, %xmm\[0-9]\+" 3 } } */
> +/* { dg-final { scan-assembler-times "vpsrlw\[ \t]\+\\\$5, %xmm\[0-9]\+, %xmm\[0-9]\+" 3 } } */
> +
> +typedef short int v32hi __attribute__((vector_size (64)));
> +typedef short int v16hi __attribute__((vector_size (32)));
> +typedef short int v8hi __attribute__((vector_size (16)));
> +typedef int v16si __attribute__((vector_size (64)));
> +typedef int v8si __attribute__((vector_size (32)));
> +typedef int v4si __attribute__((vector_size (16)));
> +typedef long long int v8di __attribute__((vector_size (64)));
> +typedef long long int v4di __attribute__((vector_size (32)));
> +typedef long long int v2di __attribute__((vector_size (16)));
> +typedef unsigned short int v32uhi __attribute__((vector_size (64)));
> +typedef unsigned short int v16uhi __attribute__((vector_size (32)));
> +typedef unsigned short int v8uhi __attribute__((vector_size (16)));
> +typedef unsigned int v16usi __attribute__((vector_size (64)));
> +typedef unsigned int v8usi __attribute__((vector_size (32)));
> +typedef unsigned int v4usi __attribute__((vector_size (16)));
> +typedef unsigned long long int v8udi __attribute__((vector_size (64)));
> +typedef unsigned long long int v4udi __attribute__((vector_size (32)));
> +typedef unsigned long long int v2udi __attribute__((vector_size (16)));
> +
> +#ifdef __AVX512F__
> +v32hi f1 (v32hi *x) { return *x >> 3; }
> +v32uhi f2 (v32uhi *x) { return *x >> 5; }
> +v32uhi f3 (v32uhi *x) { return *x << 7; }
> +#endif
> +v16hi f4 (v16hi *x) { return *x >> 3; }
> +v16uhi f5 (v16uhi *x) { return *x >> 5; }
> +v16uhi f6 (v16uhi *x) { return *x << 7; }
> +v8hi f7 (v8hi *x) { return *x >> 3; }
> +v8uhi f8 (v8uhi *x) { return *x >> 5; }
> +v8uhi f9 (v8uhi *x) { return *x << 7; }
> +#ifdef __AVX512F__
> +v16si f10 (v16si *x) { return *x >> 3; }
> +v16usi f11 (v16usi *x) { return *x >> 5; }
> +v16usi f12 (v16usi *x) { return *x << 7; }
> +#endif
> +v8si f13 (v8si *x) { return *x >> 3; }
> +v8usi f14 (v8usi *x) { return *x >> 5; }
> +v8usi f15 (v8usi *x) { return *x << 7; }
> +v4si f16 (v4si *x) { return *x >> 3; }
> +v4usi f17 (v4usi *x) { return *x >> 5; }
> +v4usi f18 (v4usi *x) { return *x << 7; }
> +#ifdef __AVX512F__
> +v8di f19 (v8di *x) { return *x >> 3; }
> +v8udi f20 (v8udi *x) { return *x >> 5; }
> +v8udi f21 (v8udi *x) { return *x << 7; }
> +#endif
> +v4di f22 (v4di *x) { return *x >> 3; }
> +v4udi f23 (v4udi *x) { return *x >> 5; }
> +v4udi f24 (v4udi *x) { return *x << 7; }
> +v2di f25 (v2di *x) { return *x >> 3; }
> +v2udi f26 (v2udi *x) { return *x >> 5; }
> +v2udi f27 (v2udi *x) { return *x << 7; }
> --- gcc/testsuite/gcc.target/i386/avx2-pr82370.c.jj 2017-10-04 13:08:46.067544889 +0200
> +++ gcc/testsuite/gcc.target/i386/avx2-pr82370.c 2017-10-04 13:09:39.388900808 +0200
> @@ -0,0 +1,23 @@
> +/* PR target/82370 */
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -mavx2 -mno-avx512f -masm=att" } */
> +/* { dg-final { scan-assembler-times "vpslld\[ \t]\+\\\$7, %xmm\[0-9]\+, %xmm\[0-9]\+" 1 } } */
> +/* { dg-final { scan-assembler-times "vpsllq\[ \t]\+\\\$7, %xmm\[0-9]\+, %xmm\[0-9]\+" 1 } } */
> +/* { dg-final { scan-assembler-times "vpsllw\[ \t]\+\\\$7, %xmm\[0-9]\+, %xmm\[0-9]\+" 1 } } */
> +/* { dg-final { scan-assembler-times "vpsrad\[ \t]\+\\\$3, %xmm\[0-9]\+, %xmm\[0-9]\+" 1 } } */
> +/* { dg-final { scan-assembler-times "vpsraq\[ \t]\+\\\$3, %xmm\[0-9]\+, %xmm\[0-9]\+" 0 } } */
> +/* { dg-final { scan-assembler-times "vpsraw\[ \t]\+\\\$3, %xmm\[0-9]\+, %xmm\[0-9]\+" 1 } } */
> +/* { dg-final { scan-assembler-times "vpsrld\[ \t]\+\\\$5, %xmm\[0-9]\+, %xmm\[0-9]\+" 1 } } */
> +/* { dg-final { scan-assembler-times "vpsrlq\[ \t]\+\\\$5, %xmm\[0-9]\+, %xmm\[0-9]\+" 1 } } */
> +/* { dg-final { scan-assembler-times "vpsrlw\[ \t]\+\\\$5, %xmm\[0-9]\+, %xmm\[0-9]\+" 1 } } */
> +/* { dg-final { scan-assembler-times "vpslld\[ \t]\+\\\$7, %ymm\[0-9]\+, %ymm\[0-9]\+" 1 } } */
> +/* { dg-final { scan-assembler-times "vpsllq\[ \t]\+\\\$7, %ymm\[0-9]\+, %ymm\[0-9]\+" 1 } } */
> +/* { dg-final { scan-assembler-times "vpsllw\[ \t]\+\\\$7, %ymm\[0-9]\+, %ymm\[0-9]\+" 1 } } */
> +/* { dg-final { scan-assembler-times "vpsrad\[ \t]\+\\\$3, %ymm\[0-9]\+, %ymm\[0-9]\+" 1 } } */
> +/* { dg-final { scan-assembler-times "vpsraq\[ \t]\+\\\$3, %ymm\[0-9]\+, %ymm\[0-9]\+" 0 } } */
> +/* { dg-final { scan-assembler-times "vpsraw\[ \t]\+\\\$3, %ymm\[0-9]\+, %ymm\[0-9]\+" 1 } } */
> +/* { dg-final { scan-assembler-times "vpsrld\[ \t]\+\\\$5, %ymm\[0-9]\+, %ymm\[0-9]\+" 1 } } */
> +/* { dg-final { scan-assembler-times "vpsrlq\[ \t]\+\\\$5, %ymm\[0-9]\+, %ymm\[0-9]\+" 1 } } */
> +/* { dg-final { scan-assembler-times "vpsrlw\[ \t]\+\\\$5, %ymm\[0-9]\+, %ymm\[0-9]\+" 1 } } */
> +
> +#include "avx-pr82370.c"
> --- gcc/testsuite/gcc.target/i386/avx512f-pr82370.c.jj 2017-10-04 13:40:17.579740975 +0200
> +++ gcc/testsuite/gcc.target/i386/avx512f-pr82370.c 2017-10-04 13:50:11.523595551 +0200
> @@ -0,0 +1,33 @@
> +/* PR target/82370 */
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -mavx512f -mno-avx512bw -mno-avx512vl -masm=att" } */
> +/* { dg-final { scan-assembler-times "vpslld\[ \t]\+\\\$7, %xmm\[0-9]\+, %xmm\[0-9]\+" 1 } } */
> +/* { dg-final { scan-assembler-times "vpsllq\[ \t]\+\\\$7, %xmm\[0-9]\+, %xmm\[0-9]\+" 1 } } */
> +/* { dg-final { scan-assembler-times "vpsllw\[ \t]\+\\\$7, %xmm\[0-9]\+, %xmm\[0-9]\+" 1 } } */
> +/* { dg-final { scan-assembler-times "vpsrad\[ \t]\+\\\$3, %xmm\[0-9]\+, %xmm\[0-9]\+" 1 } } */
> +/* { dg-final { scan-assembler-times "vpsraq\[ \t]\+\\\$3, %xmm\[0-9]\+, %xmm\[0-9]\+" 0 } } */
> +/* { dg-final { scan-assembler-times "vpsraw\[ \t]\+\\\$3, %xmm\[0-9]\+, %xmm\[0-9]\+" 1 } } */
> +/* { dg-final { scan-assembler-times "vpsrld\[ \t]\+\\\$5, %xmm\[0-9]\+, %xmm\[0-9]\+" 1 } } */
> +/* { dg-final { scan-assembler-times "vpsrlq\[ \t]\+\\\$5, %xmm\[0-9]\+, %xmm\[0-9]\+" 1 } } */
> +/* { dg-final { scan-assembler-times "vpsrlw\[ \t]\+\\\$5, %xmm\[0-9]\+, %xmm\[0-9]\+" 1 } } */
> +/* { dg-final { scan-assembler-times "vpslld\[ \t]\+\\\$7, %ymm\[0-9]\+, %ymm\[0-9]\+" 1 } } */
> +/* { dg-final { scan-assembler-times "vpsllq\[ \t]\+\\\$7, %ymm\[0-9]\+, %ymm\[0-9]\+" 1 } } */
> +/* { dg-final { scan-assembler-times "vpsllw\[ \t]\+\\\$7, %ymm\[0-9]\+, %ymm\[0-9]\+" 3 } } */
> +/* { dg-final { scan-assembler-times "vpsrad\[ \t]\+\\\$3, %ymm\[0-9]\+, %ymm\[0-9]\+" 1 } } */
> +/* { dg-final { scan-assembler-times "vpsraq\[ \t]\+\\\$3, %ymm\[0-9]\+, %ymm\[0-9]\+" 0 } } */
> +/* { dg-final { scan-assembler-times "vpsraw\[ \t]\+\\\$3, %ymm\[0-9]\+, %ymm\[0-9]\+" 3 } } */
> +/* { dg-final { scan-assembler-times "vpsrld\[ \t]\+\\\$5, %ymm\[0-9]\+, %ymm\[0-9]\+" 1 } } */
> +/* { dg-final { scan-assembler-times "vpsrlq\[ \t]\+\\\$5, %ymm\[0-9]\+, %ymm\[0-9]\+" 1 } } */
> +/* { dg-final { scan-assembler-times "vpsrlw\[ \t]\+\\\$5, %ymm\[0-9]\+, %ymm\[0-9]\+" 3 } } */
> +/* { dg-final { scan-assembler-times "vps\[lr]\[la]\[dwq]\[ \t]\+\\\$\[357], %zmm\[0-9]\+, %zmm\[0-9]\+" 0 } } */
> +/* { dg-final { scan-assembler-times "vpslld\[ \t]\+\\\$7, \\(%\[a-z0-9,]*\\), %zmm\[0-9]\+" 1 } } */
> +/* { dg-final { scan-assembler-times "vpsllq\[ \t]\+\\\$7, \\(%\[a-z0-9,]*\\), %zmm\[0-9]\+" 1 } } */
> +/* { dg-final { scan-assembler-times "vpsllw\[ \t]\+\\\$7, \\(%\[a-z0-9,]*\\), %zmm\[0-9]\+" 0 } } */
> +/* { dg-final { scan-assembler-times "vpsrad\[ \t]\+\\\$3, \\(%\[a-z0-9,]*\\), %zmm\[0-9]\+" 1 } } */
> +/* { dg-final { scan-assembler-times "vpsraq\[ \t]\+\\\$3, \\(%\[a-z0-9,]*\\), %zmm\[0-9]\+" 1 } } */
> +/* { dg-final { scan-assembler-times "vpsraw\[ \t]\+\\\$3, \\(%\[a-z0-9,]*\\), %zmm\[0-9]\+" 0 } } */
> +/* { dg-final { scan-assembler-times "vpsrld\[ \t]\+\\\$5, \\(%\[a-z0-9,]*\\), %zmm\[0-9]\+" 1 } } */
> +/* { dg-final { scan-assembler-times "vpsrlq\[ \t]\+\\\$5, \\(%\[a-z0-9,]*\\), %zmm\[0-9]\+" 1 } } */
> +/* { dg-final { scan-assembler-times "vpsrlw\[ \t]\+\\\$5, \\(%\[a-z0-9,]*\\), %zmm\[0-9]\+" 0 } } */
> +
> +#include "avx-pr82370.c"
> --- gcc/testsuite/gcc.target/i386/avx512bw-pr82370.c.jj 2017-10-04 13:59:17.233028171 +0200
> +++ gcc/testsuite/gcc.target/i386/avx512bw-pr82370.c 2017-10-04 14:00:49.235920951 +0200
> @@ -0,0 +1,33 @@
> +/* PR target/82370 */
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -mavx512bw -mno-avx512vl -masm=att" } */
> +/* { dg-final { scan-assembler-times "vpslld\[ \t]\+\\\$7, %xmm\[0-9]\+, %xmm\[0-9]\+" 1 } } */
> +/* { dg-final { scan-assembler-times "vpsllq\[ \t]\+\\\$7, %xmm\[0-9]\+, %xmm\[0-9]\+" 1 } } */
> +/* { dg-final { scan-assembler-times "vpsllw\[ \t]\+\\\$7, %xmm\[0-9]\+, %xmm\[0-9]\+" 1 } } */
> +/* { dg-final { scan-assembler-times "vpsrad\[ \t]\+\\\$3, %xmm\[0-9]\+, %xmm\[0-9]\+" 1 } } */
> +/* { dg-final { scan-assembler-times "vpsraq\[ \t]\+\\\$3, %xmm\[0-9]\+, %xmm\[0-9]\+" 0 } } */
> +/* { dg-final { scan-assembler-times "vpsraw\[ \t]\+\\\$3, %xmm\[0-9]\+, %xmm\[0-9]\+" 1 } } */
> +/* { dg-final { scan-assembler-times "vpsrld\[ \t]\+\\\$5, %xmm\[0-9]\+, %xmm\[0-9]\+" 1 } } */
> +/* { dg-final { scan-assembler-times "vpsrlq\[ \t]\+\\\$5, %xmm\[0-9]\+, %xmm\[0-9]\+" 1 } } */
> +/* { dg-final { scan-assembler-times "vpsrlw\[ \t]\+\\\$5, %xmm\[0-9]\+, %xmm\[0-9]\+" 1 } } */
> +/* { dg-final { scan-assembler-times "vpslld\[ \t]\+\\\$7, %ymm\[0-9]\+, %ymm\[0-9]\+" 1 } } */
> +/* { dg-final { scan-assembler-times "vpsllq\[ \t]\+\\\$7, %ymm\[0-9]\+, %ymm\[0-9]\+" 1 } } */
> +/* { dg-final { scan-assembler-times "vpsllw\[ \t]\+\\\$7, %ymm\[0-9]\+, %ymm\[0-9]\+" 1 } } */
> +/* { dg-final { scan-assembler-times "vpsrad\[ \t]\+\\\$3, %ymm\[0-9]\+, %ymm\[0-9]\+" 1 } } */
> +/* { dg-final { scan-assembler-times "vpsraq\[ \t]\+\\\$3, %ymm\[0-9]\+, %ymm\[0-9]\+" 0 } } */
> +/* { dg-final { scan-assembler-times "vpsraw\[ \t]\+\\\$3, %ymm\[0-9]\+, %ymm\[0-9]\+" 1 } } */
> +/* { dg-final { scan-assembler-times "vpsrld\[ \t]\+\\\$5, %ymm\[0-9]\+, %ymm\[0-9]\+" 1 } } */
> +/* { dg-final { scan-assembler-times "vpsrlq\[ \t]\+\\\$5, %ymm\[0-9]\+, %ymm\[0-9]\+" 1 } } */
> +/* { dg-final { scan-assembler-times "vpsrlw\[ \t]\+\\\$5, %ymm\[0-9]\+, %ymm\[0-9]\+" 1 } } */
> +/* { dg-final { scan-assembler-times "vps\[lr]\[la]\[dwq]\[ \t]\+\\\$\[357], %zmm\[0-9]\+, %zmm\[0-9]\+" 0 } } */
> +/* { dg-final { scan-assembler-times "vpslld\[ \t]\+\\\$7, \\(%\[a-z0-9,]*\\), %zmm\[0-9]\+" 1 } } */
> +/* { dg-final { scan-assembler-times "vpsllq\[ \t]\+\\\$7, \\(%\[a-z0-9,]*\\), %zmm\[0-9]\+" 1 } } */
> +/* { dg-final { scan-assembler-times "vpsllw\[ \t]\+\\\$7, \\(%\[a-z0-9,]*\\), %zmm\[0-9]\+" 1 } } */
> +/* { dg-final { scan-assembler-times "vpsrad\[ \t]\+\\\$3, \\(%\[a-z0-9,]*\\), %zmm\[0-9]\+" 1 } } */
> +/* { dg-final { scan-assembler-times "vpsraq\[ \t]\+\\\$3, \\(%\[a-z0-9,]*\\), %zmm\[0-9]\+" 1 } } */
> +/* { dg-final { scan-assembler-times "vpsraw\[ \t]\+\\\$3, \\(%\[a-z0-9,]*\\), %zmm\[0-9]\+" 1 } } */
> +/* { dg-final { scan-assembler-times "vpsrld\[ \t]\+\\\$5, \\(%\[a-z0-9,]*\\), %zmm\[0-9]\+" 1 } } */
> +/* { dg-final { scan-assembler-times "vpsrlq\[ \t]\+\\\$5, \\(%\[a-z0-9,]*\\), %zmm\[0-9]\+" 1 } } */
> +/* { dg-final { scan-assembler-times "vpsrlw\[ \t]\+\\\$5, \\(%\[a-z0-9,]*\\), %zmm\[0-9]\+" 1 } } */
> +
> +#include "avx-pr82370.c"
> --- gcc/testsuite/gcc.target/i386/avx512vl-pr82370.c.jj 2017-10-04 14:03:32.299958537 +0200
> +++ gcc/testsuite/gcc.target/i386/avx512vl-pr82370.c 2017-10-04 14:07:30.369093465 +0200
> @@ -0,0 +1,31 @@
> +/* PR target/82370 */
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -mavx512vl -mno-avx512bw -masm=att" } */
> +/* { dg-final { scan-assembler-times "vpsllw\[ \t]\+\\\$7, %xmm\[0-9]\+, %xmm\[0-9]\+" 1 } } */
> +/* { dg-final { scan-assembler-times "vpsraw\[ \t]\+\\\$3, %xmm\[0-9]\+, %xmm\[0-9]\+" 1 } } */
> +/* { dg-final { scan-assembler-times "vpsrlw\[ \t]\+\\\$5, %xmm\[0-9]\+, %xmm\[0-9]\+" 1 } } */
> +/* { dg-final { scan-assembler-times "vpsllw\[ \t]\+\\\$7, %ymm\[0-9]\+, %ymm\[0-9]\+" 3 } } */
> +/* { dg-final { scan-assembler-times "vpsraw\[ \t]\+\\\$3, %ymm\[0-9]\+, %ymm\[0-9]\+" 3 } } */
> +/* { dg-final { scan-assembler-times "vpsrlw\[ \t]\+\\\$5, %ymm\[0-9]\+, %ymm\[0-9]\+" 3 } } */
> +/* { dg-final { scan-assembler-times "vps\[lr]\[la]\[dq]\[ \t]\+\\\$\[357], %\[xyz]mm\[0-9]\+, %\[xyz]mm\[0-9]\+" 0 } } */
> +/* { dg-final { scan-assembler-times "vpslld\[ \t]\+\\\$7, \\(%\[a-z0-9,]*\\), %xmm\[0-9]\+" 1 } } */
> +/* { dg-final { scan-assembler-times "vpsllq\[ \t]\+\\\$7, \\(%\[a-z0-9,]*\\), %xmm\[0-9]\+" 1 } } */
> +/* { dg-final { scan-assembler-times "vps\[lr]\[la]w\[ \t]\+\\\$\[357], \\(%\[a-z0-9,]*\\), %\[xyz]mm\[0-9]\+" 0 } } */
> +/* { dg-final { scan-assembler-times "vpsrad\[ \t]\+\\\$3, \\(%\[a-z0-9,]*\\), %xmm\[0-9]\+" 1 } } */
> +/* { dg-final { scan-assembler-times "vpsraq\[ \t]\+\\\$3, \\(%\[a-z0-9,]*\\), %xmm\[0-9]\+" 1 } } */
> +/* { dg-final { scan-assembler-times "vpsrld\[ \t]\+\\\$5, \\(%\[a-z0-9,]*\\), %xmm\[0-9]\+" 1 } } */
> +/* { dg-final { scan-assembler-times "vpsrlq\[ \t]\+\\\$5, \\(%\[a-z0-9,]*\\), %xmm\[0-9]\+" 1 } } */
> +/* { dg-final { scan-assembler-times "vpslld\[ \t]\+\\\$7, \\(%\[a-z0-9,]*\\), %ymm\[0-9]\+" 1 } } */
> +/* { dg-final { scan-assembler-times "vpsllq\[ \t]\+\\\$7, \\(%\[a-z0-9,]*\\), %ymm\[0-9]\+" 1 } } */
> +/* { dg-final { scan-assembler-times "vpsrad\[ \t]\+\\\$3, \\(%\[a-z0-9,]*\\), %ymm\[0-9]\+" 1 } } */
> +/* { dg-final { scan-assembler-times "vpsraq\[ \t]\+\\\$3, \\(%\[a-z0-9,]*\\), %ymm\[0-9]\+" 1 } } */
> +/* { dg-final { scan-assembler-times "vpsrld\[ \t]\+\\\$5, \\(%\[a-z0-9,]*\\), %ymm\[0-9]\+" 1 } } */
> +/* { dg-final { scan-assembler-times "vpsrlq\[ \t]\+\\\$5, \\(%\[a-z0-9,]*\\), %ymm\[0-9]\+" 1 } } */
> +/* { dg-final { scan-assembler-times "vpslld\[ \t]\+\\\$7, \\(%\[a-z0-9,]*\\), %zmm\[0-9]\+" 1 } } */
> +/* { dg-final { scan-assembler-times "vpsllq\[ \t]\+\\\$7, \\(%\[a-z0-9,]*\\), %zmm\[0-9]\+" 1 } } */
> +/* { dg-final { scan-assembler-times "vpsrad\[ \t]\+\\\$3, \\(%\[a-z0-9,]*\\), %zmm\[0-9]\+" 1 } } */
> +/* { dg-final { scan-assembler-times "vpsraq\[ \t]\+\\\$3, \\(%\[a-z0-9,]*\\), %zmm\[0-9]\+" 1 } } */
> +/* { dg-final { scan-assembler-times "vpsrld\[ \t]\+\\\$5, \\(%\[a-z0-9,]*\\), %zmm\[0-9]\+" 1 } } */
> +/* { dg-final { scan-assembler-times "vpsrlq\[ \t]\+\\\$5, \\(%\[a-z0-9,]*\\), %zmm\[0-9]\+" 1 } } */
> +
> +#include "avx-pr82370.c"
> --- gcc/testsuite/gcc.target/i386/avx512vlbw-pr82370.c.jj 2017-10-04 13:47:53.581255062 +0200
> +++ gcc/testsuite/gcc.target/i386/avx512vlbw-pr82370.c 2017-10-04 14:02:48.102490437 +0200
> @@ -0,0 +1,33 @@
> +/* PR target/82370 */
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -mavx512vl -mavx512bw -masm=att" } */
> +/* { dg-final { scan-assembler-times "vps\[lr]\[la]\[dwq]\[ \t]\+\\\$\[357], %\[xyz]mm\[0-9]\+, %\[xyz]mm\[0-9]\+" 0 } } */
> +/* { dg-final { scan-assembler-times "vpslld\[ \t]\+\\\$7, \\(%\[a-z0-9,]*\\), %xmm\[0-9]\+" 1 } } */
> +/* { dg-final { scan-assembler-times "vpsllq\[ \t]\+\\\$7, \\(%\[a-z0-9,]*\\), %xmm\[0-9]\+" 1 } } */
> +/* { dg-final { scan-assembler-times "vpsllw\[ \t]\+\\\$7, \\(%\[a-z0-9,]*\\), %xmm\[0-9]\+" 1 } } */
> +/* { dg-final { scan-assembler-times "vpsrad\[ \t]\+\\\$3, \\(%\[a-z0-9,]*\\), %xmm\[0-9]\+" 1 } } */
> +/* { dg-final { scan-assembler-times "vpsraq\[ \t]\+\\\$3, \\(%\[a-z0-9,]*\\), %xmm\[0-9]\+" 1 } } */
> +/* { dg-final { scan-assembler-times "vpsraw\[ \t]\+\\\$3, \\(%\[a-z0-9,]*\\), %xmm\[0-9]\+" 1 } } */
> +/* { dg-final { scan-assembler-times "vpsrld\[ \t]\+\\\$5, \\(%\[a-z0-9,]*\\), %xmm\[0-9]\+" 1 } } */
> +/* { dg-final { scan-assembler-times "vpsrlq\[ \t]\+\\\$5, \\(%\[a-z0-9,]*\\), %xmm\[0-9]\+" 1 } } */
> +/* { dg-final { scan-assembler-times "vpsrlw\[ \t]\+\\\$5, \\(%\[a-z0-9,]*\\), %xmm\[0-9]\+" 1 } } */
> +/* { dg-final { scan-assembler-times "vpslld\[ \t]\+\\\$7, \\(%\[a-z0-9,]*\\), %ymm\[0-9]\+" 1 } } */
> +/* { dg-final { scan-assembler-times "vpsllq\[ \t]\+\\\$7, \\(%\[a-z0-9,]*\\), %ymm\[0-9]\+" 1 } } */
> +/* { dg-final { scan-assembler-times "vpsllw\[ \t]\+\\\$7, \\(%\[a-z0-9,]*\\), %ymm\[0-9]\+" 1 } } */
> +/* { dg-final { scan-assembler-times "vpsrad\[ \t]\+\\\$3, \\(%\[a-z0-9,]*\\), %ymm\[0-9]\+" 1 } } */
> +/* { dg-final { scan-assembler-times "vpsraq\[ \t]\+\\\$3, \\(%\[a-z0-9,]*\\), %ymm\[0-9]\+" 1 } } */
> +/* { dg-final { scan-assembler-times "vpsraw\[ \t]\+\\\$3, \\(%\[a-z0-9,]*\\), %ymm\[0-9]\+" 1 } } */
> +/* { dg-final { scan-assembler-times "vpsrld\[ \t]\+\\\$5, \\(%\[a-z0-9,]*\\), %ymm\[0-9]\+" 1 } } */
> +/* { dg-final { scan-assembler-times "vpsrlq\[ \t]\+\\\$5, \\(%\[a-z0-9,]*\\), %ymm\[0-9]\+" 1 } } */
> +/* { dg-final { scan-assembler-times "vpsrlw\[ \t]\+\\\$5, \\(%\[a-z0-9,]*\\), %ymm\[0-9]\+" 1 } } */
> +/* { dg-final { scan-assembler-times "vpslld\[ \t]\+\\\$7, \\(%\[a-z0-9,]*\\), %zmm\[0-9]\+" 1 } } */
> +/* { dg-final { scan-assembler-times "vpsllq\[ \t]\+\\\$7, \\(%\[a-z0-9,]*\\), %zmm\[0-9]\+" 1 } } */
> +/* { dg-final { scan-assembler-times "vpsllw\[ \t]\+\\\$7, \\(%\[a-z0-9,]*\\), %zmm\[0-9]\+" 1 } } */
> +/* { dg-final { scan-assembler-times "vpsrad\[ \t]\+\\\$3, \\(%\[a-z0-9,]*\\), %zmm\[0-9]\+" 1 } } */
> +/* { dg-final { scan-assembler-times "vpsraq\[ \t]\+\\\$3, \\(%\[a-z0-9,]*\\), %zmm\[0-9]\+" 1 } } */
> +/* { dg-final { scan-assembler-times "vpsraw\[ \t]\+\\\$3, \\(%\[a-z0-9,]*\\), %zmm\[0-9]\+" 1 } } */
> +/* { dg-final { scan-assembler-times "vpsrld\[ \t]\+\\\$5, \\(%\[a-z0-9,]*\\), %zmm\[0-9]\+" 1 } } */
> +/* { dg-final { scan-assembler-times "vpsrlq\[ \t]\+\\\$5, \\(%\[a-z0-9,]*\\), %zmm\[0-9]\+" 1 } } */
> +/* { dg-final { scan-assembler-times "vpsrlw\[ \t]\+\\\$5, \\(%\[a-z0-9,]*\\), %zmm\[0-9]\+" 1 } } */
> +
> +#include "avx-pr82370.c"
>
> Jakub