[Bug target/87317] Missed optimisation: merging VMOVQ with operations that only use the low 8 bytes
hjl at gcc dot gnu.org
gcc-bugzilla@gcc.gnu.org
Wed Nov 21 13:19:00 GMT 2018
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87317
--- Comment #6 from hjl at gcc dot gnu.org <hjl at gcc dot gnu.org> ---
Author: hjl
Date: Wed Nov 21 13:18:54 2018
New Revision: 266342
URL: https://gcc.gnu.org/viewcvs?rev=266342&root=gcc&view=rev
Log:
x86: Add pmovzx/pmovsx patterns with memory operands
Many x86 pmovzx/pmovsx instructions with memory operands are modeled in
a wrong way. For example:
(define_insn "sse4_1_<code>v8qiv8hi2<mask_name>"
[(set (match_operand:V8HI 0 "register_operand" "=Yr,*x,v")
(any_extend:V8HI
(vec_select:V8QI
(match_operand:V16QI 1 "nonimmediate_operand" "Yrm,*xm,vm")
(parallel [(const_int 0) (const_int 1)
(const_int 2) (const_int 3)
(const_int 4) (const_int 5)
(const_int 6) (const_int 7)]))))]
should be defind for memory operands as:
(define_insn "sse4_1_<code>v8qiv8hi2<mask_name>"
[(set (match_operand:V8HI 0 "register_operand" "=Yr,*x,v")
(any_extend:V8HI
(match_operand:V8QI "memory_operand" "m,m,m")))]
This patch updates them to
(define_insn "sse4_1_<code>v8qiv8hi2<mask_name>"
[(set (match_operand:V8HI 0 "register_operand" "=Yr,*x,v")
(any_extend:V8HI
(vec_select:V8QI
(match_operand:V16QI 1 "register_operand" "Yr,*x,v")
(parallel [(const_int 0) (const_int 1)
(const_int 2) (const_int 3)
(const_int 4) (const_int 5)
(const_int 6) (const_int 7)]))))]
(define_insn "*sse4_1_<code>v8qiv8hi2<mask_name>_1"
[(set (match_operand:V8HI 0 "register_operand" "=Yr,*x,v")
(any_extend:V8HI
(match_operand:V8QI "subreg_memory_operand" "m,m,m")))]
with a splitter:
(define_insn_and_split "*sse4_1_<code>v8qiv8hi2<mask_name>_2"
[(set (match_operand:V8HI 0 "register_operand")
(any_extend:V8HI
(vec_select:V8QI
(subreg:V16QI
(vec_concat:V2DI
(match_operand:DI 1 "memory_operand")
(const_int 0)) 0)
(parallel [(const_int 0) (const_int 1)
(const_int 2) (const_int 3)
(const_int 4) (const_int 5)
(const_int 6) (const_int 7)]))))]
"TARGET_SSE4_1
&& <mask_avx512bw_condition>
&& <mask_avx512vl_condition>
"&& can_create_pseudo_p ()"
"#"
"&& 1"
[(set (match_dup 0)
(any_extend:V8HI (match_dup 1)))]
"operands[1] = adjust_address_nv (operands[1], V8QImode, 0);")
This patch requires updating apply_subst_iterator to handle
define_insn_and_split.
gcc/
PR target/87317
* config/i386/sse.md (sse4_1_<code>v8qiv8hi2<mask_name>): Replace
nonimmediate_operand with register_operand.
(avx2_<code>v8qiv8si2<mask_name>): Likewise.
(sse4_1_<code>v4qiv4si2<mask_name>): Likewise.
(sse4_1_<code>v4hiv4si2<mask_name>): Likewise.
(sse4_1_<code>v2qiv2di2<mask_name>): Likewise.
(avx512f_<code>v8qiv8di2<mask_name>): Likewise.
(avx2_<code>v4qiv4di2<mask_name>): Likewise.
(avx2_<code>v4hiv4di2<mask_name>): Likewise.
(sse4_1_<code>v2hiv2di2<mask_name>): Likewise.
(sse4_1_<code>v2siv2di2<mask_name>): Likewise.
(*sse4_1_<code>v8qiv8hi2<mask_name>_1): New pattern.
(*sse4_1_<code>v8qiv8hi2<mask_name>_2): Likewise.
(*avx2_<code>v8qiv8si2<mask_name>_1): Likewise.
(*avx2_<code>v8qiv8si2<mask_name>_2): Likewise.
(*sse4_1_<code>v4qiv4si2<mask_name>_1): Likewise.
(*sse4_1_<code>v4qiv4si2<mask_name>_2): Likewise.
(*sse4_1_<code>v4hiv4si2<mask_name>_1): Likewise.
(*sse4_1_<code>v4hiv4si2<mask_name>_2): Likewise.
(*avx512f_<code>v8qiv8di2<mask_name>_1): Likewise.
(*avx512f_<code>v8qiv8di2<mask_name>_2): Likewise.
(*avx2_<code>v4qiv4di2<mask_name>_1): Likewise.
(*avx2_<code>v4qiv4di2<mask_name>_2): Likewise.
(*avx2_<code>v4hiv4di2<mask_name>_1): Likewise.
(*avx2_<code>v4hiv4di2<mask_name>_2): Likewise.
(*sse4_1_<code>v2hiv2di2<mask_name>_1): Likewise.
(*sse4_1_<code>v2hiv2di2<mask_name>_2): Likewise.
(*sse4_1_<code>v2siv2di2<mask_name>_1): Likewise.
(*sse4_1_<code>v2siv2di2<mask_name>_2): Likewise.
gcc/testsuite/
PR target/87317
* gcc.target/i386/pr87317-1.c: New file.
* gcc.target/i386/pr87317-2.c: Likewise.
* gcc.target/i386/pr87317-3.c: Likewise.
* gcc.target/i386/pr87317-4.c: Likewise.
* gcc.target/i386/pr87317-5.c: Likewise.
* gcc.target/i386/pr87317-6.c: Likewise.
* gcc.target/i386/pr87317-7.c: Likewise.
* gcc.target/i386/pr87317-8.c: Likewise.
* gcc.target/i386/pr87317-9.c: Likewise.
* gcc.target/i386/pr87317-10.c: Likewise.
* gcc.target/i386/pr87317-11.c: Likewise.
* gcc.target/i386/pr87317-12.c: Likewise.
* gcc.target/i386/pr87317-13.c: Likewise.
Added:
trunk/gcc/testsuite/gcc.target/i386/pr87317-1.c
trunk/gcc/testsuite/gcc.target/i386/pr87317-10.c
trunk/gcc/testsuite/gcc.target/i386/pr87317-11.c
trunk/gcc/testsuite/gcc.target/i386/pr87317-12.c
trunk/gcc/testsuite/gcc.target/i386/pr87317-13.c
trunk/gcc/testsuite/gcc.target/i386/pr87317-2.c
trunk/gcc/testsuite/gcc.target/i386/pr87317-3.c
trunk/gcc/testsuite/gcc.target/i386/pr87317-4.c
trunk/gcc/testsuite/gcc.target/i386/pr87317-5.c
trunk/gcc/testsuite/gcc.target/i386/pr87317-6.c
trunk/gcc/testsuite/gcc.target/i386/pr87317-7.c
trunk/gcc/testsuite/gcc.target/i386/pr87317-8.c
trunk/gcc/testsuite/gcc.target/i386/pr87317-9.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/i386/sse.md
trunk/gcc/testsuite/ChangeLog
More information about the Gcc-bugs
mailing list