V4 [PATCH] x86: Add pmovzx/pmovsx patterns with memory operands
H.J. Lu
hjl.tools@gmail.com
Fri Oct 26 07:39:00 GMT 2018
On 10/25/18, Uros Bizjak <ubizjak@gmail.com> wrote:
> On Fri, Oct 26, 2018 at 8:07 AM H.J. Lu <hjl.tools@gmail.com> wrote:
>>
>> Many x86 pmovzx/pmovsx instructions with memory operands are modeled in
>> a wrong way. For example:
>>
>> (define_insn "sse4_1_<code>v8qiv8hi2<mask_name>"
>> [(set (match_operand:V8HI 0 "register_operand" "=Yr,*x,v")
>> (any_extend:V8HI
>> (vec_select:V8QI
>> (match_operand:V16QI 1 "nonimmediate_operand" "Yrm,*xm,vm")
>> (parallel [(const_int 0) (const_int 1)
>> (const_int 2) (const_int 3)
>> (const_int 4) (const_int 5)
>> (const_int 6) (const_int 7)]))))]
>>
>> should be defind for memory operands as:
>>
>> (define_insn "sse4_1_<code>v8qiv8hi2<mask_name>"
>> [(set (match_operand:V8HI 0 "register_operand" "=Yr,*x,v")
>> (any_extend:V8HI
>> (match_operand:V8QI "memory_operand" "m,m,m")))]
>>
>> This set of patches updates them to
>>
>> (define_insn "sse4_1_<code>v8qiv8hi2<mask_name>"
>> [(set (match_operand:V8HI 0 "register_operand" "=Yr,*x,v")
>> (any_extend:V8HI
>> (vec_select:V8QI
>> (match_operand:V16QI 1 "nonimmediate_operand" "Yr,*x,v")
>> (parallel [(const_int 0) (const_int 1)
>> (const_int 2) (const_int 3)
>> (const_int 4) (const_int 5)
>> (const_int 6) (const_int 7)]))))]
>>
>> (define_insn "*sse4_1_<code>v8qiv8hi2<mask_name>_1"
>> [(set (match_operand:V8HI 0 "register_operand" "=Yr,*x,v")
>> (any_extend:V8HI
>> (match_operand:V8QI "subreg_memory_operand" "m,m,m")))]
>>
>> with a splitter:
>>
>> (define_insn_and_split "*sse4_1_<code>v8qiv8hi2<mask_name>_2"
>> [(set (match_operand:V8HI 0 "register_operand" "=Yr,*x,v")
>
> No constraints needed for pre-reload splitter.
>
>> (any_extend:V8HI
>> (vec_select:V8QI
>> (subreg:V16QI
>> (vec_concat:V2DI
>> (match_operand:DI 1 "memory_operand" "m,*m,m")
>> (const_int 0)) 0)
>> (parallel [(const_int 0) (const_int 1)
>> (const_int 2) (const_int 3)
>> (const_int 4) (const_int 5)
>> (const_int 6) (const_int 7)]))))]
>> "TARGET_SSE4_1 && <mask_avx512bw_condition> &&
>> <mask_avx512vl_condition>"
>> "#"
>> "&& can_create_pseudo_p ()"
>> [(set (match_dup 0) (match_dup 1))]
>
> [(set (match_dup 0)
> (any_extend:V8HI (match_dup 1)))]
>
>> {
>> operands[1] = gen_rtx_<CODE> (V8HImode,
>> gen_rtx_SUBREG (V8QImode,
>> operands[1], 0));
>> })
>
> Don't create subregs of memory. Use adjust_address_nv.
Here is the updated patch.
--
H.J.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-x86-Add-pmovzx-pmovsx-patterns-with-memory-operands.patch
Type: text/x-patch
Size: 31311 bytes
Desc: not available
URL: <http://gcc.gnu.org/pipermail/gcc-patches/attachments/20181026/0f7e328d/attachment.bin>
More information about the Gcc-patches
mailing list