V4 [PATCH] x86: Add pmovzx/pmovsx patterns with memory operands

H.J. Lu hjl.tools@gmail.com
Fri Oct 26 07:39:00 GMT 2018


On 10/25/18, Uros Bizjak <ubizjak@gmail.com> wrote:
> On Fri, Oct 26, 2018 at 8:07 AM H.J. Lu <hjl.tools@gmail.com> wrote:
>>
>> Many x86 pmovzx/pmovsx instructions with memory operands are modeled in
>> a wrong way.  For example:
>>
>> (define_insn "sse4_1_<code>v8qiv8hi2<mask_name>"
>>   [(set (match_operand:V8HI 0 "register_operand" "=Yr,*x,v")
>>     (any_extend:V8HI
>>       (vec_select:V8QI
>>         (match_operand:V16QI 1 "nonimmediate_operand" "Yrm,*xm,vm")
>>         (parallel [(const_int 0) (const_int 1)
>>                (const_int 2) (const_int 3)
>>                (const_int 4) (const_int 5)
>>                (const_int 6) (const_int 7)]))))]
>>
>> should be defind for memory operands as:
>>
>> (define_insn "sse4_1_<code>v8qiv8hi2<mask_name>"
>>   [(set (match_operand:V8HI 0 "register_operand" "=Yr,*x,v")
>>     (any_extend:V8HI
>>       (match_operand:V8QI "memory_operand" "m,m,m")))]
>>
>> This set of patches updates them to
>>
>> (define_insn "sse4_1_<code>v8qiv8hi2<mask_name>"
>>   [(set (match_operand:V8HI 0 "register_operand" "=Yr,*x,v")
>>     (any_extend:V8HI
>>       (vec_select:V8QI
>>         (match_operand:V16QI 1 "nonimmediate_operand" "Yr,*x,v")
>>         (parallel [(const_int 0) (const_int 1)
>>                (const_int 2) (const_int 3)
>>                (const_int 4) (const_int 5)
>>                (const_int 6) (const_int 7)]))))]
>>
>> (define_insn "*sse4_1_<code>v8qiv8hi2<mask_name>_1"
>>   [(set (match_operand:V8HI 0 "register_operand" "=Yr,*x,v")
>>     (any_extend:V8HI
>>       (match_operand:V8QI "subreg_memory_operand" "m,m,m")))]
>>
>> with a splitter:
>>
>> (define_insn_and_split "*sse4_1_<code>v8qiv8hi2<mask_name>_2"
>>   [(set (match_operand:V8HI 0 "register_operand" "=Yr,*x,v")
>
> No constraints needed for pre-reload splitter.
>
>>         (any_extend:V8HI
>>           (vec_select:V8QI
>>             (subreg:V16QI
>>               (vec_concat:V2DI
>>                 (match_operand:DI 1 "memory_operand" "m,*m,m")
>>                 (const_int 0)) 0)
>>             (parallel [(const_int 0) (const_int 1)
>>                        (const_int 2) (const_int 3)
>>                        (const_int 4) (const_int 5)
>>                        (const_int 6) (const_int 7)]))))]
>>   "TARGET_SSE4_1 && <mask_avx512bw_condition> &&
>> <mask_avx512vl_condition>"
>>   "#"
>>   "&& can_create_pseudo_p ()"
>>   [(set (match_dup 0) (match_dup 1))]
>
>  [(set (match_dup 0)
>       (any_extend:V8HI (match_dup 1)))]
>
>> {
>>   operands[1] = gen_rtx_<CODE> (V8HImode,
>>                                 gen_rtx_SUBREG (V8QImode,
>>                                                 operands[1], 0));
>> })
>
> Don't create subregs of memory. Use adjust_address_nv.

Here is the updated patch.

-- 
H.J.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-x86-Add-pmovzx-pmovsx-patterns-with-memory-operands.patch
Type: text/x-patch
Size: 31311 bytes
Desc: not available
URL: <http://gcc.gnu.org/pipermail/gcc-patches/attachments/20181026/0f7e328d/attachment.bin>


More information about the Gcc-patches mailing list