This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH 0/2] Update apply_subst_iterator and fix x86 pmovzx/pmovsx patterns


On Fri, Oct 26, 2018 at 8:07 AM H.J. Lu <hjl.tools@gmail.com> wrote:
>
> Many x86 pmovzx/pmovsx instructions with memory operands are modeled in
> a wrong way.  For example:
>
> (define_insn "sse4_1_<code>v8qiv8hi2<mask_name>"
>   [(set (match_operand:V8HI 0 "register_operand" "=Yr,*x,v")
>     (any_extend:V8HI
>       (vec_select:V8QI
>         (match_operand:V16QI 1 "nonimmediate_operand" "Yrm,*xm,vm")
>         (parallel [(const_int 0) (const_int 1)
>                (const_int 2) (const_int 3)
>                (const_int 4) (const_int 5)
>                (const_int 6) (const_int 7)]))))]
>
> should be defind for memory operands as:
>
> (define_insn "sse4_1_<code>v8qiv8hi2<mask_name>"
>   [(set (match_operand:V8HI 0 "register_operand" "=Yr,*x,v")
>     (any_extend:V8HI
>       (match_operand:V8QI "memory_operand" "m,m,m")))]
>
> This set of patches updates them to
>
> (define_insn "sse4_1_<code>v8qiv8hi2<mask_name>"
>   [(set (match_operand:V8HI 0 "register_operand" "=Yr,*x,v")
>     (any_extend:V8HI
>       (vec_select:V8QI
>         (match_operand:V16QI 1 "nonimmediate_operand" "Yr,*x,v")
>         (parallel [(const_int 0) (const_int 1)
>                (const_int 2) (const_int 3)
>                (const_int 4) (const_int 5)
>                (const_int 6) (const_int 7)]))))]
>
> (define_insn "*sse4_1_<code>v8qiv8hi2<mask_name>_1"
>   [(set (match_operand:V8HI 0 "register_operand" "=Yr,*x,v")
>     (any_extend:V8HI
>       (match_operand:V8QI "subreg_memory_operand" "m,m,m")))]
>
> with a splitter:
>
> (define_insn_and_split "*sse4_1_<code>v8qiv8hi2<mask_name>_2"
>   [(set (match_operand:V8HI 0 "register_operand" "=Yr,*x,v")

No constraints needed for pre-reload splitter.

>         (any_extend:V8HI
>           (vec_select:V8QI
>             (subreg:V16QI
>               (vec_concat:V2DI
>                 (match_operand:DI 1 "memory_operand" "m,*m,m")
>                 (const_int 0)) 0)
>             (parallel [(const_int 0) (const_int 1)
>                        (const_int 2) (const_int 3)
>                        (const_int 4) (const_int 5)
>                        (const_int 6) (const_int 7)]))))]
>   "TARGET_SSE4_1 && <mask_avx512bw_condition> && <mask_avx512vl_condition>"
>   "#"
>   "&& can_create_pseudo_p ()"
>   [(set (match_dup 0) (match_dup 1))]

 [(set (match_dup 0)
      (any_extend:V8HI (match_dup 1)))]

> {
>   operands[1] = gen_rtx_<CODE> (V8HImode,
>                                 gen_rtx_SUBREG (V8QImode,
>                                                 operands[1], 0));
> })

Don't create subregs of memory. Use adjust_address_nv.

Uros.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]