This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH 0/2] Update apply_subst_iterator and fix x86 pmovzx/pmovsx patterns
- From: Uros Bizjak <ubizjak at gmail dot com>
- To: "H. J. Lu" <hjl dot tools at gmail dot com>
- Cc: "gcc-patches at gcc dot gnu dot org" <gcc-patches at gcc dot gnu dot org>, Eric Botcazou <ebotcazou at libertysurf dot fr>
- Date: Fri, 26 Oct 2018 08:36:08 +0200
- Subject: Re: [PATCH 0/2] Update apply_subst_iterator and fix x86 pmovzx/pmovsx patterns
- References: <20181026060319.28506-1-hjl.tools@gmail.com>
On Fri, Oct 26, 2018 at 8:07 AM H.J. Lu <hjl.tools@gmail.com> wrote:
>
> Many x86 pmovzx/pmovsx instructions with memory operands are modeled in
> a wrong way. For example:
>
> (define_insn "sse4_1_<code>v8qiv8hi2<mask_name>"
> [(set (match_operand:V8HI 0 "register_operand" "=Yr,*x,v")
> (any_extend:V8HI
> (vec_select:V8QI
> (match_operand:V16QI 1 "nonimmediate_operand" "Yrm,*xm,vm")
> (parallel [(const_int 0) (const_int 1)
> (const_int 2) (const_int 3)
> (const_int 4) (const_int 5)
> (const_int 6) (const_int 7)]))))]
>
> should be defind for memory operands as:
>
> (define_insn "sse4_1_<code>v8qiv8hi2<mask_name>"
> [(set (match_operand:V8HI 0 "register_operand" "=Yr,*x,v")
> (any_extend:V8HI
> (match_operand:V8QI "memory_operand" "m,m,m")))]
>
> This set of patches updates them to
>
> (define_insn "sse4_1_<code>v8qiv8hi2<mask_name>"
> [(set (match_operand:V8HI 0 "register_operand" "=Yr,*x,v")
> (any_extend:V8HI
> (vec_select:V8QI
> (match_operand:V16QI 1 "nonimmediate_operand" "Yr,*x,v")
> (parallel [(const_int 0) (const_int 1)
> (const_int 2) (const_int 3)
> (const_int 4) (const_int 5)
> (const_int 6) (const_int 7)]))))]
>
> (define_insn "*sse4_1_<code>v8qiv8hi2<mask_name>_1"
> [(set (match_operand:V8HI 0 "register_operand" "=Yr,*x,v")
> (any_extend:V8HI
> (match_operand:V8QI "subreg_memory_operand" "m,m,m")))]
>
> with a splitter:
>
> (define_insn_and_split "*sse4_1_<code>v8qiv8hi2<mask_name>_2"
> [(set (match_operand:V8HI 0 "register_operand" "=Yr,*x,v")
No constraints needed for pre-reload splitter.
> (any_extend:V8HI
> (vec_select:V8QI
> (subreg:V16QI
> (vec_concat:V2DI
> (match_operand:DI 1 "memory_operand" "m,*m,m")
> (const_int 0)) 0)
> (parallel [(const_int 0) (const_int 1)
> (const_int 2) (const_int 3)
> (const_int 4) (const_int 5)
> (const_int 6) (const_int 7)]))))]
> "TARGET_SSE4_1 && <mask_avx512bw_condition> && <mask_avx512vl_condition>"
> "#"
> "&& can_create_pseudo_p ()"
> [(set (match_dup 0) (match_dup 1))]
[(set (match_dup 0)
(any_extend:V8HI (match_dup 1)))]
> {
> operands[1] = gen_rtx_<CODE> (V8HImode,
> gen_rtx_SUBREG (V8QImode,
> operands[1], 0));
> })
Don't create subregs of memory. Use adjust_address_nv.
Uros.