[Bug target/87317] Missed optimisation: merging VMOVQ with operations that only use the low 8 bytes

hjl at gcc dot gnu.org gcc-bugzilla@gcc.gnu.org
Wed Nov 21 13:19:00 GMT 2018


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87317

--- Comment #6 from hjl at gcc dot gnu.org <hjl at gcc dot gnu.org> ---
Author: hjl
Date: Wed Nov 21 13:18:54 2018
New Revision: 266342

URL: https://gcc.gnu.org/viewcvs?rev=266342&root=gcc&view=rev
Log:
x86: Add pmovzx/pmovsx patterns with memory operands

Many x86 pmovzx/pmovsx instructions with memory operands are modeled in
a wrong way.  For example:

(define_insn "sse4_1_<code>v8qiv8hi2<mask_name>"
  [(set (match_operand:V8HI 0 "register_operand" "=Yr,*x,v")
    (any_extend:V8HI
      (vec_select:V8QI
        (match_operand:V16QI 1 "nonimmediate_operand" "Yrm,*xm,vm")
        (parallel [(const_int 0) (const_int 1)
               (const_int 2) (const_int 3)
               (const_int 4) (const_int 5)
               (const_int 6) (const_int 7)]))))]

should be defind for memory operands as:

(define_insn "sse4_1_<code>v8qiv8hi2<mask_name>"
  [(set (match_operand:V8HI 0 "register_operand" "=Yr,*x,v")
    (any_extend:V8HI
      (match_operand:V8QI "memory_operand" "m,m,m")))]

This patch updates them to

(define_insn "sse4_1_<code>v8qiv8hi2<mask_name>"
  [(set (match_operand:V8HI 0 "register_operand" "=Yr,*x,v")
        (any_extend:V8HI
          (vec_select:V8QI
            (match_operand:V16QI 1 "register_operand" "Yr,*x,v")
            (parallel [(const_int 0) (const_int 1)
                       (const_int 2) (const_int 3)
                       (const_int 4) (const_int 5)
                       (const_int 6) (const_int 7)]))))]

(define_insn "*sse4_1_<code>v8qiv8hi2<mask_name>_1"
  [(set (match_operand:V8HI 0 "register_operand" "=Yr,*x,v")
        (any_extend:V8HI
          (match_operand:V8QI "subreg_memory_operand" "m,m,m")))]

with a splitter:

(define_insn_and_split "*sse4_1_<code>v8qiv8hi2<mask_name>_2"
  [(set (match_operand:V8HI 0 "register_operand")
        (any_extend:V8HI
          (vec_select:V8QI
            (subreg:V16QI
              (vec_concat:V2DI
                (match_operand:DI 1 "memory_operand")
                (const_int 0)) 0)
            (parallel [(const_int 0) (const_int 1)
                       (const_int 2) (const_int 3)
                       (const_int 4) (const_int 5)
                       (const_int 6) (const_int 7)]))))]
  "TARGET_SSE4_1
   && <mask_avx512bw_condition>
   && <mask_avx512vl_condition>
  "&& can_create_pseudo_p ()"
  "#"
  "&& 1"
  [(set (match_dup 0)
        (any_extend:V8HI (match_dup 1)))]
  "operands[1] = adjust_address_nv (operands[1], V8QImode, 0);")

This patch requires updating apply_subst_iterator to handle
define_insn_and_split.

gcc/

        PR target/87317
        * config/i386/sse.md (sse4_1_<code>v8qiv8hi2<mask_name>): Replace
        nonimmediate_operand with register_operand.
        (avx2_<code>v8qiv8si2<mask_name>): Likewise.
        (sse4_1_<code>v4qiv4si2<mask_name>): Likewise.
        (sse4_1_<code>v4hiv4si2<mask_name>): Likewise.
        (sse4_1_<code>v2qiv2di2<mask_name>): Likewise.
        (avx512f_<code>v8qiv8di2<mask_name>): Likewise.
        (avx2_<code>v4qiv4di2<mask_name>): Likewise.
        (avx2_<code>v4hiv4di2<mask_name>): Likewise.
        (sse4_1_<code>v2hiv2di2<mask_name>): Likewise.
        (sse4_1_<code>v2siv2di2<mask_name>): Likewise.
        (*sse4_1_<code>v8qiv8hi2<mask_name>_1): New pattern.
        (*sse4_1_<code>v8qiv8hi2<mask_name>_2): Likewise.
        (*avx2_<code>v8qiv8si2<mask_name>_1): Likewise.
        (*avx2_<code>v8qiv8si2<mask_name>_2): Likewise.
        (*sse4_1_<code>v4qiv4si2<mask_name>_1): Likewise.
        (*sse4_1_<code>v4qiv4si2<mask_name>_2): Likewise.
        (*sse4_1_<code>v4hiv4si2<mask_name>_1): Likewise.
        (*sse4_1_<code>v4hiv4si2<mask_name>_2): Likewise.
        (*avx512f_<code>v8qiv8di2<mask_name>_1): Likewise.
        (*avx512f_<code>v8qiv8di2<mask_name>_2): Likewise.
        (*avx2_<code>v4qiv4di2<mask_name>_1): Likewise.
        (*avx2_<code>v4qiv4di2<mask_name>_2): Likewise.
        (*avx2_<code>v4hiv4di2<mask_name>_1): Likewise.
        (*avx2_<code>v4hiv4di2<mask_name>_2): Likewise.
        (*sse4_1_<code>v2hiv2di2<mask_name>_1): Likewise.
        (*sse4_1_<code>v2hiv2di2<mask_name>_2): Likewise.
        (*sse4_1_<code>v2siv2di2<mask_name>_1): Likewise.
        (*sse4_1_<code>v2siv2di2<mask_name>_2): Likewise.

gcc/testsuite/

        PR target/87317
        * gcc.target/i386/pr87317-1.c: New file.
        * gcc.target/i386/pr87317-2.c: Likewise.
        * gcc.target/i386/pr87317-3.c: Likewise.
        * gcc.target/i386/pr87317-4.c: Likewise.
        * gcc.target/i386/pr87317-5.c: Likewise.
        * gcc.target/i386/pr87317-6.c: Likewise.
        * gcc.target/i386/pr87317-7.c: Likewise.
        * gcc.target/i386/pr87317-8.c: Likewise.
        * gcc.target/i386/pr87317-9.c: Likewise.
        * gcc.target/i386/pr87317-10.c: Likewise.
        * gcc.target/i386/pr87317-11.c: Likewise.
        * gcc.target/i386/pr87317-12.c: Likewise.
        * gcc.target/i386/pr87317-13.c: Likewise.

Added:
    trunk/gcc/testsuite/gcc.target/i386/pr87317-1.c
    trunk/gcc/testsuite/gcc.target/i386/pr87317-10.c
    trunk/gcc/testsuite/gcc.target/i386/pr87317-11.c
    trunk/gcc/testsuite/gcc.target/i386/pr87317-12.c
    trunk/gcc/testsuite/gcc.target/i386/pr87317-13.c
    trunk/gcc/testsuite/gcc.target/i386/pr87317-2.c
    trunk/gcc/testsuite/gcc.target/i386/pr87317-3.c
    trunk/gcc/testsuite/gcc.target/i386/pr87317-4.c
    trunk/gcc/testsuite/gcc.target/i386/pr87317-5.c
    trunk/gcc/testsuite/gcc.target/i386/pr87317-6.c
    trunk/gcc/testsuite/gcc.target/i386/pr87317-7.c
    trunk/gcc/testsuite/gcc.target/i386/pr87317-8.c
    trunk/gcc/testsuite/gcc.target/i386/pr87317-9.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/i386/sse.md
    trunk/gcc/testsuite/ChangeLog


More information about the Gcc-bugs mailing list