[Bug target/96201] x86 movsd/movsq string instructions and alignment inference

crazylht at gmail dot com gcc-bugzilla@gcc.gnu.org
Wed Jul 15 05:42:07 GMT 2020


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96201

Hongtao.liu <crazylht at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |crazylht at gmail dot com

--- Comment #1 from Hongtao.liu <crazylht at gmail dot com> ---
The issue is caused by pass_ivopt, ivopt select only one iv for f3(dn) which
seems not to be optimal, and select two iv for f4(sn,dn) which seems optimal.
---
loop in f3:

Selected IV set for loop 1 at pr96201.c:25, 10 avg niters, 1 IVs:
Candidate 8:
  Var befor: dn_24
  Var after: dn_18
  Incr POS: orig biv
  IV struct:
    Type:       int *
    Base:       (int *) _3
    Step:       4
    Biv:        N
    Overflowness wrto loop niter:       Overflow

loop in f4:

Selected IV set for loop 1 at pr96201.c:34, 10 avg niters, 2 IVs:
Candidate 6:
  Var befor: sn_26
  Var after: sn_20
  Incr POS: orig biv
  IV struct:
    Type:       int *
    Base:       sn_14
    Step:       4
    Object:     (void *) sn_14
    Biv:        N
    Overflowness wrto loop niter:       Overflow
Candidate 8:
  Var befor: dn_27
  Var after: dn_21
  Incr POS: orig biv
  IV struct:
    Type:       int *
    Base:       dn_16
    Step:       4
    Object:     (void *) dn_16
    Biv:        N
    Overflowness wrto loop niter:       Overflow

---

then it generate more instructions for f3 which pass_combine failed to combine
them.

---
loop in f3:

Trying 19 -> 22:
   19: r83:DI=r92:DI
   22: [r83:DI]=r89:SI
      REG_DEAD r89:SI
      REG_DEAD r83:DI
Can't combine i2 into i3

Trying 21 -> 22:
   21: r89:SI=[r93:DI]
      REG_DEAD r93:DI
   22: [r83:DI]=r89:SI
      REG_DEAD r89:SI
      REG_DEAD r83:DI
Failed to match this instruction:
(set (mem:SI (reg/v/f:DI 83 [ dn ]) [1 *dn_2+0 S4 A32])
    (mem:SI (reg/f:DI 93 [ _20 ]) [1 *_20+0 S4 A32]))

Trying 18, 21 -> 22:
   18: {r93:DI=r92:DI+r102:DI;clobber flags:CC;}
      REG_UNUSED flags:CC
   21: r89:SI=[r93:DI]
      REG_DEAD r93:DI
   22: [r83:DI]=r89:SI
      REG_DEAD r89:SI
      REG_DEAD r83:DI
Can't combine i1 into i3

Trying 21, 19 -> 22:
   21: r89:SI=[r93:DI]
      REG_DEAD r93:DI
   19: r83:DI=r92:DI
   22: [r83:DI]=r89:SI
      REG_DEAD r89:SI
      REG_DEAD r83:DI
Can't combine i1 into i3

(insn 18 16 19 4 (parallel [
            (set (reg/f:DI 93 [ _20 ])
                (plus:DI (reg/v/f:DI 92 [ dn ])
                    (reg:DI 102)))
            (clobber (reg:CC 17 flags))
        ]) 210 {*adddi_1}
     (expr_list:REG_UNUSED (reg:CC 17 flags)
        (nil)))
(insn 19 18 20 4 (set (reg/v/f:DI 83 [ dn ])
        (reg/v/f:DI 92 [ dn ])) 74 {*movdi_internal}
     (nil))
(insn 20 19 21 4 (parallel [
            (set (reg/v/f:DI 92 [ dn ])
                (plus:DI (reg/v/f:DI 92 [ dn ])
                    (const_int 4 [0x4])))
            (clobber (reg:CC 17 flags))
        ]) "pr96201.c":25:24 210 {*adddi_1}
     (expr_list:REG_UNUSED (reg:CC 17 flags)
        (nil)))
(insn 21 20 22 4 (set (reg:SI 89 [ _9 ])
        (mem:SI (reg/f:DI 93 [ _20 ]) [1 *_20+0 S4 A32])) "pr96201.c":25:29 75
{*movsi_internal}
     (expr_list:REG_DEAD (reg/f:DI 93 [ _20 ])
        (nil)))
(insn 22 21 24 4 (set (mem:SI (reg/v/f:DI 83 [ dn ]) [1 *dn_2+0 S4 A32])
        (reg:SI 89 [ _9 ])) "pr96201.c":25:27 75 {*movsi_internal}
     (expr_list:REG_DEAD (reg:SI 89 [ _9 ])
        (expr_list:REG_DEAD (reg/v/f:DI 83 [ dn ])
            (nil))))



loop in f4:

Trying 16, 18, 17 -> 19:
   16: {r89:DI=r89:DI+0x4;clobber flags:CC;}
      REG_UNUSED flags:CC
   18: r88:SI=[r89:DI-0x4]
   17: {r90:DI=r90:DI+0x4;clobber flags:CC;}
      REG_UNUSED flags:CC
   19: [r90:DI-0x4]=r88:SI
      REG_DEAD r88:SI
Successfully matched this instruction:
(parallel [
        (set (mem:SI (reg/v/f:DI 90 [ dn ]) [1 MEM[base: dn_21, offset: -4B]+0
S4 A32])
            (mem:SI (reg/v/f:DI 89 [ sn ]) [1 MEM[base: sn_20, offset: -4B]+0
S4 A32]))
        (set (reg/v/f:DI 90 [ dn ])
            (plus:DI (reg/v/f:DI 90 [ dn ])
                (const_int 4 [0x4])))
        (set (reg/v/f:DI 89 [ sn ])
            (plus:DI (reg/v/f:DI 89 [ sn ])
                (const_int 4 [0x4])))
    ])

(insn 16 15 17 3 (parallel [
            (set (reg/v/f:DI 89 [ sn ])
                (plus:DI (reg/v/f:DI 89 [ sn ])
                    (const_int 4 [0x4])))
            (clobber (reg:CC 17 flags))
        ]) "pr96201.c":34:32 210 {*adddi_1}
     (expr_list:REG_UNUSED (reg:CC 17 flags)
        (nil)))
(insn 17 16 18 3 (parallel [
            (set (reg/v/f:DI 90 [ dn ])
                (plus:DI (reg/v/f:DI 90 [ dn ])
                    (const_int 4 [0x4])))
            (clobber (reg:CC 17 flags))
        ]) "pr96201.c":34:24 210 {*adddi_1}
     (expr_list:REG_UNUSED (reg:CC 17 flags)
        (nil)))
(insn 18 17 19 3 (set (reg:SI 88 [ _9 ])
        (mem:SI (plus:DI (reg/v/f:DI 89 [ sn ])
                (const_int -4 [0xfffffffffffffffc])) [1 MEM[base: sn_20,
offset: -4B]+0 S4 A32])) "pr96201.c":34:29 75 {*movsi_internal}
     (nil))
(insn 19 18 21 3 (set (mem:SI (plus:DI (reg/v/f:DI 90 [ dn ])
                (const_int -4 [0xfffffffffffffffc])) [1 MEM[base: dn_21,
offset: -4B]+0 S4 A32])
        (reg:SI 88 [ _9 ])) "pr96201.c":34:27 75 {*movsi_internal}
     (expr_list:REG_DEAD (reg:SI 88 [ _9 ])
        (nil)))


---


More information about the Gcc-bugs mailing list