This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug target/82074] [aarch64] vmlsq_f32 compiled into 2 instructions


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82074

ktkachov at gcc dot gnu.org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Target|                            |aarch64
             Status|UNCONFIRMED                 |NEW
           Keywords|TREE                        |missed-optimization
   Last reconfirmed|                            |2017-09-01
          Component|tree-optimization           |target
                 CC|                            |ktkachov at gcc dot gnu.org
     Ever confirmed|0                           |1
      Known to fail|                            |4.9.4, 5.4.1, 6.4.1, 7.2.1,
                   |                            |8.0

--- Comment #1 from ktkachov at gcc dot gnu.org ---
Confirmed on all releases that I have access to.
Interestingly things go bad in combine. Before combine the correct RTL is
formed and the expected fnmav4sf4 insn is matched:

(insn 8 5 13 2 (set (reg:V4SF 78)
        (fma:V4SF (reg/v:V4SF 76 [ b ])
            (neg:V4SF (reg/v:V4SF 77 [ c ]))
            (reg/v:V4SF 75 [ a ]))) "vmls.c":25 1562 {fnmav4sf4}
     (expr_list:REG_DEAD (reg/v:V4SF 77 [ c ])
        (expr_list:REG_DEAD (reg/v:V4SF 76 [ b ])
            (expr_list:REG_DEAD (reg/v:V4SF 75 [ a ])
                (nil)))))


but after combine we end up with:
(insn 4 3 5 2 (set (reg/v:V4SF 77 [ c ])
        (neg:V4SF (reg:V4SF 34 v2 [ c ]))) "vmls.c":24 1532 {negv4sf2}
     (expr_list:REG_DEAD (reg:V4SF 34 v2 [ c ])
        (nil)))

(insn 13 8 14 2 (set (reg/i:V4SF 32 v0)
        (fma:V4SF (reg/v:V4SF 77 [ c ])
            (reg:V4SF 33 v1 [ b ])
            (reg:V4SF 32 v0 [ a ]))) "vmls.c":26 1542 {fmav4sf4}
     (expr_list:REG_DEAD (reg/v:V4SF 77 [ c ])
        (expr_list:REG_DEAD (reg:V4SF 33 v1 [ b ])
            (nil))))


Combine tries and fails to match:
Trying 2 -> 8:
Failed to match this instruction:
(set (reg:V4SF 78)
    (fma:V4SF (neg:V4SF (reg/v:V4SF 77 [ c ]))
        (reg/v:V4SF 76 [ b ])
        (reg:V4SF 32 v0 [ a ])))


What I think is going on is that the target pattern for fnmav4sf4 specifies
non-canonical RTL because in:
        (fma:V4SF (reg/v:V4SF 76 [ b ])
            (neg:V4SF (reg/v:V4SF 77 [ c ]))
            (reg/v:V4SF 75 [ a ])))

The first two operands of an fma are the multiplication operands, which are
commutative, so by RTL canonicalization rules the more complex expression must
go into the first operand, that would be the neg.
So combine/simplify-rtx canonicalizes the expression and tries to match it and
then breaks it up when it doesn't match.

I believe the solution here is to fix the RTL pattern of the fnma<mode>4 insn
in aarch64-simd.md

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]