This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug target/82074] [aarch64] vmlsq_f32 compiled into 2 instructions
- From: "ktkachov at gcc dot gnu.org" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: Fri, 01 Sep 2017 14:49:57 +0000
- Subject: [Bug target/82074] [aarch64] vmlsq_f32 compiled into 2 instructions
- Auto-submitted: auto-generated
- References: <bug-82074-4@http.gcc.gnu.org/bugzilla/>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82074
ktkachov at gcc dot gnu.org changed:
What |Removed |Added
----------------------------------------------------------------------------
Target| |aarch64
Status|UNCONFIRMED |NEW
Keywords|TREE |missed-optimization
Last reconfirmed| |2017-09-01
Component|tree-optimization |target
CC| |ktkachov at gcc dot gnu.org
Ever confirmed|0 |1
Known to fail| |4.9.4, 5.4.1, 6.4.1, 7.2.1,
| |8.0
--- Comment #1 from ktkachov at gcc dot gnu.org ---
Confirmed on all releases that I have access to.
Interestingly things go bad in combine. Before combine the correct RTL is
formed and the expected fnmav4sf4 insn is matched:
(insn 8 5 13 2 (set (reg:V4SF 78)
(fma:V4SF (reg/v:V4SF 76 [ b ])
(neg:V4SF (reg/v:V4SF 77 [ c ]))
(reg/v:V4SF 75 [ a ]))) "vmls.c":25 1562 {fnmav4sf4}
(expr_list:REG_DEAD (reg/v:V4SF 77 [ c ])
(expr_list:REG_DEAD (reg/v:V4SF 76 [ b ])
(expr_list:REG_DEAD (reg/v:V4SF 75 [ a ])
(nil)))))
but after combine we end up with:
(insn 4 3 5 2 (set (reg/v:V4SF 77 [ c ])
(neg:V4SF (reg:V4SF 34 v2 [ c ]))) "vmls.c":24 1532 {negv4sf2}
(expr_list:REG_DEAD (reg:V4SF 34 v2 [ c ])
(nil)))
(insn 13 8 14 2 (set (reg/i:V4SF 32 v0)
(fma:V4SF (reg/v:V4SF 77 [ c ])
(reg:V4SF 33 v1 [ b ])
(reg:V4SF 32 v0 [ a ]))) "vmls.c":26 1542 {fmav4sf4}
(expr_list:REG_DEAD (reg/v:V4SF 77 [ c ])
(expr_list:REG_DEAD (reg:V4SF 33 v1 [ b ])
(nil))))
Combine tries and fails to match:
Trying 2 -> 8:
Failed to match this instruction:
(set (reg:V4SF 78)
(fma:V4SF (neg:V4SF (reg/v:V4SF 77 [ c ]))
(reg/v:V4SF 76 [ b ])
(reg:V4SF 32 v0 [ a ])))
What I think is going on is that the target pattern for fnmav4sf4 specifies
non-canonical RTL because in:
(fma:V4SF (reg/v:V4SF 76 [ b ])
(neg:V4SF (reg/v:V4SF 77 [ c ]))
(reg/v:V4SF 75 [ a ])))
The first two operands of an fma are the multiplication operands, which are
commutative, so by RTL canonicalization rules the more complex expression must
go into the first operand, that would be the neg.
So combine/simplify-rtx canonicalizes the expression and tries to match it and
then breaks it up when it doesn't match.
I believe the solution here is to fix the RTL pattern of the fnma<mode>4 insn
in aarch64-simd.md