This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
[AARCH64] fnma<mode>4: scalar vs vector and placement of neg.
- From: Andrew Pinski <pinskia at gmail dot com>
- To: GCC Mailing List <gcc at gcc dot gnu dot org>
- Date: Tue, 27 Jun 2017 19:51:58 -0700
- Subject: [AARCH64] fnma<mode>4: scalar vs vector and placement of neg.
- Authentication-results: sourceware.org; auth=none
Hi,
I was looking into why we don't produce fmls with a scalar register
as the last argument but I found a difference in how fnma<mode>4 is
described in RTL which I think is causing the missed optimization.
Look at the scalar version:
(define_insn "fnma<mode>4"
[(set (match_operand:GPF_F16 0 "register_operand" "=w")
(fma:GPF_F16
(neg:GPF_F16 (match_operand:GPF_F16 1 "register_operand" "w"))
(match_operand:GPF_F16 2 "register_operand" "w")
(match_operand:GPF_F16 3 "register_operand" "w")))]
"TARGET_FLOAT"
"fmsub\\t%<s>0, %<s>1, %<s>2, %<s>3"
[(set_attr "type" "fmac<stype>")]
)
vs the vector version:
(define_insn "fnma<mode>4"
[(set (match_operand:VHSDF 0 "register_operand" "=w")
(fma:VHSDF
(match_operand:VHSDF 1 "register_operand" "w")
(neg:VHSDF
(match_operand:VHSDF 2 "register_operand" "w"))
(match_operand:VHSDF 3 "register_operand" "0")))]
"TARGET_SIMD"
"fmls\\t%0.<Vtype>, %1.<Vtype>, %2.<Vtype>"
[(set_attr "type" "neon_fp_mla_<stype><q>")]
)
Notice how the neg is a different location for both of them. What is
the reason for that?
Thanks,
Andrew