This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH] Fix up _mm_f{,n}m{add,sub}_s{s,d} (PR target/54564)
On 09/13/2012 11:49 AM, Jakub Jelinek wrote:
> On Thu, Sep 13, 2012 at 11:25:42AM -0700, Richard Henderson wrote:
>> (1) Negating the second argument is arguably non-canonical rtl.
>
> That is why I've put in the *fmai_fnm{add,sub}_<mode> patterns
> operands 2 with the neg as first operand of the FMA rtl. That way it is
> canonical (otherwise it didn't match in combine). The FMA rtl operand
> order doesn't need to imply the order of instruction operands.
Sorry, I didn't read the unidiff properly.
> (fma:VF_128
> (match_operand:VF_128 1 "nonimmediate_operand" " 0, 0")
> (match_operand:VF_128 2 "nonimmediate_operand" "xm, x")
> (match_operand:VF_128 3 "nonimmediate_operand" " x,xm"))
> (match_operand:VF_128 4 "nonimmediate_operand" " 0, 0")
...
> which was apparently too much for reload (supposedly the two "0" constraint
> operands, even when the expander used (match_dup 1)).
Yes. We'd have to have two different patterns to "properly" support fma4.
Though I suppose now that I think about it this is extremely similar to
the vfmadd231 case, in that in order to want to generate
vfmaddss %xmm3, %xmm2, %xmm1, %xmm0
given the semantics of the builtin we'd have had to emit a copy of %xmm1
or %xmm2 into %xmm0 anyway. So we might as well not support this and just do
(define_insn "*fmai_fmadd_<mode>"
[(set (match_operand:VF_128 0 "register_operand" "=x,x,x,x")
(vec_merge:VF_128
(fma:VF_128
(match_operand:VF_128 1 "nonimmediate_operand" "%0, 0, 0,0")
(match_operand:VF_128 2 "nonimmediate_operand" "xm, x, x,m")
(match_operand:VF_128 3 "nonimmediate_operand" " x,xm,xm,x"))
(match_dup 0)
(const_int 1)))]
"TARGET_FMA || TARGET_FMA4"
"@
vfmadd132<ssescalarmodesuffix>\t{%2, %3, %0|%0, %3, %2}
vfmadd213<ssescalarmodesuffix>\t{%3, %2, %0|%0, %2, %3}
vfmadd<ssescalarmodesuffix>\t{%3, %2, %1, %0|%0, %1, %2, %3}
vfmadd<ssescalarmodesuffix>\t{%3, %2, %1, %0|%0, %1, %2, %3}"
[(set_attr "isa" "fma,fma,fma4,fma4")
(set_attr "type" "ssemuladd")
(set_attr "mode" "<MODE>")])
r~