This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: unfused fma question
- From: Jeff Law <law at redhat dot com>
- To: sellcey at imgtec dot com, Matthew Fortune <Matthew dot Fortune at imgtec dot com>
- Cc: "gcc at gcc dot gnu dot org" <gcc at gcc dot gnu dot org>
- Date: Mon, 23 Feb 2015 11:06:56 -0700
- Subject: Re: unfused fma question
- Authentication-results: sourceware.org; auth=none
- References: <02d0fee7-2c86-4291-8405-ae250d3210d9 at BAMAIL02 dot ba dot imgtec dot org> <6D39441BF12EF246A7ABCE6654B0235320FDD81D at LEMAIL01 dot le dot imgtec dot org> <1424713330 dot 27855 dot 172 dot camel at ubuntu-sellcey>
On 02/23/15 10:42, Steve Ellcey wrote:
No, I am thinking about the case where there are only non-fused multiply
add instructions available. To make sure I am using the right
terminology, I am using a non-fused multiply-add to mean a single fma
instruction that does '(a + (b * c))' but which rounds the result of '(b
* c)' before adding it to 'a' so that there is no difference in the
results between using this instruction and using individual add and mult
instructions. My understanding is that this is how the mips32r2 madd
instruction works.
Ahhh, nevermind, nothing I said was relevant then. I misunderstood
completely :-)
In this case there seems to be two ways to have GCC generate the fma
instruction. One is the current method using combine_instructions with
an instruction defined as:
(define_insn "*madd" (set (0) (plus (mult (1) (2))))
"madd.<fmt>\t%0,%3,%1,%2"
>
The other way would be to extend the convert_mult_to_fma so that instead
of:
if (FLOAT_TYPE_P (type)
&& flag_fp_contract_mode == FP_CONTRACT_OFF)
return false
it has something like:
if (FLOAT_TYPE_P (type)
&& (flag_fp_contract_mode == FP_CONTRACT_OFF)
&& !targetm.fma_does_rounding))
return false
And then define an instruction like:
(define_insn "fma" (set (0) (fma (1) (2) (3))))"
madd.<fmt>\t%0,%3,%1,%2"
The question I have is whether one or the other of these two approaches
would be better at creating fma instructions (vs leaving mult/add
combinations) or be might be preferable for some other reason.
The combiner pattern is useful in cases where we can't see the FMA at
gimple->rtl expansion time. But there may be cases where exposing the
FMA earlier is helpful as well.
So I think an argument could be easily made that we want to support both.
Jeff