This is the mail archive of the
`gcc-patches@gcc.gnu.org`
mailing list for the GCC project.

Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|

Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |

Other format: | [Raw text] |

*From*: lucier at math dot purdue dot edu*To*: gcc-patches at gcc dot gnu dot org*Cc*: lucier at math dot purdue dot edu*Date*: Thu, 6 Dec 2001 15:46:39 -0500 (EST)*Subject*: IEEE 754 and fused-madd (was: Re: patch: rs6000 specific)

I've just finished going over this month's IEEE floating-point discussion and the IEEE 754 standard. First, -ffast-math can and does do many horrible, ugly, transformations to fp arithmetic, and it would be a shame if all this baggage is invoked when someone wants to have fused multiply-add. I wouldn't mind so much if there were an fma intrinsic (or is there already one?), but having to use -ffast-math to get fma (which is the effect if -ffused-madd implies the fast-math transformations) adds too much uncertainty to fp arithmetic. There is actually much flexibility in the IEEE spec, and I would argue that fma actually adheres to the spec. Here's why. First, what the PowerPC fma instruction does can be modelled as follows. First, it multiplies two single- or double-precision numbers and it put the result into an (internal) register (a "destination" in IEEE 754 terminology) that has the following characteristics: precision: infinite Emax: infinity Emin: -infinity Exponent bias: unspecified Exponent width in bits: infinite Format width in bits: infinite What I mean here by infinite is that the destination register has enough precision and exponent range to represent any product of single- or double-precision fp arguments. So one could make up some numbers, but I don't want to work at that. At any rate, this internal register has a valid IEEE format that satisfies the requirements of double extended format, which are: precision: >= 64 Emax: >= 16383 Emin: <= -16382 Exponent bias: unspecified Exponent width in bits: >= 15 Format width in bits: >= 79 Because the format is so big, all results in this internal destination register are computed exactly, so with one rounding, by definition. It then adds to (or subtracts from, or whatever) this internal register a single (double) precision register, and gives a single (double) precision result. It does this in a single rounding. In both cases, the result is rounded to the precision of its destination, first, to the precision of the internal register, then to the precision of the single (or double) result register. OK, the IEEE 754 spec says: <standard> 4. Rounding ... 4.3 Rounding Precision. Normally a result is rounded to the precision of its destination. However, some systems deliver results only to double or extended destinations. On such a system the user, which may be a high-level language compiler, shall be able to specify that a result be rounded instead to single precision, though it may be stored in the double or extended format with its wider exponent range.[4] Similary, a system that delivers results only to double extended destinations shall permit the user to specify rounding to single or double precision. Note that to meet the specifications in 4.1, the result cannot suffer more than one rounding error. Footnote [4]: Control of rounding precision is intended to allow systems whose destinations are always double or extended to mimic, in the absence of over/underflow, the precisions of systems with single and double destinations. An implementation should not provide operations that combine double or extended operands to produce a single result, nor operations that combine double extended operands to produce a double result, with only one rounding. </standard> Now, to meet the requirement that "On such a system ... the user ... shall be able to specify that a result be rounded instead to single precision, though it may be stored in the double or extended format with its wider exponent range" is possible in gcc by -fno-fused-madd directive, which simply disallows the use of this special internal register. As for the part of the footnote that says "An implementation should not provide operations ... that combine double extended operands to produce a double result, with only one rounding", this is only a "should" so even though the fma instruction combines an extended precision value and a double precision value to produce a double precision value with only one rounding, it doesn't violate the standard. Secondly, one can argue that what fma does in this respect is OK, since one of its arguments is a double. So I think that fm[as], as implemented in the PowerPC, does not violate IEEE 754 arithmetic. And using it should not invoke all the other stuff that -ffast-math implies. Brad Lucier

Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|

Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |