This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [PATCH] RFC for patch to add C99 fma/fmaf/fmal builtins

From: Jakub Jelinek <jakub at redhat dot com>
To: Geert Bosch <bosch at adacore dot com>
Cc: "Joseph S. Myers" <joseph at codesourcery dot com>, Richard Henderson <rth at redhat dot com>, Michael Meissner <meissner at linux dot vnet dot ibm dot com>, gcc-patches at gcc dot gnu dot org, dje dot gcc at gmail dot com, libc-alpha at sourceware dot org
Date: Wed, 6 Oct 2010 23:06:53 +0200
Subject: Re: [PATCH] RFC for patch to add C99 fma/fmaf/fmal builtins
References: <20101004204000.GA24432@hungry-tiger.westford.ibm.com> <4CAA422B.6010506@redhat.com> <Pine.LNX.4.64.1010042332500.15516@digraph.polyomino.org.uk> <702ACC86-92E7-43E9-9DA4-FAD34025D034@adacore.com>
Reply-to: Jakub Jelinek <jakub at redhat dot com>

On Tue, Oct 05, 2010 at 10:49:59PM -0400, Geert Bosch wrote:
> On Oct 4, 2010, at 19:39, Joseph S. Myers wrote:
> >> I seem to recall that double-rounding is ok as long as the intermediate
> >> step contains > 2N bits.  Which implies that for IEEE single (23 bits)
> >> we can generically implement it with IEEE double (53 bits > 46):
> >> 
> >>  result = (float)((double)op0 * (double)op1) + (double)op2);
> > 
> > No, that's not safe.  Say (double)op0 * (double)op1 has a value exactly 
> > half way between two representable float values, and (double)op2 is much 
> > smaller in magnitude, so that in round-to-nearest the above becomes 
> > (float)((double)op0 * (double)op1) - but the correct float result depends 
> > on the sign of the tiny op2.
> 
> An algorithm for the correctly rounded FMA using just regular floating
> point instructions has been published  in the IEEE TRANSACTIONS ON 
> COMPUTERS, VOL. 57, NO. 4, APRIL 2008. There is no real short-cut 
> for the correctly rounded sum of three floating point numbers.
> I think it might be possible to devise a slightly simpler
> algorithm for single precision, but as you illustrated, it doesn't
> become trivial.

Looking at that http://www.lri.fr/~melquion/doc/08-tc.pdf,
I believe for float we could though:

#include <fenv.h>
#include <ieee754.h>

float
fmaf (float op0, float op1, float op2)
{
  fenv_t env;
  /* Multiplication is always exact.  */
  double temp = (double) op0 * (double) op1;
  union ieee754_double u;
  feholdexcept (&env);
  fesetround (FE_TOWARDZERO);
  /* Perform addition with round to odd.  */
  u.d = temp + (double) op2;
  /* XXX check for NaNs and Infs?  */
  if ((u.ieee.mantissa1 & 1) == 0)
    u.ieee.mantissa1 |= fetestexcept (FE_INEXACT) != 0;
  fesetenv (&env);
  /* And finally truncation with round to nearest.  */
  return (float) u.d;
}

(and similarly for fma when long double is IEEE quad).  For fma
if long double is less precise than that or for fmal we'd need
to use the slower exact addition (either Dekker's or Knuth's) and
multiplication (Dekker's).

	Jakub

References:
- [PATCH] RFC for patch to add C99 fma/fmaf/fmal builtins
  - From: Michael Meissner
- Re: [PATCH] RFC for patch to add C99 fma/fmaf/fmal builtins
  - From: Richard Henderson
- Re: [PATCH] RFC for patch to add C99 fma/fmaf/fmal builtins
  - From: Joseph S. Myers
- Re: [PATCH] RFC for patch to add C99 fma/fmaf/fmal builtins
  - From: Geert Bosch

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]