This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Darwin long double bug fixes, switch on by default


This adds to the Darwin long double support:

- Some correctness (Radar number 3549987)
- A specification, so you can tell what it means to be correct
- A testcase, so it stays that way

It also fixes a problem where there was no way to load a long double
into GPRs from a static variable (I think), and incidentally fixes a
number of other problems involving loading things into multiple GPRs
from a non-offsettable address.

The problem was that movdi_internal32 says that you can move between
'm' and 'r', not just 'o<>' and 'r'; it looked like it'd generate
better code to just implement the case, rather than get reload to fix
it up.

It also switches 128-bit long double on by default, now that it
works.  Yay!

Bootstrapped & tested on powerpc-darwin.

-- 
- Geoffrey Keating <geoffk@apple.com>

===File ~/patches/gcc-longdoubledefault.patch===============
2004-07-30  Geoffrey Keating  <geoffk@apple.com>

	* config/rs6000/rs6000.c (legitimate_lo_sum_address_p): Permit
	non-offsettable addresses even for DImode.
	(rs6000_split_multireg_move): Cope with non-offsettable addresses
	being moved into multiple GPRs.

	* config/rs6000/rs6000.c (RS6000_DEFAULT_LONG_DOUBLE_SIZE): Default
	to 64.
	(rs6000_override_options): Use RS6000_DEFAULT_LONG_DOUBLE_SIZE.
	* config/rs6000/darwin.h (RS6000_DEFAULT_LONG_DOUBLE_SIZE): Define
	to 128.
	* config/rs6000/darwin-ldouble.c (isless): New macro.
	(inf): New macro.
	(nonfinite): New macro.
	(FPKINF): Delete.
	(_xlqadd): Completely rewrite.
	(_xlqmul): Correct overflow handling.
	(_xlqdiv): Correct overflow handling.
	* config/rs6000/darwin-ldouble-format: New file.

Index: testsuite/ChangeLog
2004-07-30  Geoffrey Keating  <geoffk@apple.com>

	* gcc.dg/darwin-longdouble.c: New file.

Index: config/rs6000/darwin-ldouble-format
===================================================================
RCS file: config/rs6000/darwin-ldouble-format
diff -N config/rs6000/darwin-ldouble-format
--- /dev/null	1 Jan 1970 00:00:00 -0000
+++ config/rs6000/darwin-ldouble-format	31 Jul 2004 01:27:04 -0000
@@ -0,0 +1,84 @@
+Long double format
+==================
+
+  Each long double is made up of two IEEE doubles.  The value of the
+long double is the sum of the values of the two parts (except for
+-0.0).  The most significant part is required to be the value of the
+long double rounded to the nearest double, as specified by IEEE.  For
+Inf values, the least significant part is required to be one of +0.0
+or -0.0.  No other requirements are made; so, for example, 1.0 may be
+represented as (1.0, +0.0) or (1.0, -0.0), and the low part of a NaN
+is don't-care.
+
+Classification
+--------------
+
+A long double can represent any value of the form
+  s * 2^e * sum(k=0...105: f_k * 2^(-k))
+where 's' is +1 or -1, 'e' is between 1022 and -968 inclusive, f_0 is
+1, and f_k for k>0 is 0 or 1.  These are the 'normal' long doubles.
+
+A long double can also represent any value of the form
+  s * 2^-968 * sum(k=0...105: f_k * 2^(-k))
+where 's' is +1 or -1, f_0 is 0, and f_k for k>0 is 0 or 1.  These are
+the 'subnormal' long doubles.
+
+There are four long doubles that represent zero, two that represent
++0.0 and two that represent -0.0.  The sign of the high part is the
+sign of the long double, and the sign of the low part is ignored.
+
+Likewise, there are four long doubles that represent infinities, two
+for +Inf and two for -Inf.
+
+Each NaN, quiet or signalling, that can be represented as a 'double'
+can be represented as a 'long double'.  In fact, there are 2^64
+equivalent representations for each one.
+
+There are certain other valid long doubles where both parts are
+nonzero but the low part represents a value which has a bit set below
+2^(e-105).  These, together with the subnormal long doubles, make up
+the denormal long doubles.
+
+Many possible long double bit patterns are not valid long doubles.
+These do not represent any value.
+
+Limits
+------
+
+The maximum representable long double is 2^1024-2^918.  The smallest
+*normal* positive long double is 2^-968.  The smallest denormalised
+positive long double is 2^-1074 (this is the same as for 'double').
+
+Conversions
+-----------
+
+A double can be converted to a long double by adding a zero low part.
+
+A long double can be converted to a double by removing the low part.
+
+Comparisons
+-----------
+
+Two long doubles can be compared by comparing the high parts, and if
+those compare equal, comparing the low parts.
+
+Arithmetic
+----------
+
+The unary negate operation operates by negating the low and high parts.
+
+An absolute or absolute-negate operation must be done by comparing
+against zero and negating if necessary.
+
+Addition and subtraction are performed using library routines.  They
+are not at present performed perfectly accurately, the result produced
+will be within 1ulp of the range generated by adding or subtracting
+1ulp from the input values, where a 'ulp' is 2^(e-106) given the
+exponent 'e'.  In the presence of cancellation, this may be
+arbitrarily inaccurate.  Subtraction is done by negation and addition.
+
+Multiplication is also performed using a library routine.  Its result
+will be within 2ulp of the correct result.
+
+Division is also performed using a library routine.  Its result will
+be within 3ulp of the correct result.
Index: config/rs6000/darwin-ldouble.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/darwin-ldouble.c,v
retrieving revision 1.7
diff -u -p -u -p -r1.7 darwin-ldouble.c
--- config/rs6000/darwin-ldouble.c	8 Jul 2004 21:16:17 -0000	1.7
+++ config/rs6000/darwin-ldouble.c	31 Jul 2004 01:27:04 -0000
@@ -51,9 +51,13 @@ Software Foundation, 59 Temple Place - S
 #if !_SOFT_FLOAT && (defined (__MACH__) || defined (__powerpc64__))
 
 #define fabs(x) __builtin_fabs(x)
+#define isless(x, y) __builtin_isless (x, y)
+#define inf() __builtin_inf()
 
 #define unlikely(x) __builtin_expect ((x), 0)
 
+#define nonfinite(a) unlikely (! isless (fabs (a), inf ()))
+
 /* All these routines actually take two long doubles as parameters,
    but GCC currently generates poor code when a union is used to turn
    a long double into a pair of doubles.  */
@@ -69,66 +73,40 @@ typedef union
   double dval[2];
 } longDblUnion;
 
-static const double FPKINF = 1.0/0.0;
-
 /* Add two 'long double' values and return the result.	*/
 long double
-_xlqadd (double a, double b, double c, double d)
+_xlqadd (double a, double aa, double c, double cc)
 {
-  longDblUnion z;
-  double t, tau, u, FPR_zero, FPR_PosInf;
-
-  FPR_zero = 0.0;
-  FPR_PosInf = FPKINF;
-
-  if (unlikely (a != a) || unlikely (c != c)) 
-    return a + c;  /* NaN result.  */
+  longDblUnion x;
+  double z, q, zz, xh;
 
-  /* Ordered operands are arranged in order of their magnitudes.  */
+  z = a + c;
 
-  /* Switch inputs if |(c,d)| > |(a,b)|. */
-  if (fabs (c) > fabs (a))
+  if (nonfinite (z))
     {
-      t = a;
-      tau = b;
-      a = c;
-      b = d;
-      c = t;
-      d = tau;
+      z = cc + aa + c + a;
+      if (nonfinite (z))
+	return z;
+      x.dval[0] = z;  /* Will always be DBL_MAX.  */
+      zz = aa + cc;
+      if (fabs(a) > fabs(c))
+	x.dval[1] = a - z + c + zz;
+      else
+	x.dval[1] = c - z + a + zz;
     }
-
-  /* b <- second largest magnitude double.  */
-  if (fabs (c) > fabs (b))
+  else
     {
-      t = b;
-      b = c;
-      c = t;
-    }
+      q = a - z;
+      zz = q + c + (a - (q + z)) + aa + cc;
+      xh = z + zz;
 
-  /* Thanks to commutativity, sum is invariant w.r.t. the next
-     conditional exchange.  */
-  tau = d + c;
+      if (nonfinite (xh))
+	return xh;
 
-  /* Order the smallest magnitude doubles.  */
-  if (fabs (d) > fabs (c))
-    {
-      t = c;
-      c = d;
-      d = t;
+      x.dval[0] = xh;
+      x.dval[1] = z - xh + zz;
     }
-
-  t = (tau + b) + a;	     /* Sum values in ascending magnitude order.  */
-
-  /* Infinite or zero result.  */
-  if (unlikely (t == FPR_zero) || unlikely (fabs (t) == FPR_PosInf))
-    return t;
-
-  /* Usual case.  */
-  tau = (((a-t) + b) + c) + d;
-  u = t + tau;
-  z.dval[0] = u;	       /* Final fixup for long double result.  */
-  z.dval[1] = (t - u) + tau;
-  return z.ldval;
+  return x.ldval;
 }
 
 long double
@@ -141,21 +119,17 @@ long double
 _xlqmul (double a, double b, double c, double d)
 {
   longDblUnion z;
-  double t, tau, u, v, w, FPR_zero, FPR_PosInf;
+  double t, tau, u, v, w;
   
-  FPR_zero = 0.0;
-  FPR_PosInf = FPKINF;
-
   t = a * c;			/* Highest order double term.  */
 
-  if (unlikely (t != t) || unlikely (t == FPR_zero) 
-      || unlikely (fabs (t) == FPR_PosInf))
+  if (unlikely (t == 0)		/* Preserve -0.  */
+      || nonfinite (t))
     return t;
 
-  /* Finite nonzero result requires summing of terms of two highest
-     orders.	*/
+  /* Sum terms of two highest orders. */
   
-  /* Use fused multiply-add to get low part of a * c.	 */
+  /* Use fused multiply-add to get low part of a * c.  */
   asm ("fmsub %0,%1,%2,%3" : "=f"(tau) : "f"(a), "f"(c), "f"(t));
   v = a*d;
   w = b*c;
@@ -163,6 +137,8 @@ _xlqmul (double a, double b, double c, d
   u = t + tau;
 
   /* Construct long double result.  */
+  if (nonfinite (u))
+    return u;
   z.dval[0] = u;
   z.dval[1] = (t - u) + tau;
   return z.ldval;
@@ -172,15 +148,12 @@ long double
 _xlqdiv (double a, double b, double c, double d)
 {
   longDblUnion z;
-  double s, sigma, t, tau, u, v, w, FPR_zero, FPR_PosInf;
-  
-  FPR_zero = 0.0;
-  FPR_PosInf = FPKINF;
+  double s, sigma, t, tau, u, v, w;
   
   t = a / c;                    /* highest order double term */
   
-  if (unlikely (t != t) || unlikely (t == FPR_zero) 
-      || unlikely (fabs (t) == FPR_PosInf))
+  if (unlikely (t == 0)		/* Preserve -0.  */
+      || nonfinite (t))
     return t;
 
   /* Finite nonzero result requires corrections to the highest order term.  */
@@ -197,6 +170,8 @@ _xlqdiv (double a, double b, double c, d
   u = t + tau;
 
   /* Construct long double result.  */
+  if (nonfinite (u))
+    return u;
   z.dval[0] = u;
   z.dval[1] = (t - u) + tau;
   return z.ldval;
Index: config/rs6000/darwin.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/darwin.h,v
retrieving revision 1.55
diff -u -p -u -p -r1.55 darwin.h
--- config/rs6000/darwin.h	28 Jul 2004 23:57:26 -0000	1.55
+++ config/rs6000/darwin.h	31 Jul 2004 01:27:04 -0000
@@ -68,7 +68,7 @@
 /* The Darwin ABI always includes AltiVec, can't be (validly) turned
    off.  */
 
-#define SUBTARGET_OVERRIDE_OPTIONS				  	\
+#define SUBTARGET_OVERRIDE_OPTIONS					\
 do {									\
   rs6000_altivec_abi = 1;						\
   rs6000_altivec_vrsave = 1;						\
@@ -87,12 +87,19 @@ do {									\
         flag_pic = 2;							\
       }									\
   }									\
-}while(0)
+} while(0)
+
+/* Darwin has 128-bit long double support in libc in 10.4 and later.
+   Default to 128-bit long doubles even on earlier platforms for ABI
+   consistency; arithmetic will work even if libc and libm support is
+   not available.  */
+
+#define RS6000_DEFAULT_LONG_DOUBLE_SIZE 128
+
 
 /* We want -fPIC by default, unless we're using -static to compile for
    the kernel or some such.  */
 
-
 #define CC1_SPEC "\
 %{gused: -g -feliminate-unused-debug-symbols %<gused }\
 %{gfull: -g -fno-eliminate-unused-debug-symbols %<gfull }\
Index: config/rs6000/rs6000.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/rs6000.c,v
retrieving revision 1.673
diff -u -p -u -p -r1.673 rs6000.c
--- config/rs6000/rs6000.c	28 Jul 2004 12:13:13 -0000	1.673
+++ config/rs6000/rs6000.c	31 Jul 2004 01:27:04 -0000
@@ -1014,6 +1014,13 @@ rs6000_init_hard_regno_mode_ok (void)
 	rs6000_hard_regno_mode_ok_p[m][r] = true;
 }
 
+/* If not otherwise specified by a target, make 'long double' equivalent to
+   'double'.  */
+
+#ifndef RS6000_DEFAULT_LONG_DOUBLE_SIZE
+#define RS6000_DEFAULT_LONG_DOUBLE_SIZE 64
+#endif
+
 /* Override command line options.  Mostly we process the processor
    type and sometimes adjust other TARGET_ options.  */
 
@@ -1220,7 +1227,7 @@ rs6000_override_options (const char *def
     }
 
   /* Set size of long double */
-  rs6000_long_double_type_size = 64;
+  rs6000_long_double_type_size = RS6000_DEFAULT_LONG_DOUBLE_SIZE;
   if (rs6000_long_double_size_string)
     {
       char *tail;
@@ -1293,7 +1300,7 @@ rs6000_override_options (const char *def
       if (rs6000_isel_string == 0)
 	rs6000_isel = 0;
       if (rs6000_long_double_size_string == 0)
-	rs6000_long_double_type_size = 64;
+	rs6000_long_double_type_size = RS6000_DEFAULT_LONG_DOUBLE_SIZE;
     }
 
   rs6000_always_hint = (rs6000_cpu != PROCESSOR_POWER4
@@ -3161,8 +3168,7 @@ legitimate_lo_sum_address_p (enum machin
 	return false;
       if (GET_MODE_NUNITS (mode) != 1)
 	return false;
-      if (GET_MODE_BITSIZE (mode) > 32
-	  && !(TARGET_HARD_FLOAT && TARGET_FPRS && mode == DFmode))
+      if (GET_MODE_BITSIZE (mode) > 64)
 	return false;
 
       return CONSTANT_P (x);
@@ -11054,7 +11060,7 @@ rs6000_split_multireg_move (rtx dst, rtx
       int j = -1;
       bool used_update = false;
 
-      if (GET_CODE (src) == MEM && INT_REGNO_P (reg))
+      if (MEM_P (src) && INT_REGNO_P (reg))
         {
           rtx breg;
 
@@ -11071,6 +11077,15 @@ rs6000_split_multireg_move (rtx dst, rtx
 			 : gen_adddi3 (breg, breg, delta_rtx));
 	      src = gen_rtx_MEM (mode, breg);
 	    }
+	  else if (! offsettable_memref_p (src))
+	    {
+	      rtx newsrc, basereg;
+	      basereg = gen_rtx_REG (Pmode, reg);
+	      emit_insn (gen_rtx_SET (VOIDmode, basereg, XEXP (src, 0)));
+	      newsrc = gen_rtx_MEM (GET_MODE (src), basereg);
+	      MEM_COPY_ATTRIBUTES (newsrc, src);
+	      src = newsrc;
+	    }
 
 	  /* We have now address involving an base register only.
 	     If we use one of the registers to address memory, 
@@ -11118,6 +11133,15 @@ rs6000_split_multireg_move (rtx dst, rtx
 			   : gen_adddi3 (breg, breg, delta_rtx));
 	      dst = gen_rtx_MEM (mode, breg);
 	    }
+	  else if (! offsettable_memref_p (dst))
+	    {
+	      rtx newdst, basereg;
+	      basereg = gen_rtx_REG (Pmode, reg);
+	      emit_insn (gen_rtx_SET (VOIDmode, basereg, XEXP (dst, 0)));
+	      newdst = gen_rtx_MEM (GET_MODE (dst), basereg);
+	      MEM_COPY_ATTRIBUTES (newdst, dst);
+	      dst = newdst;
+	    }
 	}
 
       for (i = 0; i < nregs; i++)
Index: testsuite/gcc.dg/darwin-longdouble.c
===================================================================
RCS file: testsuite/gcc.dg/darwin-longdouble.c
diff -N testsuite/gcc.dg/darwin-longdouble.c
--- /dev/null	1 Jan 1970 00:00:00 -0000
+++ testsuite/gcc.dg/darwin-longdouble.c	31 Jul 2004 01:27:12 -0000
@@ -0,0 +1,117 @@
+/* { dg-do run { target powerpc*-*-darwin* } } */
+/* { dg-options "" } */
+/* No options so 'long long' can be used.  */
+
+#include <stdio.h>
+
+typedef unsigned long long uint64_t;
+typedef uint64_t ldbits[2];
+
+union ldu 
+{
+  ldbits lb;
+  long double ld;
+};
+
+static const struct {
+  ldbits a;
+  ldbits b;
+  ldbits result;
+} single_tests[] = {
+  /* Test of values that add to near +Inf.  */
+  { { 0x7FEFFFFFFFFFFFFFLL, 0xFC88000000000000LL },
+    { 0x7C94000000000000LL, 0x0000000000000000LL },
+    { 0x7FEFFFFFFFFFFFFFLL, 0x7C80000000000000LL } },
+  { { 0x7FEFFFFFFFFFFFFFLL, 0x7C8FFFFFFFFFFFFFLL },
+    { 0x792FFFFFFFFFFFFFLL, 0x0000000000000000LL },
+    { 0x7FEFFFFFFFFFFFFFLL, 0x7C8FFFFFFFFFFFFFLL } },
+  { { 0x7FEFFFFFFFFFFFFFLL, 0x7C8FFFFFFFFFFFFFLL },
+    { 0x7930000000000000LL, 0xF5DFFFFFFFFFFFFFLL },
+    /* correct result is: { 0x7FEFFFFFFFFFFFFFLL, 0x7C8FFFFFFFFFFFFFLL } */
+    { 0x7FF0000000000000LL, 0x0000000000000000LL } },
+  /* Test of values that add to +Inf.  */
+  { { 0x7FEFFFFFFFFFFFFFLL, 0x7C8FFFFFFFFFFFFFLL },
+    { 0x7930000000000000LL, 0x0000000000000000LL },
+    { 0x7FF0000000000000LL, 0x0000000000000000LL } },
+  /* Tests of Inf addition.  */
+  { { 0x7FF0000000000000LL, 0x0000000000000000LL },
+    { 0x0000000000000000LL, 0x0000000000000000LL },
+    { 0x7FF0000000000000LL, 0x0000000000000000LL } },
+  { { 0x7FF0000000000000LL, 0x0000000000000000LL },
+    { 0x7FF0000000000000LL, 0x0000000000000000LL },
+    { 0x7FF0000000000000LL, 0x0000000000000000LL } },
+  /* Test of Inf addition producing NaN.  */
+  { { 0x7FF0000000000000LL, 0x0000000000000000LL },
+    { 0xFFF0000000000000LL, 0x0000000000000000LL },
+    { 0x7FF8000000000000LL, 0x0000000000000000LL } },
+  /* Tests of NaN addition.  */
+  { { 0x7FF8000000000000LL, 0x0000000000000000LL },
+    { 0x0000000000000000LL, 0x0000000000000000LL },
+    { 0x7FF8000000000000LL, 0x7FF8000000000000LL } },
+  { { 0x7FF8000000000000LL, 0x0000000000000000LL },
+    { 0x7FF0000000000000LL, 0x0000000000000000LL },
+    { 0x7FF8000000000000LL, 0x7FF8000000000000LL } },
+  /* Addition of positive integers, with interesting rounding properties.  */
+  { { 0x4690000000000000LL, 0x4330000000000000LL },
+    { 0x4650000000000009LL, 0xC2FFFFFFFFFFFFF2LL },
+    /* correct result is: { 0x4691000000000001LL, 0xC32C000000000000LL } */
+    { 0x4691000000000001LL, 0xc32bfffffffffffeLL } },
+  { { 0x4690000000000000LL, 0x4330000000000000LL },
+    { 0x4650000000000008LL, 0x42F0000000000010LL },
+    { 0x4691000000000001LL, 0xC32E000000000000LL } },
+  { { 0x469FFFFFFFFFFFFFLL, 0x433FFFFFFFFFFFFFLL },
+    { 0x4340000000000000LL, 0x3FF0000000000000LL },
+    { 0x46A0000000000000LL, 0x0000000000000000LL } },
+  { { 0x469FFFFFFFFFFFFFLL, 0x433FFFFFFFFFFFFFLL },
+    { 0x4340000000000000LL, 0x0000000000000000LL },
+    { 0x46A0000000000000LL, 0xBFF0000000000000LL } },
+  /* Subtraction of integers, with cancellation.  */
+  { { 0x4690000000000000LL, 0x4330000000000000LL },
+    { 0xC690000000000000LL, 0xC330000000000000LL },
+    { 0x0000000000000000LL, 0x0000000000000000LL } },
+  { { 0x4690000000000000LL, 0x4330000000000000LL },
+    { 0xC330000000000000LL, 0x0000000000000000LL },
+    { 0x4690000000000000LL, 0x0000000000000000LL } },
+  { { 0x4690000000000000LL, 0x4330000000000000LL },
+    { 0xC330000000000000LL, 0x3FA0000000000000LL },
+    { 0x4690000000000000LL, 0x3FA0000000000000LL } },
+  { { 0x4690000000000000LL, 0x4330000000000000LL },
+    { 0xC690000000000000LL, 0x3FA0000000000000LL },
+    /* correct result is: { 0x4330000000000000LL, 0x3FA0000000000000LL } */
+    { 0x4330000000000000LL, 0x0000000000000000LL } }
+};
+    
+static int fail = 0;
+
+static void
+run_single_tests (void)
+{
+  size_t i;
+  for (i = 0; i < sizeof (single_tests) / sizeof (single_tests[0]); i++)
+    {
+      union ldu a, b, result, expected;
+      memcpy (a.lb, single_tests[i].a, sizeof (ldbits));
+      memcpy (b.lb, single_tests[i].b, sizeof (ldbits));
+      memcpy (expected.lb, single_tests[i].result, sizeof (ldbits));
+      result.ld = a.ld + b.ld;
+      if (memcmp (result.lb, expected.lb,
+		  result.ld == result.ld ? sizeof (ldbits) : sizeof (double))
+	  != 0)
+	{
+	  printf ("FAIL: %016llx %016llx + %016llx %016llx\n",
+		  a.lb[0], a.lb[1], b.lb[0], b.lb[1]);
+	  printf (" = %016llx %016llx not %016llx %016llx\n",
+		  result.lb[0], result.lb[1], expected.lb[0], expected.lb[1]);
+	  fail = 1;
+	}
+    }
+}
+
+int main(void)
+{
+  run_single_tests();
+  if (fail)
+    abort ();
+  else
+    exit (0);
+}
============================================================


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]