Bug 21718

Summary:	real.c rounding not perfect
Product:	gcc	Reporter:	Neil Booth <neil>
Component:	middle-end	Assignee:	Not yet assigned to anyone <unassigned>
Status:	RESOLVED FIXED
Severity:	enhancement	CC:	areg.melikadamyan, exploringbinary, gcc-bugs, ghazi, hjl.tools, vincent-gcc, zimmerma+gcc
Priority:	P3
Version:	4.1.0
Target Milestone:	4.9.0
Host:		Target:
Build:		Known to work:
Known to fail:		Last reconfirmed:	2006-01-28 00:49:49
Bug Depends on:	30789
Bug Blocks:	16989, 55145

Description Neil Booth 2005-05-23 13:03:12 UTC

When compiled with -Wall,

int foo (void) { if (9007199254740991.4999999999999999999999999999999995 ==
9007199254740991.5) return 1;}

does not complain about a missing return statement, showing that GCC is folding
the comparison to true.  However, the LHS should round down.

With one less 9 GCC gets it correct.

I'm not convinced it's worthwhile to get perfect rounding, but Joseph wanted me
to file a report.

Comment 1 Joseph S. Myers 2005-05-23 14:50:11 UTC

We probably don't even get it right for all cases with DECIMAL_DIG digits for
all long double formats (required by Annex F).

Comment 2 Joseph S. Myers 2005-05-23 23:55:20 UTC

Here is the example requested on IRC showing that we don't get it right with
DECIMAL_DIG digits and IEEE quad long double.  Tested with current mainline on
sparc-sun-solaris2.10 as an example target where long double is IEEE quad. 
Different examples may be needed for 64-bit hosts where we use more than 160
bits (but still not enough).  DECIMAL_DIG is 36 for IEEE quad.

int
f (void)
{
  if (0.000000000000000000000000000000000691986517009071585211899336165210746L
      == 0x0.000000000000000000000000000397cecf54a352d2cf251cf4c2d40ap+0L)
    return 1;
}
/* Above decimal is
   0x3.97cecf54a352d2cf251cf4c2d40900000000000000000000000000000006cfd...p-112,
   which to within 0.5ulp (113-bit mantissa) should round up to above
   hex value.  Thus, there should be no warning with -O2 -Wall.  */

My reading of F.5#2 is that conversions of all numbers with DECIMAL_DIG
significant figures are meant to be correct, not just of those which arise from
converting a representable binary number to decimal (which we probably do get
right).  I don't think you'll get all conversions with DECIMAL_DIG digits right
without at least 226 bits internal precision, and proving that any specific
figure is enough would be hard.

Comment 3 Andrew Pinski 2005-10-29 20:50:41 UTC

Confirmed.

Comment 4 Joseph S. Myers 2007-02-05 19:03:15 UTC

Now we always require MPFR, it should be easy to fix this by using MPFR's facilities for parsing numbers from strings.

Comment 5 Neil Booth 2007-10-11 03:45:42 UTC

(In reply to comment #1)
> We probably don't even get it right for all cases with DECIMAL_DIG digits for
> all long double formats (required by Annex F).

(In reply to comment #2)
> My reading of F.5#2 is that conversions of all numbers with DECIMAL_DIG
> significant figures are meant to be correct, not just of those which arise from
> converting a representable binary number to decimal (which we probably do get
> right).  I don't think you'll get all conversions with DECIMAL_DIG digits right
> without at least 226 bits internal precision, and proving that any specific
> figure is enough would be hard.

I believe more than 160 bits are required to get even single-precision numbers right with DECIMAL_DIG digits precision and an exponent.  I'm going to try and prove this by finding an example.  It could be hard as I believe 160 is only just below the required number.

I think to do DECIMAL_DIG digits correctly for long-double requires over 11,500 bits of precision.

I'm assuming above you're permitted to attach an exponent to your DECIMAL_DIG digits; that's how I read the definition of DECIMAL_DIG anyway.  The exponents push up the required precision enormously.

Comment 6 Neil Booth 2007-10-18 15:24:18 UTC

(In reply to comment #5)

> I believe more than 160 bits are required to get even single-precision numbers
> right with DECIMAL_DIG digits precision and an exponent.  I'm going to try and
> prove this by finding an example.  It could be hard as I believe 160 is only
> just below the required number.

Here's an example to prove this assertion.  When compiled with GCC 4.1.2 or 4.1.3 with -std=c99, assuming a correctly-rounding libc (e.g. NetBSD's; glibc also seems to get this correct) you get the following output:

0x1.8p-147 0x1.4p-147 8.40779078594890243e-45

So not only is it rounded incorrectly, but the number it is rounded to, when converted back to decimal, does not even match the input number in the first digit.

#include <stdio.h>
#include <stdlib.h>

int main (void)
{
  float f1 = 7.7071415537864938e-45;
  float f2 = strtof ("7.7071415537864938e-45", NULL);
  
  printf( "%a %a %0.18g\n", f1, f2, f1);

  return 0;
}

Comment 7 Kaveh Ghazi 2009-04-20 17:01:57 UTC

How about chucking real.c and using mpfr_t as the internal representation for real numbers inside GCC?  I guess we still need something to encode/decode numbers to the target format, but IMHO we'll keep finding these corner cases until we convert everything over to MPFR underneath the hood.  At that point, what's the use of having the extra layer of real.c?

Maybe we use mpft_t at the tree level and convert to REAL_VALUE_TYPE when converting over to rtl.

Another option would be to put an mpfr_t inside struct real_value instead of all those bitfields.

Comment 8 jsm-csl@polyomino.org.uk 2009-04-20 17:30:47 UTC

Subject: Re:  real.c rounding not perfect

On Mon, 20 Apr 2009, ghazi at gcc dot gnu dot org wrote:

> How about chucking real.c and using mpfr_t as the internal representation for
> real numbers inside GCC?  I guess we still need something to encode/decode
> numbers to the target format, but IMHO we'll keep finding these corner cases
> until we convert everything over to MPFR underneath the hood.  At that point,
> what's the use of having the extra layer of real.c?

I sort of imagine that real.c should provide a wrapper layer that checks 
whether the real number is decimal (in which case it calls the dfp.c code 
which uses libdecnumber) or binary (in which case it uses MPFR, taking 
care to use the correct precision, range, subnormal handling etc. for the 
type in question) or split-binary (IBM long double) in which case it 
should use more complicated GCC-local code that ultimately uses MPFR 
underneath (bug 19779, bug 26374) - certainly the implementation should 
avoid causing problems for future split-binary folding implementation.  
(A base class from which implementations for decimal, binary and 
split-binary derive, if you wish.)  I don't see any need for GCC to have 
its own implementation of the binary arithmetic - but it will need its own 
encoding/decoding support for all the many supported formats.

I also wonder if the code for handling target formats and arithmetic on 
them should in principle be a library shared with GDB (which presently 
uses host floating-point arithmetic) but wouldn't like to try to persuade 
GDB to require MPFR.

> Another option would be to put an mpfr_t inside struct real_value instead of
> all those bitfields.

You'd need to deal with the memory allocation/deallocation requirements, 
the MPFR allocation model is rather different from the fixed-size 
structures presently used.

Comment 9 Rick Regan 2010-06-04 01:31:32 UTC

I discovered two other examples of incorrect rounding in gcc: 

1) 0.500000000000000166533453693773481063544750213623046875
2) 1.50000000000000011102230246251565404236316680908203125

These are both halfway cases. Example 1 has bit 53 equal to 1, so it should round up; gcc rounds down. Example 2 has bit 53 equal to 0, so it should round down; gcc rounds up.

I've written an article describing these examples in more detail: http://www.exploringbinary.com/incorrectly-rounded-conversions-in-gcc-and-glibc/ .

BTW, why doesn't gcc use David Gay's dtoa.c (http://www.netlib.org/fp/dtoa.c) for correct rounding?

Comment 10 jsm-csl@polyomino.org.uk 2010-06-05 11:57:06 UTC

Subject: Re:  real.c rounding not perfect

On Fri, 4 Jun 2010, exploringbinary at gmail dot com wrote:

> BTW, why doesn't gcc use David Gay's dtoa.c (http://www.netlib.org/fp/dtoa.c)
> for correct rounding?

Anything using the host's "double" or other floating-point types is 
fundamentally unsuitable for GCC floating-point handling, since GCC needs 
to handle many different floating-point formats for different targets on a 
single host, including target formats with no corresponding host format.  
The correct solution for GCC is well-established: use GNU MPFR for 
decimal-to-binary conversions (and for other arithmetic), which also makes 
it much more straightforward to handle the discontiguous mantissa cases of 
IBM long double.

Comment 11 H.J. Lu 2012-11-04 23:06:27 UTC

From:

http://www.sourceware.org/bugzilla/show_bug.cgi?id=14803#c1

---
Really I'd consider this just a variant on bug 21718 (real.c rounding not 
perfect).  That would ideally be fixed by using MPFR for this in GCC ... 
except that for any MPFR versions before 3.1.1p2, the bug I found with the 
ternary value from mpfr_strtofr could cause problems for subnormal 
results.
---

Comment 12 Vincent Lefèvre 2012-11-05 00:16:32 UTC

(In reply to comment #11)
> Really I'd consider this just a variant on bug 21718 (real.c rounding not 
> perfect).  That would ideally be fixed by using MPFR for this in GCC ... 
> except that for any MPFR versions before 3.1.1p2, the bug I found with the 
> ternary value from mpfr_strtofr could cause problems for subnormal 
> results.

Alternatively, you can write code based on MPFR without using the ternary value. The algorithm would be:
1. Round to the target precision.
2. If the result is in the subnormal range (this can be detected by looking at the exponent of the result), then deduce the "real" precision from the exponent, and recompute the result in this precision directly.

Comment 13 H.J. Lu 2012-11-05 01:38:14 UTC

Given that the correct MPFR isn't widely available, is that
possible to fix rounding in real.c?

Comment 14 Vincent Lefèvre 2012-11-05 08:12:08 UTC

Otherwise, how about taking code from the glibc implementation of strtof/strtod/strtold? Code in strtod was recently fixed. I don't know about strtold...

Comment 15 jsm-csl@polyomino.org.uk 2012-11-05 20:49:24 UTC

The glibc code is pretty complicated (using glibc's copies of mpn_* 
low-level GMP functions for multiple-precision arithmetic) and entangled 
with other bits of glibc (it needs to handle things such as locales / 
thousands grouping characters, which are not relevant to GCC).  And of 
course there is no guarantee that the host has any floating-point type 
corresponding to the required type on the target.  Even working around the 
absence of a reliable ternary value in some supported MPFR versions, using 
MPFR for this would be much simpler than adapting the glibc code for use 
in GCC - it's the natural thing to do, given the use of MPFR for built-in 
function evaluation in GCC.  (MPFR should also be used to replace 
real_sqrt - real.c doesn't use enough bits internally to get a correctly 
rounded sqrt result in all cases.  fold_builtin_sqrt already does use 
mpfr_sqrt.)

Comment 16 Rick Regan 2013-09-25 14:48:30 UTC

I can no longer find any conversions that gcc (I'm using 4.6.3) performs incorrectly, including the examples cited above. It doesn't look like there has been any related code changes in real.c though. Is this an architecture-dependent bug perhaps? When I first tested this three years ago I was on a 32-bit Intel machine; now I'm on a 64-bit machine. Thanks.

Comment 17 Vincent Lefèvre 2013-09-25 15:45:51 UTC

I confirm that it is an architecture-dependent bug. I can't reproduce any error with your test program on http://www.exploringbinary.com/incorrectly-rounded-conversions-in-gcc-and-glibc/ even with old gcc versions, from an x86_64 machine.

With GCC 4.3.5 (Debian 4.3.5-4) on Debian 6.0.7 (squeeze):

0.500000000000000166533453693773481063544750213623046875
 Correct = 0x1.0000000000002p-1
 gcc =     0x1.0000000000002p-1
 strtod =  0x1.0000000000001p-1

3.518437208883201171875e13
 Correct = 0x1.0000000000002p+45
 gcc =     0x1.0000000000002p+45
 strtod =  0x1.0000000000001p+45

62.5364939768271845828
 Correct = 0x1.f44abd5aa7ca4p+5
 gcc =     0x1.f44abd5aa7ca4p+5
 strtod =  0x1.f44abd5aa7ca3p+5

8.10109172351e-10
 Correct = 0x1.bd5cbaef0fd0cp-31
 gcc =     0x1.bd5cbaef0fd0cp-31
 strtod =  0x1.bd5cbaef0fd0cp-31

1.50000000000000011102230246251565404236316680908203125
 Correct = 0x1.8p+0
 gcc =     0x1.8p+0
 strtod =  0x1.8p+0

9007199254740991.4999999999999999999999999999999995
 Correct = 0x1.fffffffffffffp+52
 gcc =     0x1.fffffffffffffp+52
 strtod =  0x1.fffffffffffffp+52

If I add the -m32 option, I get the same output except:

8.10109172351e-10
 Correct = 0x1.bd5cbaef0fd0cp-31
 gcc =     0x1.bd5cbaef0fd0cp-31
 strtod =  0x1.bd5cbaef0fd0dp-31

and the -mfpmath=387 option doesn't change anything.

On a Debian 6.0.7 (squeeze) 32-bit machine, with GCC 4.3.5 (Debian 4.3.5-4) and GCC 4.4.5 (Debian 4.4.5-8):

0.500000000000000166533453693773481063544750213623046875
 Correct = 0x1.0000000000002p-1
 gcc =     0x1.0000000000001p-1
 strtod =  0x1.0000000000001p-1

3.518437208883201171875e13
 Correct = 0x1.0000000000002p+45
 gcc =     0x1.0000000000002p+45
 strtod =  0x1.0000000000001p+45

62.5364939768271845828
 Correct = 0x1.f44abd5aa7ca4p+5
 gcc =     0x1.f44abd5aa7ca4p+5
 strtod =  0x1.f44abd5aa7ca3p+5

8.10109172351e-10
 Correct = 0x1.bd5cbaef0fd0cp-31
 gcc =     0x1.bd5cbaef0fd0cp-31
 strtod =  0x1.bd5cbaef0fd0dp-31

1.50000000000000011102230246251565404236316680908203125
 Correct = 0x1.8p+0
 gcc =     0x1.8000000000001p+0
 strtod =  0x1.8p+0

9007199254740991.4999999999999999999999999999999995
 Correct = 0x1.fffffffffffffp+52
 gcc =     0x1p+53
 strtod =  0x1.fffffffffffffp+52

Since the computations are done at compile time, it is the architecture of the GCC code that matters, not the one of the generated code (so that using -m32 on a 64-bit machine doesn't have any effect on the GCC side).

Comment 18 Rick Regan 2013-09-25 18:00:16 UTC

(In reply to Vincent Lefèvre from comment #17)
> I confirm that it is an architecture-dependent bug. I can't reproduce any
> error with your test program on
> http://www.exploringbinary.com/incorrectly-rounded-conversions-in-gcc-and-
> glibc/ even with old gcc versions, from an x86_64 machine.

Thanks for the quick and thorough response!

Besides those testcases, I wrote an automated testcase to test gcc conversions; I have not found any incorrect ones. (In a loop, the program generates a random decimal string and puts it in a small C program that it compiles and executes on the fly. The generated C program then checks the result against David Gay's strtod(), the result of which I also coded into the program.)

As an aside, I see that you must have an older version of glibc; I have been testing strtod() in glibc 2.18 and I no longer get incorrect conversions. According to this, http://sourceware.org/bugzilla/show_bug.cgi?id=3479, it was fixed in 2.17.

(I will be writing this up on my blog soon to reflect the new status of these problems.)

Comment 19 Rick Regan 2013-10-09 18:58:42 UTC

I've looked into this and found that real.c/real.h use a precision of SIGNIFICAND_BITS, which is dependent on an architecture-dependent value called HOST_BITS_PER_LONG. In addition, SIGNIFICAND_BITS limits the precision to 192 bits on a 64-bit system, and I was able to find an example of an incorrect conversion there: 5.0216813883093451685872615018317116712748411717802652598273e58.

(Please see my article http://www.exploringbinary.com/gcc-conversions-are-incorrect-architecture-or-otherwise/ for details.)

Comment 20 Vincent Lefèvre 2013-10-10 01:12:38 UTC

I suppose that for any 54-bit[*] odd integer multiplied by a power of two with a large exponent (in absolute value), some decimal numbers close to this value will be affected.

[*] 54-bit for double, 25-bit for float, 65-bit for long double if x87 extended precision, 114-bit for long double if quadruple precision.

Comment 21 Joseph S. Myers 2013-11-20 14:34:51 UTC

Author: jsm28
Date: Wed Nov 20 14:34:49 2013
New Revision: 205119

URL: http://gcc.gnu.org/viewcvs?rev=205119&root=gcc&view=rev
Log:
	PR middle-end/21718
	* real.c: Remove comment about decimal string conversion and
	rounding errors.
	(real_from_string): Use MPFR to convert nonzero decimal constant
	to REAL_VALUE_TYPE.

testsuite:
	* gcc.dg/float-exact-1.c: New test.

Added:
    trunk/gcc/testsuite/gcc.dg/float-exact-1.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/real.c
    trunk/gcc/testsuite/ChangeLog

Comment 22 Joseph S. Myers 2013-11-20 14:39:44 UTC

Fixed for 4.9, at least if MPFR 3.1.1p2 or later is used (I don't know if the bug with mpfr_strtofr ternary value in older versions would affect the way mpfr_strtofr is now used in GCC, rounding towards zero with a higher precision and using the ternary value to set a sticky bit if the value was inexact, so that GCC's final rounding to the destination format is correct).

Comment 23 Joseph S. Myers 2013-11-20 14:41:05 UTC

*** Bug 55145 has been marked as a duplicate of this bug. ***

Comment 24 Rick Regan 2013-11-20 15:06:32 UTC

I don't understand -- won't "mpfr_init2 (m, SIGNIFICAND_BITS);" have the same problem? Don't we need to change the computation of SIGNIFICAND_BITS in real.h?

Comment 25 Joseph S. Myers 2013-11-20 16:34:00 UTC

Rounding to zero and setting a sticky bit based on inexactness works as long as the internal precision has at least two more bits than the final precision for which correctly rounded results are required.  (This is what Boldo and Melquiond call rounding to odd in their fma algorithm, but the basic idea is much older than that.)

Comment 26 Rick Regan 2013-11-21 19:34:30 UTC

So the problem was never lack of precision -- just lack of stickiness. We were seeing higher precision making more conversions work (e.g., more worked with 192 bits than 160), but ultimately, stickiness was required to augment the limited precision. This will fix the architecture-dependent aspect of the bug as well.

I suppose you could have fixed the existing algorithm, although admittedly using MPFR is a simple and elegant solution. BTW, why was this fixed now, and not when an MPFR based solution was discussed four/five/six years ago? You certainly could have done it a few weeks ago, before I wrote all those articles :).

Comment 27 Joseph S. Myers 2013-11-21 21:24:55 UTC

No, it was lack of precision.  MPFR may need to use many more than SIGNIFICAND_BITS internally in order to compute a result that is correctly rounded towards zero to SIGNIFICAND_BITS (plus the ternary value), but GCC doesn't need to know or care how many bits MPFR uses; all it needs to know is that SIGNIFICAND_BITS is at least 2 more than the largest number in any supported binary floating-point format.

Having got the C11 features I wanted into 4.9 I then looked for bugs on my list of conformance issues that would be more suited to development stage 1 (ends today) than to subsequent stabilization stages.  This was one.

Comment 28 Rick Regan 2013-11-21 23:04:14 UTC

Yes, that makes sense. I originally (mistakingly) thought that SIGNIFICAND_BITS was the intermediate precision for mpfr_strtofr(), like it was for gcc's original algorithm. Then I talked myself out of the "needs more precision" argument based on my interpretation of your response.

So then is the round to zero/stickiness just to avoid double rounding (as opposed to using round to nearest/no stickiness)?

BTW, I'm testing out the code. I tried a test I found in float-exact-1.c: it's the literal assigned to the double named d1c. When I run it (on a 64-bit system) with the fix I get 0x0.0000000000001p-1022; strtod() (David Gay's and glibc's) gives me 0. Also, before the fix, I get "warning: floating constant truncated to zero" and the conversion is correct; after the fix, no message, and an incorrect conversion.

Comment 29 Joseph S. Myers 2013-11-21 23:20:38 UTC

GCC supports lots of different floating-point formats, not all IEEE, including variants such as "no subnormals" or "no infinities".  Using round to zero/stickiness allows the existing GCC code that knows about the peculiarities of these formats to do the final rounding for them, instead of needing to replicate that logic specially for string conversions in order to use mpfr_subnormalize.  Simple use of round-to-nearest without setting exponent limits and mpfr_subnormalize would indeed result in problems with double rounding.

d1c is slightly above half the least subnormal (d1a is slightly below, d1b is exactly half the least subnormal), so should give the least subnormal as result, which is what I get with current glibc.  Note that if GCC is running with an MPFR version before 3.1.1p2, it's possible MPFR bugs will cause incorrect results (I don't know if the bug in question affects the way GCC uses MPFR).  (On 32-bit i686-pc-linux-gnu you'd have issues with excess precision meaning constants are first interpreted as long double then later rounded to their semantic type, hence the FLT_EVAL_METHOD conditionals.)

Comment 30 Rick Regan 2013-11-22 05:22:31 UTC

I was running with the unfixed glibc, so that mislead me into thinking that, since it matched David Gay's output, it was right. But the fixed gcc and fixed glibc both get it right (0x0.0000000000001p-1022), and David Gay's strtod() gets it wrong (0)!

(That's one nasty input -- bit position 1075 is a 1, and the next 1 after that is at bit position 3582.)

Thanks for enlightening me. Now it's time to bug Dave Gay...

Comment 31 Rick Regan 2013-11-26 02:50:26 UTC

Gay's strtod() bug was reported and fixed: http://www.exploringbinary.com/gays-strtod-returns-zero-for-inputs-just-above-2-1075/