When compiled with -Wall, int foo (void) { if (9007199254740991.4999999999999999999999999999999995 == 9007199254740991.5) return 1;} does not complain about a missing return statement, showing that GCC is folding the comparison to true. However, the LHS should round down. With one less 9 GCC gets it correct. I'm not convinced it's worthwhile to get perfect rounding, but Joseph wanted me to file a report.

We probably don't even get it right for all cases with DECIMAL_DIG digits for all long double formats (required by Annex F).

Here is the example requested on IRC showing that we don't get it right with DECIMAL_DIG digits and IEEE quad long double. Tested with current mainline on sparc-sun-solaris2.10 as an example target where long double is IEEE quad. Different examples may be needed for 64-bit hosts where we use more than 160 bits (but still not enough). DECIMAL_DIG is 36 for IEEE quad. int f (void) { if (0.000000000000000000000000000000000691986517009071585211899336165210746L == 0x0.000000000000000000000000000397cecf54a352d2cf251cf4c2d40ap+0L) return 1; } /* Above decimal is 0x3.97cecf54a352d2cf251cf4c2d40900000000000000000000000000000006cfd...p-112, which to within 0.5ulp (113-bit mantissa) should round up to above hex value. Thus, there should be no warning with -O2 -Wall. */ My reading of F.5#2 is that conversions of all numbers with DECIMAL_DIG significant figures are meant to be correct, not just of those which arise from converting a representable binary number to decimal (which we probably do get right). I don't think you'll get all conversions with DECIMAL_DIG digits right without at least 226 bits internal precision, and proving that any specific figure is enough would be hard.

Confirmed.

Now we always require MPFR, it should be easy to fix this by using MPFR's facilities for parsing numbers from strings.

(In reply to comment #1) > We probably don't even get it right for all cases with DECIMAL_DIG digits for > all long double formats (required by Annex F). (In reply to comment #2) > My reading of F.5#2 is that conversions of all numbers with DECIMAL_DIG > significant figures are meant to be correct, not just of those which arise from > converting a representable binary number to decimal (which we probably do get > right). I don't think you'll get all conversions with DECIMAL_DIG digits right > without at least 226 bits internal precision, and proving that any specific > figure is enough would be hard. I believe more than 160 bits are required to get even single-precision numbers right with DECIMAL_DIG digits precision and an exponent. I'm going to try and prove this by finding an example. It could be hard as I believe 160 is only just below the required number. I think to do DECIMAL_DIG digits correctly for long-double requires over 11,500 bits of precision. I'm assuming above you're permitted to attach an exponent to your DECIMAL_DIG digits; that's how I read the definition of DECIMAL_DIG anyway. The exponents push up the required precision enormously.

(In reply to comment #5) > I believe more than 160 bits are required to get even single-precision numbers > right with DECIMAL_DIG digits precision and an exponent. I'm going to try and > prove this by finding an example. It could be hard as I believe 160 is only > just below the required number. Here's an example to prove this assertion. When compiled with GCC 4.1.2 or 4.1.3 with -std=c99, assuming a correctly-rounding libc (e.g. NetBSD's; glibc also seems to get this correct) you get the following output: 0x1.8p-147 0x1.4p-147 8.40779078594890243e-45 So not only is it rounded incorrectly, but the number it is rounded to, when converted back to decimal, does not even match the input number in the first digit. #include <stdio.h> #include <stdlib.h> int main (void) { float f1 = 7.7071415537864938e-45; float f2 = strtof ("7.7071415537864938e-45", NULL); printf( "%a %a %0.18g\n", f1, f2, f1); return 0; }

How about chucking real.c and using mpfr_t as the internal representation for real numbers inside GCC? I guess we still need something to encode/decode numbers to the target format, but IMHO we'll keep finding these corner cases until we convert everything over to MPFR underneath the hood. At that point, what's the use of having the extra layer of real.c? Maybe we use mpft_t at the tree level and convert to REAL_VALUE_TYPE when converting over to rtl. Another option would be to put an mpfr_t inside struct real_value instead of all those bitfields.

Subject: Re: real.c rounding not perfect On Mon, 20 Apr 2009, ghazi at gcc dot gnu dot org wrote: > How about chucking real.c and using mpfr_t as the internal representation for > real numbers inside GCC? I guess we still need something to encode/decode > numbers to the target format, but IMHO we'll keep finding these corner cases > until we convert everything over to MPFR underneath the hood. At that point, > what's the use of having the extra layer of real.c? I sort of imagine that real.c should provide a wrapper layer that checks whether the real number is decimal (in which case it calls the dfp.c code which uses libdecnumber) or binary (in which case it uses MPFR, taking care to use the correct precision, range, subnormal handling etc. for the type in question) or split-binary (IBM long double) in which case it should use more complicated GCC-local code that ultimately uses MPFR underneath (bug 19779, bug 26374) - certainly the implementation should avoid causing problems for future split-binary folding implementation. (A base class from which implementations for decimal, binary and split-binary derive, if you wish.) I don't see any need for GCC to have its own implementation of the binary arithmetic - but it will need its own encoding/decoding support for all the many supported formats. I also wonder if the code for handling target formats and arithmetic on them should in principle be a library shared with GDB (which presently uses host floating-point arithmetic) but wouldn't like to try to persuade GDB to require MPFR. > Another option would be to put an mpfr_t inside struct real_value instead of > all those bitfields. You'd need to deal with the memory allocation/deallocation requirements, the MPFR allocation model is rather different from the fixed-size structures presently used.

I discovered two other examples of incorrect rounding in gcc: 1) 0.500000000000000166533453693773481063544750213623046875 2) 1.50000000000000011102230246251565404236316680908203125 These are both halfway cases. Example 1 has bit 53 equal to 1, so it should round up; gcc rounds down. Example 2 has bit 53 equal to 0, so it should round down; gcc rounds up. I've written an article describing these examples in more detail: http://www.exploringbinary.com/incorrectly-rounded-conversions-in-gcc-and-glibc/ . BTW, why doesn't gcc use David Gay's dtoa.c (http://www.netlib.org/fp/dtoa.c) for correct rounding?

Subject: Re: real.c rounding not perfect On Fri, 4 Jun 2010, exploringbinary at gmail dot com wrote: > BTW, why doesn't gcc use David Gay's dtoa.c (http://www.netlib.org/fp/dtoa.c) > for correct rounding? Anything using the host's "double" or other floating-point types is fundamentally unsuitable for GCC floating-point handling, since GCC needs to handle many different floating-point formats for different targets on a single host, including target formats with no corresponding host format. The correct solution for GCC is well-established: use GNU MPFR for decimal-to-binary conversions (and for other arithmetic), which also makes it much more straightforward to handle the discontiguous mantissa cases of IBM long double.

From: http://www.sourceware.org/bugzilla/show_bug.cgi?id=14803#c1 --- Really I'd consider this just a variant on bug 21718 (real.c rounding not perfect). That would ideally be fixed by using MPFR for this in GCC ... except that for any MPFR versions before 3.1.1p2, the bug I found with the ternary value from mpfr_strtofr could cause problems for subnormal results. ---

(In reply to comment #11) > Really I'd consider this just a variant on bug 21718 (real.c rounding not > perfect). That would ideally be fixed by using MPFR for this in GCC ... > except that for any MPFR versions before 3.1.1p2, the bug I found with the > ternary value from mpfr_strtofr could cause problems for subnormal > results. Alternatively, you can write code based on MPFR without using the ternary value. The algorithm would be: 1. Round to the target precision. 2. If the result is in the subnormal range (this can be detected by looking at the exponent of the result), then deduce the "real" precision from the exponent, and recompute the result in this precision directly.

Given that the correct MPFR isn't widely available, is that possible to fix rounding in real.c?

Otherwise, how about taking code from the glibc implementation of strtof/strtod/strtold? Code in strtod was recently fixed. I don't know about strtold...

The glibc code is pretty complicated (using glibc's copies of mpn_* low-level GMP functions for multiple-precision arithmetic) and entangled with other bits of glibc (it needs to handle things such as locales / thousands grouping characters, which are not relevant to GCC). And of course there is no guarantee that the host has any floating-point type corresponding to the required type on the target. Even working around the absence of a reliable ternary value in some supported MPFR versions, using MPFR for this would be much simpler than adapting the glibc code for use in GCC - it's the natural thing to do, given the use of MPFR for built-in function evaluation in GCC. (MPFR should also be used to replace real_sqrt - real.c doesn't use enough bits internally to get a correctly rounded sqrt result in all cases. fold_builtin_sqrt already does use mpfr_sqrt.)

I can no longer find any conversions that gcc (I'm using 4.6.3) performs incorrectly, including the examples cited above. It doesn't look like there has been any related code changes in real.c though. Is this an architecture-dependent bug perhaps? When I first tested this three years ago I was on a 32-bit Intel machine; now I'm on a 64-bit machine. Thanks.

I confirm that it is an architecture-dependent bug. I can't reproduce any error with your test program on http://www.exploringbinary.com/incorrectly-rounded-conversions-in-gcc-and-glibc/ even with old gcc versions, from an x86_64 machine. With GCC 4.3.5 (Debian 4.3.5-4) on Debian 6.0.7 (squeeze): 0.500000000000000166533453693773481063544750213623046875 Correct = 0x1.0000000000002p-1 gcc = 0x1.0000000000002p-1 strtod = 0x1.0000000000001p-1 3.518437208883201171875e13 Correct = 0x1.0000000000002p+45 gcc = 0x1.0000000000002p+45 strtod = 0x1.0000000000001p+45 62.5364939768271845828 Correct = 0x1.f44abd5aa7ca4p+5 gcc = 0x1.f44abd5aa7ca4p+5 strtod = 0x1.f44abd5aa7ca3p+5 8.10109172351e-10 Correct = 0x1.bd5cbaef0fd0cp-31 gcc = 0x1.bd5cbaef0fd0cp-31 strtod = 0x1.bd5cbaef0fd0cp-31 1.50000000000000011102230246251565404236316680908203125 Correct = 0x1.8p+0 gcc = 0x1.8p+0 strtod = 0x1.8p+0 9007199254740991.4999999999999999999999999999999995 Correct = 0x1.fffffffffffffp+52 gcc = 0x1.fffffffffffffp+52 strtod = 0x1.fffffffffffffp+52 If I add the -m32 option, I get the same output except: 8.10109172351e-10 Correct = 0x1.bd5cbaef0fd0cp-31 gcc = 0x1.bd5cbaef0fd0cp-31 strtod = 0x1.bd5cbaef0fd0dp-31 and the -mfpmath=387 option doesn't change anything. On a Debian 6.0.7 (squeeze) 32-bit machine, with GCC 4.3.5 (Debian 4.3.5-4) and GCC 4.4.5 (Debian 4.4.5-8): 0.500000000000000166533453693773481063544750213623046875 Correct = 0x1.0000000000002p-1 gcc = 0x1.0000000000001p-1 strtod = 0x1.0000000000001p-1 3.518437208883201171875e13 Correct = 0x1.0000000000002p+45 gcc = 0x1.0000000000002p+45 strtod = 0x1.0000000000001p+45 62.5364939768271845828 Correct = 0x1.f44abd5aa7ca4p+5 gcc = 0x1.f44abd5aa7ca4p+5 strtod = 0x1.f44abd5aa7ca3p+5 8.10109172351e-10 Correct = 0x1.bd5cbaef0fd0cp-31 gcc = 0x1.bd5cbaef0fd0cp-31 strtod = 0x1.bd5cbaef0fd0dp-31 1.50000000000000011102230246251565404236316680908203125 Correct = 0x1.8p+0 gcc = 0x1.8000000000001p+0 strtod = 0x1.8p+0 9007199254740991.4999999999999999999999999999999995 Correct = 0x1.fffffffffffffp+52 gcc = 0x1p+53 strtod = 0x1.fffffffffffffp+52 Since the computations are done at compile time, it is the architecture of the GCC code that matters, not the one of the generated code (so that using -m32 on a 64-bit machine doesn't have any effect on the GCC side).

(In reply to Vincent Lefèvre from comment #17) > I confirm that it is an architecture-dependent bug. I can't reproduce any > error with your test program on > http://www.exploringbinary.com/incorrectly-rounded-conversions-in-gcc-and- > glibc/ even with old gcc versions, from an x86_64 machine. Thanks for the quick and thorough response! Besides those testcases, I wrote an automated testcase to test gcc conversions; I have not found any incorrect ones. (In a loop, the program generates a random decimal string and puts it in a small C program that it compiles and executes on the fly. The generated C program then checks the result against David Gay's strtod(), the result of which I also coded into the program.) As an aside, I see that you must have an older version of glibc; I have been testing strtod() in glibc 2.18 and I no longer get incorrect conversions. According to this, http://sourceware.org/bugzilla/show_bug.cgi?id=3479, it was fixed in 2.17. (I will be writing this up on my blog soon to reflect the new status of these problems.)

I've looked into this and found that real.c/real.h use a precision of SIGNIFICAND_BITS, which is dependent on an architecture-dependent value called HOST_BITS_PER_LONG. In addition, SIGNIFICAND_BITS limits the precision to 192 bits on a 64-bit system, and I was able to find an example of an incorrect conversion there: 5.0216813883093451685872615018317116712748411717802652598273e58. (Please see my article http://www.exploringbinary.com/gcc-conversions-are-incorrect-architecture-or-otherwise/ for details.)

I suppose that for any 54-bit[*] odd integer multiplied by a power of two with a large exponent (in absolute value), some decimal numbers close to this value will be affected. [*] 54-bit for double, 25-bit for float, 65-bit for long double if x87 extended precision, 114-bit for long double if quadruple precision.

Author: jsm28 Date: Wed Nov 20 14:34:49 2013 New Revision: 205119 URL: http://gcc.gnu.org/viewcvs?rev=205119&root=gcc&view=rev Log: PR middle-end/21718 * real.c: Remove comment about decimal string conversion and rounding errors. (real_from_string): Use MPFR to convert nonzero decimal constant to REAL_VALUE_TYPE. testsuite: * gcc.dg/float-exact-1.c: New test. Added: trunk/gcc/testsuite/gcc.dg/float-exact-1.c Modified: trunk/gcc/ChangeLog trunk/gcc/real.c trunk/gcc/testsuite/ChangeLog

Fixed for 4.9, at least if MPFR 3.1.1p2 or later is used (I don't know if the bug with mpfr_strtofr ternary value in older versions would affect the way mpfr_strtofr is now used in GCC, rounding towards zero with a higher precision and using the ternary value to set a sticky bit if the value was inexact, so that GCC's final rounding to the destination format is correct).

*** Bug 55145 has been marked as a duplicate of this bug. ***

I don't understand -- won't "mpfr_init2 (m, SIGNIFICAND_BITS);" have the same problem? Don't we need to change the computation of SIGNIFICAND_BITS in real.h?

Rounding to zero and setting a sticky bit based on inexactness works as long as the internal precision has at least two more bits than the final precision for which correctly rounded results are required. (This is what Boldo and Melquiond call rounding to odd in their fma algorithm, but the basic idea is much older than that.)

So the problem was never lack of precision -- just lack of stickiness. We were seeing higher precision making more conversions work (e.g., more worked with 192 bits than 160), but ultimately, stickiness was required to augment the limited precision. This will fix the architecture-dependent aspect of the bug as well. I suppose you could have fixed the existing algorithm, although admittedly using MPFR is a simple and elegant solution. BTW, why was this fixed now, and not when an MPFR based solution was discussed four/five/six years ago? You certainly could have done it a few weeks ago, before I wrote all those articles :).

No, it was lack of precision. MPFR may need to use many more than SIGNIFICAND_BITS internally in order to compute a result that is correctly rounded towards zero to SIGNIFICAND_BITS (plus the ternary value), but GCC doesn't need to know or care how many bits MPFR uses; all it needs to know is that SIGNIFICAND_BITS is at least 2 more than the largest number in any supported binary floating-point format. Having got the C11 features I wanted into 4.9 I then looked for bugs on my list of conformance issues that would be more suited to development stage 1 (ends today) than to subsequent stabilization stages. This was one.

Yes, that makes sense. I originally (mistakingly) thought that SIGNIFICAND_BITS was the intermediate precision for mpfr_strtofr(), like it was for gcc's original algorithm. Then I talked myself out of the "needs more precision" argument based on my interpretation of your response. So then is the round to zero/stickiness just to avoid double rounding (as opposed to using round to nearest/no stickiness)? BTW, I'm testing out the code. I tried a test I found in float-exact-1.c: it's the literal assigned to the double named d1c. When I run it (on a 64-bit system) with the fix I get 0x0.0000000000001p-1022; strtod() (David Gay's and glibc's) gives me 0. Also, before the fix, I get "warning: floating constant truncated to zero" and the conversion is correct; after the fix, no message, and an incorrect conversion.

GCC supports lots of different floating-point formats, not all IEEE, including variants such as "no subnormals" or "no infinities". Using round to zero/stickiness allows the existing GCC code that knows about the peculiarities of these formats to do the final rounding for them, instead of needing to replicate that logic specially for string conversions in order to use mpfr_subnormalize. Simple use of round-to-nearest without setting exponent limits and mpfr_subnormalize would indeed result in problems with double rounding. d1c is slightly above half the least subnormal (d1a is slightly below, d1b is exactly half the least subnormal), so should give the least subnormal as result, which is what I get with current glibc. Note that if GCC is running with an MPFR version before 3.1.1p2, it's possible MPFR bugs will cause incorrect results (I don't know if the bug in question affects the way GCC uses MPFR). (On 32-bit i686-pc-linux-gnu you'd have issues with excess precision meaning constants are first interpreted as long double then later rounded to their semantic type, hence the FLT_EVAL_METHOD conditionals.)

I was running with the unfixed glibc, so that mislead me into thinking that, since it matched David Gay's output, it was right. But the fixed gcc and fixed glibc both get it right (0x0.0000000000001p-1022), and David Gay's strtod() gets it wrong (0)! (That's one nasty input -- bit position 1075 is a 1, and the next 1 after that is at bit position 3582.) Thanks for enlightening me. Now it's time to bug Dave Gay...

Gay's strtod() bug was reported and fixed: http://www.exploringbinary.com/gays-strtod-returns-zero-for-inputs-just-above-2-1075/