Hi. I am testing a floating point division algorithm. I had tested several thousand division cases. In most case, the result was the same with the calculated result with FPU. However, I found an error case for double precision floating-point division. The case is as follows; n = 1.600228...; d = 1.312790...; q = n/d; error in q : the last bit of the output is incorrect in 32-bit GCC compiler. (FPU rc = round to nearst even, the default rounding mode) (The actual number were written in 64-bit hex-decimal number) The following tool & envirionments output wrong answer: - i386-redhat-linux gcc 3.4.6 (Intel P4, (AMD Opteron is also tested)) - windows XP/cygwin/ gcc 3.4.4 (Intel Core2 duo) The following tool & environments output right answer: - Solaris10 gcc3.2.3, (Sparc) - x86_64-redhat-linux gcc 3.4.6 (AMD Opteron) - Visual C++ 6.0 (Intel Core2 duo) Only gcc for 32-bit envirionment outputs incorrect result. I think there may be a problem in FPU rc(rounding control) in GCC for 32-bit environment. I don't think it's not the problem of CPU h/w design because other compilers output the correct result. Although it's very small error in a floating number calculation, gcc might be regarded as an unreliable compiler in my field, computer arithmetic. I hope this problem will be fixed as soon as possible. Regards, Inwook Kong ------------ source code ------------ typedef unsigned long long int UINT64; int main(void) { int i,j; double n,d,q; //for test cases *((UINT64*)&n)=0x3ff99a89160f4c0ell; *((UINT64*)&d)=0x3ff5012fcd164611ll; q=n/d; printf("q=%llx\n",*((UINT64*)&q)); } -------------- output -------------- [output : i386-redhat-linux gcc 3.4.6] q=3ff380d464d7da48 <-incorrect value [correct output- Solaris10 gcc3.2.3, x86_64-redhat-linux gcc 3.4.6] q=3ff380d464d7da47 <- right value in IEEE-754 standard -------------- environment -------------- [Environment] Reading specs from /usr/lib/gcc/i386-redhat-linux/3.4.6/specs Configured with: ../configure --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --enable-shared --enable-threads=posix --disable-checking --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-java-awt=gtk --host=i386-redhat-linux Thread model: posix gcc version 3.4.6 20060404 (Red Hat 3.4.6-9) [GCC option] CFLAGS = -c -save-temps LFLAGS = -lm
*** This bug has been marked as a duplicate of 323 ***
It's not simple floating point related error! I fully understand that the decimal number can't be converted to the exact floating point number. so, the result may be different from our expectation. This problem has no relation to a foolish mistake, (0.3+0.6)!=0.9. In only ONE BASIC arithmetic operation, the output should be same between compilers if the compilers are compliant to IEEE-754 standard and the inputs are exactly same. If this bug is not corrected, the GCC 32-bit is NOT compliant to IEEE-754 standard.
This problem is not kind of a duplicate of #323.
The i?86 FPU computes values in 80bit precision by default. Starting with GCC 4.3 you can use -mpc64 to reduce its precision to 64bits thereby obtaining the valid IEEE-754 result. Or you can use -mfpmath=sse to use sse registers for doing math (which also have a precision of 64bits). *** This bug has been marked as a duplicate of 323 ***
The division is done and then rounded to 80bits and then rounded again to 64bits. This is not really a bug. It is just a misunderstanding on how x87 FPU works. fldl -24(%ebp) fldl -32(%ebp) fdivrp %st, %st(1) fstpl -40(%ebp) Either use -mfpmath=sse or don't use x86.
One more comment about this code, you are violating C/C++ aliasing rules.
Although I knew GCC use 80-bit format internally, I thought the result should be same in 80-bit format. Due to the very kind explanation about my problem, I can understand that the result can be changed because the rounding operation occurs two times. The cause of the problem was 'double rounding'. Ironically, the 80-bit internal format makes MORE BIG ERROR in 64-bit floating point calculation. For 64-bit output, 64-bit internal format will generate the best result becauce IEEE-754 intendeds a result to be nearest to the true value with infinite precision. If GCC adopts 80-bit internal format, the final result in the 64-bit floating point will not be the best because of the unavoidable double rounding operation. So, I think that it's very good decision to support 64-bit internal foramt in GCC 4.x version. If a user uses 64-bit floating point numbers, the user will get the more correct result by this option. Also, I really appreciate that Andrew Pinski lets me know my alais rule violation in C/C++. Due to Andrew's kind advice, I knew how much my violation affects the program performance. It was not just a problem about code portability issue. Thank you again. I reallly appreciate you all.
Actually the 80-bit internal format will be better in converting a decimal number into the floating point number. In this point, the 80-bit internal format may be useful.
Subject: A incorrect result in a simple division, only in 32-bit gcc. > Although I knew GCC use 80-bit format internally, I thought the result should > be same in 80-bit format. No, it's not that gcc uses 80 bit or 64 bit "internally". It's that the i387 does computations for 64 bit double types in 80 bit extended mode by default. So it's not gcc that determines this, or that gcc changed anything and suddenly decided "to support 64-bit internal foramt in GCC 4.x version." gcc's double precision type has always been and always will be 64 bits on 32 bit i386. That has not changed in 4.x. It's a matter of how the i387 hardware is configured. The -mpc64 switch changes nothing about gcc's internal representation, it simply emits extra code to set some fp registers to tell the hardware not to use extended precision. This is not a change in gcc per se.
Thanks to Brian's kind comment, I have knew that the exact mechanism. In my understannding, the conclusion is as follows; 1. GCC 3.x doesn't generate any codes for 80-bit precision. the FPU h/w just uses the 80-bit internal format. 2. The double rounding (80-bit rounding in the calculation and then 64-bit rounding in doing FST instruction) inside the FPU (not gcc generated codes) causes this issue. 3. GCC 4.x can add an initialization code to change the default precision of FPU. I really appreciate that I have learned many things from this issue from many great experts in the world.
FTR, this PR is linked to from here ( https://lemire.me/blog/2020/06/26/gcc-not-nearest/ ).