#include <stdio.h> #include <math.h> int main() { double a= 10.0; double b= 1e+308; printf("%d %d %d\n", isinf(a*b), __builtin_isinf(a*b), __isinf(a*b)); return 0; } mtaylor@drizzle-dev:~$ gcc -o test test.c mtaylor@drizzle-dev:~$ ./test 0 0 1 mtaylor@drizzle-dev:~$ gcc -o test test.c -std=c99 mtaylor@drizzle-dev:~$ ./test 1 0 1 mtaylor@drizzle-dev:~$ gcc -o test test.c -mfpmath=sse -march=pentium4 mtaylor@drizzle-dev:~$ ./test 1 1 1 mtaylor@drizzle-dev:~$ g++ -o test test.c mtaylor@drizzle-dev:~$ ./test 1 0 1 Originally I found the simple isinf() case to be different on x86 than x86-64, ppc32 and sparc (32 and 64). After more research, I found that x86-64 uses the sse instructions to do it (and using sse is the only way for __builtin_isinf() to produce correct results). For the g++ built version, it calls __isinf() instead of inlining (and as can be seen, the __isinf() version is always correct). Specifically, it's because the optimised 387 code is doing the math in double extended precision inside the FPU. 10.0*1e308 fits in 80bits but not in 64bit. Any code that forces it to be stored and loaded gets the correct result too. e.g. mtaylor@drizzle-dev:~$ cat test-simple.c #include <stdio.h> #include <math.h> int main() { double a= 10.0; double b= 1e+308; volatile double c= a*b; printf("%d\n", isinf(c)); return 0; } mtaylor@drizzle-dev:~$ gcc -o test-simple test-simple.c mtaylor@drizzle-dev:~$ ./test-simple 1 With this code you can easily see the load and store: 8048407: dc 0d 18 85 04 08 fmull 0x8048518 804840d: dd 5d f0 fstpl -0x10(%ebp) 8048410: dd 45 f0 fldl -0x10(%ebp) 8048413: d9 e5 fxam While if you remove volatile, the load and store doesn't happen (at least on -O3, on -O0 it hasn't been optimised away): 8048407: dc 0d 18 85 04 08 fmull 0x8048518 804840d: c7 44 24 04 10 85 04 movl $0x8048510,0x4(%esp) 8048414: 08 8048415: c7 04 24 01 00 00 00 movl $0x1,(%esp) 804841c: d9 e5 fxam This is also a regression from 4.2.4 as it just calls isinf() and doesn't expand the 387 code inline. My guess is the 387 optimisation was added in 4.3. Recommended fix: store and load in the 387 version so to operate on same precision as elsewhere.
Looking into it.
Created attachment 17323 [details] patch Patch currently in testing.
To implement a work around for us, I'm proposing the patch below. - the tmp2 being volatile was for who knows what reason (old code) - The check should (on c99 systems or those with the right compile options enabled, which we do have) detect that we calculate at 80bits but store at 64. - the volatile temp should ensure a write to memory (and subsequent read) - where the check is false, the whole code path should be optimised out. Perhaps it should also be surrounded by an ifdef for the specific gcc version (3.4)? Thoughts are very welcome. === modified file 'drizzled/function/math/round.cc' --- drizzled/function/math/round.cc 2008-12-16 06:21:46 +0000 +++ drizzled/function/math/round.cc 2009-02-18 09:59:40 +0000 @@ -111,14 +111,22 @@ tmp2 is here to avoid return the value with 80 bit precision This will fix that the test round(0.1,1) = round(0.1,1) is true */ - volatile double tmp2; + double tmp2; tmp=(abs_dec < array_elements(log_10) ? log_10[abs_dec] : pow(10.0,(double) abs_dec)); + double value_times_tmp= value * tmp; + + if(sizeof(double) < sizeof(double_t)) + { + volatile double t= value_times_tmp; + value_times_tmp= t; + } + if (dec_negative && isinf(tmp)) tmp2= 0; - else if (!dec_negative && isinf(value * tmp)) + else if (!dec_negative && isinf(value_times_tmp)) tmp2= value; else if (truncate) {
Patch at http://gcc.gnu.org/ml/gcc-patches/2009-02/msg00825.html
Subject: Bug 39228 Author: uros Date: Thu Feb 19 10:51:04 2009 New Revision: 144293 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=144293 Log: PR target/39228 * config/i386/i386.md (isinfxf2): Split from isinf<mode>2. (UNSPEC_FXAM_MEM): New unspec. (fxam<mode>2_i387_with_temp): New insn and split pattern. (isinf<mode>2): Use MODEF mode iterator. Force operand[1] through memory using fxam<mode>2_i387_with_temp to remove excess precision. testsuite/ChangeLog: PR target/39228 * gcc.c-torture/execute/pr39228.c: New test. Added: trunk/gcc/testsuite/gcc.c-torture/execute/pr39228.c Modified: trunk/gcc/ChangeLog trunk/gcc/config/i386/i386.md trunk/gcc/testsuite/ChangeLog
Subject: Bug 39228 Author: uros Date: Thu Feb 19 12:44:40 2009 New Revision: 144295 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=144295 Log: PR target/39228 * config/i386/i386.md (isinfxf2): Split from isinf<mode>2. (UNSPEC_FXAM_MEM): New unspec. (fxam<mode>2_i387_with_temp): New insn and split pattern. (isinf<mode>2): Use MODEF mode iterator. Force operand[1] through memory using fxam<mode>2_i387_with_temp to remove excess precision. testsuite/ChangeLog: PR target/39228 * gcc.c-torture/execute/pr39228.c: New test. Added: branches/gcc-4_3-branch/gcc/testsuite/gcc.c-torture/execute/pr39228.c Modified: branches/gcc-4_3-branch/gcc/ChangeLog branches/gcc-4_3-branch/gcc/config/i386/i386.md branches/gcc-4_3-branch/gcc/testsuite/ChangeLog
Fixed.
> Added: > trunk/gcc/testsuite/gcc.c-torture/execute/pr39228.c SH requires -mieee option for the newly added pr39228.c test. I guess that alpha is on the same boat. Also it'll fail on the targets like SPU which have no inf/nan supports. We need gcc.c-torture/execute/pr39228.x if { [istarget "alpha*-*-*"] || [istarget "sh*-*-*"] } { # alpha and SH require -mieee for this test. set additional_flags "-mieee" } if [istarget "spu-*-*"] { # No Inf/NaN support on SPU. return 1 } return 0 or something like that, doesn't we?