This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug c/32180] New: Paranoia UCB GSL TestFloat libm tests fail - accuracy of recent gcc math poor
- From: "rob1weld at aol dot com" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: 1 Jun 2007 15:56:37 -0000
- Subject: [Bug c/32180] New: Paranoia UCB GSL TestFloat libm tests fail - accuracy of recent gcc math poor
- Reply-to: gcc-bugzilla at gcc dot gnu dot org
GCC 4.3.0 compiled on Linux does NOT pass as many tests as GCC 3.4.4 for
Cygwin.
How seriously will people take _newer_ versions gcc if it can't pass the same
tests as older versions did. Something has slipped over the years.
Now that I have a great compiler I decide to do some tests to see how well it
worked. I've done the usual "make -i check" tests and submitted the results.
I decided to try some math library tests available on the internet.
These tests are designed to catch flaws not to check trivial math operations.
They were written by people whose life's work is mathmatics.
My Cygwin gcc 3.4.4 passed "almost" all these tests, my linux compilers did
not. I am not the one who built _all_ the linux versions of gcc, I only built
the 4.2.0 and 4.3.0 versions. Please obtain and build these tests yourselves.
I did not run every single test on every possible version but I did run the
shortest test on every gcc I have. I am satisfied that gcc 4.3.0 does not pass
all the tests that I tired and that Cygwin gcc 3.4.4 passed almost
every test I tried.
Here are some of my notes:
Platform GCC Version Output File Name
i686-pc-cygwin gcc 3.4.4 release
paranoia_3.4.4_release-cygwin.txt
i686-pc-linux-gnu gcc 4.2.0 20070501
paranoia_4.2.0_20070501-linux.txt
i686-pc-linux-gnu gcc 4.3.0 20070529
paranoia_4.3.0_20070529-linux.txt
i486-linux gcc 3.3.5 (Debian 1:3.3.5-13)
paranoia_3.3.5_Debian-1:3.3.5-13-linux.txt
i486-linux-gnu gcc 3.4.6 (Debian 3.4.6-5)
paranoia_3.4.6_Debian-3.4.6-5-linux.txt
i486-linux-gnu gcc 4.1.2 (Debian 4.1.1-21)
paranoia_4.1.2_Debian-4.1.1-21-linux.txt
All these diffs produce no output - the linux tests are all the same result.
# diff -Naur paranoia_4.2.0_20070501-linux.txt
paranoia_4.3.0_20070529-linux.txt
# diff -Naur paranoia_4.2.0_20070501-linux.txt
paranoia_3.3.5_Debian-1:3.3.5-13-linux.txt
# diff -Naur paranoia_4.2.0_20070501-linux.txt
paranoia_3.4.6_Debian-3.4.6-5-linux.txt
# diff -Naur paranoia_4.2.0_20070501-linux.txt
paranoia_4.1.2_Debian-4.1.1-21-linux.txt
#
Here is the diff for Cygwin vs. Linux:
# diff -Naur paranoia_3.4.4_release-cygwin.txt
paranoia_4.3.0_20070529-linux.txt
--- paranoia_3.4.4_release-cygwin.txt 2007-05-31 11:02:18.000000000 -0700
+++ paranoia_4.3.0_20070529-linux.txt 2007-05-31 11:04:38.000000000 -0700
@@ -127,7 +127,8 @@
Test for sqrt monotonicity.
sqrt has passed a test for Monotonicity.
Testing whether sqrt is rounded or chopped.
-Square root appears to be correctly rounded.
+Square root is neither chopped nor correctly rounded.
+Observed errors run from -5.0000000e-01 to 5.0000000e-01 ulps.
To continue, press RETURN
Diagnosis resumes after milestone Number 90 Page: 7
@@ -152,7 +153,11 @@
This computed value is O.K.
Testing X^((X + 1) / (X - 1)) vs. exp(2) = 7.38905609893065218e+00 as X -> 1.
-Accuracy seems adequate.
+DEFECT: Calculated 7.38905609548934539e+00 for
+ (1 + (-1.11022302462515654e-16) ^ (-1.80143985094819840e+16);
+ differs from correct value by -3.44130679508225512e-09 .
+ This much error may spoil financial
+ calculations involving tiny interest rates.
Testing powers Z^Q at four nearly extreme values.
... no discrepancies found.
@@ -188,7 +193,9 @@
Diagnosis resumes after milestone Number 220 Page: 10
+The number of DEFECTs discovered = 1.
The number of FLAWs discovered = 1.
-The arithmetic diagnosed seems Satisfactory though flawed.
+The arithmetic diagnosed may be Acceptable
+despite inconvenient Defects.
END OF TEST.
Both Cygwin's gcc 3.4.4 and Linux gcc 3.3.5, 3.4.6, 4.1.1, 4.2.0 and 4.3.0
have a lack of a sticky bit which is considered a "flaw" by the test; but
this may be "Satisfactory".
Checking rounding on multiply, divide and add/subtract.
* is neither chopped nor correctly rounded.
/ is neither chopped nor correctly rounded.
Addition/Subtraction neither rounds nor chops.
Sticky bit used incorrectly or not at all.
FLAW: lack(s) of guard digits or failure(s) to correctly round or chop
(noted above) count as one flaw in the final tally below.
Only the Linux gcc compilers and not Cygwin's 3.4.4 (release) version are off
on one of the calculations by -3.44130679508225512e-09 . It is not much to some
people, for others it is a lot. This led me to more testing.
Here is another math test - http://www.netlib.org/fp/ucbtest.tgz :
# ucbREADME/linux.sh
Total 60 tests: pass 59, flags err 0, value err 1, acosd
Total 352 tests: pass 352, flags err 0, value err 0, addd
Total 77 tests: pass 77, flags err 0, value err 0, asind
Total 104 tests: pass 104, flags err 0, value err 0, atan2d
Total 57 tests: pass 57, flags err 0, value err 0, atand
Total 126 tests: pass 126, flags err 0, value err 0, cabsd
Total 99 tests: pass 99, flags err 0, value err 0, ceild
Total 53 tests: pass 53, flags err 0, value err 0, cosd
Total 68 tests: pass 56, flags err 0, value err 12, coshd
Total 383 tests: pass 383, flags err 0, value err 0, divd
Total 97 tests: pass 86, flags err 0, value err 11, expd
Total 37 tests: pass 37, flags err 0, value err 0, fabsd
Total 103 tests: pass 103, flags err 0, value err 0, floord
Total 352 tests: pass 352, flags err 0, value err 0, fmodd
Total 126 tests: pass 126, flags err 0, value err 0, hypotd
Total 89 tests: pass 89, flags err 0, value err 0, log10d
Total 83 tests: pass 83, flags err 0, value err 0, logd
Total 340 tests: pass 340, flags err 0, value err 0, muld
Total 1543 tests: pass 1505, flags err 0, value err 38, powd
Total 52 tests: pass 52, flags err 0, value err 0, sind
Total 72 tests: pass 66, flags err 0, value err 6, sinhd
Total 102 tests: pass 102, flags err 0, value err 0, sqrtd
Total 321 tests: pass 321, flags err 0, value err 0, subd
Total 54 tests: pass 54, flags err 0, value err 0, tand
Total 72 tests: pass 68, flags err 0, value err 4, tanhd
That doesn't happen with Cygwin's gcc 3.4.4.
Here is the TestFloat / SoftFloat tests:
TestFloat is a program for testing whether a computer's floating-point conforms
to the IEC/IEEE Standard for Binary Floating-point Arithmetic. TestFloat works
by comparing the behavior of the machine's floating-point with that of the
SoftFloat software implementation of floating-point. Any differences found are
reported as probable errors in the machine's floating-point.
http://www.jhauser.us/arithmetic/TestFloat.html
TestFloat and SoftFloat source files
compress'ed tar archive, TestFloat-2a.tar.Z [150 kB].
http://www.jhauser.us/arithmetic/TestFloat-2a.tar.Z
compress'ed tar archive, SoftFloat-2b.tar.Z [165 kB].
http://www.jhauser.us/arithmetic/SoftFloat-2b.tar.Z
Here are the results of only two of the tests:
# ./testsoftfloat -level 1 -errors 2000000 int32_to_float32 float32_add >
/dev/null
Testing float32_add, rounding nearest_even.
46464 tests total.
46464 tests performed; 39069 errors found.
Testing float32_add, rounding to_zero.
46464 tests total.
46464 tests performed; 39099 errors found.
Testing float32_add, rounding down.
46464 tests total.
46464 tests performed; 39213 errors found.
Testing float32_add, rounding up.
46464 tests total.
46464 tests performed; 39150 errors found.
# ./testsoftfloat -level 1 -errors 2000000 int32_to_float32 float32_eq >
/dev/null
Testing float32_eq.
46464 tests total.
46464 tests performed; 1321 errors found.
Finally I tested GSL - GNU Scientific Library
http://www.gnu.org/software/gsl
The GNU Scientific Library (GSL) is a numerical library for C and C++
programmers. The current version is GSL-1.9. It was released on 21 February
2007. This is a stable release.
ftp://ftp.gnu.org/gnu/gsl/
02/20/2007 03:36PM 2,574,939 gsl-1.9.tar.gz
ftp://ftp.gnu.org/gnu/gsl/gsl-1.9.tar.gz
When compiled and checked GSL works flawlessly under Cygwin 3.4.4 but fails
both
the interpolation and sort tests under gcc 4.3.0 20070529 .
WinXP - i686-pc-cygwin:
$ cd /cygdrive/c/gsl-1.9
$ make -i -k check 2>&1 | tee check_1_log.txt
$ grep -B 2 -A 2 fail check_1_log.txt
$ (Prints NOTHING)
Debian - i686-pc-linux-gnu:
# cd /root/downloads/gsl-1.9
$ make -i -k check 2>&1 | tee check_1_log.txt
# grep -B 2 -A 2 fail check_1_log.txt
FAIL: test
===================
1 of 1 tests failed
===================
make[2]: [check-TESTS] Error 1 (ignored)
--
FAIL: test
===================
1 of 1 tests failed
===================
make[2]: [check-TESTS] Error 1 (ignored)
Cygwin:
make[2]: Entering directory `/cygdrive/c/gsl-1.9/interpolation'
Completed [1100/1100]
PASS: test.exe
==================
All 1 tests passed
==================
make[2]: Leaving directory `/cygdrive/c/gsl-1.9/interpolation'
Cygwin:
make[2]: Entering directory `/cygdrive/c/gsl-1.9/sort'
Completed [21600/21600]
PASS: test.exe
==================
All 1 tests passed
==================
make[2]: Leaving directory `/cygdrive/c/gsl-1.9/sort'
Linux:
make[2]: Entering directory `/root/downloads/gsl-1.9/interpolation'
FAIL: gsl_interp_eval_e linear [7]
FAIL: gsl_interp_eval_deriv_e linear [8]
FAIL: linear deriv 0 (0 observed vs 5.30544087554984718e-315 expected) [test
uses subnormal value] [11]
FAIL: linear integ 0 (0 observed vs 2.19361877441406383 expected) [12]
(Over 900 lines of FAIL:)
FAIL: cspline-periodic 60 (4.99961591105934785e-270 observed vs
6.8941659943411544 expected) [1072]
FAIL: cspline-periodic deriv 60 (0 observed vs 7.16157787531728253e-313
expected) [test uses subnormal value] [1073]
FAIL: cspline-periodic integ 60 (0 observed vs 6.89057922363291553 expected)
[1074]
FAIL: cspline periodic 3pt interpolation [1075]
FAIL: test
===================
1 of 1 tests failed
===================
make[2]: [check-TESTS] Error 1 (ignored)
make[2]: Leaving directory `/root/downloads/gsl-1.9/interpolation'
Linux:
make[2]: Entering directory `/root/downloads/gsl-1.9/sort'
FAIL: indexing gsl_vector_char, n = 128, stride = 1, ordered [19999]
FAIL: sorting, gsl_vector_char, n = 128, stride = 1, ordered [20000]
FAIL: smallest, gsl_vector_char, n = 128, stride = 1, ordered [20001]
FAIL: largest, gsl_vector_char, n = 128, stride = 1, ordered [20002]
(Over 120 lines of FAIL:)
FAIL: sorting, gsl_vector_char, n = 512, stride = 3, randomized [21596]
FAIL: smallest, gsl_vector_char, n = 512, stride = 3, randomized [21597]
FAIL: largest, gsl_vector_char, n = 512, stride = 3, randomized [21598]
FAIL: test
===================
1 of 1 tests failed
===================
make[2]: [check-TESTS] Error 1 (ignored)
make[2]: Leaving directory `/root/downloads/gsl-1.9/sort'
GCC need a Ph.D. of math to give it a once over. If an old version of gcc can
pass these tests there is an error preventing newer versions from passing.
DRAFT Standard for Floating-Point Arithmetic P754 - Draft 1.3.0
Modified at 17:15 GMT on February 23, 2007
http://www.validlab.com/754R/drafts/archive/2007-02-23.pdf
--
Summary: Paranoia UCB GSL TestFloat libm tests fail - accuracy of
recent gcc math poor
Product: gcc
Version: 4.3.0
Status: UNCONFIRMED
Severity: major
Priority: P3
Component: c
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: rob1weld at aol dot com
GCC build triplet: i686-pc-linux-gnu
GCC host triplet: i686-pc-linux-gnu
GCC target triplet: i686-pc-linux-gnu
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32180