Bug 47692 - Numeric inaccuracy reported in testing lapack-3.3.0 BLAS module
Summary: Numeric inaccuracy reported in testing lapack-3.3.0 BLAS module
Status: RESOLVED INVALID
Alias: None
Product: gcc
Classification: Unclassified
Component: fortran (show other bugs)
Version: 4.6.0
: P3 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-02-11 00:26 UTC by John T
Modified: 2012-06-26 16:16 UTC (History)
4 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description John T 2011-02-11 00:26:54 UTC
A number of BLAS testing results were not clean. Some results were reported to be suspect and others were reported to be fatal errors. Here's a paste of one such result:


 ******* FATAL ERROR - COMPUTED RESULT IS LESS THAN HALF ACCURATE *******
                       EXPECTED RESULT                    COMPUTED RESULT
       1  (   0.551243    ,  -0.533049E-01)  (   0.551243    ,  -0.533049E-01)
       2  (  -0.816325E-01,   0.389502    )  (  -0.816325E-01,   0.389502    )
 ******* CGEMV  FAILED ON CALL NUMBER:
     10: CGEMV ('N',  2,  1,( 0.7,-0.9), A,  3, X, 1,( 0.0, 0.0), Y, 1)         .


I don't know why BLAS routines didn't test cleanly, but it appears that most severe results were in Complex Level I BLAS. There are some REAL and DOUBLE problems too. This is a well-established numeric library that as I recall tested cleanly with gfortran 4.4.5.

The results from testing BLAS and Lapack are in two text files that I can make available, though independent verification is of course needed for this.
Comment 1 Andrew Pinski 2011-02-11 00:28:28 UTC
What target are you running on?
Comment 2 John T 2011-02-11 00:33:39 UTC
I should have included that this bug applies to a Mandriva 2008.1 Duron x86 system with kernel 2.6.24, libc 2.7.
Comment 3 John T 2011-02-11 00:42:13 UTC
I must be tired. Gotta work tonight. The GCC 4.6 is the 20110205 snapshot.
Comment 4 kargls 2011-02-11 01:43:34 UTC
Did you build blas or download a pre-compiled version?
What were your compiler options?  Note, one should not
build smalach.f and dmalach.f with any optimization 
(may have mis-remebered file names).
Comment 5 Harald Anlauf 2011-02-11 19:44:12 UTC
(In reply to comment #4)
> Did you build blas or download a pre-compiled version?
> What were your compiler options?  Note, one should not
> build smalach.f and dmalach.f with any optimization 
> (may have mis-remebered file names).

With LAPACK-3.3.0 dlamch has been completely rewritten and
should return the same results at any optimization level.
It now uses the Fortran numeric inquiry intrinsics.
Comment 6 John T 2011-02-11 19:56:02 UTC
I built the reference BLAS included with Lapack from source. I just got the results from blas_testing using gcc-4.4.5 and results good again. I don't know where to find the raw results from lapack and blas testing. Should there be an ieee flag in compiler settings? Any flags on how to round?

My flags were:
#
FORTRAN  = gfortran -fimplicit-none -g
OPTS     = -O3
DRVOPTS  = $(OPTS)
NOOPT    = -g -O0
LOADER   = gfortran -g
LOADOPTS =
Comment 7 Jerry DeLisle 2011-02-11 20:03:25 UTC
Interesting, I just built this a few days ago using trunk and ran make testing without any errors, but I had no optimization turned on. When I get back to my machine at home I will redo this and grep for Fails.
Comment 8 Steve Kargl 2011-02-11 20:23:14 UTC
On Fri, Feb 11, 2011 at 07:56:05PM +0000, jrt at worldlinc dot net wrote:
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47692
> 
> --- Comment #6 from John T <jrt at worldlinc dot net> 2011-02-11 19:56:02 UTC ---
> I built the reference BLAS included with Lapack from source. I just got the
> results from blas_testing using gcc-4.4.5 and results good again. I don't know
> where to find the raw results from lapack and blas testing. Should there be an
> ieee flag in compiler settings? Any flags on how to round?
> 
> My flags were:
> #
> FORTRAN  = gfortran -fimplicit-none -g
> OPTS     = -O3
> DRVOPTS  = $(OPTS)
> NOOPT    = -g -O0
> LOADER   = gfortran -g
> LOADOPTS =
> 

I just built the blas included with lapack-3.3.0 with
-O3 of x86_64-*-freebsd with 4.5.3 and 4.6.0 (a fews
old version).  There were no errors.  Can you rebuild
with -O and see if you have problems?  If you have
problems with -O, can you then use -O0 -ffloat-store?
Comment 9 Harald Anlauf 2011-02-11 20:24:27 UTC
(In reply to comment #7)
> Interesting, I just built this a few days ago using trunk and ran make testing
> without any errors, but I had no optimization turned on. When I get back to my
> machine at home I will redo this and grep for Fails.

I just found that some testcases still have the old problem
of using the wrong threshold.  cblat3.f tries to compute EPS
and finds (see cblat3.out)

RELATIVE MACHINE PRECISION IS TAKEN TO BE  1.1E-19

when using the 387 fpu, while with -mfpmath=sse I get:

RELATIVE MACHINE PRECISION IS TAKEN TO BE  1.2E-07

So my suggestion is to add -march=native -mfpmath=sse
to the compiler flags.

This is not a gfortran problem, but a BLAS testsuite bug.
Comment 10 Harald Anlauf 2011-02-11 20:55:32 UTC
(In reply to comment #9)
> I just found that some testcases still have the old problem
> of using the wrong threshold.  cblat3.f tries to compute EPS
> and finds (see cblat3.out)
> 
> RELATIVE MACHINE PRECISION IS TAKEN TO BE  1.1E-19

Commenting myself: I just tried -Ofast and found in cblat3.out:

 RELATIVE MACHINE PRECISION IS TAKEN TO BE  1.4E-45

Somebody please report this to the LAPACK team so that the
tests will be fixed.
Comment 11 John T 2011-02-16 01:19:36 UTC
(In reply to comment #8)
> On Fri, Feb 11, 2011 at 07:56:05PM +0000, jrt at worldlinc dot net wrote:
> > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47692
> > 
> > --- Comment #6 from John T <jrt at worldlinc dot net> 2011-02-11 19:56:02 UTC ---
> > I built the reference BLAS included with Lapack from source. I just got the
> > results from blas_testing using gcc-4.4.5 and results good again. I don't know
> > where to find the raw results from lapack and blas testing. Should there be an
> > ieee flag in compiler settings? Any flags on how to round?
> > 
> > My flags were:
> > #
> > FORTRAN  = gfortran -fimplicit-none -g
> > OPTS     = -O3
> > DRVOPTS  = $(OPTS)
> > NOOPT    = -g -O0
> > LOADER   = gfortran -g
> > LOADOPTS =
> > 
> 
> I just built the blas included with lapack-3.3.0 with
> -O3 of x86_64-*-freebsd with 4.5.3 and 4.6.0 (a fews
> old version).  There were no errors.  Can you rebuild
> with -O and see if you have problems?  If you have
> problems with -O, can you then use -O0 -ffloat-store?

I haven't been able to try these suggestions because I'm finding a different problem, linking. The GCC programs didn't respond well to an attempt to reconfigure the existing build so I rebuilt for /usr/local and used a colorgcc trick to switch between 4.4.5 and the test version. But the build for /usr/local tried to link with /usr/lib/libgfortran.so.3 and the first set of test programs (in lapack-3.3.0/INSTALL) wouldn't run. I don't see anything in the makefiles that would confuse a linker.
Comment 12 Harald Anlauf 2011-02-16 18:01:16 UTC
(In reply to comment #11)
> (In reply to comment #8)
> > On Fri, Feb 11, 2011 at 07:56:05PM +0000, jrt at worldlinc dot net wrote:
> > > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47692
> > > 
> > > --- Comment #6 from John T <jrt at worldlinc dot net> 2011-02-11 19:56:02 UTC ---
> > > I built the reference BLAS included with Lapack from source. I just got the
> > > results from blas_testing using gcc-4.4.5 and results good again. I don't know
> > > where to find the raw results from lapack and blas testing. Should there be an
> > > ieee flag in compiler settings? Any flags on how to round?
> > > 
> > > My flags were:
> > > #
> > > FORTRAN  = gfortran -fimplicit-none -g
> > > OPTS     = -O3
> > > DRVOPTS  = $(OPTS)
> > > NOOPT    = -g -O0
> > > LOADER   = gfortran -g
> > > LOADOPTS =
> > > 
> > 
> > I just built the blas included with lapack-3.3.0 with
> > -O3 of x86_64-*-freebsd with 4.5.3 and 4.6.0 (a fews
> > old version).  There were no errors.  Can you rebuild
> > with -O and see if you have problems?  If you have
> > problems with -O, can you then use -O0 -ffloat-store?
> 
> I haven't been able to try these suggestions because I'm finding a different
> problem, linking. The GCC programs didn't respond well to an attempt to
> reconfigure the existing build so I rebuilt for /usr/local and used a colorgcc
> trick to switch between 4.4.5 and the test version. But the build for
> /usr/local tried to link with /usr/lib/libgfortran.so.3 and the first set of
> test programs (in lapack-3.3.0/INSTALL) wouldn't run. I don't see anything in
> the makefiles that would confuse a linker.

Can you be a little bit more specific?  What are the precise
error messages?

In case the gfortran you build for /usr/local is incompatible with
the system version in /usr, you might try adding the following flags
for linking:

LOADOPTS = -static-libgfortran -static-libgcc

or you need to set LD_LIBRARY_PATH appropriately.
Comment 13 Jerry DeLisle 2011-02-17 01:12:44 UTC
Always set LD_LIBRARY_PATH or another way is to compile with -static to make sure the correct runtime functions get invoked.
Comment 14 John T 2011-02-18 16:58:45 UTC
Sorry for not responding sooner, had a health issue.

Here's the error message with the -static flag:
gfortran -g  -o testieee tstiee.o
gfortran -fimplicit-none -g -static -O -c ilaver.f -o ilaver.o
gfortran -fimplicit-none -g -static -O -c LAPACK_version.f -o LAPACK_version.o
gfortran -g  -o testversion ilaver.o LAPACK_version.o
make[1]: Leaving directory `/home/dilbert/Download/linear/lapack-3.3.0/INSTALL'
./testlsame: /usr/lib/libgfortran.so.3: version `GFORTRAN_1.4' not found (required by ./testlsame)
./testslamch: /usr/lib/libgfortran.so.3: version `GFORTRAN_1.4' not found (required by ./testslamch)
./testdlamch: /usr/lib/libgfortran.so.3: version `GFORTRAN_1.4' not found (required by ./testdlamch)
./testsecond: /usr/lib/libgfortran.so.3: version `GFORTRAN_1.4' not found (required by ./testsecond)
./testdsecnd: /usr/lib/libgfortran.so.3: version `GFORTRAN_1.4' not found (required by ./testdsecnd)
./testversion: /usr/lib/libgfortran.so.3: version `GFORTRAN_1.4' not found (required by ./testversion)
make: *** [lapack_install] Error 1

The LD_LIBRARY_PATH specification worked.
Comment 15 Cezary Sliwa 2012-03-19 14:28:43 UTC
I just received a message saying:

"The problem with the BLAS testing programs sometimes not computing the machine epsilon correctly has been fixed in revision r1224, available in the LAPACK SVN repository (https://icl.cs.utk.edu/svn/lapack-dev/lapack/trunk)."
Comment 16 Harald Anlauf 2012-06-25 20:26:27 UTC
(In reply to comment #15)
> I just received a message saying:
> 
> "The problem with the BLAS testing programs sometimes not computing the machine
> epsilon correctly has been fixed in revision r1224, available in the LAPACK SVN
> repository (https://icl.cs.utk.edu/svn/lapack-dev/lapack/trunk)."

It appears that these fixes are in LAPACK-3.4.1, released in April 2012.
The failures I've seen in the BLAS testing have disappeared.

John, can you test that the new LAPACK version works for you?
Comment 17 John T 2012-06-26 15:07:48 UTC
Thank you for reminding me to submit a follow-up. Yes, blas and lapack test cleanly with gcc and gfortran version 4.6.3.

I have since encountered a difficulty with the Octave program involving blas. A section of code in Octave that I think compiles the documentation fails to recognize the values returned by calls to dlamch (?) as valid ieee754 values. I've tried a couple of obtimization settings unsuccessfully. If I can't set flags for dlamch and slamch to produce standard ieee754 values, this too might be worth a bug report. suggested flags?
Comment 18 kargls 2012-06-26 16:16:54 UTC
(In reply to comment #17)
> Thank you for reminding me to submit a follow-up. Yes, blas and lapack test
> cleanly with gcc and gfortran version 4.6.3.
> 
> I have since encountered a difficulty with the Octave program involving blas. A
> section of code in Octave that I think compiles the documentation fails to
> recognize the values returned by calls to dlamch (?) as valid ieee754 values.
> I've tried a couple of obtimization settings unsuccessfully. If I can't set
> flags for dlamch and slamch to produce standard ieee754 values, this too might
> be worth a bug report. suggested flags?

Try -O -ffloat-store.