Bug 35194 - floating point truncation error on intel platform
Summary: floating point truncation error on intel platform
Status: RESOLVED WONTFIX
Alias: None
Product: gcc
Classification: Unclassified
Component: ada (show other bugs)
Version: 4.1.2
: P3 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2008-02-14 13:36 UTC by Jérôme Duquennoy
Modified: 2010-09-18 09:06 UTC (History)
2 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2008-11-30 09:39:30


Attachments
Simple source code to reproduce the bug (244 bytes, text/plain)
2008-02-14 13:38 UTC, Jérôme Duquennoy
Details
floating point tests with different FPU configuration (1.36 KB, application/x-gzip)
2008-02-15 10:11 UTC, Jérôme Duquennoy
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Jérôme Duquennoy 2008-02-14 13:36:23 UTC
In some situations (after a call to a C oracle function, or when running the program on top of valgrind), floats behave strangely : a given, hard coded float value displays erroneously.
For exemple, a float value of 1.2001 would print as "1.20831E+0".

Our analysis :
We reproduced the problem either using functions from the Oracle C library or running the software on top of valgrind.
The tests were done on a Suse Linux 10.0 and a centos 5.1, using a 4.1.2 compiler and a 4.2.3 compiler, running on an intel-based machine (core2duo and pentium D).
We COULD NOT reproduce the problem on a solaris 9 UltraSPARC III based machine.

Using a customized img_real package, we found that the bad behaving code is the Long_Long_Float'Truncation attribute, called in the Convert_Integer procedure of the Set_Image_Real procedure, in package img_real.

This function seems to be platform-dependent, this is coherent with the behaviour difference between a sparc-based machine and an pentium-based one.

The problem can be reproduced easily by compiling this simple code, and running the executable under Valgrind :


with 
	ada.text_io, 
	Ada.Strings.Fixed,
	System.img_real;

use 
	Ada.Strings.Fixed, 
	ada.text_io,
	System.img_real;

procedure test_fio is  
   value : Long_Long_Float;

	resulting_string : string(1..15);
	nat : natural;
begin
   value := 1.2001;
   resulting_string := (others => ' '); 
	nat := 0;
	
	set_image_real(value, resulting_string, nat, 2, 4, 2);
	
	put_line ("resulting_string : '" & resulting_string & "'");
end test_fio;
Comment 1 Jérôme Duquennoy 2008-02-14 13:38:17 UTC
Created attachment 15149 [details]
Simple source code to reproduce the bug

Compile this code and run the resulting binary on top of valgrind to reproduce the bug.
Comment 2 Jérôme Duquennoy 2008-02-14 13:44:14 UTC
A precision on the appearance of the bug :
The bug first appears when using the Oracle 10 C library (libclntsh), and can be reproduced using the Oracle 11 library.
The bug does not appear when using Oracle 9.
Comment 3 Richard Biener 2008-02-14 21:47:37 UTC
Does the oracle library by any chance mess with the floating point precision
registers of the CPU?
Comment 4 Jérôme Duquennoy 2008-02-15 10:11:28 UTC
Created attachment 15156 [details]
floating point tests with different FPU configuration

This archive contains code to test the behavior of the floating points using different configurations of the i387 FPU.
It shows that the Float type in ada only works well when using a 64 bits mantissa.

When using a 53 bits mantissa, the results are the same as those obtained using test_fio under valgrind or using Oracle's lib :
 
----------------------------------------
CW origine:  1.20001E+00
SW =  895
{Infinity Off; NEAREST_OR_EVEN; BITS_64}
----------------------------------------
Bits_53:  1.20831E+00
SW =  639
{Infinity Off; NEAREST_OR_EVEN; BITS_53}
----------------------------------------
Bits_24:  0.E+00
SW =  127
{Infinity Off; NEAREST_OR_EVEN; BITS_24}
----------------------------------------

The output under valgrind is weird : the Float IOs behave as if the FPU was is 53 bits mantissa mode, but the config is reported to be 64 bits ...
I guess this has to do with the way valgrind works.

When calling an oracle function, we can see that the FPU is in 64 bits mantissa mode before the call, and 53 bits after, thus the problems we have.
Comment 5 Richard Biener 2008-02-15 10:25:22 UTC
So this is a problem of the oracle library not re-setting FPU state.  The
general ada problem, if it exists, is probably a dup of PR323 if it expects
double to behave consistently as if it had a mantissa of 53 bits.  Note
that starting with gcc 4.3 there is a global flag to control FPU precision
at program start (-mpc{32,64,80}), but that affects all types and so doesn't
help here (but with -mpc64 the behavior should be consistent at least wrt
the oracle library).
Comment 6 Jérôme Duquennoy 2008-02-15 11:07:07 UTC
After testing it, the -mpc option do not help.

I agree this seems to be a problem of oracle not re-setting the FPU state. Nevertheless, this makes ada unusable (or unreliable, which is not really better) with oracle 10, oracle 11, and potentially many other software under Linux ... a quite sad situation.

At the beginning of set_image_real procedure, a call to a reset function is made.
Comments indicate that this function is used to ensure the FPU is properly configured.

In g-flocon.ads, the comments on this procedure state that :
"For example under Windows NT some system DLL calls change the default FPU arithmetic to 64 bit precision mode. [...] The call to Reset simply has no effect if the target environment does not give rise to such concerns"

Well, maybe it should have an effect under Linux too now :-).
Comment 7 Uroš Bizjak 2008-02-15 11:12:49 UTC
(In reply to comment #6)
> After testing it, the -mpc option do not help.
> 
> I agree this seems to be a problem of oracle not re-setting the FPU state.
> Nevertheless, this makes ada unusable (or unreliable, which is not really
> better) with oracle 10, oracle 11, and potentially many other software under
> Linux ... a quite sad situation.

What if you add -msse2 -mfpmath=sse? Then you use SSE for float and double calculations.
Comment 8 Andrew Pinski 2008-02-15 11:13:07 UTC
Is there a reason why this is a GCC bug, from the comments I see you agree it is a bug in oracle and valgrind.  
Comment 9 Jérôme Duquennoy 2008-02-15 11:55:01 UTC
I believe the problem is enlighten by Oracle, but needs to be addressed in the compiler :

The adalib requires a specific FPU configuration to behave normally.
Therefore, it should ensure this requirement is fulfilled.

Under windows, internix, EMX, lynx, netBSD and freeBSD on an i386 CPU, the Reset procedure (alias to __gnat_init_float) issues a 'finit' to reset the FPU.

I believe Linux should be added to the list.

This is defined in gcc/ada/init.c, on lines 1910 to 1927 in the sources of gcc 4.2.3.

I will try to compile a patched version for testing purpose.

> Uros Bizjak :
> What if you add -msse2 -mfpmath=sse? Then you use SSE for float and double
> calculations

Well ... still no difference :-(.
Comment 10 Eric Botcazou 2008-02-15 12:23:58 UTC
> The adalib requires a specific FPU configuration to behave normally.
> Therefore, it should ensure this requirement is fulfilled.
> 
> Under windows, internix, EMX, lynx, netBSD and freeBSD on an i386 CPU, the
> Reset procedure (alias to __gnat_init_float) issues a 'finit' to reset the FPU.
> 
> I believe Linux should be added to the list.

It is fulfilled under Linux, unless someone else screws things up, like Oracle.
Comment 11 Keisuke Tsubota 2009-10-19 06:35:13 UTC
I wonder if the bug is caused by errata of core2duo processor 
because the bug was only reproduced on core2duo and pentium D. 
(- not on UltraSPARC III)

Intel Core2duo processor has errata in its FPU 
- Errata AI20, AI38 (*1).
And, the processor has  workarounds of the errata.
If these workaround was not implemented in the GCC version,
the bug might be caused by these errata.

And, Intel Celeron M processor also has similar errata 
- Errata W34, W56 (*2).

---------------------------------------------------------
(*1)
Core2duo processor Specification Update
-> http://download.intel.com/design/processor/specupdt/313279.pdf
-> Errata AI20,AI38

---------------------------------------------------------
(*2)
Celeron M processor Specification Update
-> http://download.intel.com/design/mobile/SPECUPDT/300303.pdf
-> Errata W34,W56 

---------------------------------------------------------
Comment 12 Eric Botcazou 2010-09-18 09:06:52 UTC
Not a GCC bug.