In some situations (after a call to a C oracle function, or when running the program on top of valgrind), floats behave strangely : a given, hard coded float value displays erroneously. For exemple, a float value of 1.2001 would print as "1.20831E+0". Our analysis : We reproduced the problem either using functions from the Oracle C library or running the software on top of valgrind. The tests were done on a Suse Linux 10.0 and a centos 5.1, using a 4.1.2 compiler and a 4.2.3 compiler, running on an intel-based machine (core2duo and pentium D). We COULD NOT reproduce the problem on a solaris 9 UltraSPARC III based machine. Using a customized img_real package, we found that the bad behaving code is the Long_Long_Float'Truncation attribute, called in the Convert_Integer procedure of the Set_Image_Real procedure, in package img_real. This function seems to be platform-dependent, this is coherent with the behaviour difference between a sparc-based machine and an pentium-based one. The problem can be reproduced easily by compiling this simple code, and running the executable under Valgrind : with ada.text_io, Ada.Strings.Fixed, System.img_real; use Ada.Strings.Fixed, ada.text_io, System.img_real; procedure test_fio is value : Long_Long_Float; resulting_string : string(1..15); nat : natural; begin value := 1.2001; resulting_string := (others => ' '); nat := 0; set_image_real(value, resulting_string, nat, 2, 4, 2); put_line ("resulting_string : '" & resulting_string & "'"); end test_fio;
Created attachment 15149 [details] Simple source code to reproduce the bug Compile this code and run the resulting binary on top of valgrind to reproduce the bug.
A precision on the appearance of the bug : The bug first appears when using the Oracle 10 C library (libclntsh), and can be reproduced using the Oracle 11 library. The bug does not appear when using Oracle 9.
Does the oracle library by any chance mess with the floating point precision registers of the CPU?
Created attachment 15156 [details] floating point tests with different FPU configuration This archive contains code to test the behavior of the floating points using different configurations of the i387 FPU. It shows that the Float type in ada only works well when using a 64 bits mantissa. When using a 53 bits mantissa, the results are the same as those obtained using test_fio under valgrind or using Oracle's lib : ---------------------------------------- CW origine: 1.20001E+00 SW = 895 {Infinity Off; NEAREST_OR_EVEN; BITS_64} ---------------------------------------- Bits_53: 1.20831E+00 SW = 639 {Infinity Off; NEAREST_OR_EVEN; BITS_53} ---------------------------------------- Bits_24: 0.E+00 SW = 127 {Infinity Off; NEAREST_OR_EVEN; BITS_24} ---------------------------------------- The output under valgrind is weird : the Float IOs behave as if the FPU was is 53 bits mantissa mode, but the config is reported to be 64 bits ... I guess this has to do with the way valgrind works. When calling an oracle function, we can see that the FPU is in 64 bits mantissa mode before the call, and 53 bits after, thus the problems we have.
So this is a problem of the oracle library not re-setting FPU state. The general ada problem, if it exists, is probably a dup of PR323 if it expects double to behave consistently as if it had a mantissa of 53 bits. Note that starting with gcc 4.3 there is a global flag to control FPU precision at program start (-mpc{32,64,80}), but that affects all types and so doesn't help here (but with -mpc64 the behavior should be consistent at least wrt the oracle library).
After testing it, the -mpc option do not help. I agree this seems to be a problem of oracle not re-setting the FPU state. Nevertheless, this makes ada unusable (or unreliable, which is not really better) with oracle 10, oracle 11, and potentially many other software under Linux ... a quite sad situation. At the beginning of set_image_real procedure, a call to a reset function is made. Comments indicate that this function is used to ensure the FPU is properly configured. In g-flocon.ads, the comments on this procedure state that : "For example under Windows NT some system DLL calls change the default FPU arithmetic to 64 bit precision mode. [...] The call to Reset simply has no effect if the target environment does not give rise to such concerns" Well, maybe it should have an effect under Linux too now :-).
(In reply to comment #6) > After testing it, the -mpc option do not help. > > I agree this seems to be a problem of oracle not re-setting the FPU state. > Nevertheless, this makes ada unusable (or unreliable, which is not really > better) with oracle 10, oracle 11, and potentially many other software under > Linux ... a quite sad situation. What if you add -msse2 -mfpmath=sse? Then you use SSE for float and double calculations.
Is there a reason why this is a GCC bug, from the comments I see you agree it is a bug in oracle and valgrind.
I believe the problem is enlighten by Oracle, but needs to be addressed in the compiler : The adalib requires a specific FPU configuration to behave normally. Therefore, it should ensure this requirement is fulfilled. Under windows, internix, EMX, lynx, netBSD and freeBSD on an i386 CPU, the Reset procedure (alias to __gnat_init_float) issues a 'finit' to reset the FPU. I believe Linux should be added to the list. This is defined in gcc/ada/init.c, on lines 1910 to 1927 in the sources of gcc 4.2.3. I will try to compile a patched version for testing purpose. > Uros Bizjak : > What if you add -msse2 -mfpmath=sse? Then you use SSE for float and double > calculations Well ... still no difference :-(.
> The adalib requires a specific FPU configuration to behave normally. > Therefore, it should ensure this requirement is fulfilled. > > Under windows, internix, EMX, lynx, netBSD and freeBSD on an i386 CPU, the > Reset procedure (alias to __gnat_init_float) issues a 'finit' to reset the FPU. > > I believe Linux should be added to the list. It is fulfilled under Linux, unless someone else screws things up, like Oracle.
I wonder if the bug is caused by errata of core2duo processor because the bug was only reproduced on core2duo and pentium D. (- not on UltraSPARC III) Intel Core2duo processor has errata in its FPU - Errata AI20, AI38 (*1). And, the processor has workarounds of the errata. If these workaround was not implemented in the GCC version, the bug might be caused by these errata. And, Intel Celeron M processor also has similar errata - Errata W34, W56 (*2). --------------------------------------------------------- (*1) Core2duo processor Specification Update -> http://download.intel.com/design/processor/specupdt/313279.pdf -> Errata AI20,AI38 --------------------------------------------------------- (*2) Celeron M processor Specification Update -> http://download.intel.com/design/mobile/SPECUPDT/300303.pdf -> Errata W34,W56 ---------------------------------------------------------
Not a GCC bug.