# arithmetic problem (it IS a bug)

Christoph Stoeck d023243@hw1496.wdf.sap-ag.de
Thu Dec 16 03:31:00 GMT 1999

```Hi,

just to add some more facts to the discussion.

The suspicious lines of code are:

***********************************
int i = 40000000;
i = (double)i * (double)0.000001;
***********************************

The results are:

i = 40 ( Linux on Intel in opt mode     )
i = 39 ( Linux on Intel in dbg mode     ) <<<<<
i = 40 ( NT    on Intel in opt/dbg mode )

Some remarks:

On Intel platforms we are dealing with a 80bit floating point arithmetic
processor.
The intermediate result for the operation
" (double)40000000 * (double)0.000001 "
following the IEEE Standard 754 is the following
( only the decimal representation ):

80bit floating point   : ca.    39,99999999999999999...
64bit floating point   : exact  40 !!!

So the reason for the result of 39 in this case is that the 80bit floating
point result (represented by a floating point register) is directly
converted to an integral value ( i of type int). This is done by
truncation as defined.

The direct truncation of the 80bit floating point to an int in this case
is a bug.

Standard C allows that the implementation uses different arithmetic
accuracies, only if the result is the same
(ANSI C Standard. ISO 9899, Section 5.1.2.3 "Program Execution").
But at least at sequence points, all implementations should deliver a well
defined value with a well defined type.
In this case, we have a double arithmetic with 64bit representation, so
the intermediate result at the sequence point "=" has to be a double with
64bit representation.
But Linux delivers a 80bit floating point value and continues with the
next operation, the truncation. This is wrong.

In other words: Linux in dbg-mode should first round the intermediate
result from 80bit to 64bit accuracy (double) and THEN do the truncation.

By the way, the most surprising thing is that the following code gives
different results with Linux (on Intel in dbg mode) in one and the same
execution unit:

int i = 40000000;
i = (double)i         * (double)0.000001; /* i == 39 */
i = (double)40000000  * (double)0.000001; /* i == 40 */

Even, if you don't like standards, how would you explain this ?

Christoph Stoeck
Christoph.Stoeck@sap.com

```