14384 – Invalid use of extra precision floating-point with -O0 optimization

Bug 14384 - Invalid use of extra precision floating-point with -O0 optimization

Summary: Invalid use of extra precision floating-point with -O0 optimization

Status:	RESOLVED DUPLICATE of bug 323

Alias:	None

Product:	gcc
Classification:	Unclassified
Component:	target (show other bugs)
Version:	3.3.2

Importance:	P2 normal
Target Milestone:	---
Assignee:	Not yet assigned to anyone

URL:
Keywords:

Depends on:
Blocks:

Reported:	2004-03-02 15:00 UTC by Alain NOULLEZ
Modified:	2005-07-23 22:49 UTC (History)
CC List:	1 user (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed:

Attachments
Source code showing the bug (159 bytes, text/plain) 2004-03-02 15:21 UTC, Alain NOULLEZ	Details
Plain text description of the bug (781 bytes, text/plain) 2004-03-02 15:22 UTC, Alain NOULLEZ	Details
Source code showing that float arithmetic is done in double (341 bytes, text/plain) 2004-03-03 15:40 UTC, Alain NOULLEZ	Details
Source code showing that extra precision is kept also from functions (379 bytes, text/plain) 2004-03-03 15:45 UTC, Alain NOULLEZ	Details
ANSWERS: this is not a duplicate of bug 323 and it is not fixed (1.26 KB, text/plain) 2004-03-05 10:01 UTC, Alain NOULLEZ	Details
An even simpler source file demonstrating the problem (408 bytes, text/plain) 2004-03-10 18:23 UTC, Alain NOULLEZ	Details
View All Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Alain NOULLEZ 2004-03-02 15:00:03 UTC

Arithmetic involving a float and a "lesser" type should be done in float in C89,
at least when optimization is turned off.
In some cases (shown in the attached program), the arithmetic is done in double,
carrying extra precision, which is *not* wanted, and makes the program results
different from those on other machines.

Comment 1 Alain NOULLEZ 2004-03-02 15:21:05 UTC

Created attachment 5842 [details]
Source code showing the bug

Compile with -O0 with or without -DVF to show the bug,
producing the (incorrect) result 12345.123456
The correct result seems to be obtained when compiling
with -O3 and without -DVF, producing 12345.123047

Comment 2 Alain NOULLEZ 2004-03-02 15:22:07 UTC

Created attachment 5843 [details]
Plain text description of the bug

Full description of the bug and the possible test cases

Comment 3 Alain NOULLEZ 2004-03-03 15:40:24 UTC

Created attachment 5853 [details]
Source code showing that float arithmetic is done in double

The result of this code shows that gcc does float arithmetic in double, which
it is not allowed to do when compiling with -O0 optimization.
Both SGI and DEC C compilers produce the correct result, i.e. that addition of
two floats and of two doubles should produce different results ... (whatever
the optimization levels for them, by the way)

Comment 4 Alain NOULLEZ 2004-03-03 15:45:59 UTC

Created attachment 5854 [details]
Source code showing that extra precision is kept also from functions

This source code shows that gcc also keeps extra double precision out of
functions declared as float, not even honoring the explicit cast (which is
not needed here by the way).
The code prints twice 12345.123456 whereas 12345.123047 should be printed for
a machine running with 32 bits IEEE floats.
Once again, even if it can be discussed whether this is a bug or not when
optimizing the code, it is *not* allowed at -O0 or the compiler doesn't conform

(and SGI and DEC compilers produce the right result whatever the optimization
level)

Comment 5 Andrew Pinski 2004-03-03 17:03:54 UTC

I do not think this is a dup though.  Where in the C standard says that this must work this way?  Also 
what target are you are testing on?

Comment 6 Jim Wilson 2004-03-03 18:52:39 UTC

Subject: Re:  Invalid use of extra precision floating-point with
 -O0 optimization

anz at obs-nice dot fr wrote:
> This source code shows that gcc also keeps extra double precision out of
> functions declared as float, not even honoring the explicit cast (which is
> not needed here by the way).

The gccbug.txt attachment states that this is an x86 target.  So this is 
the classic x86 excess precision problem which has been known for over a 
decade.

This is partly the fault of the x86 FPU, and partly the fault of the gcc 
x86 backend.

The classic x86 FPU has no float or double arithmetic.  It has only long 
double arithmetic.  This means the only way to get the result you want 
is to emit extra instructions to convert results to float or double, but 
this makes the code run slower, and introduces double-rounding errors. 
So there is no way to win here.  Gcc by design emits fast 
excess-precision results.

This is compounded by problems in the gcc backend where it sometimes 
accidently loses the excess precision, resulting in unpredictable 
rounding errors.

There are some things that can be done about this.
1) Use -ffloat-store.  This forces user variables to be rounded but not 
intermediates, so it solves some but not all problems.  This is the 
simplest solution.
2) Set the processor rounding more to float or double instead of the 
default long double.  However, once you set the processor rounding mode, 
you can no longer use larger precisions, so this is useful only if your 
program uses only one precision.  Changing this may break glibc math 
routines.
3) Do arithmetic in the SSE registers instead of the classic FP register 
stack.  The SSE registers have float and double arithmetic instructions, 
and hence do not have the excess precision problem.  This requires 
recent gcc versions and recent x86 processors.  This is probably the 
best solution.
4) Don't use x86 processors for FP work.  This is probably impractical 
for most people though.
5) Fix the gcc backend so that slow double-rounded code is an option. 
This is hard.

Comment 7 Jim Wilson 2004-03-03 18:57:02 UTC

Subject: Re:  New: Invalid use of extra precision floating-point
 with -O0 optimization

anz at obs-nice dot fr wrote:
> Arithmetic involving a float and a "lesser" type should be done in float in C89,
> at least when optimization is turned off.

This is false.  K&R C required float operations to be performed in 
double.  C89 allows float operations to be performed as float, but does 
not require it.

In my ANSI C 89 manual, this is section 3.2.1.5 Usual Arithmetic 
Conversions, last sentence
"The values of floating operands and of the results of floating 
expressions may be represented in greater precision and range than that 
required by the type; the types are not changed thereby."

This gives the compiler the option of using float/double/long double for 
operations on float types.  This is unfortunately necessary, because 
some hardware, the classic x86 FPU in particular, does not have a full 
set of operations.  It only has one, which for linux, is long double by 
default.

Comment 8 Andrew Pinski 2004-03-03 19:18:34 UTC

This is a dup of bug 323 then, thanks Jim for the explanation.

*** This bug has been marked as a duplicate of 323 ***

Comment 9 Alain NOULLEZ 2004-03-05 10:01:04 UTC

Created attachment 5867 [details]
ANSWERS: this is not a duplicate of bug 323 and it is not fixed

Answer to the answers I got:
Even if arithmetic involving floats *can* be done with higher accuracy,
casts and assignments have to be obeyed and cannot be optimized away.

Comment 10 Alain NOULLEZ 2004-03-10 09:36:29 UTC

(In reply to comment #8)
> This is a dup of bug 323 then, thanks Jim for the explanation.
> 
> *** This bug has been marked as a duplicate of 323 ***

Comment 11 Andrew Pinski 2004-03-10 14:54:22 UTC

I do not see any problems with this:
tin:~/src/gnu/gcctest>gcc pr14384.c
tin:~/src/gnu/gcctest>./a.out 
12345.123456
tin:~/src/gnu/gcctest>gcc pr14384.c -O
tin:~/src/gnu/gcctest>./a.out 
12345.123047
tin:~/src/gnu/gcctest>gcc pr14384.c -O3
tin:~/src/gnu/gcctest>./a.out 
12345.123047
tin:~/src/gnu/gcctest>gcc pr14384.c -mfpmath=sse
pr14384.c:1: warning: SSE instruction set disabled, using 387 arithmetics
^[[Atin:~/src/gnu/gcctest>gcc pr14384.c -mfpmath=sse -msse
tin:~/src/gnu/gcctest>./a.out 
12345.123047

Comment 12 Alain NOULLEZ 2004-03-10 15:36:27 UTC

(In reply to comment #11)
The problem doesn't lie with the source gccbug.c, which I suppose is the one you
renamed to pr14384.c
It lies with the source code gccbug1c.c, in the fourth attachment.
This program, when run on my machine (see the 2nd attachment for the version of
gcc and system info), prints
12345.123456
12345.123456
showing that the result of the function has *not* been narrowed to float, even
though this *is* mandated by the C standard.  If the narrowing casts are obeyed,
as they should, at least the second result must be
12345.123047
on a machine running with 32-bit IEEE float,
and this is what is obtained if compiled with
-mfpmath=sse -msse
As Jim Wilson said, the C standard *allows* for float expressions to be computed
with higher accuracy, but if a (float) cast (or an assignment) is present, it
*must* be obeyed (see my 5th attachment), and this is not the case with gcc
(whatever the optimization level in that case).
*IF* you don't obey this rule, then programs that determine themselve the
machine accuracy in floating-point (like the famous paranoia program, see
  http://gcc.gnu.org/ml/gcc/2002-10/msg00950.html
  http://gcc.gnu.org/ml/gcc/2001-05/msg01583.html
could fail in obscure ways.

To summarize once more, even if
  f1+f2
*may* be computed with higher accuracy than float if f1 and f2 are float
  (float)(f1+f2)
cannot, and the extra precision of the x87 processor *must* be discarded.

Comment 13 Andrew Pinski 2004-03-10 15:44:57 UTC

no you are missing the point that f1+f2 is not done in double precission but only done in single.

   d1 = ((double)f1)+offset;

Will give you the addition done in double precission.

Comment 14 Alain NOULLEZ 2004-03-10 16:30:13 UTC

(In reply to comment #13)
> no you are missing the point that f1+f2 is not done in double precission but
only done in single.
> 
>    d1 = ((double)f1)+offset;
> 
> Will give you the addition done in double precission.

NO, if the addition was done in single, 12345.123047 would be printed (twice)
(check with -mfpmath=sse -msse).
The fact that 12345.123456 is printed twice proves that BOTH additions are done
in double precision, because otherwise 12345.123456 would have been rounded to
12345.123047 because float has not enough accuracy to hold 12345.123456 exactly.
And if you use the line you wrote
>    d1 = ((double)f1)+offset;
you will see that it changes nothing because gcc computes with (double) extra
precision anyway.
And if it is OK to compute with extra precision in the first expression
   d1 = f1+offset;
it is not in the second
   d2 = addf(f1);
because there addf() has the return type float (and I even added an explicit
cast to to float), so the result *HAS* to be rounded to single precision and
produce 12345.123047 as result.

Comment 15 Andrew Pinski 2004-03-10 16:49:25 UTC

Well ICC 6.0 gives exactly the same result (even at -O0).

Comment 16 Alain NOULLEZ 2004-03-10 18:23:39 UTC

Created attachment 5893 [details]
An even simpler source file demonstrating the problem

This even simpler code shows that gcc doesn't round an expression to (float)
and
keeps the extra precision even when an explicit cast is present, where it
shouldn't be allowed to do so.
The same expression is computed twice, in (double) and in forced (float) with
an
explicit cast.	They must differ, unless float and double are the same.

Comment 17 Jim Wilson 2004-03-11 02:34:01 UTC

Subject: Re:  Invalid use of extra precision floating-point
 with -O0 optimization

anz at obs-nice dot fr wrote:
> This even simpler code shows that gcc doesn't round an expression to (float)
> and
> keeps the extra precision even when an explicit cast is present, where it
> shouldn't be allowed to do so.

Yes.  As I explained a week ago, this is the excess-precision problem 
that has been known for over a decade.  There is no easy solution in the 
compiler because the design of the x86 FP register stack is flawed.  You 
either have to use a compiler workaround, or else you have to complain 
to Intel, or else you have to stop doing FP work on the x86 FP register 
stack.

Or you could try writing a fix for gcc yourself, but this will be hard, 
and will hurt performance, and will introduce double-rounding errors. 
So you may still have problems even with a fix.  That is why we haven't 
already done it.

Continuing to complain to us will only annoy us.  We can't fix it.

Comment 18 Andrew Pinski 2004-03-11 05:48:33 UTC

Read up on the mailing lists for more information on why this is considered a non-bug and a dup of 
bug 323.  The main problem is x87 (the FP part of the x86) is not suited for IEEE fp work unless you 
change some stuff, that is one reason why sse improved things in this area and one reason why in 
x86_64 x87 is not recomened any more.

*** This bug has been marked as a duplicate of 323 ***