Bug 29186 - optimzation breaks floating point exception flag reading
Summary: optimzation breaks floating point exception flag reading
Status: NEW
Alias: None
Product: gcc
Classification: Unclassified
Component: c (show other bugs)
Version: 4.1.2
: P3 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on: 30568
Blocks: 16989
  Show dependency treegraph
 
Reported: 2006-09-22 19:13 UTC by Richard B. Kreckel
Modified: 2009-12-29 21:48 UTC (History)
5 users (show)

See Also:
Host: i486-linux-gnu
Target: i486-linux-gnu
Build: i486-linux-gnu
Known to work:
Known to fail:
Last reconfirmed: 2009-12-29 21:48:42


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Richard B. Kreckel 2006-09-22 19:13:20 UTC
See <http://gcc.gnu.org/ml/gcc-help/2006-09/msg00232.html>.

Disassembling the code suggests that, using gcc-4.1,2, both calls to fetestexcept(3) mysteriously happened before the division when optimization is turned on. This was not the case with earlier versions of gcc, where the calls to fetestexcept(3) bracket the fdivl instruction.
Comment 1 Andrew Pinski 2006-09-22 19:24:01 UTC
This is not really a bug in C99 unless you use:
#pragma STDC FENV_ACCESS on

But then again we don't implement that pramgma yet ....
Comment 2 Andrew Pinski 2006-09-22 19:25:20 UTC
PR 20785 has a patch but it has not been applied.
Comment 3 Andrew Pinski 2006-09-22 19:27:59 UTC
So this is not a bug except for the fact GCC does not implement "#pragma STDC FENV_ACCESS "

*** This bug has been marked as a duplicate of 20785 ***
Comment 4 Richard B. Kreckel 2006-09-22 22:34:04 UTC
(In reply to comment #1)
> This is not really a bug in C99 unless you use:
> #pragma STDC FENV_ACCESS on
> 
> But then again we don't implement that pramgma yet ....

Okay, I was not aware of that pragma. Thank you for pointing it out. But what I find hard to grasp is why it works with previous releases. We have this library call fetestexcept(3) and with gcc 4.1 it basically stopped working. I would say this even qualifies as a regression, right?
Comment 5 Richard B. Kreckel 2006-09-23 21:41:32 UTC
(In reply to comment #3)
> So this is not a bug except for the fact GCC does not implement "#pragma STDC
> FENV_ACCESS "

According to C99, 7.6.1, you are technically right. But still: an implementation that does not allow access to floating point flags irritates me. Couldn't that be outright dangerous, in certain circumstances?

Consider a hypothetical train control unit:

#define FE_CRITICAL (FE_DIVBYZERO|FE_INVALID|FE_OVERFLOW|FE_UNDERFLOW)
double compute_speed(double measurement)
{
    return -1./(measurement); // in reality, some rather hairy computation
}

// Adjusts speed towards nominal speed, given measurement of speed sensor.
// May decelerate, in unforeseen cases.
void control(double nominal_v, double measurement)
{
#pragma STDC FENV_ACCESS on
    feclearexcept(FE_CRITICAL);
    double v = compute_speed(measurement);
    if (fetestexcept(FE_CRITICAL)) {
        // Unexpected error: should not trust the computed speed.
        decelerate();
        return;
    }
    if (v > nominal_v*1.001) {
	printf("v==%f\n",v);
	decelerate();
	return;
    }
    if (v < nominal_v*0.999) {
	accelerate();
	return;
    }
}

Would you board that train if the train control unit were compiled with GCC?

The function decelerates the train if something unforeseen happens inside the speed computation. At least it did that when it was compiled with GCC 3.3.x, 3.4.x, or 4.0.x. Now, with GCC 4.1.x, all bets are off. Also, no compiler version seems to care to print a warning.

Having the users lulled in a false sense of safety for so long, this changed behavior with a allusion to the standard ("we need not return something meaningful") strikes me as, excuse me, somewhat careless.

Maybe somehone can provide other suggestions how to program defensively? In principle, the functionality used above (testing floating point flags) has been promised since two decades (it's IEEE 754) and has been implemented in almost every major hardware since as long. Can GNU-C not be used for such simple things?
Comment 6 joseph@codesourcery.com 2006-09-23 21:52:29 UTC
Subject: Re:  optimzation breaks floating point exception flag
 reading

On Sat, 23 Sep 2006, kreckel at ginac dot de wrote:

> According to C99, 7.6.1, you are technically right. But still: an
> implementation that does not allow access to floating point flags irritates me.

Use -frounding-math to enable FENV_ACCESS for the whole translation unit, 
but note the warnings in the documentation of that option.  If that option 
does not meet all the requirements of FENV_ACCESS, file a new bug report 
for each specific problem.

Comment 7 Richard B. Kreckel 2006-09-23 22:11:33 UTC
(In reply to comment #6)
> Use -frounding-math to enable FENV_ACCESS for the whole translation unit, 

Sorry, I fail to see what -frounding-math has to do with this. The example in comment #5 was about overflows and divisions by zero. Anyway, adding -frounding-math does not change anything in the case at hand.
Comment 8 joseph@codesourcery.com 2006-09-23 22:19:11 UTC
Subject: Re:  optimzation breaks floating point exception flag
 reading

On Sat, 23 Sep 2006, kreckel at ginac dot de wrote:

> 
> 
> ------- Comment #7 from kreckel at ginac dot de  2006-09-23 22:11 -------
> (In reply to comment #6)
> > Use -frounding-math to enable FENV_ACCESS for the whole translation unit, 
> 
> Sorry, I fail to see what -frounding-math has to do with this. The example in
> comment #5 was about overflows and divisions by zero. Anyway, adding
> -frounding-math does not change anything in the case at hand.

Exceptions are meant to be covered by -ftrapping-math, which is on by 
default; with -frounding-math the whole of FENV_ACCESS should be enabled.

Although we don't implement the pragmas for control of these features in 
particular regions of code, we *do* have options that are meant to enable 
them for whole translation units.  So any failure of those options to 
disable problem optimizations is a bug which is *not* a duplicate of the 
lack of the pragmas.

Comment 9 Richard B. Kreckel 2006-09-23 22:58:26 UTC
(In reply to comment #8)
I am still not entirely sure whether we are really talking about the same problem. The original problem was that the compiler optimized assuming that the floating point division cannot have side effects, such that the offending division happens after the call to fetestexcept(3):

#include <fenv.h>
#include <stdio.h>
int main()
{
   double x = (double)printf("") + 1.0; // one
   double y = (double)printf(""); // zero
   feclearexcept(FE_ALL_EXCEPT);
   double z = x / y;  // should set FE_DIVBYZERO
   if (fetestexcept(FE_ALL_EXCEPT)) {
       printf("flag set after call.\n");
   }
   printf("%f/%f==%f\n",x,y,z);
}

Neither -ftrapping-math, nor -frounding-math change anything, as long as -O1 is turned on: The printf inside the if statement is *not* executed.
Comment 10 joseph@codesourcery.com 2006-09-23 23:02:28 UTC
Subject: Re:  optimzation breaks floating point exception flag
 reading

On Sat, 23 Sep 2006, kreckel at ginac dot de wrote:

> I am still not entirely sure whether we are really talking about the same
> problem. The original problem was that the compiler optimized assuming that the
> floating point division cannot have side effects, such that the offending
> division happens after the call to fetestexcept(3):

> Neither -ftrapping-math, nor -frounding-math change anything, as long as -O1 is
> turned on: The printf inside the if statement is *not* executed.

In that case you have a bug that is not a duplicate of the lack of 
FENV_ACCESS pragma support.  The relevant semantics are meant to be 
supported by these command line options.

Comment 11 pinskia@gmail.com 2006-09-24 00:34:58 UTC
Subject: Re:  optimzation breaks floating point exception flag
	reading

On Sat, 2006-09-23 at 23:02 +0000, joseph at codesourcery dot com wrote:
> In that case you have a bug that is not a duplicate of the lack of 
> FENV_ACCESS pragma support.  The relevant semantics are meant to be 
> supported by these command line options.

This is a TER bug then and I really doubt it can be fixed easy.

-- Pinski

Comment 12 Richard B. Kreckel 2006-09-24 16:51:07 UTC
(In reply to comment #11)
> This is a TER bug then and I really doubt it can be fixed easy.

It doesn't disappear with -fno-tree-ter, as I would assume if it were a TER bug.
Comment 13 Richard B. Kreckel 2006-10-25 07:54:08 UTC
(In reply to comment #12)
> It doesn't disappear with -fno-tree-ter, as I would assume if it were a TER
> bug.

I just discovered that it does disappear with -fno-tree-sink, though.
Comment 14 Andrew Pinski 2006-10-25 07:57:46 UTC
So what is happening is there explict barrier for the divide so we assume we can move it.  I don't know what the correct thing is really, scheduling will have the same issue and so will being able to delete the divide as it is not used (and that is not a regression).
Comment 15 Richard B. Kreckel 2006-10-25 13:22:40 UTC
(In reply to comment #14)
Maybe scheduling would have the same issue. The fact that the result of the division is not used is a red herring, though. Of course, the assumption is that it's actually used.
Comment 16 Richard B. Kreckel 2006-10-31 11:48:54 UTC
A quote from <http://www.cs.berkeley.edu/~wkahan/ieee754status/IEEE754.PDF>:

"While on the subject of miscreant compilers, we should remark their increasingly common tendency to reorder operations that can be executed concurrently by pipelined computers. C programmers may declare a variable volatile to inhibit certain reorderings. A programmer's intention is thwarted when an alleged 'optimization' moves a floating-point instruction past a procedure-call intended to deal with a flag in the floating-point status word or to write into the control word to alter trapping or rounding. Bad moves like these have been made even by compilers that come supplied with such procedures in their libraries. (See _control87 , _clear87 and _status87 in compilers for Intel processors.) Operations’ movements would be easier to debug if they were highlighted by the compiler in its annotated re-listing of the source-code. Meanwhile, so long as compilers mishandle attempts to cope with floating-point exceptions, flags and modes in the ways intended by IEEE Standard 754, frustrated programmers will abandon such attempts and compiler writers will infer wrongly that unexercised capabilities are unexercised for lack of demand."
Comment 17 Richard B. Kreckel 2006-11-06 22:23:08 UTC
(In reply to comment #15)
> Maybe scheduling would have the same issue. The fact that the result of the
> division is not used is a red herring, though. Of course, the assumption is
> that it's actually used.

For the record: Andrew was right and above statement is wrong. The standard explicitly mandates that unused code must not be removed unless the compiler can determine that it cannot raise an exception flag [F.8.1]:

: Concern about side effects may inhibit code motion and removal of seemingly
: useless code. For example, in
: #include <fenv.h>
: #pragma STDC FENV_ACCESS ON
: void f(double x)
: {
:      /* ... */
:      for (i = 0; i < n; i++) x + 1;
:      /* ... */
: }
: x + 1 might raise floating-point exceptions, so cannot be removed. And since
: the loop body might not execute (maybe 0 ³ n), x + 1 cannot be moved out of
: the loop. (Of course these optimizations are valid if the implementation can
: rule out the nettlesome cases.)
Comment 18 Richard B. Kreckel 2006-11-19 11:22:30 UTC
An idea: Would it help if feholdexcept, fetestexcept and all those standard functions accessing the status and control flags were implemented as builtins, not as extern libcalls?

This probably wouldn't help against elimination of unused statements that nevertheless trigger side effects. But it would inhibit the worst cases of code reordering, I suppose.
Comment 19 Richard Biener 2006-11-19 12:14:38 UTC
The problem is that the division is in no ways special to optimizers.  One possibility I see would be to introduce either a builtin function or a new tree-code to access the exception flags.  Of course the fact that flags are
supposed to accumulate doesn't help to simplify things here...

It would be the frontends task to emit compound expressions.  Like instead
of

  D.2529 = x / y;

emit

  { D.2529 = x / y; __builtin_update_except (D.2529); }

(note that __builtin_update_except has to be subject to read/write global
memory to support exception flow across the call-graph).

I bet it's a mess to optimize this stuff correctly without some "clever"
hacks.  Like

  { D.2529 = x / y; *__builtin_flags = __builtin_update_except (D.2529, *__builtin_flags); }

where we can then make __builtin_update_except const [ideally *__builtin_flags
would just be a special alias tag used and clobbered by the various
exception functions]
Comment 20 Richard B. Kreckel 2009-05-04 06:47:38 UTC
So, Joseph explained that the code should execute as expected, at least with -frounding-math as a workaround. However, with GCC 4.4 it is still not possible to write code that takes advantage of those advanced features of IEEE754, even on hardware that supports it directly. Could someone, please, set this bug's status to something less inappropriate than "unconfirmed"?