See <http://gcc.gnu.org/ml/gcc-help/2006-09/msg00232.html>. Disassembling the code suggests that, using gcc-4.1,2, both calls to fetestexcept(3) mysteriously happened before the division when optimization is turned on. This was not the case with earlier versions of gcc, where the calls to fetestexcept(3) bracket the fdivl instruction.
This is not really a bug in C99 unless you use: #pragma STDC FENV_ACCESS on But then again we don't implement that pramgma yet ....
PR 20785 has a patch but it has not been applied.
So this is not a bug except for the fact GCC does not implement "#pragma STDC FENV_ACCESS " *** This bug has been marked as a duplicate of 20785 ***
(In reply to comment #1) > This is not really a bug in C99 unless you use: > #pragma STDC FENV_ACCESS on > > But then again we don't implement that pramgma yet .... Okay, I was not aware of that pragma. Thank you for pointing it out. But what I find hard to grasp is why it works with previous releases. We have this library call fetestexcept(3) and with gcc 4.1 it basically stopped working. I would say this even qualifies as a regression, right?
(In reply to comment #3) > So this is not a bug except for the fact GCC does not implement "#pragma STDC > FENV_ACCESS " According to C99, 7.6.1, you are technically right. But still: an implementation that does not allow access to floating point flags irritates me. Couldn't that be outright dangerous, in certain circumstances? Consider a hypothetical train control unit: #define FE_CRITICAL (FE_DIVBYZERO|FE_INVALID|FE_OVERFLOW|FE_UNDERFLOW) double compute_speed(double measurement) { return -1./(measurement); // in reality, some rather hairy computation } // Adjusts speed towards nominal speed, given measurement of speed sensor. // May decelerate, in unforeseen cases. void control(double nominal_v, double measurement) { #pragma STDC FENV_ACCESS on feclearexcept(FE_CRITICAL); double v = compute_speed(measurement); if (fetestexcept(FE_CRITICAL)) { // Unexpected error: should not trust the computed speed. decelerate(); return; } if (v > nominal_v*1.001) { printf("v==%f\n",v); decelerate(); return; } if (v < nominal_v*0.999) { accelerate(); return; } } Would you board that train if the train control unit were compiled with GCC? The function decelerates the train if something unforeseen happens inside the speed computation. At least it did that when it was compiled with GCC 3.3.x, 3.4.x, or 4.0.x. Now, with GCC 4.1.x, all bets are off. Also, no compiler version seems to care to print a warning. Having the users lulled in a false sense of safety for so long, this changed behavior with a allusion to the standard ("we need not return something meaningful") strikes me as, excuse me, somewhat careless. Maybe somehone can provide other suggestions how to program defensively? In principle, the functionality used above (testing floating point flags) has been promised since two decades (it's IEEE 754) and has been implemented in almost every major hardware since as long. Can GNU-C not be used for such simple things?
Subject: Re: optimzation breaks floating point exception flag reading On Sat, 23 Sep 2006, kreckel at ginac dot de wrote: > According to C99, 7.6.1, you are technically right. But still: an > implementation that does not allow access to floating point flags irritates me. Use -frounding-math to enable FENV_ACCESS for the whole translation unit, but note the warnings in the documentation of that option. If that option does not meet all the requirements of FENV_ACCESS, file a new bug report for each specific problem.
(In reply to comment #6) > Use -frounding-math to enable FENV_ACCESS for the whole translation unit, Sorry, I fail to see what -frounding-math has to do with this. The example in comment #5 was about overflows and divisions by zero. Anyway, adding -frounding-math does not change anything in the case at hand.
Subject: Re: optimzation breaks floating point exception flag reading On Sat, 23 Sep 2006, kreckel at ginac dot de wrote: > > > ------- Comment #7 from kreckel at ginac dot de 2006-09-23 22:11 ------- > (In reply to comment #6) > > Use -frounding-math to enable FENV_ACCESS for the whole translation unit, > > Sorry, I fail to see what -frounding-math has to do with this. The example in > comment #5 was about overflows and divisions by zero. Anyway, adding > -frounding-math does not change anything in the case at hand. Exceptions are meant to be covered by -ftrapping-math, which is on by default; with -frounding-math the whole of FENV_ACCESS should be enabled. Although we don't implement the pragmas for control of these features in particular regions of code, we *do* have options that are meant to enable them for whole translation units. So any failure of those options to disable problem optimizations is a bug which is *not* a duplicate of the lack of the pragmas.
(In reply to comment #8) I am still not entirely sure whether we are really talking about the same problem. The original problem was that the compiler optimized assuming that the floating point division cannot have side effects, such that the offending division happens after the call to fetestexcept(3): #include <fenv.h> #include <stdio.h> int main() { double x = (double)printf("") + 1.0; // one double y = (double)printf(""); // zero feclearexcept(FE_ALL_EXCEPT); double z = x / y; // should set FE_DIVBYZERO if (fetestexcept(FE_ALL_EXCEPT)) { printf("flag set after call.\n"); } printf("%f/%f==%f\n",x,y,z); } Neither -ftrapping-math, nor -frounding-math change anything, as long as -O1 is turned on: The printf inside the if statement is *not* executed.
Subject: Re: optimzation breaks floating point exception flag reading On Sat, 23 Sep 2006, kreckel at ginac dot de wrote: > I am still not entirely sure whether we are really talking about the same > problem. The original problem was that the compiler optimized assuming that the > floating point division cannot have side effects, such that the offending > division happens after the call to fetestexcept(3): > Neither -ftrapping-math, nor -frounding-math change anything, as long as -O1 is > turned on: The printf inside the if statement is *not* executed. In that case you have a bug that is not a duplicate of the lack of FENV_ACCESS pragma support. The relevant semantics are meant to be supported by these command line options.
Subject: Re: optimzation breaks floating point exception flag reading On Sat, 2006-09-23 at 23:02 +0000, joseph at codesourcery dot com wrote: > In that case you have a bug that is not a duplicate of the lack of > FENV_ACCESS pragma support. The relevant semantics are meant to be > supported by these command line options. This is a TER bug then and I really doubt it can be fixed easy. -- Pinski
(In reply to comment #11) > This is a TER bug then and I really doubt it can be fixed easy. It doesn't disappear with -fno-tree-ter, as I would assume if it were a TER bug.
(In reply to comment #12) > It doesn't disappear with -fno-tree-ter, as I would assume if it were a TER > bug. I just discovered that it does disappear with -fno-tree-sink, though.
So what is happening is there explict barrier for the divide so we assume we can move it. I don't know what the correct thing is really, scheduling will have the same issue and so will being able to delete the divide as it is not used (and that is not a regression).
(In reply to comment #14) Maybe scheduling would have the same issue. The fact that the result of the division is not used is a red herring, though. Of course, the assumption is that it's actually used.
A quote from <http://www.cs.berkeley.edu/~wkahan/ieee754status/IEEE754.PDF>: "While on the subject of miscreant compilers, we should remark their increasingly common tendency to reorder operations that can be executed concurrently by pipelined computers. C programmers may declare a variable volatile to inhibit certain reorderings. A programmer's intention is thwarted when an alleged 'optimization' moves a floating-point instruction past a procedure-call intended to deal with a flag in the floating-point status word or to write into the control word to alter trapping or rounding. Bad moves like these have been made even by compilers that come supplied with such procedures in their libraries. (See _control87 , _clear87 and _status87 in compilers for Intel processors.) Operations’ movements would be easier to debug if they were highlighted by the compiler in its annotated re-listing of the source-code. Meanwhile, so long as compilers mishandle attempts to cope with floating-point exceptions, flags and modes in the ways intended by IEEE Standard 754, frustrated programmers will abandon such attempts and compiler writers will infer wrongly that unexercised capabilities are unexercised for lack of demand."
(In reply to comment #15) > Maybe scheduling would have the same issue. The fact that the result of the > division is not used is a red herring, though. Of course, the assumption is > that it's actually used. For the record: Andrew was right and above statement is wrong. The standard explicitly mandates that unused code must not be removed unless the compiler can determine that it cannot raise an exception flag [F.8.1]: : Concern about side effects may inhibit code motion and removal of seemingly : useless code. For example, in : #include <fenv.h> : #pragma STDC FENV_ACCESS ON : void f(double x) : { : /* ... */ : for (i = 0; i < n; i++) x + 1; : /* ... */ : } : x + 1 might raise floating-point exceptions, so cannot be removed. And since : the loop body might not execute (maybe 0 ³ n), x + 1 cannot be moved out of : the loop. (Of course these optimizations are valid if the implementation can : rule out the nettlesome cases.)
An idea: Would it help if feholdexcept, fetestexcept and all those standard functions accessing the status and control flags were implemented as builtins, not as extern libcalls? This probably wouldn't help against elimination of unused statements that nevertheless trigger side effects. But it would inhibit the worst cases of code reordering, I suppose.
The problem is that the division is in no ways special to optimizers. One possibility I see would be to introduce either a builtin function or a new tree-code to access the exception flags. Of course the fact that flags are supposed to accumulate doesn't help to simplify things here... It would be the frontends task to emit compound expressions. Like instead of D.2529 = x / y; emit { D.2529 = x / y; __builtin_update_except (D.2529); } (note that __builtin_update_except has to be subject to read/write global memory to support exception flow across the call-graph). I bet it's a mess to optimize this stuff correctly without some "clever" hacks. Like { D.2529 = x / y; *__builtin_flags = __builtin_update_except (D.2529, *__builtin_flags); } where we can then make __builtin_update_except const [ideally *__builtin_flags would just be a special alias tag used and clobbered by the various exception functions]
So, Joseph explained that the code should execute as expected, at least with -frounding-math as a workaround. However, with GCC 4.4 it is still not possible to write code that takes advantage of those advanced features of IEEE754, even on hardware that supports it directly. Could someone, please, set this bug's status to something less inappropriate than "unconfirmed"?
(In reply to Richard B. Kreckel from comment #20) > So, Joseph explained that the code should execute as expected, at least with > -frounding-math as a workaround. However, with GCC 4.4 it is still not > possible to write code that takes advantage of those advanced features of > IEEE754, even on hardware that supports it directly. Could someone, please, > set this bug's status to something less inappropriate than "unconfirmed"? I would, but I don't know what'd be more appropriate...
I can't reproduce this bug any more, with any of the optimization settings on x86 or x86_64 going back as far as GCC 4.9.2. Delighted to see that this has been addressed in the meantime (even without supporting that pragma.) I suppose this bug can just be closed. I don't know about 30568. (Don't understand why it's related at all).
(In reply to Richard B. Kreckel from comment #22) > I can't reproduce this bug any more, I think you are just lucky, I am sure it hasn't been fixed and gcc will still happily swap FP operations with function calls like fetestexcept. You still need something like volatile to protect your operation, and even then the compiler could theoretically move some unrelated FP op just before fetestexcept, which would set more flags than the operation you wanted to test.
For the division when GCC doesn't know the divident is not zero I think we actually fixed the bug but yes, in general FP operation reordering wrt FP env access isn't fixed. But GCC needs to consider that feclear/testexcept exit the program and since the division may trap it cannot hoist it before the feclearexcept call. It might move it after the fetestexcept call though (delaying traps is OK). As soon as we start to put more knowledge about feclearexcept into GCC this will break again. Oh - we actually do know this since GCC 8 ... So you're just lucky indeed ...
(In reply to Richard Biener from comment #24) > So you're just lucky indeed ... This makes me wonder if there is still a way to trigger this. You suggest this has been fixed for the division (is there a PR or reference?) and I am not able to create a similar bug using addition, multiplication, etc. using GCC 10.
(In reply to Richard B. Kreckel from comment #25) > (In reply to Richard Biener from comment #24) > > So you're just lucky indeed ... > > This makes me wonder if there is still a way to trigger this. > > You suggest this has been fixed for the division (is there a PR or > reference?) and I am not able to create a similar bug using addition, > multiplication, etc. using GCC 10. You just need to give GCC the incentive to break it. Like with #include <fenv.h> #include <stdio.h> int main() { double x = (double)printf("") + 1.0; // one double y = (double)printf(""); // zero double breakme = x / y; feclearexcept(FE_ALL_EXCEPT); double z = x / y; // should set FE_DIVBYZERO if (fetestexcept(FE_ALL_EXCEPT)) { printf("flag set after call.\n"); } printf("%f/%f==%f\n",x,y,z + breakme); }
*** Bug 101063 has been marked as a duplicate of this bug. ***