Bug 34678 - Optimization generates incorrect code with -frounding-math option (#pragma STDC FENV_ACCESS not implemented)
Summary: Optimization generates incorrect code with -frounding-math option (#pragma ST...
Status: NEW
Alias: None
Product: gcc
Classification: Unclassified
Component: middle-end (show other bugs)
Version: 4.3.0
: P3 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
: 37838 46180 47617 48295 56020 108318 (view as bug list)
Depends on:
Blocks: 16989 fortran-ieee 108329
  Show dependency treegraph
 
Reported: 2008-01-04 17:48 UTC by merkert
Modified: 2023-08-09 09:47 UTC (History)
17 users (show)

See Also:
Host:
Target:
Build: 4.3.0 20071123
Known to work:
Known to fail:
Last reconfirmed: 2008-01-05 11:19:52


Attachments
poor mans solution^Whack (771 bytes, patch)
2019-05-22 08:41 UTC, Richard Biener
Details | Diff
other hack (2.62 KB, patch)
2019-06-18 23:39 UTC, Marc Glisse
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description merkert 2008-01-04 17:48:20 UTC
The following function produces only a single division operation even when compiling with -frounding-math. I had read in a different PR (29186) that the pragma FENV_ACCESS is not supported, but that -frounding-math should produce the same effect (recognizing that the option is experimental as well).

Here's the function:

cat > div.c <<EOF
#include <fenv.h>

void xdiv (double x, double y, double* lo, double* hi)
{
  #pragma STDC FENV_ACCESS ON 
  
  fesetround(FE_DOWNWARD);
  *lo = x/y;
  fesetround(FE_UPWARD);
  *hi = x/y;
}
EOF
gcc -O -frounding-math div.c -S

I get the following assembly-fragment on 
xdiv:
.LFB2:
        movq    %rbx, -24(%rsp)
.LCFI0:
        movq    %rbp, -16(%rsp)
.LCFI1:
        movq    %r12, -8(%rsp)
.LCFI2:
        subq    $56, %rsp
.LCFI3:
        movsd   %xmm0, 24(%rsp)
        movsd   %xmm1, 16(%rsp)
        movq    %rdi, %rbx
        movq    %rsi, %r12
        movl    $1024, %edi
        call    fesetround
        movsd   24(%rsp), %xmm0
        divsd   16(%rsp), %xmm0
        movsd   %xmm0, 8(%rsp)
        movq    8(%rsp), %rbp
        movq    %rbp, (%rbx)
        movl    $2048, %edi
        call    fesetround
        movq    %rbp, (%r12)
        movq    32(%rsp), %rbx
        movq    40(%rsp), %rbp
        movq    48(%rsp), %r12
        addq    $56, %rsp
        ret
.LFE2:

Here's also a simple driver program:
#include <stdio.h>
#include <assert.h>

extern void xdiv(double x, double y, double* lo, double* hi);

int main(int argc, char** argv)
{
  double z1,z2;
  
  xdiv(1,10,&z1,&z2);
  printf(" rounding down 1/10 is %30.20g \n", z1);
  printf(" rounding up 1/10 is %30.20g \n", z2);
  assert(z1<z2 && "Rounding mode is not working");
  return 0;
}

I'm sure this is supposed to work in std c, but I'm not sure that it is supposed to work in gcc yet (according to http://gcc.gnu.org/wiki/GeertBosch it might). It doesn't work on either x86-64 nor i386.
Comment 1 Richard Biener 2008-01-05 11:19:52 UTC
Sorry, -frounding-math does not help for this case - it assumes the _same_
rounding mode is in effect everywhere, but doesn't assume it is round-to-nearest.
Comment 2 merkert 2008-01-05 15:38:56 UTC
Ok, so how then would one accomplish this in std c without resorting to asm? I still assume the original code is correct even though the rounding-math doesn't do what I wanted.

At any rate, I played a little with it and there was hint in the asm manual how to do it. This seems to work for me, but I'm not sure I'm using the constraints as efficiently as possible:

#include <fenv.h>

inline void reload(double* x)
{
  asm volatile ("" : "=m"(x) );
}

void xdiv (double x, double y, double* lo, double* hi)
{
  #pragma STDC FENV_ACCESS ON
  fesetround(FE_DOWNWARD);
  *lo = x/y;

  reload(&y);

  fesetround(FE_UPWARD);
  *hi = x/y;
}
Comment 3 jsm-csl@polyomino.org.uk 2008-01-05 17:07:54 UTC
Subject: Re:  Optimization generates incorrect code
 with -frounding-math option (#pragma STDC FENV_ACCESS not implemented)

On Sat, 5 Jan 2008, rguenth at gcc dot gnu dot org wrote:

> Sorry, -frounding-math does not help for this case - it assumes the _same_
> rounding mode is in effect everywhere, but doesn't assume it is
> round-to-nearest.

To be clear: this is a bug in the present implementation of 
-frounding-math - it *should* disable all the assumptions of same rounding 
mode (unless it can prove that no function changing the rounding mode is 
called between the two places where it assumes the same mode), but, as 
documented in the manual, it's not yet fully implemented.

#   This option is experimental and does not currently guarantee to
#   disable all GCC optimizations that are affected by rounding mode.

Comment 4 Richard Biener 2008-01-05 18:24:29 UTC
I wouldn't read the language this way.  Because that will forcefully disable
all redundancy removing optimizations (which is what happens in this testcase).
What it currently guards is expression rewriting that changes the outcome if
a rounding mode different than round-to-nearest is used.

The finer-grained control the documentation mentions should not be globbed
to -frounding-math IMHO.
Comment 5 jsm-csl@polyomino.org.uk 2008-01-06 15:12:54 UTC
Subject: Re:  Optimization generates incorrect code
 with -frounding-math option (#pragma STDC FENV_ACCESS not implemented)

On Sat, 5 Jan 2008, rguenth at gcc dot gnu dot org wrote:

> I wouldn't read the language this way.  Because that will forcefully disable
> all redundancy removing optimizations (which is what happens in this testcase).
> What it currently guards is expression rewriting that changes the outcome if
> a rounding mode different than round-to-nearest is used.

My understanding has always been that -frounding-math should be usable for 
the code that calls rounding-mode-changing functions, rather than having 
no way to compile that code safely with GCC, as well as the code that does 
not call those functions but may execute with non-default rounding modes.  
The FENV_ACCESS pragma does not distinguish between the two.

> The finer-grained control the documentation mentions should not be globbed
> to -frounding-math IMHO.

The pragma would in effect set -frounding-math for particular regions of 
code; it isn't more fine-grained regarding whether the code sets the mode 
or merely runs under a different mode.

It is of course possible that -frounding-math should be split into 
multiple options (more fine-grained than the pragma) as the other related 
flags have been split over time.

Comment 6 Richard Biener 2008-01-06 15:57:30 UTC
I see.  So basically we need to split all floating point operators into two variants, one specifying a default rounding mode is used and one specifying the
rounding mode is unknown.  I suppose the frontend parts would be actually quite
simple then?
Comment 7 jsm-csl@polyomino.org.uk 2008-01-06 16:28:46 UTC
Subject: Re:  Optimization generates incorrect code
 with -frounding-math option (#pragma STDC FENV_ACCESS not implemented)

On Sun, 6 Jan 2008, rguenth at gcc dot gnu dot org wrote:

> I see.  So basically we need to split all floating point operators into two
> variants, one specifying a default rounding mode is used and one specifying the
> rounding mode is unknown.  I suppose the frontend parts would be actually quite
> simple then?

More than two variants, in the end, depending on how you handle all the 
other flags - but eventually, everything about GIMPLE semantics controlled 
by the global flags should be directly represented in the GIMPLE and the 
pragmas, together with the command-line options determining global 
defaults, would map to appropriate choices of operations / flags on those 
operations.  This is desirable for LTO optimizing between objects compiled 
with different options as well.

Comment 8 Richard Biener 2008-10-15 15:43:29 UTC
*** Bug 37838 has been marked as a duplicate of this bug. ***
Comment 9 Vincent Lefèvre 2008-10-15 21:29:43 UTC
What was said in bug 37838 but not here is that -frounding-math sometimes fixes the problem. So, I was suggesting that -frounding-math should be enabled by default.
Comment 10 Richard Biener 2008-10-15 22:14:13 UTC
The default of -fno-rounding-math is chosen with the reason that this is what
a compiler can assume if #pragma STDC FENV_ACCESS is not turned on.

What you may request instead is an implementation of FENV_ACCESS up to that
point that we issue a fatal error whenever you try to turn it on.  Or at least
a warning by default.
Comment 11 Vincent Lefèvre 2008-10-15 22:33:52 UTC
(In reply to comment #10)
> The default of -fno-rounding-math is chosen with the reason that this is what
> a compiler can assume if #pragma STDC FENV_ACCESS is not turned on.

The C standard doesn't require a compiler to recognize the FENV_ACCESS pragma, but if the compiler does not recognize it, then it must assume that this pragma is ON (otherwise the generated code can be incorrect). That's why I suggested that -frounding-math should be the default.
Comment 12 Richard Biener 2008-10-16 09:44:38 UTC
Turning -frounding-math on by default would be a disservice to (most of) our users which is why the decision was made (long ago) to not enable this by default.
Comment 13 merkert 2008-10-16 11:56:57 UTC
Would it make sense to have a function attribute to indicate that rounding mode was changed as a side effect? This way, one could keep the default rounding behavior and not incur a performance penalty, but at the same time setroundingmode would work as expected.
Comment 14 Vincent Lefèvre 2008-10-16 14:20:25 UTC
(In reply to comment #12)
> Turning -frounding-math on by default would be a disservice to (most of) our
> users which is why the decision was made (long ago) to not enable this by
> default.

The compiler should generate correct code by default, and options like -funsafe-math-optimizations are there to allow the users to run the compiler in a non-conforming mode. So, it would be wise to have -frounding-math by default and add -fno-rounding-math to the options enabled by -funsafe-math-optimizations.
Comment 15 jsm-csl@polyomino.org.uk 2008-10-16 16:39:32 UTC
Subject: Re:  Optimization generates incorrect code
 with -frounding-math option (#pragma STDC FENV_ACCESS not implemented)

On Thu, 16 Oct 2008, vincent at vinc17 dot org wrote:

> The compiler should generate correct code by default, and options like

Sure, it generates correct code for the supported features, and 
<http://gcc.gnu.org/c99status.html>, which is linked from the manual, 
documents the state of support for the features: standard pragmas are 
documented as Missing and IEC 60559 support as Broken, with the 
explanation:

     * IEC 60559 is IEEE 754 floating point. This works if and only if
       the hardware is perfectly compliant, but GCC does not define
       __STDC_IEC_559__ or implement the associated standard pragmas; nor
       do some options such as -frounding-math to enable the pragmas
       globally work in all cases (for example, required exceptions may
       not be generated) and contracting expressions (e.g., using fused
       multiply-add) is not restricted to source-language expressions as
       required by C99.

So you are using features documented not to be fully implemented and 
whether they will work is unpredictable.

Comment 16 Vincent Lefèvre 2008-10-16 17:39:50 UTC
I was suggesting to improve the behavior by having -frounding-math by default (at least when the user compiles with -std=c99 -- if he does this, then this means that he shows some interest in a conforming implementation). This is not perfect, but would be better than the current behavior.
Comment 17 Paolo Carlini 2010-10-26 10:32:24 UTC
*** Bug 46180 has been marked as a duplicate of this bug. ***
Comment 18 Andrew Pinski 2011-02-08 01:58:40 UTC
*** Bug 47617 has been marked as a duplicate of this bug. ***
Comment 19 Andrew Pinski 2011-04-01 19:22:05 UTC
*** Bug 48295 has been marked as a duplicate of this bug. ***
Comment 20 Nick Maclaren 2013-11-11 09:30:42 UTC
Richard Biener's approach to the default is the one that matches the C
standard and Vincent Lefèvre is mistaken.  C11 7.6.1p2 says:

"... If part of a program tests floating-point status flags, sets
floating-point control modes, or runs under non-default mode settings,
but was translated with the state for the FENV_ACCESS pragma ‘‘off’’,
the behavior is undefined.  The default state (‘‘on’’ or ‘‘off’’) for
the pragma is implementation-defined."  Defining it to be 'off' and
not setting __STDC_IEC_559__ is very reasonable.

Because generated code and the library are potentially dependent on the
rounding mode (including even floating point to integer conversion!),
the default should remain that rounding mode support is off until each
target has been thoroughly checked that it does NOT break.

There are also very strong grounds for not wanting IEEE 754 support by
default, anyway, because of the performance impact and because a lot of
programs won't reset the state before calling external functions (and
hence may well give wrong answers).  That is especially true if the code
is used within a C++ program or uses GPUs or some SIMD units - let alone
OpenMP :-(
Comment 21 Vincent Lefèvre 2013-11-11 13:37:34 UTC
(In reply to Nick Maclaren from comment #20)
> Richard Biener's approach to the default is the one that matches the C
> standard and Vincent Lefèvre is mistaken.

No, I'm correct.

> Defining it to be 'off' and not setting __STDC_IEC_559__ is very reasonable.

No, this is really stupid. If the user *decides* to set the STDC FENV_ACCESS pragma to "on", then the compiler must not assume that it is "off" (this bug is not about the default state). At least it must behave in this way if -std=c99 (or c11) has been used. Otherwise a compilation failure may be better than getting wrong results.
Comment 22 jsm-csl@polyomino.org.uk 2013-11-11 16:38:28 UTC
On Mon, 11 Nov 2013, nmm1 at cam dot ac.uk wrote:

> There are also very strong grounds for not wanting IEEE 754 support by
> default, anyway, because of the performance impact and because a lot of
> programs won't reset the state before calling external functions (and
> hence may well give wrong answers).  That is especially true if the code
> is used within a C++ program or uses GPUs or some SIMD units - let alone
> OpenMP :-(

Note also that the documented default is -ftrapping-math 
-fno-rounding-math.  I suspect that if -ftrapping-math actually 
implemented everything required for the floating-point exceptions aspects 
of FENV_ACCESS, it would be just as bad for optimization as 
-frounding-math - it would disallow constant-folding inexact 
floating-point expressions because that would eliminate the side effect of 
raising the "inexact" exception, for example (just as -frounding-math does 
disable such constant folding, although not in all cases it should, 
because the result depends on the rounding mode), and would mean a value 
computed before a function call can't be reused for the same computation 
after that call because the computation might raise exceptions that the 
function call could have cleared (just as -frounding-math should prevent 
such reuse because the call might change the rounding mode).

So a key part of actually making rounding modes and exceptions work 
reliably would be working out a definition of GCC's default mode that 
allows more or less the same optimizations as at present, while allowing 
users wanting the full support (and consequent optimization cost) to 
specify the appropriate command-line options or FENV_ACCESS pragma to 
enable it.
Comment 23 Nick Maclaren 2013-11-11 17:32:01 UTC
(In reply to Vincent Lefèvre from comment #21)
>
> > Richard Biener's approach to the default is the one that matches the C
> > standard and Vincent Lefèvre is mistaken.
> 
> No, I'm correct.
> 
> > Defining it to be 'off' and not setting __STDC_IEC_559__ is very reasonable.
> 
> No, this is really stupid. If the user *decides* to set the STDC FENV_ACCESS
> pragma to "on", then the compiler must not assume that it is "off" (this bug
> is not about the default state). At least it must behave in this way if
> -std=c99 (or c11) has been used. Otherwise a compilation failure may be
> better than getting wrong results.

If __STDC_IEC_559__ is unset or does not have the value 1, setting
STDC FENV_ACCESS to "on" is undefined behaviour (see 6.10.8.3, 7.6 and
Annex F), unless the implementation explicitly chooses to extend the
language to support it.  So the user would get what he so richly
deserves.


(In reply to joseph@codesourcery.com from comment #22)
> 
> So a key part of actually making rounding modes and exceptions work 
> reliably would be working out a definition of GCC's default mode that 
> allows more or less the same optimizations as at present, while allowing 
> users wanting the full support (and consequent optimization cost) to 
> specify the appropriate command-line options or FENV_ACCESS pragma to 
> enable it.

Yes.  That won't deal with the correctness problems of introducing
IEEE 754 support into code not set up to handle it, especially C++,
of course.  I tried to get WG21 to take a stand on that issue, but
failed :-(

Working out what on earth to do in such a case is likely to be a far
fouler task than merely dealing with the performance problems :-(
Comment 24 Vincent Lefèvre 2014-01-10 11:40:02 UTC
(In reply to Nick Maclaren from comment #23)
> If __STDC_IEC_559__ is unset or does not have the value 1, setting
> STDC FENV_ACCESS to "on" is undefined behaviour (see 6.10.8.3, 7.6 and
> Annex F), unless the implementation explicitly chooses to extend the
> language to support it.

You're wrong. The C standard doesn't say that.

6.10.8.3 says: "__STDC_IEC_559__ The integer constant 1, intended to indicate conformance to the specifications in annex F (IEC 60559 floating-point arithmetic)." and nothing about STDC FENV_ACCESS.

In 7.6, only 7.6.1 is specifically about the FENV_ACCESS pragma, and it specifies under which conditions the behavior is undefined, but nothing related to __STDC_IEC_559__ and Annex F.

Annex F doesn't apply in the case __STDC_IEC_559__ is unset.
Comment 25 Nick Maclaren 2014-01-14 14:34:24 UTC
On Jan 10 2014, vincent-gcc at vinc17 dot net wrote:
>
>http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34678
>
>--- Comment #24 from Vincent Lefèvre <vincent-gcc at vinc17 dot net> -
>
>(In reply to Nick Maclaren from comment #23)
>
>> If __STDC_IEC_559__ is unset or does not have the value 1, setting
>> STDC FENV_ACCESS to "on" is undefined behaviour (see 6.10.8.3, 7.6 and
>> Annex F), unless the implementation explicitly chooses to extend the
>> language to support it.
>
>You're wrong. The C standard doesn't say that.

I am sorry, but it is you that is wrong.

>6.10.8.3 says: "__STDC_IEC_559__ The integer constant 1, intended to indica
>te
>conformance to the specifications in annex F (IEC 60559 floating-point
>arithmetic)." and nothing about STDC FENV_ACCESS.

3.4.3 says:
    undefined behavior
    behavior, upon use of a nonportable or erroneous program construct
    or of erroneous data, for which this International Standard imposes
    no requirements

4. Conformance, paragraph 2, says:
    ...  Undefined behavior is otherwise indicated in this International
    Standard by the words "undefined behavior" or by the omission of any
    explicit definition of behavior.  There is no difference in emphasis
    among these three; they all describe "behavior that is undefined".

What "explicit definition of behavior" is there for the case when
STDC FENV_ACCESS is set to "on" but __STDC_IEC_559__ is not set to one?

As there is none, it is undefined behaviour.  gcc can therefore do
whatever it likes.


Regards,
Nick Maclaren.
Comment 26 Vincent Lefèvre 2014-01-14 14:58:02 UTC
(In reply to Nick Maclaren from comment #25)
> 3.4.3 says:
>     undefined behavior
>     behavior, upon use of a nonportable or erroneous program construct
>     or of erroneous data, for which this International Standard imposes
>     no requirements
> 
> 4. Conformance, paragraph 2, says:
>     ...  Undefined behavior is otherwise indicated in this International
>     Standard by the words "undefined behavior" or by the omission of any
>     explicit definition of behavior.  There is no difference in emphasis
>     among these three; they all describe "behavior that is undefined".
> 
> What "explicit definition of behavior" is there for the case when
> STDC FENV_ACCESS is set to "on" but __STDC_IEC_559__ is not set to one?

The behavior is defined. The standard says, e.g. for C99:

----
7.6.1 The FENV_ACCESS pragma

The FENV_ACCESS pragma provides a means to inform the implementation when a program might access the floating-point environment to test floating-point status flags or run under non-default floating-point control modes.184) [...]

184) The purpose of the FENV_ACCESS pragma is to allow certain optimizations that could subvert flag tests and mode changes (e.g., global common ubexpression elimination, code motion, and constant folding). In general, if the state of FENV_ACCESS is ``off'', the translator can assume that default modes are in effect and the flags are not tested.
----

And there is here no relation at all with __STDC_IEC_559__.

> As there is none, it is undefined behaviour.  gcc can therefore do
> whatever it likes.

No.
Comment 27 Nick Maclaren 2014-01-14 15:13:43 UTC
On Jan 14 2014, vincent-gcc at vinc17 dot net wrote:
>
>http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34678
>
>> What "explicit definition of behavior" is there for the case when
>> STDC FENV_ACCESS is set to "on" but __STDC_IEC_559__ is not set to one?
>
>The behavior is defined. The standard says, e.g. for C99:
>
>7.6.1 The FENV_ACCESS pragma
>
>The FENV_ACCESS pragma provides a means to inform the implementation when a
>program might access the floating-point environment to test floating-point
>status flags or run under non-default floating-point control modes.

I suggest looking up the word "explicit" in a dictionary.

Unless __STDC_IEC_559__ is set to 1, what modes and flags exist (and,
even more importantly) what there semantics are) is at best implicit and
more realistically unspecified - see footnote 204 for a clear statement of
that.

Have you ever implemented a C system for an architecture with non-IEEE
arithmetic but with modes and flags?  Because I have, and I have used
several others.

>> As there is none, it is undefined behaviour.  gcc can therefore do
>> whatever it likes.
>
>No.

You are quite simply wrong.


Regards,
Nick Maclaren.
Comment 28 Vincent Lefèvre 2014-01-14 15:52:34 UTC
(In reply to Nick Maclaren from comment #27)
> On Jan 14 2014, vincent-gcc at vinc17 dot net wrote:
> >The FENV_ACCESS pragma provides a means to inform the implementation when a
> >program might access the floating-point environment to test floating-point
> >status flags or run under non-default floating-point control modes.
> 
> I suggest looking up the word "explicit" in a dictionary.

The above is an explicit definition. Where do you see an undefined behavior here?

#include <fenv.h>
#pragma STDC FENV_ACCESS ON
int main (void)
{
  return 0;
}

The modes and so on are dealt with in other parts of the standard, e.g.

----
Each of the macros
        FE_DOWNWARD
        FE_TONEAREST
        FE_TOWARDZERO
        FE_UPWARD
is defined if and only if the implementation supports getting and setting the represented rounding direction by means of the fegetround and fesetround functions.
----

This doesn't mean that the rounding direction will necessarily be honored even for the basic operations (just like the C standard doesn't require "1.0+2.0" to evaluate as 3.0, and a poorly-designed implementation could decide that 1-bit accuracy is OK), but honoring the rounding direction when the processor does[*] is a reasonable QoI feature. Basically, this means: disabling some optimizations when STDC FENV_ACCESS is set to ON. This is what this bug is about.

[*] a weaker requirement than __STDC_IEC_559__ being set to 1.

Note that the C standard doesn't explicitly say how a source file as a sequence of bytes is to be interpreted as a sequence of character, so that if you just restrict to the C standard, everything is undefined. The discussion is going nowhere.
Comment 29 Nick Maclaren 2014-01-14 17:22:18 UTC
On Jan 14 2014, vincent-gcc at vinc17 dot net wrote:
>
>http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34678
>
>> >The FENV_ACCESS pragma provides a means to inform the implementation whe
>n a
>> >program might access the floating-point environment to test floating-poi
>nt
>> >status flags or run under non-default floating-point control modes.
>
>> I suggest looking up the word "explicit" in a dictionary.
>
>The above is an explicit definition. Where do you see an undefined behavior
>here?

It is not an "explicit definition of BEHAVIOR" (my emphasis), and what
it implies for any nnon-IEEE system is completely unclear.  Of the two
countries active during the standardisation of C99, one voted "no" on
these grounds (among others).

>Note that the C standard doesn't explicitly say how a source file as a sequ
>ence
>of bytes is to be interpreted as a sequence of character, so that if you ju
>st
> restrict to the C standard, everything is undefined.

Yes, it does - it's implementation-defined in 5.1.1.2 Translation phases,
paragraph 1.1:

    Physical source file multibyte characters are mapped, in an
    implementation-defined manner, to the source character set
    (introducing new-line characters for end-of-line indicators) if
    necessary.  ...

You imply that you are also relying on some other standards or
specifications.  ISO/IEC Directives Part II is quite clear (in 6.2.2)
that they shall be referenced in the ISO standard.  Which ones are you
referring to and why?

If you are claiming that C99 and beyond support only systems that conform
to IEEE 754, then I can tell you that was not the intention of WG21 at
the time and is not a requirement of the standard.  To repeat, how many
other such systems are you familiar with?

The grounds that the UK voted "no" to this aspect was that the whole
'IEEE 754' morass (including "fenv.h") was neither dependent on
__STD_IEC_559__ nor implementation-dependent nor sufficiently explicit
to be interpreted on any non-IEEE system.

> The discussion is going nowhere.

Now, at least that is true.


Regards,
Nick Maclaren.
Comment 30 Vincent Lefèvre 2014-01-14 23:16:34 UTC
(In reply to Nick Maclaren from comment #29)
> It is not an "explicit definition of BEHAVIOR" (my emphasis),

The pragma is just a directive. It has no additional behavior, so that there is nothing else to define.

> >Note that the C standard doesn't explicitly say how a source file as a sequence
> >of bytes is to be interpreted as a sequence of character, so that if you just
      ^^^^^
> > restrict to the C standard, everything is undefined.
> 
> Yes, it does - it's implementation-defined in 5.1.1.2 Translation phases,
> paragraph 1.1:
> 
>     Physical source file multibyte characters are mapped, in an
                           ^^^^^^^^^^^^^^^^^^^^
Read again. I'm talking of a sequence of bytes. What your quoting is about a sequence of multibyte characters. The interpretation of the sequences of bytes as a sequence of multibyte characters is not defined.

> You imply that you are also relying on some other standards or
> specifications.

Not other standards, just the implementation.
Comment 31 Jackie Rosen 2014-02-16 10:02:39 UTC Comment hidden (spam)
Comment 32 Stefan Vigerske 2019-05-21 10:42:40 UTC
Is there any hope this could actually be improved?
Now, 10 years later, the FENV_ACCESS pragma seems to be implemented, but the problem here seems to persist.

I run into this with GCC 8.3.0 and code like this:

#pragma STDC FENV_ACCESS ON

#include <stdio.h>
#include <stdlib.h>
#include <fenv.h>

#include <assert.h>

int main()
{
   double op;
   double down;
   double up;
   double near;

   op = atof("0.2");
   
   fesetround(FE_DOWNWARD);
   down = 1.0 / op;
   
   fesetround(FE_UPWARD);
   up = 1.0 / op;
   
   fesetround(FE_TONEAREST);
   near = 1.0 / op;
   
   printf("1/%.16g: down = %.16g near = %.16g up = %.16g\n", op, down, near, up);

   assert(down <= 5.0);
   assert(down <= near);
   assert(near <= up);
   assert(5.0 <= up);
   
   return 0;
}


$ gcc -O3 -lm div.c && ./a.out 
1/0.2: down = 4.999999999999999 near = 4.999999999999999 up = 4.999999999999999
a.out: div.c:32: main: Assertion `5.0 <= up' failed.

Looking at the assembler code, I see that only the first divsd remains and the other two were optimized away.

The Intel compiler 19.0.0.177 handles this better:
$ icc -O3 div.c -lm && ./a.out 
1/0.2: down = 4.999999999999999 near = 5 up = 5
Comment 33 Richard Biener 2019-05-21 11:06:31 UTC
(In reply to Stefan Vigerske from comment #32)
> Is there any hope this could actually be improved?
> Now, 10 years later, the FENV_ACCESS pragma seems to be implemented, but the
> problem here seems to persist.

The pragma still has no effect (not sure if we would be allowed to diagnose that
fact).

Nobody stepped up with a poor-mans "solution" to the issue (not sure if there
even exists one) and a complete solution is not even on a drawing board.

Other people running into this issue live with inserting compiler-barriers
like

__asm__("# %0" : "=r" (x) : "0" (x));

for hiding 'x' from the compiler across this point.
Comment 34 Marc Glisse 2019-05-21 20:54:49 UTC
(In reply to Richard Biener from comment #33)
> (In reply to Stefan Vigerske from comment #32)
> > Is there any hope this could actually be improved?
> > Now, 10 years later, the FENV_ACCESS pragma seems to be implemented, but the
> > problem here seems to persist.
> 
> The pragma still has no effect (not sure if we would be allowed to diagnose
> that fact).

I don't remember ever seeing anything in the C or C++ standard that would prevent from printing as many diagnostics as we want, even for perfectly valid code. Most warnings would be illegal otherwise.

> Nobody stepped up with a poor-mans "solution" to the issue (not sure if there
> even exists one) and a complete solution is not even on a drawing board.

One approach could be producing IFN_PLUS instead of PLUS_EXPR, for floating-point types inside a "pragma on" region (or everywhere with -frounding-math), and expanding it using the barrier mentioned below by default (maybe with a possibility for targets to override that expansion). But then it is tempting to refine IFN_PLUS and have variants (an argument?) specifying the rounding mode when known, specifying if we only care about rounding, only about exceptions, or both, etc. We could also produce PLUS_EXPR and the barriers directly from the front-end, but with IFN_PLUS we at least have a chance to fold exact constant operations (say 2.+2.), and more with the refined version.

Most lacking, whatever the approach, is a volunteer motivated enough to work on it ;-)

clang doesn't handle the pragma either (AFAIK there was a recent effort, but it stalled). Intel and Microsoft supposedly handle it, but Microsoft only applies it to scalars and not SIMD vectors. That's not a lot of support...

> Other people running into this issue live with inserting compiler-barriers
> like
> 
> __asm__("# %0" : "=r" (x) : "0" (x));
> 
> for hiding 'x' from the compiler across this point.

You need to make it volatile, or it might be moved across fesetround. Also, you can refine the constraint, say "=gx" on x86_64 (or just "=x"), and a different one on each platform. An operation is then: *lo = hide(hide(x)/hide(y));
Comment 35 Richard Biener 2019-05-22 07:58:46 UTC
The IFN_ way may be a possibility indeed.  I believe a volunteer should first tackle -ftrapv in this way then to see how painful an exercise this is.

Note that the issue with FENV access is not so much the operations themselves
but the dependence on the fenv accesses which is what is missing at the moment.
Without IL adjustments for "special" (non-value, non-memory) dependences
the IFNs would need to appear to read and write global memory which may
be somewhat detrimental to optimization.  This issue doesn't exist for
-ftrapv.
Comment 36 Richard Biener 2019-05-22 08:41:41 UTC
Created attachment 46396 [details]
poor mans solution^Whack

So this is what a hack looks like, basically sprinkling those asm()s throughout the code automatically.

Note I need to protect inputs, not outputs, otherwise the last
testcase isn't fixed.

Improving this poor-mans solution by writing in some flow-sensitivity
like tracking which values are already protected and if there's a possibly
harmful FENV access inbetween maybe in a similar way tree-complex.c tracks
complex components might work.

Note that the FENV pragma does _not_ enable -frounding-math (it really has
no effect!) so you need to supply -frounding-math yourself (or fix the
frontends to do that).

It's a hack of course.

But it fixes the testcase:

> ./xgcc -B. t.c -O3 -lm
> ./a.out
1/0.2: down = 4.999999999999999 near = 4.999999999999999 up = 4.999999999999999
a.out: t.c:32: main: Assertion `5.0 <= up' failed.
Aborted
> ./xgcc -B. t.c -O3 -lm -frounding-math
> ./a.out
1/0.2: down = 4.999999999999999 near = 5 up = 5

IL after the lowering:

main ()
{
  static const char __PRETTY_FUNCTION__[5] = "main";
  double near;
  double up;
  double down;
  double op;
  int D.3058;

  op = atof ("0.2");
  fesetround (1024);
  __asm__ __volatile__("" : "=g" op : "0" op);
  down = 1.0e+0 / op;
  fesetround (2048);
  __asm__ __volatile__("" : "=g" op : "0" op);
  up = 1.0e+0 / op;
  fesetround (0);
  __asm__ __volatile__("" : "=g" op : "0" op);
  near = 1.0e+0 / op;
  printf ("1/%.16g: down = %.16g near = %.16g up = %.16g\n", op, down, near, up);
...
Comment 37 Marc Glisse 2019-05-22 12:47:49 UTC
(In reply to Richard Biener from comment #36)
> Created attachment 46396 [details]
> poor mans solution^Whack
> 
> So this is what a hack looks like, basically sprinkling those asm()s
> throughout the code automatically.
> 
> Note I need to protect inputs, not outputs, otherwise the last
> testcase isn't fixed.

Actually, you need to protect both inputs *and* outputs...

> Improving this poor-mans solution by writing in some flow-sensitivity
> like tracking which values are already protected

At least if you use "=x" (or whatever the right constraint is on each target) it doesn't really hurt to have a dozen protections on the same variable.

> and if there's a possibly
> harmful FENV access in between maybe in a similar way tree-complex.c tracks
> complex components might work.
> 
> Note that the FENV pragma does _not_ enable -frounding-math (it really has
> no effect!) so you need to supply -frounding-math yourself (or fix the
> frontends to do that).

If you protect even constants, the current effects of -frounding-math become redundant.
Comment 38 Marc Glisse 2019-05-22 12:54:56 UTC
(In reply to Marc Glisse from comment #37)
> If you protect even constants, the current effects of -frounding-math become
> redundant.

Oops, forget that, the hack is too late for this sentence to be true, some constant propagation has already happened by that time.
Comment 39 Jakub Jelinek 2019-05-22 13:57:21 UTC
(In reply to Richard Biener from comment #36)
> Created attachment 46396 [details]
> poor mans solution^Whack
> 
> So this is what a hack looks like, basically sprinkling those asm()s
> throughout the code automatically.
> 
> Note I need to protect inputs, not outputs, otherwise the last
> testcase isn't fixed.
> 
> Improving this poor-mans solution by writing in some flow-sensitivity
> like tracking which values are already protected and if there's a possibly
> harmful FENV access inbetween maybe in a similar way tree-complex.c tracks
> complex components might work.
> 
> Note that the FENV pragma does _not_ enable -frounding-math (it really has
> no effect!) so you need to supply -frounding-math yourself (or fix the
> frontends to do that).
> 
> It's a hack of course.
> 
> But it fixes the testcase:
> 
> > ./xgcc -B. t.c -O3 -lm
> > ./a.out
> 1/0.2: down = 4.999999999999999 near = 4.999999999999999 up =
> 4.999999999999999
> a.out: t.c:32: main: Assertion `5.0 <= up' failed.
> Aborted
> > ./xgcc -B. t.c -O3 -lm -frounding-math
> > ./a.out
> 1/0.2: down = 4.999999999999999 near = 5 up = 5
> 
> IL after the lowering:
> 
> main ()
> {
>   static const char __PRETTY_FUNCTION__[5] = "main";
>   double near;
>   double up;
>   double down;
>   double op;
>   int D.3058;
> 
>   op = atof ("0.2");
>   fesetround (1024);
>   __asm__ __volatile__("" : "=g" op : "0" op);
>   down = 1.0e+0 / op;
>   fesetround (2048);
>   __asm__ __volatile__("" : "=g" op : "0" op);
>   up = 1.0e+0 / op;
>   fesetround (0);
>   __asm__ __volatile__("" : "=g" op : "0" op);
>   near = 1.0e+0 / op;
>   printf ("1/%.16g: down = %.16g near = %.16g up = %.16g\n", op, down, near,
> up);
> ...

How does this work if op is a SSA_NAME?
Comment 40 Richard Biener 2019-05-23 09:49:00 UTC
(In reply to Jakub Jelinek from comment #39)
> (In reply to Richard Biener from comment #36)
> > Created attachment 46396 [details]
> > poor mans solution^Whack

^^^^
 
> How does this work if op is a SSA_NAME?

it doens't, the patch has to be fixed to create a new def and adjust
all uses which isn't possible here (no immediate uses).

It's a proof-of-concept hack - the SSA name issue means we have to
find a better place for such hack.  Note I don't think we should
go with this kind of hack, iff, then we should at least not use
an ASM but some special __IFN and a more appropriate construct on
the RTL side (not sure what that would be).
Comment 41 jsm-csl@polyomino.org.uk 2019-05-23 21:53:32 UTC
It's likely that caring about exceptions would actually be worse for 
optimization than caring about rounding modes (because exceptions mean 
that floating-point operations can write global state, not just read it).  
I.e., a proper implementation would also indicate splitting 
-ftrapping-math into the existing parts relating only to local 
transformations, and something new for the global effects of operations 
being considered to write the exception state, and so not be movable past 
any code that might read it (which includes most function calls, and asms 
depending on whether they might read the floating-point state register).

(There are plenty of local bugs in this area - both in machine-independent 
optimizations, and in machine-specific code, or libgcc code, that doesn't 
do the right thing regarding exceptions; lots of thorough tests would be 
needed to find such places.  But in general the local bugs should be 
individually straightforward to fix in a way that the global issues 
aren't.)

(See discussions on the gcc mailing list in Dec 2012 / Jan 2013 / Feb 2013 
for more details in this area.)
Comment 42 Marc Glisse 2019-06-18 23:39:22 UTC
Created attachment 46502 [details]
other hack

Another approach.
* lowering in an optimization pass is idiotic, it only works at -O2+, but it shows the idea and should be easy to move anywhere.
* manually setting SSA_NAME_DEF_STMT seems strange, it probably should happen automatically as it does for an assignment.

I think this kind of approach makes sense. It can be made to work without too much effort, and then can be incrementally improved
0) handle vectors and complex
1) let targets replace "=g" with something nicer, say "=x" or "=xm" for SSE (we generate nonsense for "=gx").
2) allow targets to expand the operations as they like (add an opcode?)
3) add parsing of #pragma fenv and change flag_rounding_math according to it
4) enable it as well for flag_trapping_math (and stop making that the default!)
5) add some constant folding (mpfr can tell if operations are exact or raise any flag)
6) add other, more specific versions, for cases where we care about rounding but not flags, or the reverse, or when we know the rounding direction (possible in the newest C standard?), or...
etc
Comment 43 Vincent Lefèvre 2020-04-16 15:35:00 UTC
Note that the effect of changing the rounding mode after a computation, whether -frounding-math is used or not, is not just that the change of rounding mode may not be honored. If can yield inconsistencies in a block where the rounding mode is not changed.

#include <stdio.h>
#include <stdlib.h>
#include <fenv.h>

#pragma STDC FENV_ACCESS ON

#define CST 0x1p-200

int main (void)
{
  volatile double a = CST;
  double b = a, c = a, d;
  printf ("%a\n", 1.0 - b);
  fesetround (FE_DOWNWARD);
  printf ("%a\n", 1.0 - c);

  if (b == c && b == CST && c == CST)
    {
      printf ("%d\n", 1.0 - b == 1.0 - c);
      printf ("1: %a\n", 1.0 - b);
      printf ("2: %a\n", 1.0 - c);
      d = b == CST ? b : (abort (), 1.0);
      printf ("3: %a\n", 1.0 - d);
      d = b == CST ? b : 1.0;
      printf ("4: %a\n", 1.0 - d);
    }

  return 0;
}

With -std=c17 -frounding-math -O3 -lm, I get:

0x1p+0
0x1.fffffffffffffp-1
0
1: 0x1p+0
2: 0x1.fffffffffffffp-1
3: 0x1p+0
4: 0x1.fffffffffffffp-1
Comment 44 Andrew Pinski 2023-01-06 17:29:45 UTC
*** Bug 108318 has been marked as a duplicate of this bug. ***
Comment 45 Andrew Pinski 2023-01-06 17:31:50 UTC
*** Bug 56020 has been marked as a duplicate of this bug. ***
Comment 46 Thomas Koenig 2023-01-06 19:29:16 UTC
Fortran gets this right:

$ cat set_rounding_mode.f90
module x
  implicit none
  integer, parameter :: wp = selected_real_kind(15)
contains
  subroutine foo(a,b,c)
    use ieee_arithmetic
    real(kind=wp), dimension(4), intent(out) :: a
    real(kind=wp), intent(in) :: b, c
    type (ieee_round_type), dimension(4), parameter :: mode = &
         [ieee_nearest, ieee_to_zero, ieee_up, ieee_down]
    integer :: i
    do i=1,4
       call ieee_set_rounding_mode (mode(i))
       a(i) = b + c
    end do
  end subroutine foo
end module x

program main
  use x
  real(kind=wp), dimension(4) :: a
  call foo(a, 0.1_wp, 0.2_wp)
  print *,a
end program main
$ gfortran -O3 set_rounding_mode.f90
$ ./a.out
  0.30000000000000004       0.29999999999999999       0.30000000000000004       0.29999999999999999
Comment 47 Thomas Koenig 2023-01-06 22:14:26 UTC
(In reply to Thomas Koenig from comment #46)
> Fortran gets this right:

... but only by accident. This test case shows that it doesn't:

$ cat y.f90
module y
  implicit none
  integer, parameter :: wp = selected_real_kind(15)
contains
  subroutine foo(a,b,c)
    use ieee_arithmetic
    real(kind=wp), dimension(4), intent(out) :: a
    real(kind=wp), intent(in) :: b, c
    type (ieee_round_type), dimension(4), parameter :: mode = &
         [ieee_nearest, ieee_to_zero, ieee_up, ieee_down]
    call ieee_set_rounding_mode (mode(1))
    a(1) = b + c
    call ieee_set_rounding_mode (mode(2))
    a(2) = b + c
    call ieee_set_rounding_mode (mode(3))
    a(3) = b + c
    call ieee_set_rounding_mode (mode(4))
    a(4) = b + c
  end subroutine foo
end module y

program main
  use y
  real(kind=wp), dimension(4) :: a
  call foo(a, 0.1_wp, 0.2_wp)
  print *,a
end program main
$ gfortran -O  y.f90 && ./a.out
  0.30000000000000004       0.30000000000000004       0.30000000000000004       0.30000000000000004     
$ gfortran y.f90 && ./a.out
  0.30000000000000004       0.29999999999999999       0.30000000000000004       0.29999999999999999
Comment 48 Thomas Koenig 2023-01-07 11:14:47 UTC
Clang gets this right, even without the pragma; the original test case is
compiled to

        pushq   %r14
        pushq   %rbx
        subq    $24, %rsp
        movq    %rsi, %r14
        movq    %rdi, %rbx
        movsd   %xmm1, 16(%rsp)                 # 8-byte Spill
        movsd   %xmm0, 8(%rsp)                  # 8-byte Spill
        movl    $1024, %edi                     # imm = 0x400
        callq   fesetround@PLT
        movsd   8(%rsp), %xmm0                  # 8-byte Reload
        divsd   16(%rsp), %xmm0                 # 8-byte Folded Reload
        movsd   %xmm0, (%rbx)
        movl    $2048, %edi                     # imm = 0x800
        callq   fesetround@PLT
        movsd   8(%rsp), %xmm0                  # 8-byte Reload
        divsd   16(%rsp), %xmm0                 # 8-byte Folded Reload
        movsd   %xmm0, (%r14)
        addq    $24, %rsp
        popq    %rbx
        popq    %r14
        retq
Comment 49 Thomas Koenig 2023-01-07 12:14:08 UTC
(In reply to Thomas Koenig from comment #48)
> Clang gets this right, even without the pragma;

The "even without the pragma" part is wrong.