Bug 93806 - Wrong optimization: instability of floating-point results with -funsafe-math-optimizations leads to nonsense
Summary: Wrong optimization: instability of floating-point results with -funsafe-math-...
Status: NEW
Alias: None
Product: gcc
Classification: Unclassified
Component: middle-end (show other bugs)
Version: 10.0
: P3 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords: wrong-code
Depends on:
Blocks:
 
Reported: 2020-02-18 16:52 UTC by Alexander Cherepanov
Modified: 2021-09-13 01:40 UTC (History)
5 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2020-02-19 00:00:00


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Alexander Cherepanov 2020-02-18 16:52:32 UTC
With -funsafe-math-optimizations, floating-point results are effectively unstable, this instability can taint everything around and lead to nonsense. (Similar to pr93681 and pr93682.)

Instability is not limited to FP numbers, it extends to integers too:

----------------------------------------------------------------------
#include <stdio.h>

__attribute__((noipa)) // imagine it in a separate TU
static double opaque(double d) { return d; }

int main()
{
    int x = opaque(1);
    int a = 1 + opaque(0x1p-60) != x;

    printf("a = %d\n", a);
    if (x == 1) {
        opaque(0);
        if (a)
            printf("a is 1\n");
    }
}
----------------------------------------------------------------------
$ gcc -std=c11 -pedantic -Wall -Wextra -funsafe-math-optimizations -O3 test.c && ./a.out
a = 0
a is 1
----------------------------------------------------------------------
gcc x86-64 version: gcc (GCC) 10.0.1 20200218 (experimental)
----------------------------------------------------------------------
Comment 1 Alexander Cherepanov 2020-02-18 16:53:06 UTC
And instability of integers then easily taints surrounding code:

----------------------------------------------------------------------
#include <stdio.h>

__attribute__((noipa)) // imagine it in a separate TU
static double opaque(double d) { return d; }

int main()
{
    int one = opaque(1);

    int x = opaque(1);
    int a = 1 + opaque(0x1p-60) == x;

    printf("one = %d\n", one);

    opaque(a);
    if (one == a) {
        opaque(0);
        if (x == 1) {
            opaque(0);
            if (a == 0) {
                opaque(0);
                if (one == 0)
                    printf("one = %d\n", one);
            }
        }
    }
}
----------------------------------------------------------------------
$ gcc -std=c11 -pedantic -Wall -Wextra -funsafe-math-optimizations -O3 test.c && ./a.out
one = 1
one = 0
----------------------------------------------------------------------
gcc x86-64 version: gcc (GCC) 10.0.1 20200218 (experimental)
----------------------------------------------------------------------
Comment 2 Martin Liška 2020-02-19 09:19:58 UTC
> 
> __attribute__((noipa)) // imagine it in a separate TU
> static double opaque(double d) { return d; }

Putting the function into a separate file, I see it started with:
r6-2248-g612b9d1364bbefb7

Can you Richi take a look?
Comment 3 Richard Biener 2020-02-19 09:52:43 UTC
Well, it's really down to valid simplifications with -ffast-math and conditional equivalences again.  From

    if (one == a) {
...
        if (x == 1) {
...
                if (one == 0)

we're turning the last compare to if (a == 0) and that "simplifies" from

(1 + opaque(0x1p-60) == x) == 0

again via conditional equivalences x == 1

(1 + opaque(0x1p-60) != 1)

to

opaque(0x1p-60) != 0

at -O1 printing

one = 1
one = 1

where we still print a and at -O2

one = 1
one = 0

where we figured to constant-prop the one == 0 constant before substituting
the one == a equivalence in evrp.

I think the patch enabled the x == 1 conditional equivalence to be used.

with -funsafe-math-optimizations you get the 1 + x != 1 -> x != 0
optimization which is unsafe because a rounding step is removed.
But you asked for that.  So no "wrong-code" here, just another case
of "instabilities" or how you call that via conditional equivalence
propagation.
Comment 4 Vincent Lefèvre 2020-02-19 10:45:46 UTC
(In reply to Richard Biener from comment #3)
> with -funsafe-math-optimizations you get the 1 + x != 1 -> x != 0
> optimization which is unsafe because a rounding step is removed.
> But you asked for that.  So no "wrong-code" here, just another case
> of "instabilities" or how you call that via conditional equivalence
> propagation.

I disagree. -funsafe-math-optimizations just means that floating-point expressions can be rearranged at the source level according to math rules (valid in the real numbers), thus changing how rounding is done. From that, an integer variable will always be stable (and IMHO, this should also be the case of a floating-point variable to avoid some consistency issues, perhaps unless you design a specific model with multivalued FP variables that would change them to a stable value once the variable is used to obtain a non-FP value).

This option does not mean that the optimizer may assume that math rules are valid on floating point. Otherwise one can prove that 1 == 0 (as seen above), and then it is valid to transform the program to

  int main () { return 0; }

Indeed, you can assume an "if (1)" around the main code, and since 1 == 0, you can transform it to "if (0)", and since this is always false, you can remove the code entirely. This makes no sense!
Comment 5 rguenther@suse.de 2020-02-19 11:15:24 UTC
On Wed, 19 Feb 2020, vincent-gcc at vinc17 dot net wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93806
> 
> --- Comment #4 from Vincent Lefèvre <vincent-gcc at vinc17 dot net> ---
> (In reply to Richard Biener from comment #3)
> > with -funsafe-math-optimizations you get the 1 + x != 1 -> x != 0
> > optimization which is unsafe because a rounding step is removed.
> > But you asked for that.  So no "wrong-code" here, just another case
> > of "instabilities" or how you call that via conditional equivalence
> > propagation.
> 
> I disagree. -funsafe-math-optimizations just means that floating-point
> expressions can be rearranged at the source level according to math rules
> (valid in the real numbers), thus changing how rounding is done.

From below I implicitely assume you say that "1. + x != 1." -> "x != 0."
isn't "rearranging at the source level".  Note our documentation
on -funsafe-math-optimizations is quite vague and I'd argue that
"rearranging at the source level" is covered by -fassociative-math
instead.

It's not clear how to classify the above specific transform though.
There's -ffp-contract which also enables removing of rounding steps.
So the classification as "unsafe" is probably correct (and vague).

Even only associating differently, like if you'd have 1 + x + x != 1
and re-arrange it as 1 + (x + x) != 1 may change the outcome of
the comparison (and the outcome of a comparison is an integer).

So my answer to you would be - if you don't like them, don't use
any of the special math "optimization" flags.
Comment 6 Alexander Cherepanov 2020-02-19 12:23:35 UTC
I agree that every separate optimization here is quite natural. At the same time, the end result is quite unnatural.

The following is a look at the situation from an outsider POV.

-funsafe-math-optimizations makes floating-point values unpredictable and unstable. That's fine. And gcc is ready for this. For example, it doesn't propagate conditional equivalence for them (signed zeros would complicate it but -funsafe-math-optimizations enables -fno-signed-zeros).

OTOH gcc assumes that integers are stable. And conditional equivalence propagation is fine for them.

Nonsense results start to appear when the boundary between FPs and integers blurs. So an evident solution would be to stop replacing integers by FP computations. E.g., in this particular case, don't substitute FP equality in place of non-first instances of `a`, all instances of `a` should use the result of the same FP computation.

This approach will also solve some problems with x87 (pr85957, pr93681) and with -mpc64 (pr93682). It will not make them conformant but at least it will fix some egregious issues.
Comment 7 Vincent Lefèvre 2020-02-19 13:33:25 UTC
(In reply to rguenther@suse.de from comment #5)
> From below I implicitely assume you say that "1. + x != 1." -> "x != 0."
> isn't "rearranging at the source level".

No, it depends on how you do that. If in the source you have

  int c = 1. + x != 1.;

then you might choose to transform this to

  int c = x != 0.;

under -funsafe-math-optimizations (though this transformation is currently not documented, see below). What the optimizer MUST NOT do is to replace an occurrence of c in the code by the expression used to compute it, as doing such a thing is likely to yield major inconsistencies (this might be acceptable if the value of c is read only once, but IMHO, even this case should not be optimized as this could be too dangerous).

> Note our documentation
> on -funsafe-math-optimizations is quite vague and I'd argue that
> "rearranging at the source level" is covered by -fassociative-math
> instead.

BTW, strictly speaking, transforming "1. + x != 1." to "x != 0." does not just use the associativity, but also the "additive property" or "cancellation property". In math, if "a = b", then "a + x = b + x". However, for an arbitrary operation, the converse is not necessarily true, even though the operation may be associative. That is, if "a != b", then "a + x != b + x" is not necessarily true. Having the cancellation property under -funsafe-math-optimizations might be OK (here, this is similar to assuming associativity, but possibly stronger, so that it could be preferable to have a separate option for that).

But I think that this is not directly related to this bug.

The gcc(1) man page says:

    -fassociative-math
        Allow re-association of operands in series of floating-point
        operations.  [...]

As I read it, this is possible only inside an expression, otherwise it should not be worded like that. What the optimizer does here is to apply re-association of operands beyond expressions, changing the allowed behaviors to an unexpected one; thus, IMHO, the "as-if rule" is broken by the optimizer here.

> It's not clear how to classify the above specific transform though.
> There's -ffp-contract which also enables removing of rounding steps.
> So the classification as "unsafe" is probably correct (and vague).

Note that the conditions under FP contraction are rather strict. A consequence of what -funsafe-math-optimizations can do is to increase the error of the considered expression, which is not allowed by FP contraction.
Comment 8 Alexander Cherepanov 2020-02-20 20:58:30 UTC
A similar problem happens with -fno-signed-zeros -fno-trapping-math. Not sure if a separate PR should be filed...

----------------------------------------------------------------------
#include <stdio.h>

__attribute__((noipa)) // imagine it in a separate TU
static double opaque(double d) { return d; }

int main()
{
    int zero = opaque(0);

    double x = opaque(-0.);
    int a = 1 / x == 1 / 0.;

    printf("zero = %d\n", zero);

    opaque(a);
    if (zero == a) {
        opaque(0);
        if (x == 0) {
            opaque(0);
            if (a) {
                opaque(0);
                if (zero == 1)
                    printf("zero = %d\n", zero);
            }
        }
    }
}
----------------------------------------------------------------------
$ gcc -std=c11 -pedantic -Wall -Wextra -fno-signed-zeros -fno-trapping-math -O3 test.c && ./a.out
zero = 0
zero = 1
----------------------------------------------------------------------
gcc x86-64 version: gcc (GCC) 10.0.1 20200220 (experimental)
----------------------------------------------------------------------
Comment 9 Vincent Lefèvre 2020-02-21 00:40:19 UTC
(In reply to Alexander Cherepanov from comment #8)
> A similar problem happens with -fno-signed-zeros -fno-trapping-math. Not
> sure if a separate PR should be filed...

Concerning -fno-signed-zeros, your code has undefined behavior since the sign of zero is significant in "1 / x == 1 / 0.", i.e. changing the sign of 0 changes the result. If you use this option, you are telling GCC that this cannot be the case. Thus IMHO, this is not a bug.

I would say that -fno-trapping-math should have no effect because there are no traps by default (for floating point). But since your code has undefined behavior, this is not necessarily surprising.
Comment 10 Rich Felker 2020-02-21 00:49:33 UTC
I don't think it's at all clear that -fno-signed-zeros is supposed to mean the programmer is promising that their code has behavior independent of the sign of zeros, and that any construct which would be influenced by the sign of a zero has undefined behavior. I've always read it as a license to optimize in ways that disregard the sign of a zero or change the sign of a zero, but with internal consistency of the program preserved.

If -fno-signed-zeros is really supposed to be an option that vastly expands the scope of what's undefined behavior, rather than just removing part of Annex F and allowing the unspecified quality of floating point results that C otherwise allows in the absence of Annex F, it really needs a much much bigger warning in its documentation!
Comment 11 Vincent Lefèvre 2020-02-21 01:26:06 UTC
(In reply to Rich Felker from comment #10)
> I don't think it's at all clear that -fno-signed-zeros is supposed to mean
> the programmer is promising that their code has behavior independent of the
> sign of zeros, and that any construct which would be influenced by the sign
> of a zero has undefined behavior. I've always read it as a license to
> optimize in ways that disregard the sign of a zero or change the sign of a
> zero, but with internal consistency of the program preserved.

But what does "internal consistency" mean? IMHO, if you choose a strict, safe meaning, then the optimization option is likely to have no effect in practice.

An example:

int foo (double a)
{
  double b, c;
  b = 1 / a;
  c = 1 / a;
  return b == -c;
}

If a is a zero, would you regard a result 1 as correct? Some users may regard this as inconsistent, since even though they do not care about the sign of zero when computing a value, they may assume that the sign of a will not change magically.
Comment 12 Rich Felker 2020-02-21 01:39:18 UTC
To me the meaning of internal consistency is very clear: that the semantics of the C language specification are honored and that the only valid transformations are those that follow the "as-if rule". Since C without Annex F allows arbitrarily awful floating point results, your example in comment 11 is fine. Each instance of 1/a can evaluate to a different value. They could even evaluate to random values. However, if you had written:

  int b = 1/a == 1/0.;
  int c = b;
  return b == c;

then the function must necessarily return 1, because the single instance of 1/a==1/0. in the abstract machine has a single value, either 0 or 1, and in the abstract machine that value is stored to b, then copied to c, and b and c necessarily have the same value. While I don't think it's likely that GCC would mess up this specific example, it seems that it currently _can_ make transformations such that a more elaborate version of the same idea would be broken.
Comment 13 Vincent Lefèvre 2020-02-21 01:49:48 UTC
(In reply to Rich Felker from comment #12)
> To me the meaning of internal consistency is very clear: that the semantics
> of the C language specification are honored and that the only valid
> transformations are those that follow the "as-if rule".

which is not clear...

> Since C without Annex F allows arbitrarily awful floating point results,

In C without Annex F, division by 0 is undefined behavior (really undefined behavior, not an unspecified result, which would be very different).

With the examples using divisions by 0, you need to assume that Annex F applies, but at the same time, with your interpretation, -fno-signed-zeros breaks Annex F in some cases, e.g. if you have floating-point divisions by 0. So I don't follow you...
Comment 14 Rich Felker 2020-02-21 02:04:44 UTC
Indeed, without Anenx F, division by zero is UB, so it's fine to do anything if the program performs division by zero. So we need examples without division by zero.
Comment 15 Vincent Lefèvre 2020-02-21 02:31:22 UTC
Note that there are very few ways to be able to distinguish the sign of zero. The main one is division by zero. Other ones are:

* Conversion to a character string, e.g. via printf(). But in this case, if -fno-signed-zeros is used, whether "0" or "-0" is output (even in a way that seems to be inconsistent) doesn't matter since the user does not care about the sign of 0, i.e. "0" and "-0" are regarded as equivalent (IIRC, this would be a bit like NaN, which has a sign bit in IEEE 754, but the output does not need to match its sign bit).

* Memory analysis. Again, the sign does not matter, but for instance, reading an object twice as a byte sequence while the object has not been changed by the code must give the same result. I doubt that this is affected by optimization.

* copysign(). The C standard is clear: "On implementations that represent a signed zero but do not treat negative zero consistently in arithmetic operations, the copysign functions regard the sign of zero as positive." Thus with -fno-signed-zeros, the sign of zero must be regarded as positive with this function. If GCC chooses to deviate from the standard here, this needs to be documented.
Comment 16 Alexander Cherepanov 2020-02-21 02:46:55 UTC
On 21/02/2020 03.40, vincent-gcc at vinc17 dot net wrote:
> Concerning -fno-signed-zeros, your code has undefined behavior since the sign
> of zero is significant in "1 / x == 1 / 0.", i.e. changing the sign of 0
> changes the result. If you use this option, you are telling GCC that this
> cannot be the case. Thus IMHO, this is not a bug.

If I understand correctly what you are saying, then one cannot even do `printf("%f", 0);` because changing the sign of 0 will add a minus in front of 0. This seems kinda restrictive.

> I would say that -fno-trapping-math should have no effect because there are no
> traps by default (for floating point).

With -fno-trapping-math gcc simplifies `1 / x == 1 / 0.` into `1 / x > 1.79769313486231570814527423731704356798070567525844996599e+308`.
Comment 17 rguenther@suse.de 2020-02-21 08:39:06 UTC
On Fri, 21 Feb 2020, bugdal at aerifal dot cx wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93806
> 
> --- Comment #10 from Rich Felker <bugdal at aerifal dot cx> ---
> I don't think it's at all clear that -fno-signed-zeros is supposed to mean the
> programmer is promising that their code has behavior independent of the sign of
> zeros, and that any construct which would be influenced by the sign of a zero
> has undefined behavior. I've always read it as a license to optimize in ways
> that disregard the sign of a zero or change the sign of a zero, but with
> internal consistency of the program preserved.
> 
> If -fno-signed-zeros is really supposed to be an option that vastly expands the
> scope of what's undefined behavior, rather than just removing part of Annex F
> and allowing the unspecified quality of floating point results that C otherwise
> allows in the absence of Annex F, it really needs a much much bigger warning in
> its documentation!

-fno-signed-zeros internally tells GCC that the hardwares FP format
does not distinguish between -0.0 and 0.0 (not sure if that makes
sense).  There's always the question what these kind of options
mean when GCC is present with a literal -0.0 (or Inf/NaN with
-ffinite-math-only) or an explicit sign or finiteness check but
we've historically erred on the "optimize" side here.
Comment 18 rguenther@suse.de 2020-02-21 08:49:34 UTC
On Fri, 21 Feb 2020, bugdal at aerifal dot cx wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93806
> 
> --- Comment #12 from Rich Felker <bugdal at aerifal dot cx> ---
> To me the meaning of internal consistency is very clear: that the semantics of
> the C language specification are honored and that the only valid
> transformations are those that follow the "as-if rule". Since C without Annex F
> allows arbitrarily awful floating point results, your example in comment 11 is
> fine. Each instance of 1/a can evaluate to a different value. They could even
> evaluate to random values. However, if you had written:
> 
>   int b = 1/a == 1/0.;
>   int c = b;
>   return b == c;
> 
> then the function must necessarily return 1, because the single instance of
> 1/a==1/0. in the abstract machine has a single value, either 0 or 1, and in the
> abstract machine that value is stored to b, then copied to c, and b and c
> necessarily have the same value. While I don't think it's likely that GCC would
> mess up this specific example, it seems that it currently _can_ make
> transformations such that a more elaborate version of the same idea would be
> broken.

GCC indeed happily evaluates a floating-point expression multiple times,
for example for

void foo(float a, float b, float *x, float *y)
{
  float tem = a + b;
  *x = tem - a;
  *y = tem - b;
}

I would expect GCC to turn this into

  *x = (a + b) - a;
  *y = (a + b) - b;

and then simplify further to b and a.  Does this violate the "as-if rule"?
It certainly affects the result when a + b == a.

Note the fortran frontend uses the PAREN_EXPR which is an association
barrier to prevent optimizations across boundaries the language
standard places (parens IIRC).  That still doesn't prevent GCC from
evaluating a single expression multiple times and possibly
doing different association inside the expression - so it still
violates the as-if rule.  Classical transforms involve tail-duplication
of blocks but also loop unrolling if we happen to duplicate loop
invariant expressions.

So I guess the "as-if rule" can practically not be honored by an
optimized compiler.
Comment 19 rguenther@suse.de 2020-02-21 08:52:52 UTC
On Fri, 21 Feb 2020, vincent-gcc at vinc17 dot net wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93806
> 
> --- Comment #15 from Vincent Lefèvre <vincent-gcc at vinc17 dot net> ---
> Note that there are very few ways to be able to distinguish the sign of zero.
> The main one is division by zero. Other ones are:
> 
> * Conversion to a character string, e.g. via printf(). But in this case, if
> -fno-signed-zeros is used, whether "0" or "-0" is output (even in a way that
> seems to be inconsistent) doesn't matter since the user does not care about the
> sign of 0, i.e. "0" and "-0" are regarded as equivalent (IIRC, this would be a
> bit like NaN, which has a sign bit in IEEE 754, but the output does not need to
> match its sign bit).
> 
> * Memory analysis. Again, the sign does not matter, but for instance, reading
> an object twice as a byte sequence while the object has not been changed by the
> code must give the same result. I doubt that this is affected by optimization.
> 
> * copysign(). The C standard is clear: "On implementations that represent a
> signed zero but do not treat negative zero consistently in arithmetic
> operations, the copysign functions regard the sign of zero as positive." Thus
> with -fno-signed-zeros, the sign of zero must be regarded as positive with this
> function. If GCC chooses to deviate from the standard here, this needs to be
> documented.

I'm sure GCC doesn't adhere to this (it also relies on the systems
math library which doesn't "see" whether -fno-signed-zeros is in effect).
We'd need to special-case -0.0 at runtime for copysign (x, y) which
would be quite wasteful since -fno-signed-zeros is used for performance...
Comment 20 Vincent Lefèvre 2020-02-21 09:36:02 UTC
(In reply to rguenther@suse.de from comment #18)
> GCC indeed happily evaluates a floating-point expression multiple times,
> for example for
> 
> void foo(float a, float b, float *x, float *y)
> {
>   float tem = a + b;
>   *x = tem - a;
>   *y = tem - b;
> }
> 
> I would expect GCC to turn this into
> 
>   *x = (a + b) - a;
>   *y = (a + b) - b;
> 
> and then simplify further to b and a.  Does this violate the "as-if rule"?

I think that if optimization goes beyond expressions, this is unexpected, thus violate the "as-if rule". Assignments should be regarded as barriers. Perhaps casts too.

Now, if you disregard Annex F entirely and assume that the operations may be inaccurate (e.g. due to the optimizer), then in foo(), *x can be any value and *y can be any value, so that the simplification of *x to b and *y to a would be valid (as long as GCC assumes that their values are stable, i.e. that it will not "recompute" a same unmodified variable multiple times). But IMHO, this kind of aggressive optimization is not helpful to the user.

BTW, I fear that due to FP contraction, GCC might be broken even without "unsafe" optimizations. For instance, consider:

  double a, b, c, r, s, t;
  /* ... */
  r = a * b + c;
  s = a * b;
  t = s + c;

possibly slightly modified, without changing the semantics and still allowing FP contraction for r (I mean that things like "opaque" and volatile could be introduced in the code to change how optimization is done).

Here, if FP contraction is allowed, the compiler may replace a * b + c by fma(a,b,c), i.e. compute r with a single rounding instead of two, so that r and t may have different values. My question is the following: Due to the fact that r and t are computed with the same Level-1 expression a * b + c (i.e. at the level of real numbers, without rounding), is it possible that GCC's optimizer regard r and t as equal, even though they may actually be different? If this is possible, this would be a bug.
Comment 21 rguenther@suse.de 2020-02-21 10:10:42 UTC
On Fri, 21 Feb 2020, vincent-gcc at vinc17 dot net wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93806
> 
> --- Comment #20 from Vincent Lefèvre <vincent-gcc at vinc17 dot net> ---
> (In reply to rguenther@suse.de from comment #18)
> > GCC indeed happily evaluates a floating-point expression multiple times,
> > for example for
> > 
> > void foo(float a, float b, float *x, float *y)
> > {
> >   float tem = a + b;
> >   *x = tem - a;
> >   *y = tem - b;
> > }
> > 
> > I would expect GCC to turn this into
> > 
> >   *x = (a + b) - a;
> >   *y = (a + b) - b;
> > 
> > and then simplify further to b and a.  Does this violate the "as-if rule"?
> 
> I think that if optimization goes beyond expressions, this is unexpected, thus
> violate the "as-if rule". Assignments should be regarded as barriers. Perhaps
> casts too.
> 
> Now, if you disregard Annex F entirely and assume that the operations may be
> inaccurate (e.g. due to the optimizer), then in foo(), *x can be any value and
> *y can be any value, so that the simplification of *x to b and *y to a would be
> valid (as long as GCC assumes that their values are stable, i.e. that it will
> not "recompute" a same unmodified variable multiple times). But IMHO, this kind
> of aggressive optimization is not helpful to the user.
> 
> BTW, I fear that due to FP contraction, GCC might be broken even without
> "unsafe" optimizations. For instance, consider:
> 
>   double a, b, c, r, s, t;
>   /* ... */
>   r = a * b + c;
>   s = a * b;
>   t = s + c;
> 
> possibly slightly modified, without changing the semantics and still allowing
> FP contraction for r (I mean that things like "opaque" and volatile could be
> introduced in the code to change how optimization is done).
> 
> Here, if FP contraction is allowed, the compiler may replace a * b + c by
> fma(a,b,c), i.e. compute r with a single rounding instead of two, so that r and
> t may have different values. My question is the following: Due to the fact that
> r and t are computed with the same Level-1 expression a * b + c (i.e. at the
> level of real numbers, without rounding), is it possible that GCC's optimizer
> regard r and t as equal, even though they may actually be different? If this is
> possible, this would be a bug.

Yes, GCCs value-numbering would compute them equal but then it would
also replace one with the other.  So one would need a quite more
contrived example where the value-numbering results in a second
order observable difference (hah, compiler side-channel attack!).

Note that FP contraction happens quite late after value-numbering
has the chance to declare both of the above equal.  value-numbering
does not, however, consider fma (a, b, c) and a * b + c as equal.

The issue here is I think requiring "consistent optimization"
for correctness so that in some cases a missed-optimization becomes
wrong-code :/

Note that GCC does FP contraction across stmt boundaries so
even s = a * b; t = s + c; is contracted.  If that is already
a bug in your eyes then of couse value-numbering replacing
t by r is already a bug.
Comment 22 Vincent Lefèvre 2020-02-21 11:06:45 UTC
(In reply to rguenther@suse.de from comment #21)
> Note that GCC does FP contraction across stmt boundaries so
> even s = a * b; t = s + c; is contracted.  If that is already
> a bug in your eyes then of couse value-numbering replacing
> t by r is already a bug.

Yes, with testing, I've eventually noticed that and mentioned this fact in PR20785 (i.e. this is very important when the STDC FP_CONTRACT pragma will be implemented, as this is not conforming).

I haven't found an example with a major inconsistency, but when contraction is enabled, e.g. with "gcc -march=native -O3" on an x86 processor with a FMA, the following program

#include <stdio.h>

__attribute__((noipa)) // imagine it in a separate TU
static double opaque(double d) { return d; }

void foo (double a, double b, double c)
{
  double s, t, u, v;
  s = a * b;
  t = opaque(s) + c;
  u = a * opaque(b);
  v = u + c;
  printf ("%d %a %a\n", t != v, t, v);
}

int main (void)
{
  volatile double a = 0x1.000000000001p0, c = -1;
  foo (a, a, c);
  return 0;
}

outputs

1 0x1p-47 0x1.0000000000008p-47

Though t and v are computed from equivalent expressions, their values are different. However, in t != v, GCC correctly notices that they are different: it generates the comparison instruction, without optimizing; and this is still the case if -ffinite-math-only is used (since NaNs could prevent the optimization of t != v).

Note that if one adds "if (s == u)" (which is true, and noticed by GCC) before the printf, one correctly gets:

0 0x1p-47 0x1p-47
Comment 23 Vincent Lefèvre 2020-02-21 11:28:19 UTC
(In reply to Vincent Lefèvre from comment #22)
> Note that if one adds "if (s == u)" (which is true, and noticed by GCC)

Sorry, this is not noticed by GCC (I used an incorrect command line).

Anyway, the opaque's prevent any optimization.
Comment 24 Alexander Cherepanov 2020-02-21 12:42:41 UTC
(In reply to Vincent Lefèvre from comment #11)
> But what does "internal consistency" mean?
That's a good question. Here we talk about cases (like -funsafe-math-optimizations) that are not covered by any standards. Other PRs (like pr61502 or pr93301) discuss possible changes to the standard. So we need some basic rules to decide what is good and what is bad.

pr61502 taught me that discussing which value is the right result for a particular computation is very interesting but not very conclusive. So now I'm looking for contradictions. If you can derive a contradiction then you can derive any statement, so it's an ultimate goal. How to apply this to a compiler? I thought the following is supposed to always hold: if you explicitly write a value into a variable (of any type) then you should get back the same value at every future read no matter how the results of other reads are used or what control flow happens (without other writes to the variable, of course). That is, after `int x = 0;` all `printf("x = %d", x);` should output the same no matter how many `if (x == ...)` there are in-between -- either `printf` doesn't fire at all or it prints `x = 0`. If we demonstrated that it's broken then we demonstrated a contradiction (nonsense). And I hoped that it would be uncontroversial:-(

Sometimes it's possible to raise the bar even higher and to construct a testcase where `if` connecting the problematic part with the "independent" variable is hidden in non-executed code in such a way that loop unswitching will move it back into executable part (see bug 93301, comment 6 for an example).

OTOH I think the bar should be lowered in gcc and I hope it would be possible to come to an agreement that all integers in gcc should be stable. That is, in this PR the testcase in comment 0 should be enough to demonstrate the problem, without any need for a testcase in comment 1. It's quite easy to get the latter from the former so this agreement doesn't seem very important. Much more important to agree on the general principle described above.

It's always possible that any particular testcase is broken itself. Then some undefined behavior should be pointed out. So I totally support how you assessed my testcase from comment 8. We can disagree whether it's UB (I'll get to this a bit later) but we agree that it's either UB or the program should print something sane. What I don't understand is what is happening with my initial testcases.

(In reply to Richard Biener from comment #3)
> But you asked for that.  So no "wrong-code" here, just another case
> of "instabilities" or how you call that via conditional equivalence
> propagation.
Just to be sure: you are saying that everything works as intended -- the testcase doesn't contain UB and the result it prints is one of the allowed? (It could also be read as implying that this pr is a dup of another bug about conditional equivalence propagation or something.) Then we just disagree.

Discussion went to specifics of particular optimizations. IMHO it's not important at all for deciding whether comment 1 demonstrates a bug or not. Again, IMHO either the testcase contains UB or it shouldn't print nonsense (in the sense described above). And it doesn't matter which options are used, whether it's a standards compliant mode, etc.
Comment 25 Richard Biener 2020-02-24 11:08:12 UTC
(In reply to Alexander Cherepanov from comment #24)
> (In reply to Vincent Lefèvre from comment #11)
> > But what does "internal consistency" mean?
> That's a good question. Here we talk about cases (like
> -funsafe-math-optimizations) that are not covered by any standards. Other
> PRs (like pr61502 or pr93301) discuss possible changes to the standard. So
> we need some basic rules to decide what is good and what is bad.
> 
> pr61502 taught me that discussing which value is the right result for a
> particular computation is very interesting but not very conclusive. So now
> I'm looking for contradictions. If you can derive a contradiction then you
> can derive any statement, so it's an ultimate goal. How to apply this to a
> compiler? I thought the following is supposed to always hold: if you
> explicitly write a value into a variable (of any type) then you should get
> back the same value at every future read no matter how the results of other
> reads are used or what control flow happens (without other writes to the
> variable, of course). That is, after `int x = 0;` all `printf("x = %d", x);`
> should output the same no matter how many `if (x == ...)` there are
> in-between -- either `printf` doesn't fire at all or it prints `x = 0`. If
> we demonstrated that it's broken then we demonstrated a contradiction
> (nonsense). And I hoped that it would be uncontroversial:-(

It's at least a very easy to describe and obviously useful thing.  I also
agree it's a very good QOI goal to target.

Just to clarify - this is for a single execution of the abstract machine,
not for the single "source" instance when the same two values are involved?
So

float foo(float x) { return x + 1; }
void bar()
{
  float f1 = foo(1.);
  float f2 = foo(1.);
}

here x + 1 need not evaluate to the same value?  This would make transforms
such as loop unrolling or tail duplication "safe" since those do not involve
evaluating the same expression more times than in the untransformed abstract
machine evaluations.

[...]

> (In reply to Richard Biener from comment #3)
> > But you asked for that.  So no "wrong-code" here, just another case
> > of "instabilities" or how you call that via conditional equivalence
> > propagation.
> Just to be sure: you are saying that everything works as intended -- the
> testcase doesn't contain UB and the result it prints is one of the allowed?
> (It could also be read as implying that this pr is a dup of another bug
> about conditional equivalence propagation or something.) Then we just
> disagree.

I think that -funsafe-optimization makes the result acceptable.  I see
(and somewhat agree) that this is unfortunate for the QOI design goal
of "stable" values.

But the instability is just a result of optimizers working as intended.
There is no designed mechanism in GCC that avoids "opening up" an expression
and recomputing (parts of) it [in different contexts differently].  And
I have a hard time coming up with a way to implement such a thing that
wouldn't be detrimental to desired optimizations :/

As we see in other testcases "conditional equivalences" and replacing
one variable with another under them is a source of various problems,
but simply disabling those propagations likely will not address the
fundamental issue.

> Discussion went to specifics of particular optimizations. IMHO it's not
> important at all for deciding whether comment 1 demonstrates a bug or not.
> Again, IMHO either the testcase contains UB or it shouldn't print nonsense
> (in the sense described above). And it doesn't matter which options are
> used, whether it's a standards compliant mode, etc.

Overall I'm not sure how to address these bugs (but thanks for raising
awareness and for creating those testcases!).  It's tempting to paper over
the individual issues (like not allow conditional propagation of FP values
or disabling forwarding of once computed, many used FP conditionals) but
I think there's a more fundametal issue around all this which is not
fully understood and which we should try to address in a more coordinated
fashion (also given "artificial" wrong-code testcases don't put a very high
pressure on finding a workaround for them).

Just compare to the very old issue regarding the lack of barriers of FP
arithmetic with FP environment access.

Would it help if we had some --param no-conditional-propagation to turn
off such propagation in the two places I'm aware we do it?  That way
you could see if you can find testcases not involving such propagation
(I think it's most easy to get them there though).
Comment 26 joseph@codesourcery.com 2020-02-25 01:03:27 UTC
I wouldn't be surprised if such a test could be constructed in the absence 
of -funsafe-math-optimizations, that does a single conversion of an 
out-of-range integer to a floating-point type in the abstract machine but 
where that conversion gets duplicated so that one copy is done at compile 
time (valid with -fno-trapping-math, covered by other bugs in the 
-ftrapping-math case where it loses exceptions) and the other copy is done 
at run time and the particular instruction used doesn't follow the logic 
in fold_convert_const_int_from_real of converting NaN to zero and 
saturating other values.

Under Annex F, such an out-of-range conversion is defined to raise 
"invalid" and return an unspecified value (that is, an unspecified value 
represented in the given integer type, with any given execution of the 
conversion in the abstract machine returning a single, stable value, but 
different executions possibly returning different values).
Comment 27 Vincent Lefèvre 2020-02-25 12:29:53 UTC
(In reply to joseph@codesourcery.com from comment #26)
> I wouldn't be surprised if such a test could be constructed in the absence 
> of -funsafe-math-optimizations, that does a single conversion of an 
> out-of-range integer to a floating-point type in the abstract machine but 

I suppose that you meant the opposite: floating-point to integer.

> where that conversion gets duplicated so that one copy is done at compile 
> time (valid with -fno-trapping-math, covered by other bugs in the 
> -ftrapping-math case where it loses exceptions) and the other copy is done 
> at run time and the particular instruction used doesn't follow the logic 
> in fold_convert_const_int_from_real of converting NaN to zero and 
> saturating other values.

Yes, here's an example:

#include <stdio.h>

__attribute__((noipa)) // imagine it in a separate TU
static double opaque(double d) { return d; }

int main (void)
{
  double x = opaque(5000000000.0);
  int i;

  i = x;
  printf ("%d\n", i);
  if (x == 5000000000.0)
    printf ("%d\n", i);
  return 0;
}

With -O3, I get:

-2147483648
2147483647

Tested with:

gcc-10 (Debian 10-20200222-1) 10.0.1 20200222 (experimental) [master revision 01af7e0a0c2:487fe13f218:e99b18cf7101f205bfdd9f0f29ed51caaec52779]
Comment 28 Vincent Lefèvre 2020-02-25 12:36:15 UTC
A slightly modified version of the example, showing the issue with GCC 5 to 7 (as the noipa attribute directive has been added in GCC 8):

#include <stdio.h>

int main (void)
{
  volatile double d = 5000000000.0;
  double x = d;
  int i = x;

  printf ("%d\n", i);
  if (x == 5000000000.0)
    printf ("%d\n", i);
  return 0;
}

The -O1 optimization level is sufficient to make the bug appear.
Comment 29 Vincent Lefèvre 2020-02-25 12:41:37 UTC
And with unsigned too (this should be a bit more readable):

#include <stdio.h>

int main (void)
{
  volatile double d = -1.0;
  double x = d;
  unsigned int i = x;

  printf ("%u\n", i);
  if (x == -1.0)
    printf ("%u\n", i);
  return 0;
}

gives

4294967295
0
Comment 30 Vincent Lefèvre 2020-02-25 12:50:44 UTC
(In reply to Vincent Lefèvre from comment #28)
> A slightly modified version of the example, showing the issue with GCC 5 to
> 7 (as the noipa attribute directive has been added in GCC 8):

Correction: This allows to test with old GCC versions, and this shows that the bug has been introduced in GCC 6. GCC 5 outputs consistent values.
Comment 31 rguenther@suse.de 2020-02-25 12:53:22 UTC
On Tue, 25 Feb 2020, vincent-gcc at vinc17 dot net wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93806
> 
> --- Comment #28 from Vincent Lefèvre <vincent-gcc at vinc17 dot net> ---
> A slightly modified version of the example, showing the issue with GCC 5 to 7
> (as the noipa attribute directive has been added in GCC 8):
> 
> #include <stdio.h>
> 
> int main (void)
> {
>   volatile double d = 5000000000.0;
>   double x = d;
>   int i = x;
> 
>   printf ("%d\n", i);
>   if (x == 5000000000.0)
>     printf ("%d\n", i);
>   return 0;
> }
> 
> The -O1 optimization level is sufficient to make the bug appear.

it's DOM at work with conditional equivalences again:

Optimizing statement if (x_3 == 5.0e+9)
LKUP STMT x_3 eq_expr 5.0e+9
Optimizing block #3

1>>> STMT 1 = x_3 ordered_expr 5.0e+9
1>>> STMT 1 = x_3 le_expr 5.0e+9
1>>> STMT 1 = x_3 ge_expr 5.0e+9
1>>> STMT 1 = x_3 eq_expr 5.0e+9
1>>> STMT 0 = x_3 ne_expr 5.0e+9
0>>> COPY x_3 = 5.0e+9
Match-and-simplified (int) x_3 to 2147483647
0>>> COPY i_4 = 2147483647
Optimizing statement printf ("%d\n", i_4);
  Replaced 'i_4' with constant '2147483647'
Comment 32 Rich Felker 2020-02-25 13:28:10 UTC
> A slightly modified version of the example, showing the issue with GCC 5 to 7 (as the noipa attribute directive has been added in GCC 8):

Note that __attribute__((__weak__)) necessarily applies noipa and works in basically all GCC versions, so you can use it where you want this kind of example for older GCC.
Comment 33 Vincent Lefèvre 2020-02-26 02:23:53 UTC
I couldn't find a failing test with FP contraction, but this seems to be because GCC doesn't optimize as much as it could. Still, the following example could be interesting in the future or as a non-regression test:

#include <stdio.h>

#define A 0x0.ffffffffffffp25
#define C -0x1.p50

int main (void)
{
  volatile double a = A;
  double b = a, c = C;
  int i;

#if 0
  double x = b * b;
  printf ("%a\n", x);
#endif
  // if (b == A && c == C)
  {
    i = b * b + c;
    printf ("%d\n", i);
    if (b == A && c == C)
      {
        // i = b * b + c;
        printf ("%d\n", i == -7);
      }
  }
  return 0;
}

By "GCC doesn't optimize as much as it could", I mean that the comparison with -7 is generated, even though the result of this comparison is known at compile time. Note that this is shown by the fact that uncommenting the second "i = b * b + c;" does not change anything: a comparison with -7 is still generated.

With contraction enabled, one currently gets:

-7
1

(which is correct). If one uncomments the first "if (b == A && c == C)", one gets:

-8
0

(still correct) showing that contraction is disabled at compile time.

Changing the "#if 0" to "#if 1" allows one to avoid contraction at run time. Such a change can be useful if the rule at compile time changes in the future, say: contraction is not used at run time, but at compile time, the optimization bug might have the effect to re-evaluate i with contraction, yielding inconsistent output.
Comment 34 Alexander Cherepanov 2020-03-03 15:30:20 UTC
(In reply to Vincent Lefèvre from comment #13)
> > Since C without Annex F allows arbitrarily awful floating point results,
> 
> In C without Annex F, division by 0 is undefined behavior (really undefined
> behavior, not an unspecified result, which would be very different).
> 
> With the examples using divisions by 0, you need to assume that Annex F
> applies, but at the same time, with your interpretation, -fno-signed-zeros
> breaks Annex F in some cases, e.g. if you have floating-point divisions by
> 0. So I don't follow you...

You seem to say that either Annex F is fully there or not at all but why? -fno-signed-zeros breaks Annex F but only parts of it. Isn't it possible to retain the other parts of it? Maybe it's impossible or maybe it's impossible to retain division by zero, I don't know. What is your logic here?

(In reply to Vincent Lefèvre from comment #15)
> Note that there are very few ways to be able to distinguish the sign of
> zero. The main one is division by zero. Other ones are:
> 
> * Conversion to a character string, e.g. via printf(). But in this case, if
> -fno-signed-zeros is used, whether "0" or "-0" is output (even in a way that
> seems to be inconsistent) doesn't matter since the user does not care about
> the sign of 0, i.e. "0" and "-0" are regarded as equivalent (IIRC, this
> would be a bit like NaN, which has a sign bit in IEEE 754, but the output
> does not need to match its sign bit).

This means that you cannot implement you own printf: if you analyze sign bit of your value to decide whether you need to print '-', the sign of zero is significant in your code.

IOW why do you think that printf is fine while "1 / x == 1 / 0." is not?

> * Memory analysis. Again, the sign does not matter, but for instance,
> reading an object twice as a byte sequence while the object has not been
> changed by the code must give the same result. I doubt that this is affected
> by optimization.

Working with objects on byte level is often optimized too:

----------------------------------------------------------------------
#include <string.h>
#include <stdio.h>

__attribute__((noipa)) // imagine it in a separate TU
static double opaque(double d) { return d; }

int main()
{
    int zero = opaque(0);

    double x = opaque(-0.);
    long l;
    memcpy(&l, &x, sizeof l);
    int a = l == 0;
    // or just this:
    //int a = (union { double d; long l; }){x}.l == 0;

    printf("zero = %d\n", zero);

    opaque(a);
    if (zero == a) {
        opaque(0);
        if (x == 0) {
            opaque(0);
            if (a) {
                opaque(0);
                if (zero == 1)
                    printf("zero = %d\n", zero);
            }
        }
    }
}
----------------------------------------------------------------------
$ gcc -std=c11 -pedantic -Wall -Wextra -fno-signed-zeros -O3 test.c && ./a.out
zero = 0
zero = 1
----------------------------------------------------------------------
gcc x86-64 version: gcc (GCC) 10.0.1 20200303 (experimental)
----------------------------------------------------------------------

Bonus: bare -fno-signed-zeros is used here, without -fno-trapping-math.
Comment 35 Vincent Lefèvre 2020-03-03 16:17:57 UTC
(In reply to Alexander Cherepanov from comment #34)
> (In reply to Vincent Lefèvre from comment #13)
> > In C without Annex F, division by 0 is undefined behavior (really undefined
> > behavior, not an unspecified result, which would be very different).
> > 
> > With the examples using divisions by 0, you need to assume that Annex F
> > applies, but at the same time, with your interpretation, -fno-signed-zeros
> > breaks Annex F in some cases, e.g. if you have floating-point divisions by
> > 0. So I don't follow you...
> 
> You seem to say that either Annex F is fully there or not at all but why?
> -fno-signed-zeros breaks Annex F but only parts of it. Isn't it possible to
> retain the other parts of it? Maybe it's impossible or maybe it's impossible
> to retain division by zero, I don't know. What is your logic here?

This issue is that the nice property x == y implies f(x) == f(y), in particular, x == y implies 1 / x == 1 / y is no longer valid with signed zeros. Thus one intent of -fno-signed-zeros could be to enable optimizations based on this property. But this means that division by zero becomes undefined behavior (like in C without Annex F). Major parts of Annex F would still remain valid.

> (In reply to Vincent Lefèvre from comment #15)
> > Note that there are very few ways to be able to distinguish the sign of
> > zero. The main one is division by zero. Other ones are:
> > 
> > * Conversion to a character string, e.g. via printf(). But in this case, if
> > -fno-signed-zeros is used, whether "0" or "-0" is output (even in a way that
> > seems to be inconsistent) doesn't matter since the user does not care about
> > the sign of 0, i.e. "0" and "-0" are regarded as equivalent (IIRC, this
> > would be a bit like NaN, which has a sign bit in IEEE 754, but the output
> > does not need to match its sign bit).
> 
> This means that you cannot implement you own printf: if you analyze sign bit
> of your value to decide whether you need to print '-', the sign of zero is
> significant in your code.

If you want to implement a printf that takes care of the sign of 0, you must not use -fno-signed-zeros.

> IOW why do you think that printf is fine while "1 / x == 1 / 0." is not?

printf is not supposed to trigger undefined behavior. Part of its output is unspecified, but that's all.

> > * Memory analysis. Again, the sign does not matter, but for instance,
> > reading an object twice as a byte sequence while the object has not been
> > changed by the code must give the same result. I doubt that this is affected
> > by optimization.
> 
> Working with objects on byte level is often optimized too:

Indeed, there could be invalid optimization... But I would have thought that in such a case, the same kind of issue could also occur without -fno-signed-zeros. Indeed, if x == y, then this does not mean that x and y have the same memory representation. Where does -fno-signed-zeros introduce a difference?

Note: There's also the case of IEEE 754 decimal floating-point formats (such as _Decimal64), for instance, due to the "cohorts", where two identical values can have different memory representations. Is GCC always correct here?
Comment 36 rguenther@suse.de 2020-03-04 08:01:21 UTC
On Tue, 3 Mar 2020, vincent-gcc at vinc17 dot net wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93806
> 
> --- Comment #35 from Vincent Lefèvre <vincent-gcc at vinc17 dot net> ---
> > > * Memory analysis. Again, the sign does not matter, but for instance,
> > > reading an object twice as a byte sequence while the object has not been
> > > changed by the code must give the same result. I doubt that this is affected
> > > by optimization.
> > 
> > Working with objects on byte level is often optimized too:
> 
> Indeed, there could be invalid optimization... But I would have thought that in
> such a case, the same kind of issue could also occur without -fno-signed-zeros.
> Indeed, if x == y, then this does not mean that x and y have the same memory
> representation. Where does -fno-signed-zeros introduce a difference?
> 
> Note: There's also the case of IEEE 754 decimal floating-point formats (such as
> _Decimal64), for instance, due to the "cohorts", where two identical values can
> have different memory representations. Is GCC always correct here?

We're actually careful about the sign of zero here when recording
requivalences for propagation.  I don't see anything preventing
equivalence based propagation for decimal floats though.
Also not sure if actual FP formats are required to be always
normalized ...
Comment 37 Vincent Lefèvre 2020-03-04 08:56:11 UTC
(In reply to rguenther@suse.de from comment #36)
> We're actually careful about the sign of zero here when recording
> requivalences for propagation.

But shouldn't the use of -fno-signed-zeros imply that the sign of zero never matches (i.e. the actual sign is unknown, because unsafe optimizations could have modified it in an inconsistent way)?
Comment 38 rguenther@suse.de 2020-03-04 09:33:56 UTC
On Wed, 4 Mar 2020, vincent-gcc at vinc17 dot net wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93806
> 
> --- Comment #37 from Vincent Lefèvre <vincent-gcc at vinc17 dot net> ---
> (In reply to rguenther@suse.de from comment #36)
> > We're actually careful about the sign of zero here when recording
> > requivalences for propagation.
> 
> But shouldn't the use of -fno-signed-zeros imply that the sign of zero never
> matches (i.e. the actual sign is unknown, because unsafe optimizations could
> have modified it in an inconsistent way)?

There's no rigorous definition of what -fno-signed-zeros implies and I
fear we'll not arrive at something consistent.  We err on the side
of enabling optimizations that look useful even if those might
result in conflicting such "definitions".  -fno-signed-zeros to
a simple minded person means all zeros have positive sign.  That's
how we treat -ffinite-math-only as well - there doesn't exist
any NaN so isnan() is always false (even if that's not exactly useful
for some people).
Comment 39 Vincent Lefèvre 2020-03-04 11:55:37 UTC
So I wonder whether -fno-signed-zeros should be removed. It seems too dangerous. The -ffinite-math-only option is fine because the programmer may be able to prove that Inf and NaN never occur in his code (at least with non-erroneous inputs). But zero is a value that is too common, and not having a very rigorous specification for -fno-signed-zeros is bad.

Now, if the only issue with -fno-signed-zeros (assuming division by 0 does not occur) concerns values with multiple representations, solving this more general issue could be sufficient, though.
Comment 40 Alexander Cherepanov 2020-03-04 19:16:51 UTC
(In reply to Vincent Lefèvre from comment #35)
> > You seem to say that either Annex F is fully there or not at all but why?
> > -fno-signed-zeros breaks Annex F but only parts of it. Isn't it possible to
> > retain the other parts of it? Maybe it's impossible or maybe it's impossible
> > to retain division by zero, I don't know. What is your logic here?
> 
> This issue is that the nice property x == y implies f(x) == f(y), in
> particular, x == y implies 1 / x == 1 / y is no longer valid with signed
> zeros. Thus one intent of -fno-signed-zeros could be to enable optimizations
> based on this property. But this means that division by zero becomes
> undefined behavior (like in C without Annex F). Major parts of Annex F would
> still remain valid.

I agree that the intent is to enable optimization based on the property "x == y implies f(x) == f(y)". But I'm not sure what follows from this.

Sure, one possibility is make undefined any program that uses f(x) where x could be a zero and f(x) differs for two zeros. But this approach make printf and memory-accesss undefined too. Sorry, I don't how you could undefine division by zero while not undefining printing of zero.

Another approach is to say that we don't care which of possible two values f(x) returns x is zero. That is, we don't care whether 1/0. is +inf or -inf and we don't care whether printf("%g", 0.) outputs 0 or -0.

> > This means that you cannot implement you own printf: if you analyze sign bit
> > of your value to decide whether you need to print '-', the sign of zero is
> > significant in your code.
> 
> If you want to implement a printf that takes care of the sign of 0, you must
> not use -fno-signed-zeros.

So if I use ordinary printf from a libc with -fno-signed-zeros it's fine but if I copy its implementation into my own program it's not fine?

> > IOW why do you think that printf is fine while "1 / x == 1 / 0." is not?
> 
> printf is not supposed to trigger undefined behavior. Part of its output is
> unspecified, but that's all.

Why the same couldn't be said about division? Division by zero is not supposed to trigger undefined behavior. Part of its result (the sign of infinit) is unspecified, but that's all.

> > > * Memory analysis. Again, the sign does not matter, but for instance,
> > > reading an object twice as a byte sequence while the object has not been
> > > changed by the code must give the same result. I doubt that this is affected
> > > by optimization.
> > 
> > Working with objects on byte level is often optimized too:
> 
> Indeed, there could be invalid optimization... But I would have thought that
> in such a case, the same kind of issue could also occur without
> -fno-signed-zeros. Indeed, if x == y, then this does not mean that x and y
> have the same memory representation. Where does -fno-signed-zeros introduce
> a difference?

Right. But it's well known that x == y doesn't imply that x and y have the same value. And the only such case is zeros of different signs (right?). So compilers deal with this case in a special way. (E.g., the optimization `if (x == C) use(x)` -> `if (x == C) use(C)` is normally done only for non-zero FP constant `C`. -fno-signed-zeros changes this.)

The idea that one value could have different representations is not widely distributed. I didn't manage to construct a testcase for this yesterday but I succeeded today -- see pr94035 (affects clang too).

The next level -- the same value, the same representation, different meaning. E.g., pointers of different provenance. But that's another story:-)

> Note: There's also the case of IEEE 754 decimal floating-point formats (such
> as _Decimal64), for instance, due to the "cohorts", where two identical
> values can have different memory representations. Is GCC always correct here?

I have used pseudo-denormals in long double (x86_fp80) for this so far. Are decimal floating-point formats more interesting?
Comment 41 joseph@codesourcery.com 2020-03-04 22:09:08 UTC
On Wed, 4 Mar 2020, rguenther at suse dot de wrote:

> We're actually careful about the sign of zero here when recording
> requivalences for propagation.  I don't see anything preventing
> equivalence based propagation for decimal floats though.
> Also not sure if actual FP formats are required to be always
> normalized ...

All finite DFP values that can be represented with fewer decimal digits 
than the maximum for the format, and where the least significant nonzero 
decimal digit has an exponent greater than the smallest possible for the 
format, have representations with more than one possible quantum exponent, 
which are not equivalent even though they compare equal, and thus 
comparing equal should not result in an equivalence being recorded.

E.g. _Decimal32 has 7-digit precision.  1.0DF and 1.00DF are different 
members of the same cohort, with quantum exponents -1 and -2, and compare 
equal but are not equivalent.  1.000001DF uses all 7 digits so is the 
unique member of its cohort, so it is valid to use 1.000001DF in place of 
anything that compared equal to 1.000001DF.  1E-101DF is the least 
positive subnormal value; as -101 is the lowest quantum exponent, anything 
with a nonzero digit in the 10^-101 place is the unique member of its 
cohort and so 1E-101DF can be used in place of anything that compared 
equal to 1E-101DF.

This is a separate matter from noncanonical encodings of DFP values.  A 
noncanonical encoding is a valid representation that *is* fully equivalent 
to some canonical encoding - but most operations are supposed to produce 
only canonically encoded results.  If a cohort has only one member but 
there are noncanonical encodings of that member, a canonical encoding can 
replace a noncanonical one (in general it's not required to propagate 
noncanonical encodings, even when permitted), but not vice versa.
Comment 42 Vincent Lefèvre 2020-03-05 15:47:37 UTC
(In reply to Alexander Cherepanov from comment #40)
> Sure, one possibility is make undefined any program that uses f(x) where x
> could be a zero and f(x) differs for two zeros. But this approach make
> printf and memory-accesss undefined too. Sorry, I don't how you could
> undefine division by zero while not undefining printing of zero.

printf and memory accesss can already yield different results on the same value (for printf on NaN, a sign can be output or not, and the sign bit of NaN is generally unspecified). Moreover, it would not be correct to make printf and memory accesss undefined on zero, because the behavior is defined by the C standard, and more than that, very useful, while the floating-point division by 0 is undefined behavior (and the definition in Annex F makes sense only if one has signed zeros, where we care about the sign -- see more about that below).

> Another approach is to say that we don't care which of possible two values
> f(x) returns x is zero. That is, we don't care whether 1/0. is +inf or -inf
> and we don't care whether printf("%g", 0.) outputs 0 or -0.

But that would disable all the related optimizations. I don't think this would make a noticeable difference for printf in practice (in most cases), but this can be more problematic for division.

Otherwise it should be said that -fno-signed-zeros also implies that infinity gets an arbitrary sign that can change at any time. But I think that in such a case +inf and -inf should compare as equal (+ some other rules), and this would also be bad for optimization.

> > > This means that you cannot implement you own printf: if you analyze sign bit
> > > of your value to decide whether you need to print '-', the sign of zero is
> > > significant in your code.
> > 
> > If you want to implement a printf that takes care of the sign of 0, you must
> > not use -fno-signed-zeros.
> 
> So if I use ordinary printf from a libc with -fno-signed-zeros it's fine but
> if I copy its implementation into my own program it's not fine?

If you use -fno-signed-zeros, you cannot assume that you will get consistent output. But perhaps the call to printf should be changed in a mode where 0 is always regarded as of a positive sign (GCC knows the types of the arguments, thus could wrap printf, and I doubt that this would introduce much overhead).

> > > IOW why do you think that printf is fine while "1 / x == 1 / 0." is not?
> > 
> > printf is not supposed to trigger undefined behavior. Part of its output is
> > unspecified, but that's all.
> 
> Why the same couldn't be said about division? Division by zero is not
> supposed to trigger undefined behavior. Part of its result (the sign of
> infinit) is unspecified, but that's all.

See above.

> Right. But it's well known that x == y doesn't imply that x and y have the
> same value. And the only such case is zeros of different signs (right?).

On numeric types, I think so.

> So compilers deal with this case in a special way.

Only for optimization (the compiler does not have to deal with what the processor does).

> (E.g., the optimization `if (x == C) use(x)` -> `if (x == C) use(C)` is
> normally done only for non-zero FP constant `C`. -fno-signed-zeros changes
> this.)

Yes.

> The idea that one value could have different representations is not widely
> distributed.

s/is/was/ (see below with decimal).

And what about the padding bytes in structures for alignment? Could there be issues?

> I didn't manage to construct a testcase for this yesterday but
> I succeeded today -- see pr94035 (affects clang too).

I'm not sure that pseudo-denormal values of x86 long double are regarded as valid values by GCC (note that they are specified neither by IEEE 754 nor by Annex F). They could be regarded as trap representations, as defined in 3.19.4: "an object representation that need not represent a value of the object type". Reading such a representation yields undefined behavior (6.2.6.1p5), in which case PR94035 would not be a bug.

> > Note: There's also the case of IEEE 754 decimal floating-point formats (such
> > as _Decimal64), for instance, due to the "cohorts", where two identical
> > values can have different memory representations. Is GCC always correct here?
> 
> I have used pseudo-denormals in long double (x86_fp80) for this so far. Are
> decimal floating-point formats more interesting?

Yes, because contrary to pseudo-denormals in long double, the support for different representations of decimal values are fully specified, have their own use (and can easily be generated with usual operations, e.g. if you have a cancellation in a subtraction), and cannot be trap representations in C. FYI, in IEEE 754-2019:

  cohort: The set of all floating-point representations that represent a
  given floating-point number in a given floating-point format. In this
  context −0 and +0 are considered distinct and are in different cohorts.

Cohorts only appear in decimal interchange formats (such as _Decimal64).

For binary formats, normalization is useful as it allows one to gain 1 bit precision with the implicit bit. But in decimal, one could not have such a gain, and it was decided that instead of requiring normalization (to have a single representation), it was better to use quantum information (or exponent information, if you prefer), which could be used by some applications.

Note that even two identical values of a decimal format with the same exponent (i.e. with the same representation in the sense of IEEE 754, but not in the sense of ISO C) can have different encodings (= different representations in the sense of ISO C). However, there is the notion of canonical encoding. A non-canonical encoding may be propagated by some operations, but AFAIK, it may never be generated (i.e. it could be obtained only by reading a value from memory, which has been generated by other means). Thus, regarding a non-canonical encoding as a trap representation in C could be fine; it would be non-conform, but this does not apply when using options that drop conformance anyway.
Comment 43 Alexander Cherepanov 2020-03-10 23:34:06 UTC
Joseph, Vincent, thanks a lot for the crash course in decimal floating-point. Indeed, quite interesting types. Findings so far: bug 94035, comment 5, bug 94111, bug 94122.

Let me try to summarize the understanding and to get closer to the C terminology. Please correct me if I'm wrong.

Different representations in the IEEE 754-2019 speak are different values in the C speak, e.g. 1.0DF and 1.00DF are different values in _Decimal32 (in particular, the assignment operator is required to preserve the difference). The set of values corresponding to the same number is a cohort. Cohorts of non-zero non-inf non-nan values in _Decimal32 have from 1 to 7 elements. Both infinities have only 1 element in their cohorts. Both zeros have much more elements in their cohorts (with all possible quantum exponents -- 192 for _Decimal32).

Some values admit several representations (in the C speak; it's encodings in the IEEE speak). GCC on x86-64 uses the binary encoding for the significand. Hence, non-zero non-inf non-nan values have only one representation each. Significands exceeding the maximum are treated as zero which gives many (non-canonical) representations for each of many zero values. Inf and nan values have many representations too (due to ignored trailing exponent and/or significand).

So the first question: does any platform (that gcc supports) use the decimal encoding for the significand (aka densely packed decimal encoding)?

Then, the rules about (non)propagation of some encodings blur the boundary between values and representations in C. In particular this means that different encodings are _not_ equivalent. Take for example the optimization `x == C ? C + 0 : x` -> `x` for a constant C that is the unique member of its cohort and that has non-canonical encodings (C is an infinity according to the above analysis). Not sure about encoding of literals but the result of addition `C + 0` is required to have canonical encoding. If `x` has non-canonical encoding then the optimization is invalid.

While at it, convertFormat is required to return canonical encodings, so after `_Decimal32 x = ..., y = (_Decimal32)(_Decimal64)x;` `y` has to have canonical encoding? But these casts are nop in gcc now.
Comment 44 Vincent Lefèvre 2020-03-11 01:10:21 UTC
(In reply to Alexander Cherepanov from comment #43)
> GCC on x86-64 uses the binary encoding for the significand.

In general, yes. This includes the 32-bit ABI under Linux. But it seems to be different under MS-Windows, at least with MinGW using the 32-bit ABI: according to my tests of MPFR,

MPFR config.status 4.1.0-dev
configured by ./configure, generated by GNU Autoconf 2.69,
  with options "'--host=i686-w64-mingw32' '--disable-shared' '--with-gmp=/usr/local/gmp-6.1.2-mingw32' '--enable-assert=full' '--enable-thread-safe' 'host_alias=i686-w64-mingw32'"
[...]
CC='i686-w64-mingw32-gcc'
[...]
[tversion] Compiler: GCC 8.3-win32 20191201
[...]
[tversion] TLS = yes, float128 = yes, decimal = yes (DPD), GMP internals = no

i.e. GCC uses DPD instead of the usual BID.

> So the first question: does any platform (that gcc supports) use the decimal
> encoding for the significand (aka densely packed decimal encoding)?

DPD is also used on PowerPC (at least the 64-bit ABI), as these processors now have hardware decimal support.

> Then, the rules about (non)propagation of some encodings blur the boundary
> between values and representations in C. In particular this means that
> different encodings are _not_ equivalent. Take for example the optimization
> `x == C ? C + 0 : x` -> `x` for a constant C that is the unique member of
> its cohort and that has non-canonical encodings (C is an infinity according
> to the above analysis). Not sure about encoding of literals but the result
> of addition `C + 0` is required to have canonical encoding. If `x` has
> non-canonical encoding then the optimization is invalid.

In C, it is valid to choose any possible encoding. Concerning the IEEE 754 conformance, this depends on the bindings. But IEEE 754 does not define the ternary operator. It depends whether C considers encodings before or possibly after optimizations (in the C specification, this does not matter, but when IEEE 754 is taken into account, there may be more restrictions).

> While at it, convertFormat is required to return canonical encodings, so
> after `_Decimal32 x = ..., y = (_Decimal32)(_Decimal64)x;` `y` has to have
> canonical encoding? But these casts are nop in gcc now.

A question is whether casts are regarded as explicit convertFormat operations or whether simplification is allowed as it does not affect the value, in which case the canonicalize() function would be needed here. And in any case, when FP contraction is enabled, I suppose that (_Decimal32)(_Decimal64)x can be regarded as x.
Comment 45 Alexander Cherepanov 2020-03-11 18:54:24 UTC
(In reply to Vincent Lefèvre from comment #44)
> (In reply to Alexander Cherepanov from comment #43)
> > GCC on x86-64 uses the binary encoding for the significand.
> 
> In general, yes. This includes the 32-bit ABI under Linux. But it seems to
> be different under MS-Windows, at least with MinGW using the 32-bit ABI:
> according to my tests of MPFR,
> 
> MPFR config.status 4.1.0-dev
> configured by ./configure, generated by GNU Autoconf 2.69,
>   with options "'--host=i686-w64-mingw32' '--disable-shared'
> '--with-gmp=/usr/local/gmp-6.1.2-mingw32' '--enable-assert=full'
> '--enable-thread-safe' 'host_alias=i686-w64-mingw32'"
> [...]
> CC='i686-w64-mingw32-gcc'
> [...]
> [tversion] Compiler: GCC 8.3-win32 20191201
> [...]
> [tversion] TLS = yes, float128 = yes, decimal = yes (DPD), GMP internals = no
> 
> i.e. GCC uses DPD instead of the usual BID.

Strange, I tried mingw from stable Debian on x86-64 and see it behaving the same way as the native gcc:

$ echo 'int main() { return (union { _Decimal32 d; int i; }){0.df}.i; }' > test.c

$ gcc -O3 -fdump-tree-optimized=test.out test.c && grep -h return test.out
  return 847249408;
$ gcc --version | head -n 1
gcc (Debian 8.3.0-6) 8.3.0

$ x86_64-w64-mingw32-gcc -O3 -fdump-tree-optimized=test.out test.c && grep -h return test.out
  return 847249408;
$ x86_64-w64-mingw32-gcc --version | head -n 1
x86_64-w64-mingw32-gcc (GCC) 8.3-win32 20190406

$ i686-w64-mingw32-gcc -O3 -fdump-tree-optimized=test.out test.c && grep -h return test.out
  return 847249408;
$ i686-w64-mingw32-gcc --version | head -n 1
i686-w64-mingw32-gcc (GCC) 8.3-win32 20190406

Plus some other cross-compilers:

$ powerpc64-linux-gnu-gcc -O3 -fdump-tree-optimized=test.out test.c && grep -h return test.out
  return 575668224;
$ powerpc64-linux-gnu-gcc --version | head -n 1
powerpc64-linux-gnu-gcc (Debian 8.3.0-2) 8.3.0

$ powerpc64le-linux-gnu-gcc -O3 -fdump-tree-optimized=test.out test.c && grep -h return test.out
  return 575668224;
$ powerpc64le-linux-gnu-gcc --version | head -n 1
powerpc64le-linux-gnu-gcc (Debian 8.3.0-2) 8.3.0

$ s390x-linux-gnu-gcc -O3 -fdump-tree-optimized=test.out test.c && grep -h return test.out
  return 575668224;
$ s390x-linux-gnu-gcc --version | head -n 1
s390x-linux-gnu-gcc (Debian 8.3.0-2) 8.3.0

AIUI the value 847249408 (= 0x32800000) is right for 0.df with BID and 575668224 (= 0x22500000) is right with DPD.

> > So the first question: does any platform (that gcc supports) use the decimal
> > encoding for the significand (aka densely packed decimal encoding)?
> 
> DPD is also used on PowerPC (at least the 64-bit ABI), as these processors
> now have hardware decimal support.

Oh, this means that cohorts differs by platform.

> > Then, the rules about (non)propagation of some encodings blur the boundary
> > between values and representations in C. In particular this means that
> > different encodings are _not_ equivalent. Take for example the optimization
> > `x == C ? C + 0 : x` -> `x` for a constant C that is the unique member of
> > its cohort and that has non-canonical encodings (C is an infinity according
> > to the above analysis). Not sure about encoding of literals but the result
> > of addition `C + 0` is required to have canonical encoding. If `x` has
> > non-canonical encoding then the optimization is invalid.
> 
> In C, it is valid to choose any possible encoding. Concerning the IEEE 754
> conformance, this depends on the bindings. But IEEE 754 does not define the
> ternary operator. It depends whether C considers encodings before or
> possibly after optimizations (in the C specification, this does not matter,
> but when IEEE 754 is taken into account, there may be more restrictions).

The ternary operator is not important, let's replace it with `if`:

----------------------------------------------------------------------
#include <math.h>

_Decimal32 f(_Decimal32 x)
{
    _Decimal32 inf = (_Decimal32)INFINITY + 0;

    if (x == inf)
        return inf;
    else
        return x;
}
----------------------------------------------------------------------

This is optimized into just `return x;`.

> > While at it, convertFormat is required to return canonical encodings, so
> > after `_Decimal32 x = ..., y = (_Decimal32)(_Decimal64)x;` `y` has to have
> > canonical encoding? But these casts are nop in gcc now.
> 
> A question is whether casts are regarded as explicit convertFormat
> operations 

N2478, a recent draft of C2x, lists bindings in F.3 and "convertFormat - different formats" corresponds to "cast and implicit conversions". Is this enough?

BTW "convertFormat - same format" corresponds to "canonicalize", so I guess a cast to the same type is not required to canonicalize.

> or whether simplification is allowed as it does not affect the
> value, in which case the canonicalize() function would be needed here. 

Not sure what this means.

> And
> in any case, when FP contraction is enabled, I suppose that
> (_Decimal32)(_Decimal64)x can be regarded as x.

Perhaps this PR is not ideal for such discussions but the problems with DFP that we see happen in a standards-compliant mode.
Comment 46 Vincent Lefèvre 2020-03-12 01:08:46 UTC
(In reply to Alexander Cherepanov from comment #45)
> (In reply to Vincent Lefèvre from comment #44)
> > (In reply to Alexander Cherepanov from comment #43)
> > > GCC on x86-64 uses the binary encoding for the significand.
> > 
> > In general, yes. This includes the 32-bit ABI under Linux. But it seems to
> > be different under MS-Windows, at least with MinGW using the 32-bit ABI:
> > according to my tests of MPFR,
[...]
> > i.e. GCC uses DPD instead of the usual BID.
> 
> Strange, I tried mingw from stable Debian on x86-64 and see it behaving the
> same way as the native gcc:

Sorry, I've looked at the code (the detection is done by the configure script), and in case of cross-compiling, MPFR just assumes that this is DPD. I confirm that this is actually BID. So MPFR's guess is wrong. It appears that this does not currently have any consequence on the MPFR build and the tests, so that the issue remained unnoticed.

> > In C, it is valid to choose any possible encoding. Concerning the IEEE 754
> > conformance, this depends on the bindings. But IEEE 754 does not define the
> > ternary operator. It depends whether C considers encodings before or
> > possibly after optimizations (in the C specification, this does not matter,
> > but when IEEE 754 is taken into account, there may be more restrictions).
> 
> The ternary operator is not important, let's replace it with `if`:
> 
> ----------------------------------------------------------------------
> #include <math.h>
> 
> _Decimal32 f(_Decimal32 x)
> {
>     _Decimal32 inf = (_Decimal32)INFINITY + 0;
> 
>     if (x == inf)
>         return inf;
>     else
>         return x;
> }
> ----------------------------------------------------------------------
> 
> This is optimized into just `return x;`.

I'm still wondering whether the optimization is valid, since this affects only the encoding, not the value, and in general, C allows the encoding to change when accessing an object. I don't know whether the C2x draft says something special about the decimal formats.

> N2478, a recent draft of C2x, lists bindings in F.3 and "convertFormat -
> different formats" corresponds to "cast and implicit conversions". Is this
> enough?

But ditto, is optimization that just modifies the encoding allowed?