111655 – [11/12/13/14/15 Regression] wrong code generated for __builtin_signbit and 0./0. on x86-64 -O2

Bug 111655 - [11/12/13/14/15 Regression] wrong code generated for __builtin_signbit and 0./0. on x86-64 -O2

Summary: [11/12/13/14/15 Regression] wrong code generated for __builtin_signbit and 0....

Status:	NEW

Alias:	None

Product:	gcc
Classification:	Unclassified
Component:	middle-end (show other bugs)
Version:	13.2.1

Importance:	P2 normal
Target Milestone:	11.5
Assignee:	Not yet assigned to anyone

URL:
Keywords:	wrong-code

Depends on:
Blocks:	111701
	Show dependency tree / graph

Reported:	2023-10-01 17:22 UTC by Paul Eggert
Modified:	2024-04-26 10:54 UTC (History)
CC List:	7 users (show)

See Also:	51446
Host:
Target:	x86_64-linux-gnu
Build:
Known to work:	4.8.5
Known to fail:
Last reconfirmed:	2023-10-02 00:00:00

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Paul Eggert 2023-10-01 17:22:55 UTC

I ran into this bug when testing Gnulib code on Fedora 38 x86-64, which uses gcc (GCC) 13.2.1 20230728 (Red Hat 13.2.1-1). The problem is a regression from gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-44), which does the right thing.

Here is a stripped-down version of the bug. Compile and run the following code with "gcc -O2 t.i; ./a.out".

  int
  main ()
  {
    double x = 0.0 / 0.0;
    if (!__builtin_signbit (x))
      x = -x;
    return !__builtin_signbit (x);
  }

Although a.out's exit status should be 0, it is 1. If I compile without -O2 the bug goes away.

Here's the key part of the generated assembly language:

  main:
        pxor    %xmm0, %xmm0
        divsd   %xmm0, %xmm0
        xorpd   .LC1(%rip), %xmm0
        movmskpd        %xmm0, %eax
        testb   $1, %al
        sete    %al
        movzbl  %al, %eax
        ret

  .LC1:
        .long   0
        .long   -2147483648

On the x86-64, the "divsd %xmm0, %xmm0" instruction that implements 0.0 / 0.0 generates a NaN with the sign bit set. I determined this by testing on a Xeon W-1350, although I don't see where the NaN's sign bit is documented by Intel in this situation.

It appears that GCC's optimization incorrectly assumes that 0.0 / 0.0 generates a NaN with the sign bit clear, which causes the "if (!__builtin_signbit (x)) x = -x;" to be compiled as if it were merely "x = -x;", which is obviously incorrect.

Comment 1 Andrew Pinski 2023-10-01 17:29:03 UTC

I think this is unspecified if this is supposed to be set or not ...

Comment 2 Andrew Pinski 2023-10-01 17:29:47 UTC

The match pattern which causes the issue:
(simplify
 /* signbit(x) -> 0 if x is nonnegative.  */
 (SIGNBIT tree_expr_nonnegative_p@0)
 { integer_zero_node; })

Comment 3 Andrew Pinski 2023-10-01 17:40:15 UTC

Another testcase which shows a related issue:
```
double t = 0.0/0.0;


  int
  main ()
  {
    double x = 0.0/0.0;
    return __builtin_signbit (x) != __builtin_signbit (t);
  }
```

And another one:
```
double t = 0.0/0.0;


  int
  main ()
  {
  //  double x = 0.0/0.0;
    volatile double tt = 0.0;

    return __builtin_signbit (tt/tt) != __builtin_signbit (t);
  }

```

Note GCC and clang even disagree on the first testcase but agree with the second one.

I am thinking this is all under specified really ...

Comment 4 Andrew Pinski 2023-10-01 17:42:25 UTC

(In reply to Andrew Pinski from comment #3)
> 
> Note GCC and clang even disagree on the first testcase but agree with the
> second one.

Oh and the C and C++ front-end even disagree with each other.

Comment 5 Paul Eggert 2023-10-01 22:08:06 UTC

> I am thinking this is all under specified really ...
Although it is indeed unspecified whether 0.0/0.0 yields -NaN or +NaN, it is well understood that negating a floating point value flips its sign bit. The original test case demonstrates that gcc -O2 currently mishandles this, as that test case negates a floating point value but the sign bit remains unchanged.

Old GCC and Clang handle this correctly, as do the other non-GCC compilers that I checked. As far as I know, only recentish gcc gets this wrong, and even then only when optimization is enabled.

Here is a sharper example of the bug:

  int
  main ()
  {
    double x = 0.0 / 0.0;
    return !__builtin_signbit (x) == !__builtin_signbit(-x);
  }

This should return 0 no matter what X's value is, but it returns 1 with recent gcc -O2 on x86-64.


> The match pattern which causes the issue:
> (simplify
>  /* signbit(x) -> 0 if x is nonnegative.  */
>  (SIGNBIT tree_expr_nonnegative_p@0)
>  { integer_zero_node; })
I don't see anything wrong with that match pattern.

I speculate that what's wrong is that GCC incorrectly thinks that 0.0/0.0 is nonnegative. Although it's tempting to say that the sign bit of a division is the exclusive OR of the sign bits of its operands, evidently this is not true on x86-64 when NaNs are involved.

Comment 6 Andrew Pinski 2023-10-01 23:57:47 UTC

(In reply to Paul Eggert from comment #5)
> > The match pattern which causes the issue:
> > (simplify
> >  /* signbit(x) -> 0 if x is nonnegative.  */
> >  (SIGNBIT tree_expr_nonnegative_p@0)
> >  { integer_zero_node; })
> I don't see anything wrong with that match pattern.
> 
> I speculate that what's wrong is that GCC incorrectly thinks that 0.0/0.0 is
> nonnegative. Although it's tempting to say that the sign bit of a division
> is the exclusive OR of the sign bits of its operands, evidently this is not
> true on x86-64 when NaNs are involved.

tree_expr_nonnegative_p for divide does:
    case RDIV_EXPR:
    case TRUNC_DIV_EXPR:
    case CEIL_DIV_EXPR:
    case FLOOR_DIV_EXPR:
    case ROUND_DIV_EXPR:
      return RECURSE (op0) && RECURSE (op1);

Since 0.0 and 0.0 both don't have their sign bits set, GCC assumes diving them won't produce a value with the sign bit set ...

I really think x86_64 div instruction is broken for IEEE really.

Comment 7 Andrew Pinski 2023-10-01 23:59:56 UTC

What I am trying to say in comment #3 is both GCC and clang's constant folding is different from what the instruction divsd does in the end.

Comment 8 Andrew Pinski 2023-10-02 00:02:02 UTC

Oh this is a dup of bug 51446 , with -fno-trapping-math GCC produces the NaN as a constant folding.

*** This bug has been marked as a duplicate of bug 51446 ***

Comment 9 Alexander Monakov 2023-10-02 11:08:12 UTC

It's true that the sign of 0./0 is unpredictable, but we can fold it only when the division is being eliminated by the folding. 

It's fine to fold

  t = 0./0;
  s = __builtin_signbit(t);

to

  s = 0

with t eliminated from IR, but it's not OK to fold

  t = 0./0
  s = __builtin_signbit(t);

to

  t = 0./0
  s = 0

because the resulting program runs as if 0./0 was evaluated twice, first to a positive NaN (which was used for signbit), then to a negative NaN (which fed the following computations). This is not allowed.

This bug was incorrectly classified as a dup. The fix is either to not fold this, or fold only when we know that the division will be eliminated (e.g. the only use was in the signbit). Reopening.

Comment 10 Richard Biener 2023-10-04 09:37:17 UTC

And this conservatively has to apply to all FP divisions where we might infer
"nonnegative" unless we can also infer !zerop?  On the side of replacing
all uses I'd error on simply not folding.

Note 6.5.5/6 says "In both operations, if the value of the second operand is zero, the behavior is undefined." only remotely implying this doesn't
apply to non-integer types (remotely by including modulo behavior in this
sentence).

Possibly in some other place the C standard makes FP division by zero subject
to other rules.

Comment 11 Alexander Monakov 2023-10-04 11:41:20 UTC

(In reply to Richard Biener from comment #10)
> And this conservatively has to apply to all FP divisions where we might infer
> "nonnegative" unless we can also infer !zerop?

Yes, I think the logic in tree_binary_nonnegative_warnv_p is incorrect for floating-point division. Likewise for multiplication: it returns true for 'x * x', but when x is a NaN, 'x * x' is also a NaN (potentially with the same sign).


> On the side of replacing all uses I'd error on simply not folding.

Yes, as preceding transforms might have duplicated the division already. We can only do such folding very early, when we are sure no duplication might have taken place.


> Note 6.5.5/6 says "In both operations, if the value of the second operand is
> zero, the behavior is undefined." only remotely implying this doesn't
> apply to non-integer types (remotely by including modulo behavior in this
> sentence).
> 
> Possibly in some other place the C standard makes FP division by zero subject
> to other rules.

I think the intention is that Annex F makes it follow IEEE rules (returns an Inf/NaN and sets FE_DIVBYZERO/FE_INVALID), but it doesn't seem to be clearly worded, afaict.

Comment 12 Jakub Jelinek 2023-11-24 09:33:04 UTC

Started with r6-3811-g68e57f040c6330eb853551622d458a67d6f9e572 on this testcase,
but as pointed out tree_binary_nonnegative_warnv_p needs to be more careful.
Given that tree_single_nonnegative_warnv_p may return true on REAL_CST NaN literal with sign bit unset, one question is if e.g. such NaN + nonnegative_value or nonnegative_value + such NaN could result in NaN result with negative sign, in that case even PLUS_EXPR
      if (FLOAT_TYPE_P (type))
        return RECURSE (op0) && RECURSE (op1);
would be incorrect and we'd need to guard it with && !tree_expr_maybe_nan_p (op0) && !tree_expr_maybe_nan_p (op1).
Then there is the MULT_EXPR x * x case, that might suffer from similar problem to PLUS_EXPR if NaN with positive sign * NaN with positive sign can result in NaN with negative sign.  Then there is the MULT_EXPR RECURSE (op0) && RECURSE (op1) case, that
one can certainly result in possibly NaN if one operand is 0 and another +inf, so
we'd need to guard it for FLOAT_TYPE_P with
(tree_expr_nonzero_warnv_p (op0, strict_overflow_p) || !tree_expr_maybe_infinite_p (op1))
&& (tree_expr_nonzero_warnv_p (op1, strict_overflow_p) || !tree_expr_maybe_infinite_p (op0))
And then RDIV_EXPR, again corresponding checks.

Comment 13 Alexander Monakov 2023-11-24 09:48:38 UTC

> Then there is the MULT_EXPR x * x case

This is PR 111701.

It would be nice to clarify what "nonnegative" means in the contracts of this family of functions, because it's ambiguous for NaNs and negative zeros (x < 0 is false while signbit is set, and x >= 0 is also false for positive NaNs).

Comment 14 rguenther@suse.de 2023-11-24 09:54:58 UTC

On Fri, 24 Nov 2023, amonakov at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111655
> 
> --- Comment #13 from Alexander Monakov <amonakov at gcc dot gnu.org> ---
> > Then there is the MULT_EXPR x * x case
> 
> This is PR 111701.
> 
> It would be nice to clarify what "nonnegative" means in the contracts of this
> family of functions, because it's ambiguous for NaNs and negative zeros (x < 0
> is false while signbit is set, and x >= 0 is also false for positive NaNs).

Agreed, I think we're using it in both ways which is the problem in
the end.  Maybe having _compares_nonnegative and _sign_positive
would clarify this.

Comment 15 Jakub Jelinek 2024-03-22 17:09:13 UTC

I don't think a 9+ years old bug should be a P1.  It would be nice to get it fixed, but I'm not sure it is feasible for GCC 14 and very unlikely backportable if we e.g. have to split the *nonnegative* predicates to 2 sets.

Comment 16 Bruno Haible 2024-04-19 11:38:48 UTC

(In reply to Paul Eggert from comment #5)
> Although it is indeed unspecified whether 0.0/0.0 yields -NaN or +NaN, it is
> well understood that negating a floating point value flips its sign bit.

While I agree that this is generally true, it's not the case on mips with clang as compiler.

Namely, this compiler generates 'neg.d' instructions for the unary minus of a 'double' value, and this instruction may be a no-op on NaN values, depending on the CPU's control registers. For details, see the "MIPS® Architecture for Progreammers, Volume II-A: The MIPS® Instruction Set Manual", page 307 (317).

Whereas gcc generates an integer XOR instruction (at least in my test cases), so is fine.