111101 – -finline-small-functions may invert FP arguments breaking FP bit accuracy in case of NaNs

Bug 111101 - -finline-small-functions may invert FP arguments breaking FP bit accuracy in case of NaNs

Summary: -finline-small-functions may invert FP arguments breaking FP bit accuracy in ...

Status:	RESOLVED INVALID

Alias:	None

Product:	gcc
Classification:	Unclassified
Component:	rtl-optimization (show other bugs)
Version:	11.3.0

Importance:	P3 normal
Target Milestone:	---
Assignee:	Not yet assigned to anyone

URL:
Keywords:

Depends on:
Blocks:

Reported:	2023-08-22 13:30 UTC by Pavel M
Modified:	2023-08-22 14:06 UTC (History)
CC List:	1 user (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed:

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Pavel M 2023-08-22 13:30:05 UTC

Notes:
1. This may not be a bug.
2. This may be a duplicate.
3. I don't have MRE.

Brief: -finline-small-functions may invert FP arguments breaking FP bit accuracy 
 in case of NaNs

Demo:
$ gcc t1.c -O1 -std=c11 && ./a
r     nan 7fe5ed65
r_ref nan 7fe5ed65

$ gcc t1.c -O1 -std=c11 -finline-small-functions && ./a
r     -nan fffffffe
r_ref nan 7fe5ed65

Description: In my code I add two FP values (represented in "raw hex"): 0x7fa5ed65 (sNaN) with 0xfffffffe (qNaN). x86_64 instruction addss returns 0x7fe5ed65 (sNaN). However, under -finline-small-functions gcc, I guess, rewrites A+B to B+A, resulting in 0xfffffffe (qNaN), which breaks FP bit accuracy.

I examined generated assembly code:
-O1:
add(x, y)
x => ecx => ebp => xmm0
y => edx => edi => xmm1
addss xmm1, xmm0 (at&t syntax)

-O1 -finline-small-functions:
add(x, y)
x => ecx => esi => xmm1
y => edx => ebx => xmm0
addss xmm1, xmm0 (at&t syntax)

Here we see that in case of -finline-small-functions x and y are inverted.

Notes:
1. Some software may rely on FP bit accuracy in case of NaNs (NaN boxing, etc.).
2. I'm not sure which "Component:" to select: rtl-optimization or tree-optimization.

Comment 1 Alexander Monakov 2023-08-22 14:06:32 UTC

0x7fe5ed65 is a quiet NaN, not signaling (it differs from the input 0x7fa5ed65 sNaN by the leading mantissa bit 0x00400000).

IEEE-754 does not pin down which of the two payloads should be propagated when both operands are NaNs, and neither do language standards, so for GCC floating-point addition and similar operations are commutative.

Observed NaN payloads are not predictable and may change depending on optimization level, choice of x87 vs. SSE instructions, etc. This is not a bug.