[PATCH] builtin fadd variants implementation
Joseph Myers
joseph@codesourcery.com
Mon Sep 2 16:38:00 GMT 2019
On Mon, 2 Sep 2019, Tejas Joshi wrote:
> Hello.
> Should a result like 1.4 be considered as inexact if truncating
> (narrowing?) from double to float? (due to loss of trailing bits)
If the mathematical result of the arithmetic operation is literally the
decimal number 1.4, as opposed to the double value represented by the C
constant 1.4 which is actually 0x1.6666666666666p+0, then it is inexact
regardless of the (non-decimal) types involved. For example, fdiv (7, 5),
ddivl (7, 5), etc. are always inexact.
If the mathematical result of the arithmetic operation is
0x1.6666666666666p+0, the closest approximation to 1.4 in IEEE binary64,
then it is inexact for result formats narrower than binary64 and exact for
result formats that can represent that value. For example, fadd (1.4,
0.0) is inexact (the truncation to float is inexact although the addition
is exact). But daddl (1.4, 0.0) - note the arguments are double
constants, not long double - is exact, because the mathematical result is
exactly representable in double. Whereas daddl (1.4L, 0.0L) would be
inexact if long double is wider than double.
The question is always whether the infinite-precision mathematical result
of the arithmetic operation - which takes values representable in its
argument types - is exactly representable in the final result type.
--
Joseph S. Myers
joseph@codesourcery.com
More information about the Gcc-patches
mailing list