on arm-* with #include <fenv.h> #include <stdio.h> int main() { long long x = (1LL << 60) - 1; double y; fesetround(FE_DOWNWARD); __asm__ __volatile__ ("" : "+m" (x)); y = x; __asm__ __volatile__ ("" : "+m" (y)); fesetround(FE_TONEAREST); printf("%a\n", y); } i get 0x1p60 instead of 0x1.fffffffffffffp+59 i assume this is because the conversion is handled by __aeabi_l2d (also known as __floatdidf in libgcc) which is not rounding mode aware. this affects hardfloat targets which otherwise support directed rounding modes.
floating-point exceptions are also missing for the same reason.
Don't you need #pragma STDC FENV_ACCESS?
(In reply to Andreas Schwab from comment #2) > Don't you need #pragma STDC FENV_ACCESS? yes, for iso c conformance you need it, but gcc does not handle it anyway, instead it requires -frounding-math. however if double prec instructions are available, using them may be even faster in the difficult inexact case, e.g. double uconv64(uint64_t x) { double lo = uconv32(x); // single instruction, always exact double hi = uconv32(x>>32); return lo + hi*0x1p32; } so i would not make the fix depend on -frounding-math, just always use hardfloat instructions on hardfloat targets to do the conv. (i suspect it affects more than just armhf)
The libgcc2.c functions for conversions that get used by default on most architectures should respect the rounding mode if the underlying single-word-to-floating-point instruction does so. (They have more complicated logic to handle the __floatdisf case and other such cases where a single word is wider than the precision of the floating-point type.) The libgcc2.c __fix* functions may raise spurious "inexact" exceptions, but that's a different matter.
ok so the real problem is that libgcc does not define FP_INIT_ROUNDMODE and FP_HANDLE_EXCEPTIONS etc for hardfloat arm targets.
Arm only uses soft-fp for Thumb-1; otherwise it uses Arm-specific assembly code. A natural fix might be, for these particular functions, in the hard-float case, to use the libgcc2.c versions instead; I expect that's faster than doing them entirely as software floating-point while respecting exceptions and rounding modes. Arguably for the other functions, in the hard-float case, it would be best to build the t-hardfp versions to keep the libgcc size down, instead of the assembly versions. In both cases you'd need to make sure the functions have the ABI-specified names, that they get properly built for the base PCS rather than the VFP variant, where the ABI requires that, and that any other ABI requirements are met (the default libgcc versions, compiled as C code, might not always follow the same interface as the ABI-defined functions). The only case where soft-fp support for exceptions and rounding modes seems likely to be useful for Arm is those processors that do single-precision-only floating point in hardware, for which use of soft-fp for double precision would make sense in order to support exceptions and rounding modes (along with t-hardfp for single precision). (Also, I suppose, use of soft-fp with exceptions and rounding modes would make sense for half-precision on hardware with floating-point support only for single and maybe double precision.)
i think the code snippet i posted is more efficient and significantly smaller than using libgcc (which also sounds hard to wire up to do the right thing). the code sequence can possibly be even inlined. (and i don't mind if ucontrollers with single precision only fpu don't have correct fenv behaviour)
The code snippet you posted looks like exactly what libgcc2.c would typically do for __floatundidf. Given that, I'd prefer building the relevant function from libgcc2.c rather than having an Arm-specific duplicate. /* When the word size is small, we never get any rounding error. */ FSTYPE f = (UWtype) (u >> W_TYPE_SIZE); f *= Wtype_MAXp1_F; f += (UWtype)u; return f;
ok i was looking at the wrong code, didn't know libgcc2, i agree that's the right way to fix this.
*** Bug 94646 has been marked as a duplicate of this bug. ***
confirmed
There's some awful hand-written asm in libgcc/config/arm/ieee754-df.S replacing the standard libgcc2.c versions; that's the problem. But in order to use the latter it would need to be compiled with -mfloat-abi=softfp since the __aeabi_l2d function (and all the __aeabi_* apparently) use the standard soft-float EABI even on EABIHF targets. I'm not sure why you want a library function to be called for this on hardfloat targets anyway. Inlining the hi*0x1p32+lo is almost surely smaller than the function call, counting spills and conversion of the result back from GP registers to an FP register. It seems like GCC should be able to inline this idiom at a high level for *all* targets that lack a floatdidf operation but have floatsidf. Of course a high level fix is going to be hell to backport, and this really needs a backportable fix or workaround (maintained in mcm not upstream gcc) from musl perspective. Maybe the easiest way to do that is just to hack the right preprocessor conditions for a hardfloat implementation into ieee754-df.S...
bpabi-lib.h is the existing mechanism for renaming libgcc2.c functions and declaring them with __attribute__((pcs("aapcs"))). (But when causing more such functions to be used, or such functions to be used when they weren't before, we'd need to make sure all the declarations therein are being applied properly in the new cases to give the functions the desired ABI.)
This came up in the context of a Linux kernel patch series providing a "Unified cross-architecture kernel-mode FPU API" because the AMD GPU driver fails to build on arm with the series applied because of this issue: https://lore.kernel.org/linux-arm-kernel/75a37a4b-f516-40a3-b6b5-4aa1636f9b60@sifive.com/ Copying the patch author's response here: > In both cases, the issue is that the toolchain requires runtime support to > convert between `unsigned long long` and `double`, even when hardware FP is > enabled. There was some past discussion about GCC inlining some of these > conversions[1], but that did not get implemented. > > The short-term fix would be to drop the `select ARCH_HAS_KERNEL_FPU_SUPPORT` > for 32-bit arm until we can provide these runtime library functions.
(In reply to Thiago Jung Bauermann from comment #14) > This came up in the context of a Linux kernel patch series providing a > "Unified cross-architecture kernel-mode FPU API" because the AMD GPU driver > fails to build on arm with the series applied because of this issue: > > https://lore.kernel.org/linux-arm-kernel/75a37a4b-f516-40a3-b6b5- > 4aa1636f9b60@sifive.com/ That is a totally unrelated issue from this one. The issue there is arm EABI defines the function names rather than GCC and GCC follows the ARM EABI here (so does clang/LLVM). so you would need to do a similar renaming for the functions on arm (32bit).
(In reply to Andrew Pinski from comment #15) > (In reply to Thiago Jung Bauermann from comment #14) > > This came up in the context of a Linux kernel patch series providing a > > "Unified cross-architecture kernel-mode FPU API" because the AMD GPU driver > > fails to build on arm with the series applied because of this issue: > > > > https://lore.kernel.org/linux-arm-kernel/75a37a4b-f516-40a3-b6b5- > > 4aa1636f9b60@sifive.com/ > > That is a totally unrelated issue from this one. The issue there is arm EABI > defines the function names rather than GCC and GCC follows the ARM EABI here > (so does clang/LLVM). so you would need to do a similar renaming for the > functions on arm (32bit). Thank you! I relayed your point (hopefully correctly) in the email thread above.