This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug libstdc++/77776] C++17 std::hypot implementation is poor
- From: "kretz at kde dot org" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: Thu, 06 Dec 2018 15:35:04 +0000
- Subject: [Bug libstdc++/77776] C++17 std::hypot implementation is poor
- Auto-submitted: auto-generated
- References: <bug-77776-4@http.gcc.gnu.org/bugzilla/>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77776
Matthias Kretz <kretz at kde dot org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |kretz at kde dot org
--- Comment #1 from Matthias Kretz <kretz at kde dot org> ---
Testcase for the inf input case
(https://wandbox.org/permlink/wDla0jQOtKFUsHNs):
constexpr float inf = std::numeric_limits<float>::infinity();
std::cout << std::hypot(inf, 1.f, 1.f) << '\n';
std::cout << std::hypot(inf, inf, 1.f) << '\n';
std::cout << std::hypot(inf, inf, inf) << '\n';
All of these return NaN, but surely should return inf instead. Joseph mentions
this issue in his mails.
here's what I currently have for std::experimental::simd's hypot overload:
template <class _T, class _Abi>
simd<_T, _Abi> hypot(simd<_T, _Abi> __x, simd<_T, _Abi> __y, simd<_T, _Abi>
__z)
{
using namespace __proposed::float_bitwise_operators;
__x = abs(__x); // no error
__y = abs(__y); // no error
__z = abs(__z); // no error
auto __hi = max(max(__x, __y), __z); // no error
auto __l0 = min(__z, max(__x, __y)); // no error
auto __l1 = min(__y, __x); // no error
auto __hi_exp =
__hi & simd<_T, _Abi>(std::numeric_limits<_T>::infinity()); // no error
where(__hi_exp == 0, __hi_exp) = std::numeric_limits<_T>::min();
// scale __hi to 0x1.???p0 and __lo by the same factor
auto __scale = 1 / __hi_exp; // no error
auto __h1 = __hi / __hi_exp; // no error
__l0 *= 1 / __hi_exp; // no error
__l1 *= 1 / __hi_exp; // no error
auto __lo = __l0 * __l0 + __l1 * __l1; // add the two smaller values first
auto __r = __hi_exp * sqrt(__lo + __h1 * __h1);
where(__l0 + __l1 == 0, __r) = __hi;
where(isinf(__x) || isinf(__y) || isinf(__z), __r) =
std::numeric_limits<_T>::infinity();
return __r;
}
IIUC this might still produce underflow exceptions, even though the answer is
precise. And without FMA in the accumulation inside the sqrt there may be a
larger than .5 ULP error on the result. But that's just to the best of my
knowledge.