[committed] libstdc++: Optimise std::future::wait_for and fix futex polling

Mike Crowe mac@mcrowe.com
Fri Nov 13 20:29:34 GMT 2020


On Friday 13 November 2020 at 17:25:22 +0000, Jonathan Wakely wrote:
> On 13/11/20 11:02 +0000, Jonathan Wakely wrote:
> > On 12/11/20 23:49 +0000, Jonathan Wakely wrote:
> > > To poll a std::future to see if it's ready you have to call one of the
> > > timed waiting functions. The most obvious way is wait_for(0s) but this
> > > was previously very inefficient because it would turn the relative
> > > timeout to an absolute one by calling system_clock::now(). When the
> > > relative timeout is zero (or less) we're obviously going to get a time
> > > that has already passed, but the overhead of obtaining the current time
> > > can be dozens of microseconds. The alternative is to call wait_until
> > > with an absolute timeout that is in the past. If you know the clock's
> > > epoch is in the past you can use a default constructed time_point.
> > > Alternatively, using some_clock::time_point::min() gives the earliest
> > > time point supported by the clock, which should be safe to assume is in
> > > the past. However, using a futex wait with an absolute timeout before
> > > the UNIX epoch fails and sets errno=EINVAL. The new code using futex
> > > waits with absolute timeouts was not checking for this case, which could
> > > result in hangs (or killing the process if the libray is built with
> > > assertions enabled).
> > > 
> > > This patch checks for times before the epoch before attempting to wait
> > > on a futex with an absolute timeout, which fixes the hangs or crashes.
> > > It also makes it very fast to poll using an absolute timeout before the
> > > epoch (because we skip the futex syscall).
> > > 
> > > It also makes future::wait_for avoid waiting at all when the relative
> > > timeout is zero or less, to avoid the unnecessary overhead of getting
> > > the current time. This makes polling with wait_for(0s) take only a few
> > > cycles instead of dozens of milliseconds.
> > > 
> > > libstdc++-v3/ChangeLog:
> > > 
> > > 	* include/std/future (future::wait_for): Do not wait for
> > > 	durations less than or equal to zero.
> > > 	* src/c++11/futex.cc (_M_futex_wait_until)
> > > 	(_M_futex_wait_until_steady): Do not wait for timeouts before
> > > 	the epoch.
> > > 	* testsuite/30_threads/future/members/poll.cc: New test.
> > > 
> > > Tested powerpc64le-linux. Committed to trunk.
> > > 
> > > I think the shortcut in future::wait_for is worth backporting. The
> > > changes in src/c++11/futex.cc are not needed because the code using
> > > absolute timeouts with futex waits is not present on any release
> > > branch.
> > 
> > I've committed this fix for the new test.
> 
> Backporting the change to gcc-10 revealed an overflow bug in the
> existing code, resulting in blocking for years when given an absolute
> timeout in the distant past. There's still a similar bug in the new
> code (using futexes with absolute timeouts against clocks) where a
> large chrono::seconds value can overflow and produce an incorrect
> tv_sec value. Apart from the overflow itself being UB, the result in
> that case is just a spurious wakeup (the call says it timed out when
> it didn't reach the specified time). That should still be fixed, but
> I'll do it separately.
> 
> Tested x86_64-linux. Committed to trunk.
> 
> 

> commit e7e0eeeb6e6707be2a6c6da49d4b6be3199e2af8
> Author: Jonathan Wakely <jwakely@redhat.com>
> Date:   Fri Nov 13 15:19:04 2020
> 
>     libstdc++: Avoid 32-bit time_t overflows in futex calls
>     
>     The existing code doesn't check whether the chrono::seconds value is out
>     of range of time_t. When using a timeout before the epoch (with a
>     negative value) subtracting the current time (as time_t) and then
>     assigning it to a time_t can overflow to a large positive value. This
>     means that we end up waiting several years even though the specific
>     timeout was in the distant past.
>     
>     We do have a check for negative timeouts, but that happens after the
>     conversion to time_t so happens after the overflow.
>     
>     The conversion to a relative timeout is done in two places, so this
>     factors it into a new function and adds the overflow checks there.
>     
>     libstdc++-v3/ChangeLog:
>     
>             * src/c++11/futex.cc (relative_timespec): New function to
>             create relative time from two absolute times.
>             (__atomic_futex_unsigned_base::_M_futex_wait_until)
>             (__atomic_futex_unsigned_base::_M_futex_wait_until_steady):
>             Use relative_timespec.
> 
> diff --git a/libstdc++-v3/src/c++11/futex.cc b/libstdc++-v3/src/c++11/futex.cc
> index 57f7dfe87e9e..c2b2d32e8c43 100644
> --- a/libstdc++-v3/src/c++11/futex.cc
> +++ b/libstdc++-v3/src/c++11/futex.cc
> @@ -31,6 +31,7 @@
>  #include <unistd.h>
>  #include <sys/time.h>
>  #include <errno.h>
> +#include <ext/numeric_traits.h>
>  #include <debug/debug.h>
>  
>  #ifdef _GLIBCXX_USE_CLOCK_GETTIME_SYSCALL
> @@ -46,20 +47,55 @@ const unsigned futex_clock_realtime_flag = 256;
>  const unsigned futex_bitset_match_any = ~0;
>  const unsigned futex_wake_op = 1;
>  
> -namespace
> -{
> -  std::atomic<bool> futex_clock_realtime_unavailable;
> -  std::atomic<bool> futex_clock_monotonic_unavailable;
> -}
> -
>  namespace std _GLIBCXX_VISIBILITY(default)
>  {
>  _GLIBCXX_BEGIN_NAMESPACE_VERSION
>  
> +namespace
> +{
> +  std::atomic<bool> futex_clock_realtime_unavailable;
> +  std::atomic<bool> futex_clock_monotonic_unavailable;
> +
> +  // Return the relative duration from (now_s + now_ns) to (abs_s + abs_ns)
> +  // as a timespec.
> +  struct timespec
> +  relative_timespec(chrono::seconds abs_s, chrono::nanoseconds abs_ns,
> +		    time_t now_s, long now_ns)
> +  {
> +    struct timespec rt;
> +
> +    // Did we already time out?
> +    if (now_s > abs_s.count())
> +      {
> +	rt.tv_sec = -1;
> +	return rt;
> +      }
> +
> +    auto rel_s = abs_s.count() - now_s;
> +
> +    // Avoid overflows
> +    if (rel_s > __gnu_cxx::__int_traits<time_t>::__max)
> +      rel_s = __gnu_cxx::__int_traits<time_t>::__max;
> +    else if (rel_s < __gnu_cxx::__int_traits<time_t>::__min)
> +      rel_s = __gnu_cxx::__int_traits<time_t>::__min;

I may be missing something, but if the line above executes...

> +
> +    // Convert the absolute timeout value to a relative timeout
> +    rt.tv_sec = rel_s;
> +    rt.tv_nsec = abs_ns.count() - now_ns;
> +    if (rt.tv_nsec < 0)
> +      {
> +	rt.tv_nsec += 1000000000;
> +	--rt.tv_sec;

...and so does this line above, then I think that we'll end up
underflowing. (Presumably rt.tv_sec will wrap round to being some time in
2038 on most 32-bit targets.)

I'm currently trying to persuade myself that this can actually happen and
if so work out how to come up with a test case for it.

I don't believe that this part changed in your later patch.

> +      }
> +
> +    return rt;
> +  }
> +} // namespace
> +
>    bool
> -  __atomic_futex_unsigned_base::_M_futex_wait_until(unsigned *__addr,
> -      unsigned __val,
> -      bool __has_timeout, chrono::seconds __s, chrono::nanoseconds __ns)
> +  __atomic_futex_unsigned_base::
> +  _M_futex_wait_until(unsigned *__addr, unsigned __val, bool __has_timeout,
> +		      chrono::seconds __s, chrono::nanoseconds __ns)
>    {
>      if (!__has_timeout)
>        {
> @@ -109,15 +145,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>  	// true or has just been set to true.
>  	struct timeval tv;
>  	gettimeofday (&tv, NULL);
> +
>  	// Convert the absolute timeout value to a relative timeout
> -	struct timespec rt;
> -	rt.tv_sec = __s.count() - tv.tv_sec;
> -	rt.tv_nsec = __ns.count() - tv.tv_usec * 1000;
> -	if (rt.tv_nsec < 0)
> -	  {
> -	    rt.tv_nsec += 1000000000;
> -	    --rt.tv_sec;
> -	  }
> +	auto rt = relative_timespec(__s, __ns, tv.tv_sec, tv.tv_usec * 1000);
> +
>  	// Did we already time out?
>  	if (rt.tv_sec < 0)
>  	  return false;
> @@ -134,9 +165,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>    }
>  
>    bool
> -  __atomic_futex_unsigned_base::_M_futex_wait_until_steady(unsigned *__addr,
> -      unsigned __val,
> -      bool __has_timeout, chrono::seconds __s, chrono::nanoseconds __ns)
> +  __atomic_futex_unsigned_base::
> +  _M_futex_wait_until_steady(unsigned *__addr, unsigned __val,
> +			     bool __has_timeout,
> +			     chrono::seconds __s, chrono::nanoseconds __ns)

Oh how I wish all these functions didn't expect already cracked seconds and
nanoseconds. :(

Mike.


More information about the Gcc-patches mailing list