Bug 41861 (DR887) - [DR 887][C++0x] <condition_variable> does not use monotonic_clock
Summary: [DR 887][C++0x] <condition_variable> does not use monotonic_clock
Status: RESOLVED FIXED
Alias: DR887
Product: gcc
Classification: Unclassified
Component: libstdc++ (show other bugs)
Version: 4.4.1
: P3 normal
Target Milestone: 10.0
Assignee: Not yet assigned to anyone
URL:
Keywords: patch
Depends on:
Blocks:
 
Reported: 2009-10-28 23:46 UTC by Aaron Graham
Modified: 2019-09-04 22:47 UTC (History)
8 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2019-09-04 00:00:00


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Aaron Graham 2009-10-28 23:46:31 UTC
<mutex> looks like this:

171     #ifdef _GLIBCXX_USE_CLOCK_MONOTONIC
172	    typedef chrono::monotonic_clock 	  	__clock_t;
173	#else
174	    typedef chrono::high_resolution_clock 	__clock_t;
175	#endif

...but <condition_variable> only has this:

56	    typedef chrono::system_clock	__clock_t;

Shouldn't <condition_variable> be using the monotonic_clock if available, just like <mutex>?
Comment 1 Paolo Carlini 2009-10-29 00:31:45 UTC
Chris, can you have a look to this issue? Thanks.
Comment 2 Paolo Carlini 2009-11-03 11:22:14 UTC
Chris??
Comment 3 Chris Fairles 2009-11-03 13:58:41 UTC
Yes, I'm alive! Starting to get back into the GCC swing of things.

Ok, <condition_variable> and <mutex> and clocks. Its a bit of a tricky situation, reading current standard draft and other related docs (i.e. posix) to get myself back up to speed. In my preliminary investigations there seems to be some issues in assuming what "epoch" means when using gthreads and what we assume is a "known" clock.

I think condition_variable is more correct than mutex using system_clock as its "known" clock. Currently mutex doesn't attempt "sync" unknown clocks to a known clock in its *_until/for functions like condition_variable. This could potentially be an issue for gthread implementations (and posix implementations for that matter) where the epoch for system_clock, monotonic and high_res clocks are all different (let alone user-defined clocks).

So, in condition_variable at least, the assumption is the system_clock epoch (system_clock::time_point()) == gthread's expected epoch.

Taking the assumption that system_clock::time_point() == gthread's epoch, in mutex, it erroneously assumes that monotic_clock::time_point() == system_clock::time_point() but a quick reading of the POSIX docs shows:

"For [the monotonic] clock, the value returned by clock_gettime() represents the amount of time (in seconds and nanoseconds) since an unspecified point in the past (for example, system start-up time, or the Epoch)."
(http://www.opengroup.org/onlinepubs/009695399/functions/clock_gettime.html)

For the POSIX gthread_cond* implementation, the POSIX standard suggests that, if Clock Selection is available you should set the appropriate condition attribute (ex. pthread_condattr_setclock if available).

For mutexes (from pthread_mutex_timedlock), 
"If the Timers option is supported, the timeout shall be based on the CLOCK_REALTIME clock; if the Timers option is not supported, the timeout shall be based on the system clock as returned by the time() function."

This almost exactly describes system_clock other than the fact that it might use gettimeofday before using time() (i.e. std::time()), which is still realative to the POSIX epoch so I think thats ok ... for posix gthreads impl.

Long story short, using monotonic_clock as a "known" clock in *mutex* is almost certainly incorrect since, in POSIX, its absolute value is meaningless (epoch is arbitrary). It would be more correct to sync the monotonic_clock to system_clock like condition_varaible does.

30.2.4p4 says implementations should use a monotonic clock for the *_for functions that take a relative time. Unfortunately, gthreads (and its POSIX impl) use absolute time and assumes times are relative to the POSIX epoch.

Let me do a bit more research and poke Howard H. again for some clarification on this before moving forward.

Chris
Comment 5 Paolo Carlini 2009-11-03 15:37:10 UTC
Thanks for your feedback Chris. Gosh, that issue, quite a bit of time spent in Santa Cruz, Detlef arguing was not implementable and Howard disagreeing...
Comment 6 Aaron Graham 2009-11-10 04:38:03 UTC
So it appears that the problem is gthreads. The monotonic_clock support is purely superficial in gcc until gthreads supports such a concept. Developers will need to create their own clock and modify the standard library headers to use it should they require a reasonable level of reliability in the face of a possibly-changing system clock.

But I think the Howard/Detlef debate is a separate issue. I believe they have determined that a condition_variable (and mutex) must continue to use a specific clock once the object is created, and to sync all given time points to that clock, and are arguing over whether or not that is implementable. No big deal. I just don't believe there is any particular requirement that it be the system_clock (and, if there were, I would think that to be a big mistake). In almost every project I've worked on, our purposes would be much better served if a monotonic_clock were used instead. Rarely do we care what the epoch is. What we do care about is timer reliability even when NTP (or some other mechanism) is changing the clock. But that's just my experience.

Thanks for looking into this. I'm hoping for a resolution that doesn't make <condition_variable> and <mutex> all but useless as provided by the standard library sans modification. The boost team has already made some egregious mistakes in this area.
Comment 7 Paolo Carlini 2010-02-04 18:02:20 UTC
It seems to me that if DR887 is indeed resolved per the discussion in Santa Cruz, thus the wait_for functions removed, the wait_until functions use system_clock, then this issue could be immediately resolved. In other terms, I would add [DR 887] to the Subject and suspend. Agreed?
Comment 8 Jonathan Wakely 2012-07-03 16:23:30 UTC
887 is NAD, reopening.

This is a QoI issue, but even if monotonic clock support was added to gthreads (by which I assume people mean providing an API for pthread_condattr_setclock) we couldn't meet all the QoI suggestions.

The standard says implementations should use a steady clock for xxx_for, but should use the user-provided clock for xxx_until.  It's not possible to do both on POSIX, because a condition variable has a single clock, set on construction, and mutexes always use the system clock.
Comment 9 Mike Crowe 2015-07-07 15:49:32 UTC
It seems that there's been lots of talk about this but no firm solution. Here's some interesting links:

http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2009/n2999.html

http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2015/n4486.html (search
for "887")

https://www.redhat.com/archives/posix-c++-wg/2009-July/msg00002.html
(search for "887")

https://github.com/cplusplus/LWG/blob/master/xml/issue0887.xml

To me they seem to boil down to three points of view:

1. The current implementation of converting all other clocks to the system_clock is acceptable behaviour. Unfortunately this is racey when the system clock is changed and could cause the caller to wait for far longer than requested.

2. Implementers should add an extra constructor parameter (or perhaps template parameter) to condition_variable to allow the master clock for the condition_variable to be configured. All clocks would then be converted to this clock. This would allow the client code to express its intent but it seems ugly to require platform-specific code to make the standard behaviour actually work.

3. condition_variable should support wait_until using at least steady_clock (CLOCK_MONOTONIC) and system_clock (CLOCK_REALTIME.) Relative wait operations should use steady_clock. User-defined clocks should probably convert to steady_clock.

I believe that only option 3 makes any sense but this requires an equivalent to pthread_cond_timedwait that supports specifying the clock. The glibc implementation of such a function would be straightforward.
Comment 10 Mike Crowe 2015-07-07 15:51:32 UTC
(In reply to Mike Crowe from comment #9)
> 3. condition_variable should support wait_until using at least steady_clock
> (CLOCK_MONOTONIC) and system_clock (CLOCK_REALTIME.) Relative wait
> operations should use steady_clock. User-defined clocks should probably
> convert to steady_clock.
> 
> I believe that only option 3 makes any sense but this requires an equivalent
> to pthread_cond_timedwait that supports specifying the clock. The glibc
> implementation of such a function would be straightforward.

I've proposed a patch that implements this option at:

 https://gcc.gnu.org/ml/libstdc++/2015-07/msg00024.html

and the required glibc change at:

 https://sourceware.org/ml/libc-alpha/2015-07/msg00193.html
Comment 11 Jonathan Wakely 2015-07-08 07:43:40 UTC
Aaron, your reply got added to Bug 887 for some reason.

Re-posting Aaron's comment here:

>Thanks. I had already patched our gcc so that gthreads cond always gets
>initialized with CLOCK_MONOTONIC, then I switched __clock_t in
>condition_variable to steady_clock. It was a very simple change and works
>well but not nearly as portable as yours.
>
>I also disabled all timed waits on mutex (gcc already has ifdef for that)
>in order to avoid problems there. In my opinion, people shouldn't be using
>timed waits on mutexes anyway, since they are not cooperatively
>interruptible. If we did need them for some reason, I would reimplement
>timed mutex in terms of condition_variable and a regular mutex.
>
>It seems strange that this is no big deal to lots of people.
Comment 12 Roman Fietze 2016-10-17 12:11:36 UTC
Sorry if it is inappropriate to ask for any changes, but how can it be, that there is no fix for this bug for years in any of the GCC releases? 

With this bug condition_variable::wait_until is completely unusable on many targets with non monotonic system clocks, e.g. because they only do connect to any time server from time to time, or targets that get their time via any other method from a system that only connects from time to time.

I'm sure this affects e.g. many embedded systems.

If I can be of any help to get any fix into the release branches please tell me. Be it testing or debugging.
Comment 13 Jonathan Wakely 2016-10-17 12:20:50 UTC
(In reply to Roman Fietze from comment #12)
> Sorry if it is inappropriate to ask for any changes, but how can it be, that
> there is no fix for this bug for years in any of the GCC releases? 

Because it's not possible to implement the C++ requirements purely in terms of POSIX, so it requires a new API in the C library, which is complicated. All the information you need to investigate that is provided in this bug report and the enclosed links.

> With this bug condition_variable::wait_until is completely unusable on many

I find that hard to believe.
Comment 14 Mike Crowe 2016-10-18 16:43:04 UTC
(In reply to Jonathan Wakely from comment #13)
> (In reply to Roman Fietze from comment #12)
> > Sorry if it is inappropriate to ask for any changes, but how can it be, that
> > there is no fix for this bug for years in any of the GCC releases? 
> 
> Because it's not possible to implement the C++ requirements purely in terms
> of POSIX, so it requires a new API in the C library, which is complicated.
> All the information you need to investigate that is provided in this bug
> report and the enclosed links.

I submitted an RFC glibc patch last year: http://patchwork.ozlabs.org/project/glibc/list/?submitter=66786

There were some objections but no-one seemed to outright say no. Unfortunately it is blocked waiting for Torvald Riegel's removal of the assembly "optimisation" for x86 and x86_64 before it can go in. This seems to be taking longer than I expected when I wrote the patch.

> > With this bug condition_variable::wait_until is completely unusable on many
> 
> I find that hard to believe.

Well, it does make it unsafe to use in an environment where CLOCK_REALTIME can change arbitrarily which some may consider to equate to "unusable".

However, in the intervening time I've become aware of the std::synchronic proposal ( http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2016/p0126r2.pdf ) and I've been wondering whether libstdc++ would be better off implementing support for that and then using it to build its own condition_variable implementation. I'm not sure whether I'm qualified to do that though. :)

If we wanted to do something in the short term then we could consider flipping the default clock for std::condition_variable to be std::chrono::steady_clock where it is available. I suspect that most code is using a relative timeout anyway and isn't really expecting the timeout to change when the system clock changes.
Comment 15 Mike Crowe 2017-09-30 16:13:32 UTC
(In reply to Mike Crowe from comment #14)
> I submitted an RFC glibc patch last year:
> http://patchwork.ozlabs.org/project/glibc/list/?submitter=66786

I submitted an updated version earlier this year:

 https://sourceware.org/ml/libc-alpha/2017-06/msg01111.html

There are a few minor niggles to address, but it seems that the greater hurdle now is that two respondents have requested that I raise this with the Austin Group. So, I have done so at:

 http://austingroupbugs.net/view.php?id=1164

If this attempt fails, then I think it would be preferable to switch libstdc++ std::condition_variable to use std::chrono::steady_clock as its primary clock (along with the necessary changes in gthreads to support creating a condition variable using CLOCK_MONOTONIC) and convert all other clocks to that clock. I personally think it's more surprising that:

 cv.wait_for(std::chrono::seconds(2));

risks waiting for potentially huge amount of time if someone changes the system clock, than

 cv.wait_until(std::chrono::system_clock::now() + std::chrono::seconds(2));

waiting for two seconds if someone changes the system clock just afterwards so that the wait should return immediately.
Comment 16 Austin Beer 2019-03-24 01:11:00 UTC
Back in 2017 I spent a lot of time (along with the maintainer) testing and fixing time-related QOI issues in the Boost.Thread library, which were included in Boost 1.67. The two main solutions we employed were:

1. We used chrono::steady_clock as the primary clock inside condition_variable.
2. When converting from another clock to chrono::steady_clock in *_until() functions, we polled the other clock periodically (i.e. every 100 milliseconds) in case the other clock had jumped.

Here is the PR, the test code, and the final test results after fixing as much as we could.

https://github.com/boostorg/thread/pull/142
https://github.com/austin-beer/test-time-jumps/blob/master/test_time_jumps.cpp
https://github.com/austin-beer/test-time-jumps/blob/master/linux_results_develop1_default.txt

I've also created a similar test for libstdc++ and uploaded those results. I tested against GCC 8.3.0.

https://github.com/austin-beer/test-time-jumps/blob/master/test_time_jumps_libstdcxx.cpp
https://github.com/austin-beer/test-time-jumps/blob/master/test_time_jumps_libstdcxx_results.txt

So I support Mike Crowe's suggestion that std::condition_variable be changed to use std::chrono::steady_clock.
Comment 17 Mike Crowe 2019-03-25 11:21:28 UTC
In https://sourceware.org/ml/libc-alpha/2019-02/msg00637.html , I proposed the addition of the pthread_cond_clockwait function (among others) to glibc, and whilst there are a few comments on the patches, there's been no fundamental objections. I'm just working on sorting out some things in the test suite before these patches can land.

Once this function is available in glibc, it will be relatively straightforward to support both steady_clock and system_clock correctly in condition_variable.

After many years of me letting this languish, I'm optimistic that we can get this fixed properly before the tenth anniversay of Aaron's original bug submission!

However, in the meantime, I wouldn't object to using steady_clock by default.
Comment 18 Mike Crowe 2019-08-17 08:45:53 UTC
(In reply to Mike Crowe from comment #17)
> In https://sourceware.org/ml/libc-alpha/2019-02/msg00637.html , I proposed
> the addition of the pthread_cond_clockwait function (among others) to glibc,
> and whilst there are a few comments on the patches, there's been no
> fundamental objections. I'm just working on sorting out some things in the
> test suite before these patches can land.

This has landed in glibc 2.30.

> Once this function is available in glibc, it will be relatively
> straightforward to support both steady_clock and system_clock correctly in
> condition_variable.

Patches posted at https://gcc.gnu.org/ml/libstdc++/2019-07/msg00044.html .

> After many years of me letting this languish, I'm optimistic that we can get
> this fixed properly before the tenth anniversary of Aaron's original bug
> submission!

There's still time! :)
Comment 19 Jonathan Wakely 2019-09-04 22:44:00 UTC
Author: redi
Date: Wed Sep  4 22:43:29 2019
New Revision: 275390

URL: https://gcc.gnu.org/viewcvs?rev=275390&root=gcc&view=rev
Log:
PR libstdc++/41861 Add full steady_clock support to condition_variable

The pthread_cond_clockwait function is available in glibc since the 2.30
release. If this function is available in the C library it can be used
to fix PR libstdc++/41861 by supporting std::chrono::steady_clock
properly with std::condition_variable.

This means that code using std::condition_variable::wait_for or
std::condition_variable::wait_until with std::chrono::steady_clock is no
longer subject to timing out early or potentially waiting for much
longer if the system clock is warped at an inopportune moment.

If pthread_cond_clockwait is available then std::chrono::steady_clock is
deemed to be the "best" clock available which means that it is used for
the relative wait_for calls and absolute wait_until calls using
user-defined clocks. Calls explicitly using std::chrono::system_clock
continue to use CLOCK_REALTIME via __gthread_cond_timedwait.

If pthread_cond_clockwait is not available then
std::chrono::system_clock is deemed to be the "best" clock available
which means that the previous suboptimal behaviour remains.

2019-09-04  Mike Crowe  <mac@mcrowe.com>

	PR libstdc++/41861
	* acinclude.m4 (GLIBCXX_CHECK_PTHREAD_COND_CLOCKWAIT): Check for new
	pthread_cond_clockwait function.
	* configure.ac: Use GLIBCXX_CHECK_PTHREAD_COND_CLOCKWAIT.
	* configure: Regenerate.
	* config.h.in: Regenerate.
	* include/std/condition_variable: (condition_variable): Rename
	__steady_clock_t typedef and add system_clock. Change __clock_t to be
	a typedef for the preferred clock to convert arbitrary other clocks to.
	[_GLIBCXX_USE_PTHREAD_COND_CLOCKWAIT] (wait_until): Add a steady_clock
	overload.
	(wait_until): Change __clock_t overload to use system_clock.
	[_GLIBCXX_USE_PTHREAD_COND_CLOCKWAIT] (__wait_until_impl): Add
	steady_clock overload that calls pthread_cond_clockwait.
	(__wait_until_impl): Change __clock_t overload to use system_clock.
	(condition_variable_any) [_GLIBCXX_USE_PTHREAD_COND_CLOCKWAIT]: Use
	steady_clock for __clock_t if pthread_cond_clockwait is available.

Modified:
    trunk/libstdc++-v3/ChangeLog
    trunk/libstdc++-v3/acinclude.m4
    trunk/libstdc++-v3/config.h.in
    trunk/libstdc++-v3/configure
    trunk/libstdc++-v3/configure.ac
    trunk/libstdc++-v3/include/std/condition_variable
Comment 20 Jonathan Wakely 2019-09-04 22:47:16 UTC
Fixed for GCC 10 by Mike's patches.