<mutex> looks like this: 171 #ifdef _GLIBCXX_USE_CLOCK_MONOTONIC 172 typedef chrono::monotonic_clock __clock_t; 173 #else 174 typedef chrono::high_resolution_clock __clock_t; 175 #endif ...but <condition_variable> only has this: 56 typedef chrono::system_clock __clock_t; Shouldn't <condition_variable> be using the monotonic_clock if available, just like <mutex>?
Chris, can you have a look to this issue? Thanks.
Chris??
Yes, I'm alive! Starting to get back into the GCC swing of things. Ok, <condition_variable> and <mutex> and clocks. Its a bit of a tricky situation, reading current standard draft and other related docs (i.e. posix) to get myself back up to speed. In my preliminary investigations there seems to be some issues in assuming what "epoch" means when using gthreads and what we assume is a "known" clock. I think condition_variable is more correct than mutex using system_clock as its "known" clock. Currently mutex doesn't attempt "sync" unknown clocks to a known clock in its *_until/for functions like condition_variable. This could potentially be an issue for gthread implementations (and posix implementations for that matter) where the epoch for system_clock, monotonic and high_res clocks are all different (let alone user-defined clocks). So, in condition_variable at least, the assumption is the system_clock epoch (system_clock::time_point()) == gthread's expected epoch. Taking the assumption that system_clock::time_point() == gthread's epoch, in mutex, it erroneously assumes that monotic_clock::time_point() == system_clock::time_point() but a quick reading of the POSIX docs shows: "For [the monotonic] clock, the value returned by clock_gettime() represents the amount of time (in seconds and nanoseconds) since an unspecified point in the past (for example, system start-up time, or the Epoch)." (http://www.opengroup.org/onlinepubs/009695399/functions/clock_gettime.html) For the POSIX gthread_cond* implementation, the POSIX standard suggests that, if Clock Selection is available you should set the appropriate condition attribute (ex. pthread_condattr_setclock if available). For mutexes (from pthread_mutex_timedlock), "If the Timers option is supported, the timeout shall be based on the CLOCK_REALTIME clock; if the Timers option is not supported, the timeout shall be based on the system clock as returned by the time() function." This almost exactly describes system_clock other than the fact that it might use gettimeofday before using time() (i.e. std::time()), which is still realative to the POSIX epoch so I think thats ok ... for posix gthreads impl. Long story short, using monotonic_clock as a "known" clock in *mutex* is almost certainly incorrect since, in POSIX, its absolute value is meaningless (epoch is arbitrary). It would be more correct to sync the monotonic_clock to system_clock like condition_varaible does. 30.2.4p4 says implementations should use a monotonic clock for the *_for functions that take a relative time. Unfortunately, gthreads (and its POSIX impl) use absolute time and assumes times are relative to the POSIX epoch. Let me do a bit more research and poke Howard H. again for some clarification on this before moving forward. Chris
See also: http://www.open-std.org/jtc1/sc22/wg21/docs/lwg-active.html#887 http://home.roadrunner.com/~hinnant/issue_review/lwg-active.html#887
Thanks for your feedback Chris. Gosh, that issue, quite a bit of time spent in Santa Cruz, Detlef arguing was not implementable and Howard disagreeing...
So it appears that the problem is gthreads. The monotonic_clock support is purely superficial in gcc until gthreads supports such a concept. Developers will need to create their own clock and modify the standard library headers to use it should they require a reasonable level of reliability in the face of a possibly-changing system clock. But I think the Howard/Detlef debate is a separate issue. I believe they have determined that a condition_variable (and mutex) must continue to use a specific clock once the object is created, and to sync all given time points to that clock, and are arguing over whether or not that is implementable. No big deal. I just don't believe there is any particular requirement that it be the system_clock (and, if there were, I would think that to be a big mistake). In almost every project I've worked on, our purposes would be much better served if a monotonic_clock were used instead. Rarely do we care what the epoch is. What we do care about is timer reliability even when NTP (or some other mechanism) is changing the clock. But that's just my experience. Thanks for looking into this. I'm hoping for a resolution that doesn't make <condition_variable> and <mutex> all but useless as provided by the standard library sans modification. The boost team has already made some egregious mistakes in this area.
It seems to me that if DR887 is indeed resolved per the discussion in Santa Cruz, thus the wait_for functions removed, the wait_until functions use system_clock, then this issue could be immediately resolved. In other terms, I would add [DR 887] to the Subject and suspend. Agreed?
887 is NAD, reopening. This is a QoI issue, but even if monotonic clock support was added to gthreads (by which I assume people mean providing an API for pthread_condattr_setclock) we couldn't meet all the QoI suggestions. The standard says implementations should use a steady clock for xxx_for, but should use the user-provided clock for xxx_until. It's not possible to do both on POSIX, because a condition variable has a single clock, set on construction, and mutexes always use the system clock.
It seems that there's been lots of talk about this but no firm solution. Here's some interesting links: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2009/n2999.html http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2015/n4486.html (search for "887") https://www.redhat.com/archives/posix-c++-wg/2009-July/msg00002.html (search for "887") https://github.com/cplusplus/LWG/blob/master/xml/issue0887.xml To me they seem to boil down to three points of view: 1. The current implementation of converting all other clocks to the system_clock is acceptable behaviour. Unfortunately this is racey when the system clock is changed and could cause the caller to wait for far longer than requested. 2. Implementers should add an extra constructor parameter (or perhaps template parameter) to condition_variable to allow the master clock for the condition_variable to be configured. All clocks would then be converted to this clock. This would allow the client code to express its intent but it seems ugly to require platform-specific code to make the standard behaviour actually work. 3. condition_variable should support wait_until using at least steady_clock (CLOCK_MONOTONIC) and system_clock (CLOCK_REALTIME.) Relative wait operations should use steady_clock. User-defined clocks should probably convert to steady_clock. I believe that only option 3 makes any sense but this requires an equivalent to pthread_cond_timedwait that supports specifying the clock. The glibc implementation of such a function would be straightforward.
(In reply to Mike Crowe from comment #9) > 3. condition_variable should support wait_until using at least steady_clock > (CLOCK_MONOTONIC) and system_clock (CLOCK_REALTIME.) Relative wait > operations should use steady_clock. User-defined clocks should probably > convert to steady_clock. > > I believe that only option 3 makes any sense but this requires an equivalent > to pthread_cond_timedwait that supports specifying the clock. The glibc > implementation of such a function would be straightforward. I've proposed a patch that implements this option at: https://gcc.gnu.org/ml/libstdc++/2015-07/msg00024.html and the required glibc change at: https://sourceware.org/ml/libc-alpha/2015-07/msg00193.html
Aaron, your reply got added to Bug 887 for some reason. Re-posting Aaron's comment here: >Thanks. I had already patched our gcc so that gthreads cond always gets >initialized with CLOCK_MONOTONIC, then I switched __clock_t in >condition_variable to steady_clock. It was a very simple change and works >well but not nearly as portable as yours. > >I also disabled all timed waits on mutex (gcc already has ifdef for that) >in order to avoid problems there. In my opinion, people shouldn't be using >timed waits on mutexes anyway, since they are not cooperatively >interruptible. If we did need them for some reason, I would reimplement >timed mutex in terms of condition_variable and a regular mutex. > >It seems strange that this is no big deal to lots of people.
Sorry if it is inappropriate to ask for any changes, but how can it be, that there is no fix for this bug for years in any of the GCC releases? With this bug condition_variable::wait_until is completely unusable on many targets with non monotonic system clocks, e.g. because they only do connect to any time server from time to time, or targets that get their time via any other method from a system that only connects from time to time. I'm sure this affects e.g. many embedded systems. If I can be of any help to get any fix into the release branches please tell me. Be it testing or debugging.
(In reply to Roman Fietze from comment #12) > Sorry if it is inappropriate to ask for any changes, but how can it be, that > there is no fix for this bug for years in any of the GCC releases? Because it's not possible to implement the C++ requirements purely in terms of POSIX, so it requires a new API in the C library, which is complicated. All the information you need to investigate that is provided in this bug report and the enclosed links. > With this bug condition_variable::wait_until is completely unusable on many I find that hard to believe.
(In reply to Jonathan Wakely from comment #13) > (In reply to Roman Fietze from comment #12) > > Sorry if it is inappropriate to ask for any changes, but how can it be, that > > there is no fix for this bug for years in any of the GCC releases? > > Because it's not possible to implement the C++ requirements purely in terms > of POSIX, so it requires a new API in the C library, which is complicated. > All the information you need to investigate that is provided in this bug > report and the enclosed links. I submitted an RFC glibc patch last year: http://patchwork.ozlabs.org/project/glibc/list/?submitter=66786 There were some objections but no-one seemed to outright say no. Unfortunately it is blocked waiting for Torvald Riegel's removal of the assembly "optimisation" for x86 and x86_64 before it can go in. This seems to be taking longer than I expected when I wrote the patch. > > With this bug condition_variable::wait_until is completely unusable on many > > I find that hard to believe. Well, it does make it unsafe to use in an environment where CLOCK_REALTIME can change arbitrarily which some may consider to equate to "unusable". However, in the intervening time I've become aware of the std::synchronic proposal ( http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2016/p0126r2.pdf ) and I've been wondering whether libstdc++ would be better off implementing support for that and then using it to build its own condition_variable implementation. I'm not sure whether I'm qualified to do that though. :) If we wanted to do something in the short term then we could consider flipping the default clock for std::condition_variable to be std::chrono::steady_clock where it is available. I suspect that most code is using a relative timeout anyway and isn't really expecting the timeout to change when the system clock changes.
(In reply to Mike Crowe from comment #14) > I submitted an RFC glibc patch last year: > http://patchwork.ozlabs.org/project/glibc/list/?submitter=66786 I submitted an updated version earlier this year: https://sourceware.org/ml/libc-alpha/2017-06/msg01111.html There are a few minor niggles to address, but it seems that the greater hurdle now is that two respondents have requested that I raise this with the Austin Group. So, I have done so at: http://austingroupbugs.net/view.php?id=1164 If this attempt fails, then I think it would be preferable to switch libstdc++ std::condition_variable to use std::chrono::steady_clock as its primary clock (along with the necessary changes in gthreads to support creating a condition variable using CLOCK_MONOTONIC) and convert all other clocks to that clock. I personally think it's more surprising that: cv.wait_for(std::chrono::seconds(2)); risks waiting for potentially huge amount of time if someone changes the system clock, than cv.wait_until(std::chrono::system_clock::now() + std::chrono::seconds(2)); waiting for two seconds if someone changes the system clock just afterwards so that the wait should return immediately.
Back in 2017 I spent a lot of time (along with the maintainer) testing and fixing time-related QOI issues in the Boost.Thread library, which were included in Boost 1.67. The two main solutions we employed were: 1. We used chrono::steady_clock as the primary clock inside condition_variable. 2. When converting from another clock to chrono::steady_clock in *_until() functions, we polled the other clock periodically (i.e. every 100 milliseconds) in case the other clock had jumped. Here is the PR, the test code, and the final test results after fixing as much as we could. https://github.com/boostorg/thread/pull/142 https://github.com/austin-beer/test-time-jumps/blob/master/test_time_jumps.cpp https://github.com/austin-beer/test-time-jumps/blob/master/linux_results_develop1_default.txt I've also created a similar test for libstdc++ and uploaded those results. I tested against GCC 8.3.0. https://github.com/austin-beer/test-time-jumps/blob/master/test_time_jumps_libstdcxx.cpp https://github.com/austin-beer/test-time-jumps/blob/master/test_time_jumps_libstdcxx_results.txt So I support Mike Crowe's suggestion that std::condition_variable be changed to use std::chrono::steady_clock.
In https://sourceware.org/ml/libc-alpha/2019-02/msg00637.html , I proposed the addition of the pthread_cond_clockwait function (among others) to glibc, and whilst there are a few comments on the patches, there's been no fundamental objections. I'm just working on sorting out some things in the test suite before these patches can land. Once this function is available in glibc, it will be relatively straightforward to support both steady_clock and system_clock correctly in condition_variable. After many years of me letting this languish, I'm optimistic that we can get this fixed properly before the tenth anniversay of Aaron's original bug submission! However, in the meantime, I wouldn't object to using steady_clock by default.
(In reply to Mike Crowe from comment #17) > In https://sourceware.org/ml/libc-alpha/2019-02/msg00637.html , I proposed > the addition of the pthread_cond_clockwait function (among others) to glibc, > and whilst there are a few comments on the patches, there's been no > fundamental objections. I'm just working on sorting out some things in the > test suite before these patches can land. This has landed in glibc 2.30. > Once this function is available in glibc, it will be relatively > straightforward to support both steady_clock and system_clock correctly in > condition_variable. Patches posted at https://gcc.gnu.org/ml/libstdc++/2019-07/msg00044.html . > After many years of me letting this languish, I'm optimistic that we can get > this fixed properly before the tenth anniversary of Aaron's original bug > submission! There's still time! :)
Author: redi Date: Wed Sep 4 22:43:29 2019 New Revision: 275390 URL: https://gcc.gnu.org/viewcvs?rev=275390&root=gcc&view=rev Log: PR libstdc++/41861 Add full steady_clock support to condition_variable The pthread_cond_clockwait function is available in glibc since the 2.30 release. If this function is available in the C library it can be used to fix PR libstdc++/41861 by supporting std::chrono::steady_clock properly with std::condition_variable. This means that code using std::condition_variable::wait_for or std::condition_variable::wait_until with std::chrono::steady_clock is no longer subject to timing out early or potentially waiting for much longer if the system clock is warped at an inopportune moment. If pthread_cond_clockwait is available then std::chrono::steady_clock is deemed to be the "best" clock available which means that it is used for the relative wait_for calls and absolute wait_until calls using user-defined clocks. Calls explicitly using std::chrono::system_clock continue to use CLOCK_REALTIME via __gthread_cond_timedwait. If pthread_cond_clockwait is not available then std::chrono::system_clock is deemed to be the "best" clock available which means that the previous suboptimal behaviour remains. 2019-09-04 Mike Crowe <mac@mcrowe.com> PR libstdc++/41861 * acinclude.m4 (GLIBCXX_CHECK_PTHREAD_COND_CLOCKWAIT): Check for new pthread_cond_clockwait function. * configure.ac: Use GLIBCXX_CHECK_PTHREAD_COND_CLOCKWAIT. * configure: Regenerate. * config.h.in: Regenerate. * include/std/condition_variable: (condition_variable): Rename __steady_clock_t typedef and add system_clock. Change __clock_t to be a typedef for the preferred clock to convert arbitrary other clocks to. [_GLIBCXX_USE_PTHREAD_COND_CLOCKWAIT] (wait_until): Add a steady_clock overload. (wait_until): Change __clock_t overload to use system_clock. [_GLIBCXX_USE_PTHREAD_COND_CLOCKWAIT] (__wait_until_impl): Add steady_clock overload that calls pthread_cond_clockwait. (__wait_until_impl): Change __clock_t overload to use system_clock. (condition_variable_any) [_GLIBCXX_USE_PTHREAD_COND_CLOCKWAIT]: Use steady_clock for __clock_t if pthread_cond_clockwait is available. Modified: trunk/libstdc++-v3/ChangeLog trunk/libstdc++-v3/acinclude.m4 trunk/libstdc++-v3/config.h.in trunk/libstdc++-v3/configure trunk/libstdc++-v3/configure.ac trunk/libstdc++-v3/include/std/condition_variable
Fixed for GCC 10 by Mike's patches.