Bug 59177 - steady_clock::now() and system_clock::now do not use the vdso (and are therefore very slow)
Summary: steady_clock::now() and system_clock::now do not use the vdso (and are theref...
Status: RESOLVED WORKSFORME
Alias: None
Product: gcc
Classification: Unclassified
Component: libstdc++ (show other bugs)
Version: 4.8.3
: P3 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-11-18 18:22 UTC by Andy Lutomirski
Modified: 2013-11-19 00:45 UTC (History)
1 user (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Andy Lutomirski 2013-11-18 18:22:19 UTC
std::chrono::steady_clock::now() does this:

#ifdef _GLIBCXX_USE_CLOCK_GETTIME_SYSCALL
      syscall(SYS_clock_gettime, CLOCK_MONOTONIC, &tp);
#else
      clock_gettime(CLOCK_MONOTONIC, &tp);
#endif

I'm not sure what the intent of this condition is, but the effect is that glibc's clock_gettime (which is very carefully optimized and avoids using syscalls) is ignored in favor of using syscall(2) (which is very slow).

This appears to have been introduced by this commit:

http://gcc.gnu.org/ml/libstdc++/2013-05/txtfX2KusGj9C.txt

It's a serious slowdown:

steady_clock::now(): 0.0933114µs per iteration
clock_gettime: 0.0230129µs per iteration

as measured by:

#include <chrono>
#include <iostream>
#include <time.h>
using namespace std;

constexpr int iters = 10000;
typedef chrono::duration<double> dsecs;

int main()
{
  auto start = chrono::steady_clock::now();
  for (int i = 0; i < iters; i++)
    chrono::steady_clock::now();
  auto end = chrono::steady_clock::now();

  std::cout << "steady_clock::now(): " << 1e6 * chrono::duration_cast<dsecs>(end-start).count() / iters << "µs per iteration\n";

  start = chrono::steady_clock::now();
  timespec ts;
  for (int i = 0; i < iters; i++)
    clock_gettime(CLOCK_MONOTONIC, &ts);
  end = chrono::steady_clock::now();

  std::cout << "clock_gettime: " << 1e6 * chrono::duration_cast<dsecs>(end-start).count() / iters << "µs per iteration\n";
}

system_clock appears to behave identically.
Comment 1 Paolo Carlini 2013-11-18 18:37:12 UTC
CC-ing Jakub, then.
Comment 2 Jakub Jelinek 2013-11-18 18:45:41 UTC
Just install glibc 2.17 or later if you care about the speed, otherwise it is fixing much more important problem that the clocks had ABI incompatible settings depending on common libstdc++ configure options.  If you do have glibc 2.17 or later, clock_gettime(CLOCK_MONOTONIC, &tp); should be used (if not, that would be a bug somewhere), but earlier glibcs have clock_gettime defined only in librt, which is undesirable to require.
Comment 3 Andy Lutomirski 2013-11-18 19:15:19 UTC
I can't get gcc trunk to build right now, but I just distcleaned and rebuilt the 4.8 branch truck on Fedora 19, which has glibc-2.17-19.fc19.x86_64.  It defines _GLIBCXX_USE_CLOCK_GETTIME_SYSCALL.  This happens because:

configure:19349: checking for clock_gettime, nanosleep and sched_yield
configure:19378: result: no

I think that the underlying problem is that --enable-libstdcxx-time defaults to "no".  Shouldn't it default to "yes" (and hence run the fancy configure checks)?  Configuring with ./configure --enable-libstdcxx-time does the right thing.
Comment 4 Jakub Jelinek 2013-11-18 19:24:17 UTC
The default for --enable-libstdcxx-time is =auto, which should be the right thing:
      gnu* | linux* | kfreebsd*-gnu | knetbsd*-gnu)
        AC_MSG_CHECKING([for at least GNU libc 2.17])
        AC_TRY_COMPILE(
          [#include <features.h>],
          [
          #if ! __GLIBC_PREREQ(2, 17)
          #error
          #endif
          ],
          [glibcxx_glibc217=yes], [glibcxx_glibc217=no])
        AC_MSG_RESULT($glibcxx_glibc217)

        if test x"$glibcxx_glibc217" = x"yes"; then
          ac_has_clock_monotonic=yes
          ac_has_clock_realtime=yes
        fi
        ac_has_nanosleep=yes
        ac_has_sched_yield=yes


My trunk build on Fedora 19 today has:
/* Defined if clock_gettime syscall has monotonic and realtime clock support.
   */
/* #undef _GLIBCXX_USE_CLOCK_GETTIME_SYSCALL */

/* Defined if clock_gettime has monotonic clock support. */
#define _GLIBCXX_USE_CLOCK_MONOTONIC 1

/* Defined if clock_gettime has realtime clock support. */
#define _GLIBCXX_USE_CLOCK_REALTIME 1
and there are two clock_gettime calls in libstdc++.so.6 as they should be.
Comment 5 Andy Lutomirski 2013-11-18 19:46:19 UTC
Aha.  I think the issue is only in 4.8.  libstdc++-v3/configure.ac says:

# For clock_gettime, nanosleep and sched_yield support.
# NB: The default is [no], because otherwise it requires linking.
GLIBCXX_ENABLE_LIBSTDCXX_TIME([no])

Aside from the typo in the comment (the end of the sentence seems to have disappeared), I'm still confused.  Even in the 4.8 branch, enable_libstdcxx_time=yes doesn't appear to link in librt; only enable_libstdcxx_time=rt seems to have that effect.

(Please accept my apologies if I'm just completely confused about how autoconf works here.)
Comment 6 Jakub Jelinek 2013-11-18 19:50:47 UTC
Sure, for GCC 4.8.1 we've only changed minimum things that were required for correct operation and ABI compatibility, it didn't receive the extra changes for =auto.
Comment 7 Andy Lutomirski 2013-11-18 19:53:14 UTC
I guess what I'm saying is: what's wrong with "yes" in 4.8?  From looking at the code, it still seems like it does the right thing (i.e. not using clock_gettime if it's not in [posix4]).
Comment 8 Jakub Jelinek 2013-11-18 19:58:44 UTC
For hosted linux --enable-libstdcxx-time can work just fine, just use it.
But, such a change isn't desirable for the branch, e.g. because the fancy checks require link tests which aren't usable for some bare metal configurations, might not be the right thing for Solaris or whatever else, etc.
=auto has been introduced exactly to get some reasonably safe defaults automatically by default.  So, if you want to change anything on the branch, supposedly it would be partial backport of the =auto stuff, but only limit it to the most commonly tested targets or something.  Still, I'd say it is better to just tell people who want it to --enable-libstdcxx-time themselves if they want.
In the past it has been ABI incompatible option, now hopefully is not.
Comment 9 Andy Lutomirski 2013-11-18 20:03:01 UTC
Given that this is C++11-only, it's already fixed on trunk, it's only a performance issue (as opposed to correctness), and it's more complicated than just changing the default, I won't argue for a backport.

From my perspective, feel free to close this.
Comment 10 Paolo Carlini 2013-11-18 20:15:56 UTC
Ok, thanks.
Comment 11 Jonathan Wakely 2013-11-19 00:15:54 UTC
(In reply to Andy Lutomirski from comment #5)
> # For clock_gettime, nanosleep and sched_yield support.
> # NB: The default is [no], because otherwise it requires linking.
> GLIBCXX_ENABLE_LIBSTDCXX_TIME([no])
> 
> Aside from the typo in the comment (the end of the sentence seems to have
> disappeared),

No, the sentence is complete. Comment 8 explains that we don't want to do tests that require running the linker for the default configuration, because that's not possible for bare metal targets.
Comment 12 Andy Lutomirski 2013-11-19 00:45:08 UTC
D'oh!  I assumed that "it" was the resulting library (and thus that it would require linking *against librt*), not that "it" was the configure test.