This is the mail archive of the
mailing list for the GCC project.
Re: Measuring short times
Dennis Clarke schrieb:
On IA-32 CPUs (Pentium or K5 and above) rdts returns the content of the
64 bit CPU-internal clock counter, which is set to 0 at system start and
incremented by one at every clock tick. This is guaranteed to be not
only monotonic, but even strongly monotonic. Moreover, even on a 4.2 GHz
system it would take 2^32 seconds (>130 years) until the counter
overflows. For all practical purposes, this should be fine :-)
Dennis, those Solaris functions were not useful for me because I got
no idea how to make it work on Windows.
How do you know that those functions are monotonic and consistent ?
Only caveat is that it depends on the CPU clock speed. Not only that the
basic clock speed is different on every system, it may even change over
time due to power saving strategies. Nevertheless, for short time
periods this is on Intel the best method to go.
Unfortunately, the rdts instruction is not a pipeline sequence point.
This means that the CPU is free to reorder it with other instructions in
the pipeline. Thus, if you want to measure really short portions of
code, the results could be "surprising". To prevent this, one has to
manually insert a sequence point. The cpuid instruction does a good job
here. However, this clearly is only necessary for really short periods,
periods way below the 500ns the OP asked for.
So how to determine the clock speed?
1) On windows, this is measured by the system and maintained somewhere
in the registry. I don't know the exact key, but if you look a bit
around or goggle for it you should be able to find it.
2) An alternative would be to wait for a defined amount of time (let's
say 100 msec) and get the rdtsc-value before and afterwards to
calculate the CPU frequency.
For this you have to:
a) give your thread a high priority: SetThreadPriority(...),
b) set the systems-internal timer to high precision: timeBeginPeriod(1),
c) do a Sleep(100).
With the timeBeginPeriod(1), the systems-internal timers can be expected
to be precise by one msec, so the Sleep(100) should wait for 100 msec
with an error of < 1 Percent.