Bug 15516 - assembly snippets for nano second resolution wall clock time
Summary: assembly snippets for nano second resolution wall clock time
Status: RESOLVED WONTFIX
Alias: None
Product: gcc
Classification: Unclassified
Component: libfortran (show other bugs)
Version: 4.0.0
: P2 enhancement
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2004-05-18 13:27 UTC by Helge Avlesen
Modified: 2007-05-19 18:48 UTC (History)
3 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2005-12-28 06:20:35


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Helge Avlesen 2004-05-18 13:27:58 UTC
if someone feels for making gfortran's system_clock into a nano second
resolution wall clock timer see this page for some assembly snippets(ia32,64):

http://www.ncsa.uiuc.edu/UserInfo/Resources/Hardware/IA32LinuxCluster/Doc/timing.html

(relevant lines pasted below if the page disappears)


1.5 Linux Assembly code
The highest resolution but least portable timers are the Linux ASM timers. These
routines provide wall clock time .
IA64
In order to use the Linux ASM timer on the IA64 platform (titan), you will need
to compile the routine using the GNU C compiler. The routine is:

unsigned long long int nanotime_ia64(void)
{
    unsigned long long int val;
    __asm__ __volatile__("mov %0=ar.itc" : "=r"(val) :: "memory");
    return(val);
}
IA32
On the IA32 platform (platinum) you can use either the Intel or GNU compiler.
The routine is:

unsigned long long int nanotime_ia32(void)
{
     unsigned long long int val;
    __asm__ __volatile__("rdtsc" : "=A" (val) : );
     return(val);
}

You can link to the resulting object file with either Intel or GNU compiler from
C or Fortran with the appropriate wrapper if needed. If you extract the object
file get_clockfreq.o from /usr/lib/librt.a then you can call the function
__get_clockfreq() to determine clock frequency. To extract the routine, try:

    ar xv /usr/lib/librt.a get_clockfreq.o

To use the routines as timers, you can use the following routine. Call it before
and after the section of code you want to time and the difference will be the
elapsed time. Be sure to include the appropriate routine from above.


static long int CPS;
static double iCPS;
static unsigned start=0;

/* CPU Clock Freq. in Hz from routine in /usr/lib/librt.a */  
/* extern unsigned long long int __get_clockfreq(void); */ 

double second(void) /* Include an '_' if you will be calling from Fortan */
{
  double foo;
  if (!start)
  {
     /* CPU Clock Freq. in Hz from routine in /usr/lib/librt.a */
     /* CPS=__get_clockfreq(); */
     /* CPU Clock Freq. in Hz taken from /proc/cpuinfo */
     CPS=800134992;  
     iCPS=1.0/(double)CPS;
     start=1;
  }

    /* Uncomment one of the following */
  /* foo=iCPS*nanotime_ia32(); */  /* If running on IA32 machine */
  /* foo=iCPS*nanotime_ia64(); */   /* If running on IA64 machine */

  return(foo);
}
Comment 1 Andrew Pinski 2004-05-18 14:02:31 UTC
Confirmed, and here is the PPC32 one too (but I do not know how to get the timebase frequency):
Note (long needs to be really a 32bit value and I also wrote this from memory):

unsigned long long GetTimebase(void)
{
  unsigned long low;
  unsigned long high;
  unsinged long high1;
  do
  {
    asm volatile ("mftbu %0":"=r"(high));
    asm volatile ("mftb %0":"=r"(low));
    asm volatile ("mftbu %0":"=r"(high1));
  } while (high != high1);
  return ((unsigned long long)high)<<32ULL|(unsigned long long) low;
}

PPC64 (easier as mftb gives the full timebase register for 64bit processors):

unsigned long long GetTimebase(void)
{
  unsigned long long timebase;
  asm volatile ("mftb %0":"=r"(timebase));
  return timebase;
}
Comment 2 Joost VandeVondele 2007-02-11 10:55:44 UTC
(In reply to comment #0)
> If you extract the object
> file get_clockfreq.o from /usr/lib/librt.a then you can call the function
> __get_clockfreq() to determine clock frequency. To extract the routine, try:
> 
>     ar xv /usr/lib/librt.a get_clockfreq.o
> 
> To use the routines as timers, you can use the following routine. Call it before
> and after the section of code you want to time and the difference will be the
> elapsed time. Be sure to include the appropriate routine from above.

is this comment about get_clockfreq.o actually correct ? I find it returns different values depending on the load of the machine (I guess this is frequency rescaling at work, i.e.):

 46799775 1596000000 0.029323167293233084
 46703250 1596000000 0.029262687969924813
 40773807 1596000000 0.02554749812030075
 34589439 2394000000 0.014448387218045113
 33201315 1596000000 0.020802828947368422
 34758144 2394000000 0.014518857142857142
 33325110 1596000000 0.020880394736842105
 34576236 2394000000 0.014442872180451127

where the first number is the ticks as returned by differences of nanotime_ia32, and the second the number returned by get_clockfreq, the third is the estimated time if seconds (quite random, since it is allways the same matrix multiply). (an unrelated issue is that it wraps pretty quicky...)
Comment 3 Helge.Avlesen@bccs.uib.no 2007-02-12 10:03:24 UTC
Subject: Re:  assembly snippets for nano second resolution wall clock time

"jv244 at cam dot ac dot uk" <gcc-bugzilla@gcc.gnu.org> writes:

> is this comment about get_clockfreq.o actually correct ? I find it returns
> different values depending on the load of the machine (I guess this is
> frequency rescaling at work, i.e.):

yup, it is rescaling. should be turned off if you want reliable high
res measurements.

Helge
Comment 4 Janne Blomqvist 2007-05-19 17:51:35 UTC
There are enough pitfalls with using rdtsc that I don't think it's justifiable to use it for a general purpose timing routine like system_clock. See e.g.:

http://www.ussg.iu.edu/hypermail/linux/kernel/0505.1/1463.html

http://en.wikipedia.org/wiki/RDTSC

http://lkml.org/lkml/2005/11/4/173

http://lwn.net/Articles/209101/

For a general purpose library like libgfortran I think the best way is to use something reasonably portable and consistent, and let the kernel people worry about providing a usable api such as gettimeofday or clock_gettime using whatever HW they deem the most appropriate behind the scenes.
Comment 5 Francois-Xavier Coudert 2007-05-19 18:45:59 UTC
(In reply to comment #4)
> For a general purpose library like libgfortran I think the best way is to use
> something reasonably portable and consistent

I agree. Do we close this as WONTFIX?
Comment 6 Jerry DeLisle 2007-05-19 18:48:37 UTC
Yes, agree. Closing