2002-03-06 03:42:20

by Ben Greear

[permalink] [raw]
Subject: a faster way to gettimeofday?

I have a program that I very often need to calculate the current
time, with milisecond accuracy. I've been using gettimeofday(),
but gprof shows it's taking a significant (10% or so) amount of
time. Is there a faster (and perhaps less portable?) way to get
the time information on x86? My program runs as root, so should
have any permissions it needs to use some backdoor hack if that
helps!

Thanks,
Ben

--
Ben Greear <[email protected]> <Ben_Greear AT excite.com>
President of Candela Technologies Inc http://www.candelatech.com
ScryMUD: http://scry.wanfear.com http://scry.wanfear.com/~greear



2002-03-06 03:55:12

by Davide Libenzi

[permalink] [raw]
Subject: Re: a faster way to gettimeofday?

On Tue, 5 Mar 2002, Ben Greear wrote:

> I have a program that I very often need to calculate the current
> time, with milisecond accuracy. I've been using gettimeofday(),
> but gprof shows it's taking a significant (10% or so) amount of
> time. Is there a faster (and perhaps less portable?) way to get
> the time information on x86? My program runs as root, so should
> have any permissions it needs to use some backdoor hack if that
> helps!

If you're on x86 you can use collect rdtsc samples and convert them to ms.
You'll get even more then ms accuracy.




- Davide



2002-03-06 04:20:57

by Ben Greear

[permalink] [raw]
Subject: Re: a faster way to gettimeofday?



Davide Libenzi wrote:

> On Tue, 5 Mar 2002, Ben Greear wrote:
>
>
>>I have a program that I very often need to calculate the current
>>time, with milisecond accuracy. I've been using gettimeofday(),
>>but gprof shows it's taking a significant (10% or so) amount of
>>time. Is there a faster (and perhaps less portable?) way to get
>>the time information on x86? My program runs as root, so should
>>have any permissions it needs to use some backdoor hack if that
>>helps!
>>
>
> If you're on x86 you can use collect rdtsc samples and convert them to ms.
> You'll get even more then ms accuracy.


Can I do this from user space? If so, any examples or docs
you can point me to?

Also, I'm looking primarily for a speed increase, not an accuracy
increase.

Thanks for the response!

Ben


>
>
>
>
> - Davide
>
>
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
>


--
Ben Greear <[email protected]> <Ben_Greear AT excite.com>
President of Candela Technologies Inc http://www.candelatech.com
ScryMUD: http://scry.wanfear.com http://scry.wanfear.com/~greear


2002-03-06 04:28:09

by Davide Libenzi

[permalink] [raw]
Subject: Re: a faster way to gettimeofday?

On Tue, 5 Mar 2002, Ben Greear wrote:

>
>
> Davide Libenzi wrote:
>
> > On Tue, 5 Mar 2002, Ben Greear wrote:
> >
> >
> >>I have a program that I very often need to calculate the current
> >>time, with milisecond accuracy. I've been using gettimeofday(),
> >>but gprof shows it's taking a significant (10% or so) amount of
> >>time. Is there a faster (and perhaps less portable?) way to get
> >>the time information on x86? My program runs as root, so should
> >>have any permissions it needs to use some backdoor hack if that
> >>helps!
> >>
> >
> > If you're on x86 you can use collect rdtsc samples and convert them to ms.
> > You'll get even more then ms accuracy.
>
>
> Can I do this from user space? If so, any examples or docs
> you can point me to?
>
> Also, I'm looking primarily for a speed increase, not an accuracy
> increase.


#include <linux/timex.h>


unsigned long long mscurr;
cycles_t cys, cye, mscycles;
struct timespec ts1, ts2;

clock_gettime(CLOCK_REALTIME, &ts1);
cys = get_cycles();
sleep(1);
clock_gettime(CLOCK_REALTIME, &ts2);
cye = get_cycles();
mscycles = (cye - cys) / ((ts2.tv_sec - ts1.tv_sec) * 1000 +
(ts2.tv_nsec - ts1.tv_nsec) / 1000000);



mscurr = ts2.tv_sec * 1000 + ts2.tv_nsec * 1000000 + (get_cycles() - cye) / mscycles;





- Davide


2002-03-06 04:31:39

by Davide Libenzi

[permalink] [raw]
Subject: Re: a faster way to gettimeofday?

On Tue, 5 Mar 2002, Davide Libenzi wrote:

> On Tue, 5 Mar 2002, Ben Greear wrote:
>
> >
> >
> > Davide Libenzi wrote:
> >
> > > On Tue, 5 Mar 2002, Ben Greear wrote:
> > >
> > >
> > >>I have a program that I very often need to calculate the current
> > >>time, with milisecond accuracy. I've been using gettimeofday(),
> > >>but gprof shows it's taking a significant (10% or so) amount of
> > >>time. Is there a faster (and perhaps less portable?) way to get
> > >>the time information on x86? My program runs as root, so should
> > >>have any permissions it needs to use some backdoor hack if that
> > >>helps!
> > >>
> > >
> > > If you're on x86 you can use collect rdtsc samples and convert them to ms.
> > > You'll get even more then ms accuracy.
> >
> >
> > Can I do this from user space? If so, any examples or docs
> > you can point me to?
> >
> > Also, I'm looking primarily for a speed increase, not an accuracy
> > increase.
>
>
> #include <linux/timex.h>
>
>
> unsigned long long mscurr;
> cycles_t cys, cye, mscycles;
> struct timespec ts1, ts2;
>
> clock_gettime(CLOCK_REALTIME, &ts1);
> cys = get_cycles();
> sleep(1);
> clock_gettime(CLOCK_REALTIME, &ts2);
> cye = get_cycles();
> mscycles = (cye - cys) / ((ts2.tv_sec - ts1.tv_sec) * 1000 +
> (ts2.tv_nsec - ts1.tv_nsec) / 1000000);
>
>
>
> mscurr = ts2.tv_sec * 1000 + ts2.tv_nsec * 1000000 + (get_cycles() - cye) / mscycles;

it's obviously ... ts2.tv_nsec / 1000000 ...



- Davide


2002-03-06 16:08:13

by Chris Friesen

[permalink] [raw]
Subject: Re: a faster way to gettimeofday?

Ben Greear wrote:
>
> I have a program that I very often need to calculate the current
> time, with milisecond accuracy. I've been using gettimeofday(),
> but gprof shows it's taking a significant (10% or so) amount of
> time. Is there a faster (and perhaps less portable?) way to get
> the time information on x86? My program runs as root, so should
> have any permissions it needs to use some backdoor hack if that
> helps!


#include <asm/msr.h>

/* get this value from the "cpu MHz" line of /proc/cpuinfo */
#define CLOCKSPEED xxxxxxxx

int main()
{
unsigned int lowbegin, lowend, highbegin, highend;
unsigned long long diff;
double elapsed;

rdtsc(lowbegin,highbegin);

//do stuff

rdtsc(lowend,highend);

if (lowend < lowbegin)
highend--;

diff = (((unsigned long long) highend - highbegin) << 32) + (lowend -
lowbegin);

elapsed = (double) diff / CLOCKSPEED;

/* elapsed now has time in microseconds, do whatever you wantwith it */

return 0;
}


--
Chris Friesen | MailStop: 043/33/F10
Nortel Networks | work: (613) 765-0557
3500 Carling Avenue | fax: (613) 765-2986
Nepean, ON K2H 8E9 Canada | email: [email protected]

2002-03-06 16:55:32

by Richard B. Johnson

[permalink] [raw]
Subject: Re: a faster way to gettimeofday?

On Wed, 6 Mar 2002, Chris Friesen wrote:

> Ben Greear wrote:
> >
> > I have a program that I very often need to calculate the current
> > time, with milisecond accuracy. I've been using gettimeofday(),
> > but gprof shows it's taking a significant (10% or so) amount of
> > time. Is there a faster (and perhaps less portable?) way to get
> > the time information on x86? My program runs as root, so should
> > have any permissions it needs to use some backdoor hack if that
> > helps!
>
>
> #include <asm/msr.h>
>
> /* get this value from the "cpu MHz" line of /proc/cpuinfo */
> #define CLOCKSPEED xxxxxxxx
>
> int main()
> {
> unsigned int lowbegin, lowend, highbegin, highend;
> unsigned long long diff;
> double elapsed;
>
> rdtsc(lowbegin,highbegin);
>
> //do stuff
>
> rdtsc(lowend,highend);
>
> if (lowend < lowbegin)
> highend--;
>
> diff = (((unsigned long long) highend - highbegin) << 32) + (lowend -
> lowbegin);
>
> elapsed = (double) diff / CLOCKSPEED;
>
> /* elapsed now has time in microseconds, do whatever you wantwith it */
>
> return 0;
> }
>
>
> --
> Chris Friesen | MailStop: 043/33/F10
> Nortel Networks | work: (613) 765-0557
> 3500 Carling Avenue | fax: (613) 765-2986
> Nepean, ON K2H 8E9 Canada | email: [email protected]
> -

Also, a bit more accurate is enclosed.

Cheers,
Dick Johnson

Penguin : Linux version 2.4.18 on an i686 machine (799.53 BogoMips).

Bill Gates? Who?


Attachments:
timer.tar.gz (7.55 kB)

2002-03-06 20:45:33

by dean gaudet

[permalink] [raw]
Subject: Re: a faster way to gettimeofday?

On Tue, 5 Mar 2002, Ben Greear wrote:

>
> Davide Libenzi wrote:
>
> > If you're on x86 you can use collect rdtsc samples and convert them to ms.
> > You'll get even more then ms accuracy.
>
>
> Can I do this from user space? If so, any examples or docs
> you can point me to?
>
> Also, I'm looking primarily for a speed increase, not an accuracy
> increase.

ingo started the proper work for this, for example, see:
<http://people.redhat.com/mingo/vsyscall-patches/vsyscall-2.3.32-F4>
(there's a documentation file near the bottom of the patch) but it
doesn't appear to support gettimeofday via rdtsc yet.

-dean


2002-03-06 21:31:46

by Chris Ball

[permalink] [raw]
Subject: Re: a faster way to gettimeofday?

>>>>> "dean" == dean gaudet <[email protected]> writes:

dean> ingo started the proper work for this, for example, see:
dean> <http://people.redhat.com/mingo/vsyscall-patches/vsyscall-2.3.32-F4>
dean> (there's a documentation file near the bottom of the patch)
dean> but it doesn't appear to support gettimeofday via rdtsc yet.

Interesting patch; when last I looked, vsyscalls were only being
implemented on the new 64-bit architectures.

Does this patch break binary compatibility? I seem to recall that being
Andrea's reason for not running vsyscalls on standard x86 back in August
last year.

- Chris.
--
$a="printf.net"; Chris Ball | chris@void.$a | www.$a | finger: chris@$a
"In the beginning there was nothing, which exploded."

2002-03-06 22:26:08

by George Anzinger

[permalink] [raw]
Subject: Re: a faster way to gettimeofday?

Chris Ball wrote:
>
> >>>>> "dean" == dean gaudet <[email protected]> writes:
>
> dean> ingo started the proper work for this, for example, see:
> dean> <http://people.redhat.com/mingo/vsyscall-patches/vsyscall-2.3.32-F4>
> dean> (there's a documentation file near the bottom of the patch)
> dean> but it doesn't appear to support gettimeofday via rdtsc yet.
>
> Interesting patch; when last I looked, vsyscalls were only being
> implemented on the new 64-bit architectures.
>
> Does this patch break binary compatibility? I seem to recall that being
> Andrea's reason for not running vsyscalls on standard x86 back in August
> last year.
>
In keeping with the subject, the gettimeofday call in this patch does
NOT get the current timeofday, but the time it was last updated, which
is usually every 1/HZ. The REAL gettimeofday adds a converted delta of
the TSC to get the time to NOW with a resolution of 1 micro second.
(Oh, and by the way, it does not update xtime in the process of doing
this.)

As a wonder, just how long is your system taking to do a gettimeofday.
Mine does it in under one micro second (800 MHZ PIII).

-g

> - Chris.
> --
> $a="printf.net"; Chris Ball | chris@void.$a | www.$a | finger: chris@$a
> "In the beginning there was nothing, which exploded."
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

--
George [email protected]
High-res-timers: http://sourceforge.net/projects/high-res-timers/
Real time sched: http://sourceforge.net/projects/rtsched/

2002-03-07 00:09:23

by Mark Mielke

[permalink] [raw]
Subject: vsyscalls

On Wed, Mar 06, 2002 at 09:31:10PM +0000, Chris Ball wrote:
> dean> ingo started the proper work for this, for example, see:
> dean> <http://people.redhat.com/mingo/vsyscall-patches/vsyscall-2.3.32-F4>
> dean> (there's a documentation file near the bottom of the patch)
> dean> but it doesn't appear to support gettimeofday via rdtsc yet.
> Interesting patch; when last I looked, vsyscalls were only being
> implemented on the new 64-bit architectures.

> Does this patch break binary compatibility? I seem to recall that being
> Andrea's reason for not running vsyscalls on standard x86 back in August
> last year.

WinNT looks like it supports vsyscalls (or at least SYSENTER).

mark

--
[email protected]/[email protected]/[email protected] __________________________
. . _ ._ . . .__ . . ._. .__ . . . .__ | Neighbourhood Coder
|\/| |_| |_| |/ |_ |\/| | |_ | |/ |_ |
| | | | | \ | \ |__ . | | .|. |__ |__ | \ |__ | Ottawa, Ontario, Canada

One ring to rule them all, one ring to find them, one ring to bring them all
and in the darkness bind them...

http://mark.mielke.cc/

2002-03-07 14:15:23

by Terje Eggestad

[permalink] [raw]
Subject: Re: a faster way to gettimeofday? rdtsc strangeness

One word of caution.

If you have a CPU that begin throttling it usually cut the CPU clock in
half and the rdtsc counter count half a fast.

You *should* actually call gettimeofday every few seconds (> 10) to
correct offset.

If you have a SMP system with impropper cooling you actually get into
trouble since you may have one CPU throttling alot and the other don't.
Then the rdtsc of the two CPU's may be quite different.
Have actually seen machines where sleep 1 took 1.2-1.4 secs!

TJ

On Wed, 2002-03-06 at 05:31, Davide Libenzi wrote:
> On Tue, 5 Mar 2002, Ben Greear wrote:
>
> >
> >
> > Davide Libenzi wrote:
> >
> > > On Tue, 5 Mar 2002, Ben Greear wrote:
> > >
> > >
> > >>I have a program that I very often need to calculate the current
> > >>time, with milisecond accuracy. I've been using gettimeofday(),
> > >>but gprof shows it's taking a significant (10% or so) amount of
> > >>time. Is there a faster (and perhaps less portable?) way to get
> > >>the time information on x86? My program runs as root, so should
> > >>have any permissions it needs to use some backdoor hack if that
> > >>helps!
> > >>
> > >
> > > If you're on x86 you can use collect rdtsc samples and convert them to ms.
> > > You'll get even more then ms accuracy.
> >
> >
> > Can I do this from user space? If so, any examples or docs
> > you can point me to?
> >
> > Also, I'm looking primarily for a speed increase, not an accuracy
> > increase.
>
>
> #include <linux/timex.h>
>
>
> unsigned long long mscurr;
> cycles_t cys, cye, mscycles;
> struct timespec ts1, ts2;
>
> clock_gettime(CLOCK_REALTIME, &ts1);
> cys = get_cycles();
> sleep(1);
> clock_gettime(CLOCK_REALTIME, &ts2);
> cye = get_cycles();
> mscycles = (cye - cys) / ((ts2.tv_sec - ts1.tv_sec) * 1000 +
> (ts2.tv_nsec - ts1.tv_nsec) / 1000000);
>
>
>
> mscurr = ts2.tv_sec * 1000 + ts2.tv_nsec * 1000000 + (get_cycles() - cye) / mscycles;
>
>
>
>
>
> - Davide
>
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
--
_________________________________________________________________________

Terje Eggestad mailto:[email protected]
Scali Scalable Linux Systems http://www.scali.com

Olaf Helsets Vei 6 tel: +47 22 62 89 61 (OFFICE)
P.O.Box 150, Oppsal +47 975 31 574 (MOBILE)
N-0619 Oslo fax: +47 22 62 89 51
NORWAY
_________________________________________________________________________

2002-03-07 14:27:25

by Alan

[permalink] [raw]
Subject: Re: a faster way to gettimeofday? rdtsc strangeness

> If you have a CPU that begin throttling it usually cut the CPU clock in
> half and the rdtsc counter count half a fast.

They normally adjust the STPCLK which is just fine. I've only seen a few
laptops that do it other ways. More fun are people running mixed 300/450
BP6 boards where the tsc varies by cpu

2002-03-07 15:44:08

by Terje Eggestad

[permalink] [raw]
Subject: Re: a faster way to gettimeofday? rdtsc strangeness

On Thu, 2002-03-07 at 15:41, Alan Cox wrote:
> > If you have a CPU that begin throttling it usually cut the CPU clock in
> > half and the rdtsc counter count half a fast.
>
> They normally adjust the STPCLK which is just fine. I've only seen a few
> laptops that do it other ways. More fun are people running mixed 300/450
> BP6 boards where the tsc varies by cpu

Can /proc/cpuinfo really be trusted in figuring out how long a cycle is?


--
_________________________________________________________________________

Terje Eggestad mailto:[email protected]
Scali Scalable Linux Systems http://www.scali.com

Olaf Helsets Vei 6 tel: +47 22 62 89 61 (OFFICE)
P.O.Box 150, Oppsal +47 975 31 574 (MOBILE)
N-0619 Oslo fax: +47 22 62 89 51
NORWAY
_________________________________________________________________________


2002-03-07 16:03:56

by Alan

[permalink] [raw]
Subject: Re: a faster way to gettimeofday? rdtsc strangeness

> > They normally adjust the STPCLK which is just fine. I've only seen a few
> > laptops that do it other ways. More fun are people running mixed 300/450
> > BP6 boards where the tsc varies by cpu
>
> Can /proc/cpuinfo really be trusted in figuring out how long a cycle is?

The bogomip counter is a pretty good bet for any other system, and
gettimeofday() for the general case.


2002-03-07 18:33:10

by H. Peter Anvin

[permalink] [raw]
Subject: Re: a faster way to gettimeofday? rdtsc strangeness

Followup to: <[email protected]>
By author: Terje Eggestad <[email protected]>
In newsgroup: linux.dev.kernel
>
> Can /proc/cpuinfo really be trusted in figuring out how long a cycle is?
>

It uses RDTSC, so yes.

-hpa
--
<[email protected]> at work, <[email protected]> in private!
"Unix gives you enough rope to shoot yourself in the foot."
http://www.zytor.com/~hpa/puzzle.txt <[email protected]>

2002-03-08 01:33:03

by Jamie Lokier

[permalink] [raw]
Subject: Re: a faster way to gettimeofday? rdtsc strangeness

H. Peter Anvin wrote:
> > Can /proc/cpuinfo really be trusted in figuring out how long a cycle is?
>
> It uses RDTSC, so yes.

Yes and no. The kernel's measurement isn't as accurate as it could be.

I have some code which calibrates TSC cycles against gettimeofday().

Considering that gettimeofday() works by using the kernel's frequency
estimate that it shows in /proc/cpuinfo, the results from the code
should match /proc/cpuinfo very precisely ndeed.

However, over a million gettimeofday() calls, the frequency of the 100Hz
tick would tend to dominate the result, showing any error in the
kernel's estimate which appears in /proc/cpuinfo (which is derived from
measuring the length of a 100Hz tick in TSC cycles).

With 10^6 gettimeofday() calls, and linear regression through the set of
measurements when an interrupt didn't happen (checked by reading TSC
before and after the system call and filtering on the minimum), I get
superb results on a FreeBSD box:

Kernel reports clock frequency: 746.154 MHz
Measured clock frequency: 746.154 MHz

On various Linux boxes, however:

Kernel reports clock frequency: 596.924 MHz
Measured clock frequency: 596.929 MHz

Kernel reports clock frequency: 866.708 MHz
Measured clock frequency: 866.675 MHz

Kernel reports clock frequency: 664.597 MHz
Measured clock frequency: 664.582 MHz

Kernel reports clock frequency: 366.601 MHz
Measured clock frequency: 366.584 MHz

As you can see, the kernel consistently overestimates the CPU clock
frequency, or more precisely it overestimates the number of cycles in a
"100Hz" system clock tick. This is something that could be improved in
the kernel code, as FreeBSD demonstrates. Note that the clock frequency
probably varies by a similar order of magnitude to this error due to
thermal and power variation anyway. But it would be nice if the
measurements matched as they do in FreeBSD!

I have one of those laptops that Alan mentioned which does change the
CPU clock rate. It's a Toshiba 4070CDT, here running in fast then slow
mode:

Kernel reports clock frequency: 366.601 MHz
Measured clock frequency: 366.584 MHz

Kernel reports clock frequency: 366.601 MHz
Measured clock frequency: 184.503 MHz

It can switch between fast and slow modes while a program is running,
which occurs either when you press a special key combination, or the
mains power cuts off.

enjoy,
-- Jamie

2002-03-08 01:37:04

by H. Peter Anvin

[permalink] [raw]
Subject: Re: a faster way to gettimeofday? rdtsc strangeness

I would like to see the error bars on your measurement.

-hpa

2002-03-08 01:57:36

by Jamie Lokier

[permalink] [raw]
Subject: Re: a faster way to gettimeofday? rdtsc strangeness

H. Peter Anvin wrote:
> I would like to see the error bars on your measurement.

I just ran it on my laptop 20 times in a row and it printed exactly the
same values every time:

Kernel reports clock frequency: 366.601 MHz
Measured clock frequency: 366.584 MHz

Well we need more precision than that. So here goes:

Kernel reports clock frequency: 366.601000 MHz

Measured clock frequency: 366.584116 MHz
Measured clock frequency: 366.584071 MHz
Measured clock frequency: 366.584108 MHz
Measured clock frequency: 366.584116 MHz
Measured clock frequency: 366.584090 MHz
Measured clock frequency: 366.583968 MHz
Measured clock frequency: 366.584069 MHz
Measured clock frequency: 366.584106 MHz
Measured clock frequency: 366.584172 MHz
Measured clock frequency: 366.584093 MHz
Measured clock frequency: 366.584077 MHz
Measured clock frequency: 366.583939 MHz
Measured clock frequency: 366.584083 MHz
Measured clock frequency: 366.584118 MHz
Measured clock frequency: 366.584062 MHz
Measured clock frequency: 366.584085 MHz
Measured clock frequency: 366.584066 MHz
Measured clock frequency: 366.610417 MHz <- anomaly :-)
Measured clock frequency: 366.584109 MHz
Measured clock frequency: 366.584062 MHz
Measured clock frequency: 366.582293 MHz
Measured clock frequency: 366.584103 MHz
Measured clock frequency: 366.584093 MHz
Measured clock frequency: 366.584107 MHz
Measured clock frequency: 366.584059 MHz
Measured clock frequency: 366.583985 MHz
Measured clock frequency: 366.584133 MHz
Measured clock frequency: 366.584092 MHz
Measured clock frequency: 366.584087 MHz
Measured clock frequency: 366.555754 MHz <- another deviant!
Measured clock frequency: 366.584044 MHz
Measured clock frequency: 366.584109 MHz
Measured clock frequency: 366.584084 MHz
Measured clock frequency: 366.583608 MHz
Measured clock frequency: 366.584112 MHz
Measured clock frequency: 366.584134 MHz
Measured clock frequency: 366.584049 MHz
Measured clock frequency: 366.584090 MHz
Measured clock frequency: 366.584155 MHz
Measured clock frequency: 366.584006 MHz
Measured clock frequency: 366.584182 MHz
Measured clock frequency: 366.584009 MHz

It takes the median of 1000 samples of the TSC time taken to do a
rdtsc/gettimeofday/rdtsc measurement, then uses that as the threshold
for deciding which of the subsequent 1000000 measurements are accepted.
Then linear regression through the accepted points.

I see a couple of results there which suggests a probable fault in the
filtering algorithm. Perhaps it should simply use the smallest TSC time
taken as the threshold.

Sorry, I'd need permission to post the code.

enjoy,
-- Jamie

2002-03-08 18:31:35

by Jamie Lokier

[permalink] [raw]
Subject: gettimeofday() system call timing curiosity

Jamie Lokier wrote:
> It takes the median of 1000 samples of the TSC time taken to do a
> rdtsc/gettimeofday/rdtsc measurement, then uses that as the threshold
> for deciding which of the subsequent 1000000 measurements are accepted.
> Then linear regression through the accepted points.
>
> I see a couple of results there which suggests a probable fault in the
> filtering algorithm. Perhaps it should simply use the smallest TSC time
> taken as the threshold.

I've looked more closely. Of all the machines I have access to, only my
laptop shows the anomolous measurements.

It turns out that the median of "time in TSC cycles to do a
rdtsc+gettimeofday+rdtsc measurement" varies from run to run. It only
varies between two values, though.

On my laptop, the median of rdtsc+gettimeofday+rdtsc times is 470 cycles
for most runs of 1000, but is occasionally 453 cycles.

Over runs of approx. 10^6 samples, about 1% of the samples take 453
cycles to executre and about 60% take 470 cycles to execute. I have no
idea why the smallest is so rare!

This explains the anomalous measurement on my laptop: the quickest
samples of 453 cycles are quite rare (1-2%, not measured accurately),
over runs of 10^6 samples. Nevertheless, occasionally over a run of
1000 samples, the median comes out as 453. When that happens, over 10^6
samples only about 1-2% are considered valid, and this is not good
enough to get an accurate calibration method.

On all machines except my laptop, the percentage of samples over large
runs which execute in the minimum number of cycles is somewhat larger
(35-60% - it varies between runs), and that is enough that whether the
median selects the minimum or the next value up, the calibration
estimate is good. On one PC, the two median measurement times are 461
and 479 cycles.

On the FreeBSD box I noticed a very curious effect. The median
measurement time comes out as 695 or 712 cycles. (Btw, showing Linux
has faster gettimeofday calls :-). The curiousity is that if I
repeatedly run the program, I see something like 30 consecutive results
where the median came out at 695 cycles, then 50 consecutive results at
712 cycles, then another 30 or so at 695 cycles.

That shows there is something stateful going on - I would guess
something connected with the processor caches (but it's just a guess),
and doing 1000 iterations in a loop doesn't cause the timing to settle
down to the minimum time.

So there are: two oddities: why does my laptop have only a small
proportion of rdtsc+gettimeofday+rdtsc calls executing in the minimum
number of clock cycles, and why does FreeBSD show long runs with one
value alternating with long runs of another.

cheers,
-- Jamie

2002-03-08 18:51:49

by H. Peter Anvin

[permalink] [raw]
Subject: Re: gettimeofday() system call timing curiosity

Jamie Lokier wrote:

>
> On my laptop, the median of rdtsc+gettimeofday+rdtsc times is 470 cycles
> for most runs of 1000, but is occasionally 453 cycles.
>


What that indicates to me is that 1000 is way too small of a sample.
You're only talking a difference of 17,000 cycles, which could --
especially with cache effects -- easily be the time spent in an
interrupt handler.

-hpa



2002-03-08 18:54:58

by Richard B. Johnson

[permalink] [raw]
Subject: Re: gettimeofday() system call timing curiosity

On Fri, 8 Mar 2002, Jamie Lokier wrote:

> Jamie Lokier wrote:
> > It takes the median of 1000 samples of the TSC time taken to do a
> > rdtsc/gettimeofday/rdtsc measurement, then uses that as the threshold
> > for deciding which of the subsequent 1000000 measurements are accepted.
> > Then linear regression through the accepted points.
> >
> > I see a couple of results there which suggests a probable fault in the
> > filtering algorithm. Perhaps it should simply use the smallest TSC time
> > taken as the threshold.
>
> I've looked more closely. Of all the machines I have access to, only my
> laptop shows the anomolous measurements.
>
> It turns out that the median of "time in TSC cycles to do a
> rdtsc+gettimeofday+rdtsc measurement" varies from run to run. It only
> varies between two values, though.
>
[SNIPPED...]

The following program clearly shows that Linux will not return the
same gettimeofday values twice in succession. Since it provably takes
less than 1 microsecond to make a system call on a modern machine,
Linux must be waiting within the gettimeofday procedure long enough
to make certain that the time has changed. This may be screwing up
any performance measurments made with gettimeofday().


#include <stdio.h>
#include <sys/time.h>

int main(void);
int main()
{
struct timeval tv, pv;
tv.tv_sec = tv.tv_usec = pv.tv_sec = pv.tv_usec = 0;
for(;;)
{
(void)gettimeofday(&tv, NULL);
if((tv.tv_sec != pv.tv_sec) || (tv.tv_usec != pv.tv_usec))
printf("sec = %ld usec = %ld\n", tv.tv_sec, tv.tv_usec);
else
puts("The same!");
pv = tv;
}
return 0;
}


Cheers,
Dick Johnson

Penguin : Linux version 2.4.18 on an i686 machine (797.90 BogoMips).

Bill Gates? Who?

2002-03-08 19:05:29

by Alan

[permalink] [raw]
Subject: Re: gettimeofday() system call timing curiosity

> The following program clearly shows that Linux will not return the
> same gettimeofday values twice in succession. Since it provably takes
> less than 1 microsecond to make a system call on a modern machine,

We don't wait. Your computer just sucks 8)

More seriously even on a processor with a TSC we have to take a lock and
do several computations.

Alan

2002-03-08 19:07:39

by Johan Adolfsson

[permalink] [raw]
Subject: Re: gettimeofday() system call timing curiosity

What happens if you remove the printf/puts and simply counts the number
of times the different cases happen?

Another thought: Isn't it quite common that clock generators has a mode
where the frequency differs around the desired frequency to spread the
spectrum
and easier pass EMC tests?
Could that be the case with the laptop?
/Johan

2002-03-08 19:17:59

by H. Peter Anvin

[permalink] [raw]
Subject: Re: gettimeofday() system call timing curiosity

Followup to: <[email protected]>
By author: [email protected]
In newsgroup: linux.dev.kernel
>
> Another thought: Isn't it quite common that clock generators has a
> mode where the frequency differs around the desired frequency to
> spread the spectrum and easier pass EMC tests?
>

Indeed it is. It's usually supposed to average out relatively
quickly, however.

-hpa
--
<[email protected]> at work, <[email protected]> in private!
"Unix gives you enough rope to shoot yourself in the foot."
http://www.zytor.com/~hpa/puzzle.txt <[email protected]>

2002-03-08 19:45:46

by Richard B. Johnson

[permalink] [raw]
Subject: Re: gettimeofday() system call timing curiosity

On Fri, 8 Mar 2002 [email protected] wrote:

> What happens if you remove the printf/puts and simply counts the number
> of times the different cases happen?
>

Try it. It doesn't matter. Alan was correct, my computer sucks. However,
they won't give me a 10 GHz one (yet). Note that although gettimeofday()
has 1 microsecond resolution, not all the codes are exercised. We get
something with the granularity of 50 to 190 microseconds. This is
to be expected. You would need a timer that updates at least twice
the expected resolution to be able to exercise all the bits, i.e.,
something with a 2 MHz update-rate, that could be read before it
changed, or a real 'buzzer' of an ISR at 2 MHz. Before everybody
argues, a 1 MHz ISR would give you a value that is updated at
1 microsecond intervals, but now we have to read in asynchronously
and that's where Shannon (information theory) laws take effect.

> Another thought: Isn't it quite common that clock generators has a mode
> where the frequency differs around the desired frequency to spread the
> spectrum
> and easier pass EMC tests?

Actually, I think they just use a junk piezo-resonator instead of a
quartz crystal. If you try spread-spectrum with a bus-clock, you will
screw up all the timing so the bus won't work. Think PCI Bus with its
'reflected-wave' mechanism. If the clock timimg was to change during
PCI Bus activity, all bets are off.

> Could that be the case with the laptop?
> /Johan
>
Don't think so.


Cheers,
Dick Johnson

Penguin : Linux version 2.4.18 on an i686 machine (797.90 BogoMips).

Bill Gates? Who?

2002-03-08 20:17:12

by Jamie Lokier

[permalink] [raw]
Subject: Re: gettimeofday() system call timing curiosity

H. Peter Anvin wrote:
> > On my laptop, the median of rdtsc+gettimeofday+rdtsc times is 470 cycles
> > for most runs of 1000, but is occasionally 453 cycles.
>
> What that indicates to me is that 1000 is way too small of a sample.
> You're only talking a difference of 17,000 cycles, which could --
> especially with cache effects -- easily be the time spent in an
> interrupt handler.

No, interrupt handlers don't (or shouldn't) affect the measurement.

It's the _median_ that varies from 453 to 470, not the _mean_, so the
accumulation to 17000 cycles doesn't apply.

I do rdtsc, then gettimeofday(), then rdtsc. I record (a) the actual
time values, and (b) the difference between the two rdtsc measurements.

First I store 1000 of the difference measurements -- those are the "time
taken to do measurement in TSC cycles". I don't want to store a very
large number of these measurements because of the memory it takes, but
perhaps 1000 isn't quite enough (it isn't on the laptop, and is fine on
all the other machines).

Then I sort those and take the 499'th to get the median.

Assuming that at least 50% of the calls don't receive an interrupt or
reschedule in the middle (a reasonable assumption), then that median
gives me a good idea of how long a measurement takes _when there is no
interrupt or reschedule_.

Then I take approx. 1000000 measurements using the same code, and
discard all those that took longer than the median threshold determined
earlier.

This means that whenever an interrupt happened, that sample is discarded.

Finally, I do a simple least-squares linear regression through the
remaining samples (normally 400000-600000 or so) to calculate the slope.

The slope is what's printed.

cheers,
-- Jamie

2002-03-08 20:27:13

by Davide Libenzi

[permalink] [raw]
Subject: Re: gettimeofday() system call timing curiosity

On Fri, 8 Mar 2002, Jamie Lokier wrote:

> It's the _median_ that varies from 453 to 470, not the _mean_, so the
> accumulation to 17000 cycles doesn't apply.

I was doing something similar and i got huge numbers out of a small mean.
It was the recalc loop that hit inside the scheduler or in this case it
could be a schedule() entry due a need_resched set by the timer. From the
number of cycles it seems possible.



- Davide


2002-03-08 20:28:53

by Alan

[permalink] [raw]
Subject: Re: gettimeofday() system call timing curiosity

> Try it. It doesn't matter. Alan was correct, my computer sucks. However,
> they won't give me a 10 GHz one (yet). Note that although gettimeofday()

On a 1.8GHz athlon the "same" case occurs almost all the time. Machines
without a TSC will probably never hit it due to the slow ISA transactions
to talk to the PIT

2002-03-08 20:31:33

by Johan Adolfsson

[permalink] [raw]
Subject: Re: gettimeofday() system call timing curiosity

>
> > What happens if you remove the printf/puts and simply counts the number
> > of times the different cases happen?
> >
>
> Try it. It doesn't matter. Alan was correct, my computer sucks. However,
> they won't give me a 10 GHz one (yet). Note that although gettimeofday()
> has 1 microsecond resolution, not all the codes are exercised. We get
> something with the granularity of 50 to 190 microseconds.

I admit that I don't know the details of the x86 implementation,
but that sounds a little to large doesn't it?

I recently fixed the cris port to get 1 us resolution (by reading a
25MHz timer value within a jiffie).
My tests show that the time between calls to gettimeofday() typically
is 4-5 us on the 100MIPS ETRAX100LX - not that bad.
On my Linux PC i got 1us most of the time but never the same value,
but that is an old machine - PII 266 (reminds me I need to change machine:).

..
>
> Cheers,
> Dick Johnson

/Johan


2002-03-08 20:41:53

by George Anzinger

[permalink] [raw]
Subject: Re: gettimeofday() system call timing curiosity

"Richard B. Johnson" wrote:
>
> On Fri, 8 Mar 2002, Jamie Lokier wrote:
>
> > Jamie Lokier wrote:
> > > It takes the median of 1000 samples of the TSC time taken to do a
> > > rdtsc/gettimeofday/rdtsc measurement, then uses that as the threshold
> > > for deciding which of the subsequent 1000000 measurements are accepted.
> > > Then linear regression through the accepted points.
> > >
> > > I see a couple of results there which suggests a probable fault in the
> > > filtering algorithm. Perhaps it should simply use the smallest TSC time
> > > taken as the threshold.
> >
> > I've looked more closely. Of all the machines I have access to, only my
> > laptop shows the anomolous measurements.
> >
> > It turns out that the median of "time in TSC cycles to do a
> > rdtsc+gettimeofday+rdtsc measurement" varies from run to run. It only
> > varies between two values, though.
> >
> [SNIPPED...]
>
> The following program clearly shows that Linux will not return the
> same gettimeofday values twice in succession. Since it provably takes
> less than 1 microsecond to make a system call on a modern machine,
> Linux must be waiting within the gettimeofday procedure long enough
> to make certain that the time has changed. This may be screwing up
> any performance measurments made with gettimeofday().
>
Balderdash!! It is easy to do if you machine is fast enough. On my
800MHZ PIII gettimeofday take about 0.7 micro seconds min. over about
100 calls, NO loop overhead.

-g

> #include <stdio.h>
> #include <sys/time.h>
>
> int main(void);
> int main()
> {
> struct timeval tv, pv;
> tv.tv_sec = tv.tv_usec = pv.tv_sec = pv.tv_usec = 0;
> for(;;)
> {
> (void)gettimeofday(&tv, NULL);
> if((tv.tv_sec != pv.tv_sec) || (tv.tv_usec != pv.tv_usec))
> printf("sec = %ld usec = %ld\n", tv.tv_sec, tv.tv_usec);
> else
> puts("The same!");
> pv = tv;
> }
> return 0;
> }
>
> Cheers,
> Dick Johnson
>
> Penguin : Linux version 2.4.18 on an i686 machine (797.90 BogoMips).
>
> Bill Gates? Who?

--
George [email protected]
High-res-timers: http://sourceforge.net/projects/high-res-timers/
Real time sched: http://sourceforge.net/projects/rtsched/

2002-03-09 07:18:56

by pjd

[permalink] [raw]
Subject: Re: gettimeofday() system call timing curiosity

re: repeated gettimeofday() calls...

Alan Cox wrote:
>
> On a 1.8GHz athlon the "same" case occurs almost all the time. Machines
> without a TSC will probably never hit it due to the slow ISA transactions
> to talk to the PIT

Curiously, on a 1.7GHz P4 running RH7.1 (2.4.2 i686) it never hits
the same case, and takes slightly over a second to call gettimeofday a
million times. On my 600MHz Celeron at home (66MHz FSB, i810) running
2.4.17 with a similar .config (I started with the RH7.1 config out of
the source RPM) it takes .8 sec for a million calls and successive calls
return the same value about 20% of the time.

Peter Desnoyers

2002-03-09 07:19:07

by Kelsey Hudson

[permalink] [raw]
Subject: Re: gettimeofday() system call timing curiosity

On Fri, 8 Mar 2002, Richard B. Johnson wrote:

> Try it. It doesn't matter. Alan was correct, my computer sucks. However,

of course he is; all computers suck! :)


2002-03-09 18:38:02

by Alan

[permalink] [raw]
Subject: Re: gettimeofday() system call timing curiosity

> Curiously, on a 1.7GHz P4 running RH7.1 (2.4.2 i686) it never hits
> the same case, and takes slightly over a second to call gettimeofday a

Nothing curious about that I'm afraid. Every benchmark I've run on a PIV
shows that kind of behaviour - anything but perfectly predicted linear
code flow seems to hurt it. Locked operations really seem to make it suffer.

2002-03-11 22:47:55

by OBATA Noboru

[permalink] [raw]
Subject: Re: a faster way to gettimeofday?

> I have a program that I very often need to calculate the current
> time, with milisecond accuracy. I've been using gettimeofday(),
> but gprof shows it's taking a significant (10% or so) amount of
> time. Is there a faster (and perhaps less portable?) way to get
> the time information on x86? My program runs as root, so should
> have any permissions it needs to use some backdoor hack if that
> helps!

Just from curious, I have implemented the userland gettimeofday
as a shared library. It runs about ten times faster than the
system call on my PC, Pentium4 1.5GHz, Linux-2.4.18-pre9-ac3.

The benchmark program says:

system call: 1.226556 usec/call
userland : 0.133730 usec/call, (usec = micro second = 0.000001 sec)

It is available from:

http://www.cs.rpi.edu/~obatan/ugettime/ugettime-0.1.1.tar.gz

You have to run calibration program "calib" first. Then compile
a shared library ugettime.so, in which these calibrated
parameters are embedded as constants. (So you need to recompile
it whenever you reboot the system. :-) Please read README to
see how to use it.

Preloading this library enables the userland gettimeofday. For
example,

$ LD_PRELOAD=./ugettime.so emacs

will execute emacs and all gettimeofday system calls that emacs
issues will be handled by ./ugettime.so.

Patient calibration gives you very accurate userland
gettimeofday if your original gettimeofday and TSC on your
system has good enough precision.

I'm running verification program for 18 hours, but the userland
gettimeofday still synchronizes with the actual system call.

:
:
sys: 1015885532.519445 <- system call
usr: 1015885532.519444 (-0.000001) <- userland (difference in seconds)
sys: 1015885533.029344
usr: 1015885533.029344 (+0.000000)
sys: 1015885533.539446
usr: 1015885533.539445 (-0.000001)
:
:

Yes, it is just for fun... Enjoy!

--
OBATA Noboru ([email protected])

2002-03-12 13:08:09

by Jamie Lokier

[permalink] [raw]
Subject: Re: a faster way to gettimeofday?

OBATA Noboru wrote:
> I'm running verification program for 18 hours, but the userland
> gettimeofday still synchronizes with the actual system call.
>
> :
> :
> sys: 1015885532.519445 <- system call
> usr: 1015885532.519444 (-0.000001) <- userland (difference in seconds)
> sys: 1015885533.029344
> usr: 1015885533.029344 (+0.000000)
> sys: 1015885533.539446
> usr: 1015885533.539445 (-0.000001)
> :
> :
>
> Yes, it is just for fun... Enjoy!

I was just thinking that I'd expect to see more drift, as in labs it is
really quite difficult to keep _different_ computers synchronised.
Thermal and/or power variation causes the clock frequencies to change,
just a little but enough that <1ppm over 18 hours would be improbable.
I've seen graphs in which night and day, even in an environmentally
regulated lab, show up.

But then I realised that the CPU clock and the PIT chip (which provides
the 100HZ tick) are sometimes derived from the same clock source anyway,
so the drift would be as good as the calibration. And if they weren't,
they might will be physically close and so tend to drift together.

If you run NTP (synchronisation with atomic clock standards over the
network), you get really good real world clock times. It continually
adjust gettimeofday() but not the rdtsc clock. That may highlight any
drift due to thermal effects in the PC.

I would guess you're not running NTP?

cheers,
-- Jamie

2002-03-12 15:16:06

by OBATA Noboru

[permalink] [raw]
Subject: Re: a faster way to gettimeofday?

> If you run NTP (synchronisation with atomic clock standards over the
> network), you get really good real world clock times. It continually
> adjust gettimeofday() but not the rdtsc clock. That may highlight any
> drift due to thermal effects in the PC.
>
> I would guess you're not running NTP?

You are right. I'm not running NTP. Actually, I was using
clockspeed (http://cr.yp.to/clockspeed.html) before, which
adjusts system clock using RDTSC, but I have stopped it to
observe pure system clock.

In the first version my userland gettimeofday, I'm more
interested in how RDTSC could be configured to synchronize with
the system clock, rather than the real world clock time.

To atapt the system clock change, userland gettimeofday should
issue real gettimeofday system call occasionally, as Terje
suggests early in this thread, and adjust calibration
parameters. I'll call it "adaptive userland gettimeofday."

You may not need calibration any longer since it is adaptive.
However, it will be slower and less effective because much
adjusting code should be implemented. So I'm hesitating to
implement adaptive userland gettimeofday :-).

--
OBATA Noboru ([email protected])