2001-03-02 05:30:36

by John Being

[permalink] [raw]
Subject: strange nonmonotonic behavior of gettimeoftheday

I've got following problem with 2.2.17 (Redhat stock kernel)
Linux ***** 2.2.17-14 #1 Mon Feb 5 14:57:25 EST 2001 i586 unknown
on AMD K6, VIA Technologies VT 82C586, Compaq Presario XL119.
Following C program
#include <stdio.h>
#include <sys/time.h>
#include <unistd.h>
#include <time.h>
#define ABS(x) (x < 0 ? -x : x)
#define TIME_T struct timeval
#define TIME_DIFF_T long
#define GET_TIME(x) gettimeofday(&x, NULL)
#define TIME_DIFF(x1, x2) ((x2.tv_sec - x1.tv_sec)*1000000 + (x2.tv_usec -
x1.tv_usec))
int main(int argc, char** argv)
{
TIME_T t1, t2;
TIME_DIFF_T d;

GET_TIME(t2);
while (1) {
GET_TIME(t1);
d = TIME_DIFF(t2, t1);
if (d > 500000 || d < 0) {
fprintf(stderr, "Leap found: %ld msec\n", d);
return 0;
}
t2 = t1;
}
return 1;

gives following result on box in question
root@******:# ./clo
Leap found: -1687 msec
and prints nothing on all other my boxes.
This gives me bunch of troubles with occasional hang ups and I found nothing
in kernel archives at
http://www.uwsg.indiana.edu/hypermail/linux/kernel/index.html
just some notes about smth like this for SMP boxes with ntp. Is this issue
known, and how can I fix it?

Thanks.


_________________________________________________________________________
Get Your Private, Free E-mail from MSN Hotmail at http://www.hotmail.com.


2001-03-02 15:07:55

by Eli Carter

[permalink] [raw]
Subject: Re: strange nonmonotonic behavior of gettimeoftheday

John Being wrote:
>
> I've got following problem with 2.2.17 (Redhat stock kernel)
> Linux ***** 2.2.17-14 #1 Mon Feb 5 14:57:25 EST 2001 i586 unknown
> on AMD K6, VIA Technologies VT 82C586, Compaq Presario XL119.
> Following C program
> #include <stdio.h>
> #include <sys/time.h>
> #include <unistd.h>
> #include <time.h>
> #define ABS(x) (x < 0 ? -x : x)
> #define TIME_T struct timeval
> #define TIME_DIFF_T long
> #define GET_TIME(x) gettimeofday(&x, NULL)
> #define TIME_DIFF(x1, x2) ((x2.tv_sec - x1.tv_sec)*1000000 + (x2.tv_usec -
> x1.tv_usec))
> int main(int argc, char** argv)
> {
> TIME_T t1, t2;
> TIME_DIFF_T d;
>
> GET_TIME(t2);
> while (1) {
> GET_TIME(t1);
> d = TIME_DIFF(t2, t1);
> if (d > 500000 || d < 0) {
> fprintf(stderr, "Leap found: %ld msec\n", d);
> return 0;
> }
> t2 = t1;
> }
> return 1;
>
> gives following result on box in question
> root@******:# ./clo
> Leap found: -1687 msec
> and prints nothing on all other my boxes.
> This gives me bunch of troubles with occasional hang ups and I found nothing
> in kernel archives at
> http://www.uwsg.indiana.edu/hypermail/linux/kernel/index.html
> just some notes about smth like this for SMP boxes with ntp. Is this issue
> known, and how can I fix it?

(That should read "usec" since you are printing microseconds and not
milliseconds.)

I've seen behaviour like that on an ARM processor because the time code
was not considering missed (or rather, delayed response to) timer
interrupts. The time jump in that case was slightly less than 1 jiffie
(jiffie = 10ms). It's likely rather hardware specific; see if you can
get someone with the same hardware to run your test code.

Eli
-----------------------. Rule of Accuracy: When working toward
Eli Carter | the solution of a problem, it always
eli.carter(at)inet.com `------------------ helps if you know the answer.

2001-03-02 15:13:04

by Chris Friesen

[permalink] [raw]
Subject: Re: strange nonmonotonic behavior of gettimeoftheday -- seen similar problem on PPC

John Being wrote:

> gives following result on box in question
> root@******:# ./clo
> Leap found: -1687 msec
> and prints nothing on all other my boxes.
> This gives me bunch of troubles with occasional hang ups and I found nothing
> in kernel archives at
> http://www.uwsg.indiana.edu/hypermail/linux/kernel/index.html
> just some notes about smth like this for SMP boxes with ntp. Is this issue
> known, and how can I fix it?

I've run into non-monotonic gettimeofday() on a PPC system with 2.2.17, but it
always seemed to be almost exactly a jiffy out, as though it was getting
hundredths of a second from the old tick, and microseconds from the new tick.
Your leap seems to be more unusual, and the first one I've seen on an x86 box.

Have you considered storing the results to see what happens on the next call?
Does it make up the difference, or do you just lose that time?

Chris


--
Chris Friesen | MailStop: 043/33/F10
Nortel Networks | work: (613) 765-0557
3500 Carling Avenue | fax: (613) 765-2986
Nepean, ON K2H 8E9 Canada | email: [email protected]

2001-03-02 15:41:33

by Richard B. Johnson

[permalink] [raw]
Subject: Re: strange nonmonotonic behavior of gettimeoftheday -- seen similar problem on PPC

On Fri, 2 Mar 2001, Christopher Friesen wrote:

> John Being wrote:
>
> > gives following result on box in question
> > root@******:# ./clo
> > Leap found: -1687 msec
> > and prints nothing on all other my boxes.
> > This gives me bunch of troubles with occasional hang ups and I found nothing
> > in kernel archives at
> > http://www.uwsg.indiana.edu/hypermail/linux/kernel/index.html
> > just some notes about smth like this for SMP boxes with ntp. Is this issue
> > known, and how can I fix it?
>
> I've run into non-monotonic gettimeofday() on a PPC system with 2.2.17, but it
> always seemed to be almost exactly a jiffy out, as though it was getting
> hundredths of a second from the old tick, and microseconds from the new tick.
> Your leap seems to be more unusual, and the first one I've seen on an x86 box.
>
> Have you considered storing the results to see what happens on the next call?
> Does it make up the difference, or do you just lose that time?
>
> Chris

I think it's a math problem in the test code. Try this:

#include <stdio.h>
#include <sys/time.h>

#define DEB(f)

int main()
{
struct timeval t;
double start_us;
double stop_us;
for(;;)
{
gettimeofday(&t, NULL);
start_us = (double) t.tv_sec * 1e6;
start_us += (double) t.tv_usec;
gettimeofday(&t, NULL);
stop_us = (double) t.tv_sec * 1e6;
stop_us += (double) t.tv_usec;
if(stop_us <= start_us)
break;
DEB(fprintf(stdout, "Start = %f, Stop = %f\n", start_us, stop_us));
}
fprintf(stderr, "Start = %f, Stop = %f\n", start_us, stop_us);
return 0;
}

Note that two subsequent calls to gettimeofday() must not return the
same time even if your CPU runs infinitely fast. I haven't seen any
kernel in the past few years that fails this test.


Cheers,
Dick Johnson

Penguin : Linux version 2.4.1 on an i686 machine (799.53 BogoMips).

"Memory is like gasoline. You use it up when you are running. Of
course you get it all back when you reboot..."; Actual explanation
obtained from the Micro$oft help desk.


2001-03-02 17:07:03

by Manfred Spraul

[permalink] [raw]
Subject: Re: strange nonmonotonic behavior of gettimeoftheday

#include <stdio.h>
#include <sys/time.h>
#include <unistd.h>
#include <time.h>

static unsigned long long get_tsc(void)
{
unsigned long v1;
unsigned long v2;
__asm__ __volatile__(
"rdtsc\n\t"
: "=a" (v1), "=d" (v2));
return (((unsigned long long)v2)<<32)+v1;
}

int main(int argc, char** argv)
{
unsigned long long t1;
unsigned long long t2;

printf("RDTSC tester\n");
t1 = get_tsc();
for(;;) {
t2 = get_tsc();
if(t1 > t2) {
printf("tsc jumped backwards: from %lld to %lld.\n",
t1, t2);
}
#if 0
printf("diff is %lld-%lld=%d.\n",t2,t1,t2-t1);
#endif
t1 = t2;

}
return 1;
}


Attachments:
ms.c (612.00 B)

2001-03-02 18:32:01

by George Anzinger

[permalink] [raw]
Subject: Re: strange nonmonotonic behavior of gettimeoftheday -- seen similar problem on PPC

"Richard B. Johnson" wrote:
>
> On Fri, 2 Mar 2001, Christopher Friesen wrote:
>
> > John Being wrote:
> >
> > > gives following result on box in question
> > > root@******:# ./clo
> > > Leap found: -1687 msec
> > > and prints nothing on all other my boxes.
> > > This gives me bunch of troubles with occasional hang ups and I found nothing
> > > in kernel archives at
> > > http://www.uwsg.indiana.edu/hypermail/linux/kernel/index.html
> > > just some notes about smth like this for SMP boxes with ntp. Is this issue
> > > known, and how can I fix it?
> >
> > I've run into non-monotonic gettimeofday() on a PPC system with 2.2.17, but it
> > always seemed to be almost exactly a jiffy out, as though it was getting
> > hundredths of a second from the old tick, and microseconds from the new tick.
> > Your leap seems to be more unusual, and the first one I've seen on an x86 box.
> >
> > Have you considered storing the results to see what happens on the next call?
> > Does it make up the difference, or do you just lose that time?
> >
> > Chris
>
> I think it's a math problem in the test code. Try this:
>
> #include <stdio.h>
> #include <sys/time.h>
>
> #define DEB(f)
>
> int main()
> {
> struct timeval t;
> double start_us;
> double stop_us;
> for(;;)
> {
> gettimeofday(&t, NULL);
> start_us = (double) t.tv_sec * 1e6;
> start_us += (double) t.tv_usec;
> gettimeofday(&t, NULL);
> stop_us = (double) t.tv_sec * 1e6;
> stop_us += (double) t.tv_usec;
> if(stop_us <= start_us)
> break;
> DEB(fprintf(stdout, "Start = %f, Stop = %f\n", start_us, stop_us));
> }
> fprintf(stderr, "Start = %f, Stop = %f\n", start_us, stop_us);
> return 0;
> }
>
> Note that two subsequent calls to gettimeofday() must not return the
> same time even if your CPU runs infinitely fast. I haven't seen any
> kernel in the past few years that fails this test.

Oh! With only micro second resolution how is this avoided? The only
"legal" thing to do to avoid this is for the fast boxes to loop until
the requirement is satisfied. Is this really done?

George

2001-03-02 19:16:16

by Richard B. Johnson

[permalink] [raw]
Subject: Re: strange nonmonotonic behavior of gettimeoftheday -- seen similar problem on PPC

On Fri, 2 Mar 2001, george anzinger wrote:

> "Richard B. Johnson" wrote:
> >
> > On Fri, 2 Mar 2001, Christopher Friesen wrote:
> >
> > > John Being wrote:
> > >
> > > > gives following result on box in question
> > > > root@******:# ./clo
> > > > Leap found: -1687 msec
> > > > and prints nothing on all other my boxes.
> > > > This gives me bunch of troubles with occasional hang ups and I found nothing
> > > > in kernel archives at
> > > > http://www.uwsg.indiana.edu/hypermail/linux/kernel/index.html
> > > > just some notes about smth like this for SMP boxes with ntp. Is this issue
> > > > known, and how can I fix it?
> > >
> > > I've run into non-monotonic gettimeofday() on a PPC system with 2.2.17, but it
> > > always seemed to be almost exactly a jiffy out, as though it was getting
> > > hundredths of a second from the old tick, and microseconds from the new tick.
> > > Your leap seems to be more unusual, and the first one I've seen on an x86 box.
> > >
> > > Have you considered storing the results to see what happens on the next call?
> > > Does it make up the difference, or do you just lose that time?
> > >
> > > Chris
> >
> > I think it's a math problem in the test code. Try this:
> >
> > #include <stdio.h>
> > #include <sys/time.h>
> >
> > #define DEB(f)
> >
> > int main()
> > {
> > struct timeval t;
> > double start_us;
> > double stop_us;
> > for(;;)
> > {
> > gettimeofday(&t, NULL);
> > start_us = (double) t.tv_sec * 1e6;
> > start_us += (double) t.tv_usec;
> > gettimeofday(&t, NULL);
> > stop_us = (double) t.tv_sec * 1e6;
> > stop_us += (double) t.tv_usec;
> > if(stop_us <= start_us)
> > break;
> > DEB(fprintf(stdout, "Start = %f, Stop = %f\n", start_us, stop_us));
> > }
> > fprintf(stderr, "Start = %f, Stop = %f\n", start_us, stop_us);
> > return 0;
> > }
> >
> > Note that two subsequent calls to gettimeofday() must not return the
> > same time even if your CPU runs infinitely fast. I haven't seen any
> > kernel in the past few years that fails this test.
>
> Oh! With only micro second resolution how is this avoided? The only
> "legal" thing to do to avoid this is for the fast boxes to loop until
> the requirement is satisfied. Is this really done?
>
> George
>

Yes and no. It takes microseconds to call the kernel for anything (time
getpid() ), so it seldom loops. All the kernel has to do is remember
the last value returned. If the time isn't past that time yet, bump
that value and return it instead of waiting.


Cheers,
Dick Johnson

Penguin : Linux version 2.4.1 on an i686 machine (799.53 BogoMips).

"Memory is like gasoline. You use it up when you are running. Of
course you get it all back when you reboot..."; Actual explanation
obtained from the Micro$oft help desk.


2001-03-02 20:13:05

by George Anzinger

[permalink] [raw]
Subject: Re: strange nonmonotonic behavior of gettimeoftheday -- seen similar problem on PPC

"Richard B. Johnson" wrote:
>
> On Fri, 2 Mar 2001, george anzinger wrote:
>
> > "Richard B. Johnson" wrote:
>
~snip~

> > > Note that two subsequent calls to gettimeofday() must not return the
> > > same time even if your CPU runs infinitely fast. I haven't seen any
> > > kernel in the past few years that fails this test.
> >
> > Oh! With only micro second resolution how is this avoided? The only
> > "legal" thing to do to avoid this is for the fast boxes to loop until
> > the requirement is satisfied. Is this really done?
> >
> > George
> >
>
> Yes and no. It takes microseconds to call the kernel for anything (time
> getpid() ), so it seldom loops. All the kernel has to do is remember
> the last value returned. If the time isn't past that time yet, bump
> that value and return it instead of waiting.
>
Well, "has to do" and "does" are two different animals. My reading of
the code shows that it does not. I have a bit of code that does
gettimeofday() calls as fast as possible and on some boxes (ix86) have
seen the difference as low as 1 micro second. It is not beyond
imagination that a box might return the same time two times in a row
given the processors performance increases we are seeing. I, for one,
don't find this objectionable. I WILL take exception to time running
backward, however. (I don't see how this is avoided on the leap second
delete, but I have just started looking at this issue.) As to returning
a time in the future as you suggest, I think this is a bad policy. If
the box can actually do two gettimeofdays in one micro second or less,
it SHOULD return the same time (given the resolution can not resolve the
difference). If this becomes an issue, and it will, those that care
should use the clock_gettime() call which should return time in nano
seconds. This is part of the POSIX standard code for which we are
working on at:


http://sourceforge.net/projects/high-res-timers/

George

2001-03-02 22:33:55

by John Being

[permalink] [raw]
Subject: Re: strange nonmonotonic behavior of gettimeoftheday

OK, short status from the same box. It was up for about 2 weeks, but
yesterday due this problem it become unuseable, as X failed at startup with
message about failed select(). Before reboot I made some tests
and found:
- it triggered by starting of X (without X no backjumps)
- it has something with interrupts, at least when I run program above
(it is correct, at least it can determine problem) as
while [ 1 ]; do ./clo; done
and pressed key, it printed much less strings
- jumps are about 300-2000 microseconds
- there are some cases of such behaviour on Usenet (mainly diagnosed as
screen flickering due incorrect screensaver startup)


After reboot problem goes away( nothing changed in config). Maybe it related
to APM (as I did several suspends before this problem appears). Program
testing RDTSC works OK now. If this problem appears again - I will run it.
Thanks for help.


>From: Manfred Spraul <[email protected]>
>To: [email protected]
>CC: [email protected]
>Subject: Re: strange nonmonotonic behavior of gettimeoftheday
>Date: Fri, 02 Mar 2001 18:06:05 +0100
>
> >
> > on AMD K6, VIA Technologies VT 82C586, Compaq Presario XL119.
> > [snip]
> > gives following result on box in question
> > root@******:# ./clo
> > Leap found: -1687 msec
> > and prints nothing on all other my boxes.
>
>Perhaps APM or SMI problems?
>Could you run the attached program?
>
>--
> Manfred
>#include <stdio.h>
>#include <sys/time.h>
>#include <unistd.h>
>#include <time.h>
>
>static unsigned long long get_tsc(void)
>{
> unsigned long v1;
> unsigned long v2;
> __asm__ __volatile__(
> "rdtsc\n\t"
> : "=a" (v1), "=d" (v2));
> return (((unsigned long long)v2)<<32)+v1;
>}
>
>int main(int argc, char** argv)
>{
> unsigned long long t1;
> unsigned long long t2;
>
> printf("RDTSC tester\n");
> t1 = get_tsc();
> for(;;) {
> t2 = get_tsc();
> if(t1 > t2) {
> printf("tsc jumped backwards: from %lld to %lld.\n",
> t1, t2);
> }
>#if 0
> printf("diff is %lld-%lld=%d.\n",t2,t1,t2-t1);
>#endif
> t1 = t2;
>
> }
> return 1;
>}
>

_________________________________________________________________________
Get Your Private, Free E-mail from MSN Hotmail at http://www.hotmail.com.

2001-03-03 07:43:25

by Mike Galbraith

[permalink] [raw]
Subject: Re: strange nonmonotonic behavior of gettimeoftheday -- seen similar problem on PPC

On Fri, 2 Mar 2001, Richard B. Johnson wrote:

> Yes and no. It takes microseconds to call the kernel for anything (time
> getpid() ), so it seldom loops. All the kernel has to do is remember

Hi,

c0109286 system_call +<22/40> (0.21) pid(4265)
c011c7e7 sys_gettimeofday +<13/a8> (0.27) pid(4265)
c010e3b2 do_gettimeofday +<e/88> (0.48) pid(4265)
c01092aa ret_from_sys_call +<6/21> (0.76) pid(4265)
c0109286 system_call +<22/40> (0.19) pid(4265)
c0120b45 sys_getpid +<d/1c> (0.18) pid(4265)
c01092aa ret_from_sys_call +<6/21> (0.77) pid(4265)
^^^^ time in usecs

This is a 500Mhz PIII. It wouldn't take much more cpu/memory speed
to get under a usec. The overhead of calling the kernel on this box
is almost exactly 1 usec.

-Mike

2001-03-03 22:40:38

by dean gaudet

[permalink] [raw]
Subject: Re: strange nonmonotonic behavior of gettimeoftheday -- seen similar problem on PPC

On Fri, 2 Mar 2001, Richard B. Johnson wrote:

> Note that two subsequent calls to gettimeofday() must not return the
> same time even if your CPU runs infinitely fast. I haven't seen any
> kernel in the past few years that fails this test.

i don't see any requirement for this in SuS.

http://www.opengroup.org/onlinepubs/007908799/xsh/gettimeofday.html

it'd be a pretty disappointing requirement.

-dean