2003-03-29 01:57:47

by Eyal Lebedinsky

[permalink] [raw]
Subject: 2.4.20: problem with "ps -olstart"

I have a program that needs to check if a process is still running.
It issues a ps and saves the pid and start time. It then uses this
info to compare against a later "ps --pid".

The monitored process is long-running and unlikely to restart at the
same second and with the same pid. I am trying to go better than
just using a pid sentry (I also compare the cmd to be really sure).

I see a different start time returned on different calls. An example
is attached below. This is a show stopper for me. Is this a known
problem? Does it have a solution?

This is vanilla (my build) 2.4.20 on i386.

$ ps
PID TTY TIME CMD
906 pts/0 00:00:01 bash
3026 pts/0 00:00:00 sh
8254 pts/0 00:00:00 ps
$ while true ; do ps --pid "3026" -olstart,cmd --no-headers ; done
Thu Mar 27 22:03:11 2003 sh
Thu Mar 27 22:03:11 2003 sh
Thu Mar 27 22:03:11 2003 sh
Thu Mar 27 22:03:11 2003 sh
Thu Mar 27 22:03:12 2003 sh
Thu Mar 27 22:03:12 2003 sh
Thu Mar 27 22:03:12 2003 sh
Thu Mar 27 22:03:12 2003 sh
Thu Mar 27 22:03:12 2003 sh
Thu Mar 27 22:03:12 2003 sh
Thu Mar 27 22:03:12 2003 sh
Thu Mar 27 22:03:12 2003 sh
Thu Mar 27 22:03:12 2003 sh
Thu Mar 27 22:03:12 2003 sh
Thu Mar 27 22:03:11 2003 sh
Thu Mar 27 22:03:11 2003 sh
Thu Mar 27 22:03:11 2003 sh
Thu Mar 27 22:03:11 2003 sh
Thu Mar 27 22:03:11 2003 sh
Thu Mar 27 22:03:12 2003 sh
Thu Mar 27 22:03:12 2003 sh
Thu Mar 27 22:03:12 2003 sh
Thu Mar 27 22:03:12 2003 sh
Thu Mar 27 22:03:12 2003 sh
Thu Mar 27 22:03:12 2003 sh
Thu Mar 27 22:03:12 2003 sh
Thu Mar 27 22:03:12 2003 sh
Thu Mar 27 22:03:12 2003 sh
Thu Mar 27 22:03:12 2003 sh
Thu Mar 27 22:03:11 2003 sh

--
Eyal Lebedinsky ([email protected]) <http://samba.org/eyal/>


2003-03-29 11:58:13

by Andries Brouwer

[permalink] [raw]
Subject: Re: 2.4.20: problem with "ps -olstart"

On Sat, Mar 29, 2003 at 01:08:58PM +1100, Eyal Lebedinsky wrote:

> I see a different start time returned on different calls. An example
> is attached below. This is a show stopper for me. Is this a known
> problem? Does it have a solution?
>
> This is vanilla (my build) 2.4.20 on i386.
>
> $ while true ; do ps --pid "3026" -olstart,cmd --no-headers ; done
> Thu Mar 27 22:03:11 2003 sh
> Thu Mar 27 22:03:11 2003 sh
> Thu Mar 27 22:03:12 2003 sh
> Thu Mar 27 22:03:11 2003 sh

Look at your ps source. There are many incarnations of ps,
but perhaps you'll find something like

seconds_since_boot = uptime(0,0);
seconds_since_1970 = time(NULL);
time_of_boot = seconds_since_1970 - seconds_since_boot;
start = time_of_boot + pp->start_time/Hertz;

The interplay of rounding and truncating you see here
results in what you see. Instead of using ps you might try
a tiny utility that reads the start time directly.

Andries

2003-03-29 13:27:38

by Eyal Lebedinsky

[permalink] [raw]
Subject: Re: 2.4.20: problem with "ps -olstart"

Andries Brouwer wrote:
>
> On Sat, Mar 29, 2003 at 01:08:58PM +1100, Eyal Lebedinsky wrote:
>
> > I see a different start time returned on different calls. An example
> > is attached below. This is a show stopper for me. Is this a known
> > problem? Does it have a solution?
> >
> > This is vanilla (my build) 2.4.20 on i386.
> >
> > $ while true ; do ps --pid "3026" -olstart,cmd --no-headers ; done
> > Thu Mar 27 22:03:11 2003 sh
> > Thu Mar 27 22:03:11 2003 sh
> > Thu Mar 27 22:03:12 2003 sh
> > Thu Mar 27 22:03:11 2003 sh
>
> Look at your ps source. There are many incarnations of ps,
> but perhaps you'll find something like
>
> seconds_since_boot = uptime(0,0);
> seconds_since_1970 = time(NULL);
> time_of_boot = seconds_since_1970 - seconds_since_boot;
> start = time_of_boot + pp->start_time/Hertz;
>
> The interplay of rounding and truncating you see here
> results in what you see. Instead of using ps you might try
> a tiny utility that reads the start time directly.

OK, I can see how ps may be doing a bad job there. It can do
better, e.g. from I can identify a process using something like:
grep btime /proc/stat
time machine booted (secs since 1970 probably)
cut -d' ' -f22 </proc/$pid/stat
process start time (jiffies since boot)
and if I really want I can convert start_time to seconds and have
a value (the sum of the two) that does not change.

>From a user program, and not depending on /proc, I do not see
how I can access the kernel 'xtime' and 'task->start_time'.

--
Eyal Lebedinsky ([email protected]) <http://samba.org/eyal/>