Looks like I dropped the ball on this. Didn't realise the patch would
just evaporate...
On Mon, 25 May 2009, Martin Schwidefsky wrote:
> On Mon, 18 May 2009 14:23:03 +0100 (BST)
> Michael Abbott <[email protected]> wrote:
>
> > diff --git a/fs/proc/uptime.c b/fs/proc/uptime.c
> > index 0c10a0b..0f43395 100644
> > --- a/fs/proc/uptime.c
> > +++ b/fs/proc/uptime.c
> > @@ -4,13 +4,19 @@
> > #include <linux/sched.h>
> > #include <linux/seq_file.h>
> > #include <linux/time.h>
> > +#include <linux/kernel_stat.h>
> > #include <asm/cputime.h>
> >
> > static int uptime_proc_show(struct seq_file *m, void *v)
> > {
> > struct timespec uptime;
> > struct timespec idle;
> > - cputime_t idletime = cputime_add(init_task.utime, init_task.stime);
> > + int len, i;
> > + cputime_t idletime = 0;
> > +
> > + for_each_possible_cpu(i)
> > + idletime = cputime64_add(idletime, kstat_cpu(i).cpustat.idle);
> > + idletime = cputime64_to_clock_t(idletime);
> >
> > do_posix_clock_monotonic_gettime(&uptime);
> > monotonic_to_bootbased(&uptime);
>
> I found another problem with this patch: why do you convert the
> idletime from cputime_t to clock_t ? The call to cputime_to_timespec
> takes a cputime_t and the conversion from cputime to clock_t returns a
> value that is way to small on s390. After removing that line it works
> for me but I wonder why you added it.
I've tidied up the patch a trifle, the idle time processing is now very
nearly an exactly copy of that in fs/proc/stat.c.
Unfortunately this now begs more questions:
1. Why is idle time processing so different from up time, in particular
why am I still allowing uptime to overflow in a year and four months or
so?
Answer: I'm trying to just fix idle time, and the uptime calculation looks
like a can of worms to me.
2. Is this really the right calculation of idle time, in particular what
about multiple CPUs?
Answer: I don't have a multi-process test system, so I'm not qualified to
investigate. My view is that idle time should match uptime, ie should be
in literal elapsed machine time (so should really be divided by the number
of CPUs here), but it's clear that doing this properly is not
straighforward.
As for whether it is "safe" to compute idle time this way, this code is
just a copy of what is done in fs/proc/stat.c, so worry about that first!
It's rather a shame that /proc/uptime idle time reporting has been busted
now for what will be three kernel releases in a row.
>From b885468d04475522486e7004de5dba29df1a28a8 Mon Sep 17 00:00:00 2001
From: Michael Abbott <[email protected]>
Date: Mon, 11 May 2009 07:14:19 +0100
Subject: [PATCH] Fix idle time field in /proc/uptime
Git commit 79741dd changes idle cputime accounting, but unfortunately
the /proc/uptime file hasn't caught up. Here the idle time calculation
from /proc/stat is copied over. Further changes from commit e1c8053
are also included in this fix.
Signed-off-by: Michael Abbott <[email protected]>
---
fs/proc/uptime.c | 24 ++++++++++++++++++------
1 files changed, 18 insertions(+), 6 deletions(-)
diff --git a/fs/proc/uptime.c b/fs/proc/uptime.c
index 0c10a0b..2d573f7 100644
--- a/fs/proc/uptime.c
+++ b/fs/proc/uptime.c
@@ -4,22 +4,34 @@
#include <linux/sched.h>
#include <linux/seq_file.h>
#include <linux/time.h>
+#include <linux/kernel_stat.h>
#include <asm/cputime.h>
+#include <asm/div64.h>
+
+#ifndef arch_idle_time
+#define arch_idle_time(cpu) 0
+#endif
static int uptime_proc_show(struct seq_file *m, void *v)
{
struct timespec uptime;
- struct timespec idle;
- cputime_t idletime = cputime_add(init_task.utime, init_task.stime);
+ int i;
+ cputime64_t idle = cputime64_zero;
+ unsigned long int idle_mod;
+
+ for_each_possible_cpu(i) {
+ idle = cputime64_add(idle, kstat_cpu(i).cpustat.idle);
+ idle = cputime64_add(idle, arch_idle_time(i));
+ }
+ idle = cputime64_to_clock_t(idle);
+ idle_mod = do_div(idle, 100);
do_posix_clock_monotonic_gettime(&uptime);
monotonic_to_bootbased(&uptime);
- cputime_to_timespec(idletime, &idle);
- seq_printf(m, "%lu.%02lu %lu.%02lu\n",
+ seq_printf(m, "%lu.%02lu %llu.%02lu\n",
(unsigned long) uptime.tv_sec,
(uptime.tv_nsec / (NSEC_PER_SEC / 100)),
- (unsigned long) idle.tv_sec,
- (idle.tv_nsec / (NSEC_PER_SEC / 100)));
+ idle, idle_mod);
return 0;
}
--
1.6.1.3
On Monday 2009-07-06 17:48, Michael Abbott wrote:
>[...]
>why am I still allowing uptime to overflow in a year and four months or
>so?
Is not this addressed by 64-bit jiffies already?
On Mon, 6 Jul 2009, Jan Engelhardt wrote:
> On Monday 2009-07-06 17:48, Michael Abbott wrote:
> >[...]
> >why am I still allowing uptime to overflow in a year and four months or
> >so?
> Is not this addressed by 64-bit jiffies already?
Possibly, sorry, not terribly familiar with the underlying infrastructure,
just trying to fix one bug without being dragged under by all the rest!