2006-10-27 03:18:26

by Martin Tostrup Setek

[permalink] [raw]
Subject: [PATCH: 2.6.18.1] delayacct: cpu_count in taskstats updated correctly

from: Martin T. Setek <[email protected]>

cpu_count in struct taskstats should be the same as the corresponding
(third) value found in /proc/<pid>/schedstat
Signed-off-by: <[email protected]>
---
Index: linux-2.6.18.1/kernel/delayacct.c
===================================================================
--- linux-2.6.18.1.orig/kernel/delayacct.c
+++ linux-2.6.18.1/kernel/delayacct.c
@@ -124,7 +124,7 @@ int __delayacct_add_tsk(struct taskstats
t2 = tsk->sched_info.run_delay;
t3 = tsk->sched_info.cpu_time;

- d->cpu_count += t1;
+ d->cpu_count = t1;

jiffies_to_timespec(t2, &ts);
tmp = (s64)d->cpu_delay_total + timespec_to_ns(&ts);


2006-10-27 03:30:00

by David Rientjes

[permalink] [raw]
Subject: Re: [PATCH: 2.6.18.1] delayacct: cpu_count in taskstats updated correctly

On Fri, 27 Oct 2006, Martin Tostrup Setek wrote:

> from: Martin T. Setek <[email protected]>
>
> cpu_count in struct taskstats should be the same as the corresponding (third)
> value found in /proc/<pid>/schedstat

I disagree in favor of Documentation/accounting/taskstats-struct.txt.
cpu_count is the number of delay values recorded, so accumulating them is
appropriate.

David

2006-10-27 04:47:19

by Martin Tostrup Setek

[permalink] [raw]
Subject: Re: [PATCH: 2.6.18.1] delayacct: cpu_count in taskstats updated correctly

On Thu, 26 Oct 2006, David Rientjes wrote:
> On Fri, 27 Oct 2006, Martin Tostrup Setek wrote:
>> from: Martin T. Setek <[email protected]>
>> cpu_count in struct taskstats should be the same as the corresponding (third)
>> value found in /proc/<pid>/schedstat
>
> I disagree in favor of Documentation/accounting/taskstats-struct.txt.
> cpu_count is the number of delay values recorded, so accumulating them is
> appropriate.

Perhaps I'm mistaken, but the code accumulates the value it
finds in sched_info.pcnt in the task_struct. Now, in sched.h I found this:

struct sched_info {
/* cumulative counters */
unsigned long cpu_time, /* time spent on the cpu */
run_delay, /* time spent waiting on a runqueue */
pcnt; /* # of timeslices run on this cpu */

The comment says that these counters are cumulative... The code that
updates them (sched.c: sched_info_arrive()), does accumulate them.

In include/linux/taskstats.h, I found:

* xxx_count is the number of delay values recorded
* xxx_delay_total is the corresponding cumulative delay in nanoseconds

I interpret these comments as saying that:

cpu_delay should be the total number of nanoseconds a task has been
waiting in a runqueue for a CPU, and cpu_count is equal to the number of
times the task got the CPU (or waited for it). If so, then the code
updates taskstats.cpu_delay_total incorrectly too (which my patch didn't
fix).

If not, then the comments in taskstats.h are very confusing....

Martin

2006-10-27 05:51:19

by David Rientjes

[permalink] [raw]
Subject: Re: [PATCH: 2.6.18.1] delayacct: cpu_count in taskstats updated correctly

On Fri, 27 Oct 2006, Martin Tostrup Setek wrote:

> In include/linux/taskstats.h, I found:
>
> * xxx_count is the number of delay values recorded
> * xxx_delay_total is the corresponding cumulative delay in
> nanoseconds
>
> I interpret these comments as saying that:
>
> cpu_delay should be the total number of nanoseconds a task has been waiting in
> a runqueue for a CPU, and cpu_count is equal to the number of times the task
> got the CPU (or waited for it). If so, then the code updates
> taskstats.cpu_delay_total incorrectly too (which my patch didn't fix).
>

I don't see a cpu_delay, I see a cpu_delay_total. This is the CPU's
cumulative delay and is only accessible through the user-space accounting
program Documentation/accounting/getdelays.c. It reports the number of
delay values recorded and the real total, virtual total, and delay total;
each of these are cumulative.

In the use case for cpu_count, taking the difference of two successive
cpu_count readings as reported by getdelays.c would find the number of
delays experienced in that interval. This is described in the program's
documentation so by definition the count must be cumulative. It is also
possible to find the average delay but dividing cpu_delay_total by
cpu_count.

The original code is correct and accumulation is appropriate.

David

2006-10-27 07:23:14

by Martin Tostrup Setek

[permalink] [raw]
Subject: Re: [PATCH: 2.6.18.1] delayacct: cpu_count in taskstats updated correctly

On Thu, 26 Oct 2006, David Rientjes wrote:
> I don't see a cpu_delay, I see a cpu_delay_total. This is the CPU's
> cumulative delay and is only accessible through the user-space accounting
> program Documentation/accounting/getdelays.c. It reports the number of
> delay values recorded and the real total, virtual total, and delay total;
> each of these are cumulative.

Yes, I reread the docs more carefully. You are right, and I was wrong.

> In the use case for cpu_count, taking the difference of two successive
> cpu_count readings as reported by getdelays.c would find the number of
> delays experienced in that interval. This is described in the program's
> documentation so by definition the count must be cumulative. It is also
> possible to find the average delay but dividing cpu_delay_total by
> cpu_count.
>
> The original code is correct and accumulation is appropriate.

Thanks for setting me straight. I withdraw the patch.

Martin

2006-10-27 15:50:07

by Shailabh Nagar

[permalink] [raw]
Subject: Re: [PATCH: 2.6.18.1] delayacct: cpu_count in taskstats updated correctly

David Rientjes wrote:
> On Fri, 27 Oct 2006, Martin Tostrup Setek wrote:
>
>> from: Martin T. Setek <[email protected]>
>>
>> cpu_count in struct taskstats should be the same as the corresponding (third)
>> value found in /proc/<pid>/schedstat
>
> I disagree in favor of Documentation/accounting/taskstats-struct.txt.
> cpu_count is the number of delay values recorded, so accumulating them is
> appropriate.
>
> David


David's right. The delay accounting field cpu_count
is measuring how many delay values are recorded in the field cpu_delay_total
(so that one can divide one by the other to get an average if needed).

For the delays reported for a single task (ie. __delayacct_add_tsk called only
once for a given task), the effect will be what Martin wants i.e. cpu_count will be
the same as /proc/pid/schedstat's third field (sched_info->pcnt)

But when the delays are reported for a tgid, where *accumalation* of delays for
all constituent pids is being done (by calling __delayacct_add_tsk repeatedly), what
is desired is to accumalate both cpu_count and cpu_delay_total.

So the patch proposed by Martin is incorrect.

--Shailabh