2008-06-16 09:41:45

by bharathravi1

[permalink] [raw]
Subject: [PATCH] Delay accounting, fix incorrect delay time when constantly waiting on runqueue

From: Bharath Ravi <[email protected]>

This patch corrects the incorrect value of per process run-queue wait time
reported by delay statistics. The anomaly was due to the following reason.
When a process leaves the CPU and immediately starts waiting for CPU on
the runqueue (which means it remains in the TASK_RUNNABLE state), the time
of re-entry into the run-queue is never recorded. Due to this, the waiting time
on the runqueue from this point of re-entry upto the next time it hits the CPU
is not accounted for. This is solved by recording the time of re-entry of a
process leaving the CPU in the sched_info_depart() function IF the process will
go back to waiting on the run-queue. This IF condition is verified by checking
whether the process is still in the TASK_RUNNABLE state.

The patch was tested on 2.6.26-rc6 using two simple CPU hog programs. The
values noted prior to the fix did not account for the time spent on the
runqueue waiting. After the fix, the correct values were reported back
to user space.

Signed-off-by: Bharath Ravi <[email protected]>
Signed-off-by: Madhava K R <[email protected]>
---
kernel/sched_stats.h | 6 ++++++
1 files changed, 6 insertions(+), 0 deletions(-)

diff --git a/kernel/sched_stats.h b/kernel/sched_stats.h
index a38878e..80179ef 100644
--- a/kernel/sched_stats.h
+++ b/kernel/sched_stats.h
@@ -198,6 +198,9 @@ static inline void sched_info_queued(struct task_struct *t)
/*
* Called when a process ceases being the active-running process, either
* voluntarily or involuntarily. Now we can calculate how long we ran.
+ * Also, if the process is still in the TASK_RUNNING state, call
+ * sched_info_queued() to mark that it has now again started waiting on
+ * the runqueue.
*/
static inline void sched_info_depart(struct task_struct *t)
{
@@ -206,6 +209,9 @@ static inline void sched_info_depart(struct task_struct *t)

t->sched_info.cpu_time += delta;
rq_sched_info_depart(task_rq(t), delta);
+
+ if (t->state == TASK_RUNNING)
+ sched_info_queued(t);
}

/*
--
1.5.5.1


2008-06-16 09:55:32

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH] Delay accounting, fix incorrect delay time when constantly waiting on runqueue

On Mon, 2008-06-16 at 15:11 +0530, [email protected] wrote:
> From: Bharath Ravi <[email protected]>
>
> This patch corrects the incorrect value of per process run-queue wait time
> reported by delay statistics. The anomaly was due to the following reason.
> When a process leaves the CPU and immediately starts waiting for CPU on
> the runqueue (which means it remains in the TASK_RUNNABLE state), the time
> of re-entry into the run-queue is never recorded. Due to this, the waiting time
> on the runqueue from this point of re-entry upto the next time it hits the CPU
> is not accounted for. This is solved by recording the time of re-entry of a
> process leaving the CPU in the sched_info_depart() function IF the process will
> go back to waiting on the run-queue. This IF condition is verified by checking
> whether the process is still in the TASK_RUNNABLE state.
>
> The patch was tested on 2.6.26-rc6 using two simple CPU hog programs. The
> values noted prior to the fix did not account for the time spent on the
> runqueue waiting. After the fix, the correct values were reported back
> to user space.


Have you considered: http://lkml.org/lkml/2008/6/5/10

I'm sad to say it is still pending in my todo list :-( - sorry Ankita.

> Signed-off-by: Bharath Ravi <[email protected]>
> Signed-off-by: Madhava K R <[email protected]>
> ---
> kernel/sched_stats.h | 6 ++++++
> 1 files changed, 6 insertions(+), 0 deletions(-)
>
> diff --git a/kernel/sched_stats.h b/kernel/sched_stats.h
> index a38878e..80179ef 100644
> --- a/kernel/sched_stats.h
> +++ b/kernel/sched_stats.h
> @@ -198,6 +198,9 @@ static inline void sched_info_queued(struct task_struct *t)
> /*
> * Called when a process ceases being the active-running process, either
> * voluntarily or involuntarily. Now we can calculate how long we ran.
> + * Also, if the process is still in the TASK_RUNNING state, call
> + * sched_info_queued() to mark that it has now again started waiting on
> + * the runqueue.
> */
> static inline void sched_info_depart(struct task_struct *t)
> {
> @@ -206,6 +209,9 @@ static inline void sched_info_depart(struct task_struct *t)
>
> t->sched_info.cpu_time += delta;
> rq_sched_info_depart(task_rq(t), delta);
> +
> + if (t->state == TASK_RUNNING)
> + sched_info_queued(t);
> }
>
> /*

2008-06-16 12:00:35

by Madhava K R

[permalink] [raw]
Subject: Re: [PATCH] Delay accounting, fix incorrect delay time when constantly waiting on runqueue

Hello,

On Mon, Jun 16, 2008 at 3:25 PM, Peter Zijlstra <[email protected]> wrote:
> On Mon, 2008-06-16 at 15:11 +0530, [email protected] wrote:
>> From: Bharath Ravi <[email protected]>
>>
>> This patch corrects the incorrect value of per process run-queue wait time
>> reported by delay statistics. The anomaly was due to the following reason.
>> When a process leaves the CPU and immediately starts waiting for CPU on
>> the runqueue (which means it remains in the TASK_RUNNABLE state), the time
>> of re-entry into the run-queue is never recorded. Due to this, the waiting time
>> on the runqueue from this point of re-entry upto the next time it hits the CPU
>> is not accounted for. This is solved by recording the time of re-entry of a
>> process leaving the CPU in the sched_info_depart() function IF the process will
>> go back to waiting on the run-queue. This IF condition is verified by checking
>> whether the process is still in the TASK_RUNNABLE state.
>>
>> The patch was tested on 2.6.26-rc6 using two simple CPU hog programs. The
>> values noted prior to the fix did not account for the time spent on the
>> runqueue waiting. After the fix, the correct values were reported back
>> to user space.
>
>
> Have you considered: http://lkml.org/lkml/2008/6/5/10
>
> I'm sad to say it is still pending in my todo list :-( - sorry Ankita.
>

It seems that Ankita's patch addresses the scenario where a process is
already on the run-queue, and is shuffled about CPUs. This patch
addresses the scenario where a process is pre-empted and returns to
the run-queue, where the last_queued value is not recorded.

I tried our test case with Ankita's patch, and the problem remains.
Our test case involves running two tight loops on an idle CPU.
Ideally, both should experience a run time of 50% and a delay time of
50%. But the results show negligible delay time for both processes.

The problems appear mutually exclusive...

>> Signed-off-by: Bharath Ravi <[email protected]>
>> Signed-off-by: Madhava K R <[email protected]>
>> ---
>> kernel/sched_stats.h | 6 ++++++
>> 1 files changed, 6 insertions(+), 0 deletions(-)
>>
>> diff --git a/kernel/sched_stats.h b/kernel/sched_stats.h
>> index a38878e..80179ef 100644
>> --- a/kernel/sched_stats.h
>> +++ b/kernel/sched_stats.h
>> @@ -198,6 +198,9 @@ static inline void sched_info_queued(struct task_struct *t)
>> /*
>> * Called when a process ceases being the active-running process, either
>> * voluntarily or involuntarily. Now we can calculate how long we ran.
>> + * Also, if the process is still in the TASK_RUNNING state, call
>> + * sched_info_queued() to mark that it has now again started waiting on
>> + * the runqueue.
>> */
>> static inline void sched_info_depart(struct task_struct *t)
>> {
>> @@ -206,6 +209,9 @@ static inline void sched_info_depart(struct task_struct *t)
>>
>> t->sched_info.cpu_time += delta;
>> rq_sched_info_depart(task_rq(t), delta);
>> +
>> + if (t->state == TASK_RUNNING)
>> + sched_info_queued(t);
>> }
>>
>> /*
>
>

2008-06-16 12:42:32

by Ankita Garg

[permalink] [raw]
Subject: Re: [PATCH] Delay accounting, fix incorrect delay time when constantly waiting on runqueue

Hi,

On Mon, Jun 16, 2008 at 05:30:26PM +0530, Madhava K R wrote:
> Hello,
>
> On Mon, Jun 16, 2008 at 3:25 PM, Peter Zijlstra <[email protected]> wrote:
> > On Mon, 2008-06-16 at 15:11 +0530, [email protected] wrote:
> >> From: Bharath Ravi <[email protected]>
> >>
> >> This patch corrects the incorrect value of per process run-queue wait time
> >> reported by delay statistics. The anomaly was due to the following reason.
> >> When a process leaves the CPU and immediately starts waiting for CPU on
> >> the runqueue (which means it remains in the TASK_RUNNABLE state), the time
> >> of re-entry into the run-queue is never recorded. Due to this, the waiting time
> >> on the runqueue from this point of re-entry upto the next time it hits the CPU
> >> is not accounted for. This is solved by recording the time of re-entry of a
> >> process leaving the CPU in the sched_info_depart() function IF the process will
> >> go back to waiting on the run-queue. This IF condition is verified by checking
> >> whether the process is still in the TASK_RUNNABLE state.
> >>
> >> The patch was tested on 2.6.26-rc6 using two simple CPU hog programs. The
> >> values noted prior to the fix did not account for the time spent on the
> >> runqueue waiting. After the fix, the correct values were reported back
> >> to user space.
> >
> >
> > Have you considered: http://lkml.org/lkml/2008/6/5/10
> >
> > I'm sad to say it is still pending in my todo list :-( - sorry Ankita.
> >
>
> It seems that Ankita's patch addresses the scenario where a process is
> already on the run-queue, and is shuffled about CPUs. This patch
> addresses the scenario where a process is pre-empted and returns to
> the run-queue, where the last_queued value is not recorded.
>

Right, so looks like the two patches are addressing different scenarios.
The patch I had sent earlier was addressing the case where the accounting
of the task had already begun and then it was migrated to another
runqueue, leading to skews. This patch addresses the issue that the task
delay accounting could potentially miss some delay stats, as explained by
Madhava.

> I tried our test case with Ankita's patch, and the problem remains.
> Our test case involves running two tight loops on an idle CPU.
> Ideally, both should experience a run time of 50% and a delay time of
> 50%. But the results show negligible delay time for both processes.
>
> The problems appear mutually exclusive...
>
> >> Signed-off-by: Bharath Ravi <[email protected]>
> >> Signed-off-by: Madhava K R <[email protected]>
> >> ---
> >> kernel/sched_stats.h | 6 ++++++
> >> 1 files changed, 6 insertions(+), 0 deletions(-)
> >>
> >> diff --git a/kernel/sched_stats.h b/kernel/sched_stats.h
> >> index a38878e..80179ef 100644
> >> --- a/kernel/sched_stats.h
> >> +++ b/kernel/sched_stats.h
> >> @@ -198,6 +198,9 @@ static inline void sched_info_queued(struct task_struct *t)
> >> /*
> >> * Called when a process ceases being the active-running process, either
> >> * voluntarily or involuntarily. Now we can calculate how long we ran.
> >> + * Also, if the process is still in the TASK_RUNNING state, call
> >> + * sched_info_queued() to mark that it has now again started waiting on
> >> + * the runqueue.
> >> */
> >> static inline void sched_info_depart(struct task_struct *t)
> >> {
> >> @@ -206,6 +209,9 @@ static inline void sched_info_depart(struct task_struct *t)
> >>
> >> t->sched_info.cpu_time += delta;
> >> rq_sched_info_depart(task_rq(t), delta);
> >> +
> >> + if (t->state == TASK_RUNNING)
> >> + sched_info_queued(t);
> >> }
> >>
> >> /*
> >
> >

--
Regards,
Ankita Garg ([email protected])
Linux Technology Center
IBM India Systems & Technology Labs,
Bangalore, India

2008-06-16 16:34:55

by Balbir Singh

[permalink] [raw]
Subject: Re: [PATCH] Delay accounting, fix incorrect delay time when constantly waiting on runqueue

On Mon, Jun 16, 2008 at 3:11 PM, <[email protected]> wrote:
> From: Bharath Ravi <[email protected]>
>
> This patch corrects the incorrect value of per process run-queue wait time
> reported by delay statistics. The anomaly was due to the following reason.
> When a process leaves the CPU and immediately starts waiting for CPU on
> the runqueue (which means it remains in the TASK_RUNNABLE state), the time
> of re-entry into the run-queue is never recorded. Due to this, the waiting time
> on the runqueue from this point of re-entry upto the next time it hits the CPU
> is not accounted for. This is solved by recording the time of re-entry of a
> process leaving the CPU in the sched_info_depart() function IF the process will
> go back to waiting on the run-queue. This IF condition is verified by checking
> whether the process is still in the TASK_RUNNABLE state.
>
> The patch was tested on 2.6.26-rc6 using two simple CPU hog programs. The
> values noted prior to the fix did not account for the time spent on the
> runqueue waiting. After the fix, the correct values were reported back
> to user space.
>
> Signed-off-by: Bharath Ravi <[email protected]>
> Signed-off-by: Madhava K R <[email protected]>

Looks very good to me

Acked-by: Balbir Singh <[email protected]>

Balbir

2008-06-19 10:01:42

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH] Delay accounting, fix incorrect delay time when constantly waiting on runqueue

On Mon, 2008-06-16 at 15:11 +0530, [email protected] wrote:
> From: Bharath Ravi <[email protected]>
>
> This patch corrects the incorrect value of per process run-queue wait time
> reported by delay statistics. The anomaly was due to the following reason.
> When a process leaves the CPU and immediately starts waiting for CPU on
> the runqueue (which means it remains in the TASK_RUNNABLE state), the time
> of re-entry into the run-queue is never recorded. Due to this, the waiting time
> on the runqueue from this point of re-entry upto the next time it hits the CPU
> is not accounted for. This is solved by recording the time of re-entry of a
> process leaving the CPU in the sched_info_depart() function IF the process will
> go back to waiting on the run-queue. This IF condition is verified by checking
> whether the process is still in the TASK_RUNNABLE state.
>
> The patch was tested on 2.6.26-rc6 using two simple CPU hog programs. The
> values noted prior to the fix did not account for the time spent on the
> runqueue waiting. After the fix, the correct values were reported back
> to user space.
>
> Signed-off-by: Bharath Ravi <[email protected]>
> Signed-off-by: Madhava K R <[email protected]>

Acked-by: Peter Zijlstra <[email protected]>

> ---
> kernel/sched_stats.h | 6 ++++++
> 1 files changed, 6 insertions(+), 0 deletions(-)
>
> diff --git a/kernel/sched_stats.h b/kernel/sched_stats.h
> index a38878e..80179ef 100644
> --- a/kernel/sched_stats.h
> +++ b/kernel/sched_stats.h
> @@ -198,6 +198,9 @@ static inline void sched_info_queued(struct task_struct *t)
> /*
> * Called when a process ceases being the active-running process, either
> * voluntarily or involuntarily. Now we can calculate how long we ran.
> + * Also, if the process is still in the TASK_RUNNING state, call
> + * sched_info_queued() to mark that it has now again started waiting on
> + * the runqueue.
> */
> static inline void sched_info_depart(struct task_struct *t)
> {
> @@ -206,6 +209,9 @@ static inline void sched_info_depart(struct task_struct *t)
>
> t->sched_info.cpu_time += delta;
> rq_sched_info_depart(task_rq(t), delta);
> +
> + if (t->state == TASK_RUNNING)
> + sched_info_queued(t);
> }
>
> /*

2008-06-19 12:18:17

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH] Delay accounting, fix incorrect delay time when constantly waiting on runqueue


* Peter Zijlstra <[email protected]> wrote:

> > The patch was tested on 2.6.26-rc6 using two simple CPU hog
> > programs. The values noted prior to the fix did not account for the
> > time spent on the runqueue waiting. After the fix, the correct
> > values were reported back to user space.
> >
> > Signed-off-by: Bharath Ravi <[email protected]>
> > Signed-off-by: Madhava K R <[email protected]>
>
> Acked-by: Peter Zijlstra <[email protected]>

applied to tip/sched/urgent - thanks.

Ingo