2008-08-26 19:15:28

by John Blackwood

[permalink] [raw]
Subject: [PATCH] sched_rt_rq_enqueue() resched idle

Hi Peter,

When sysctl_sched_rt_runtime is set to something other than -1 and the
CONFIG_RT_GROUP_SCHED kernel parameter is NOT enabled, we get into a state
where we see one or more CPUs idling forvever even though there are
real-time
tasks in their rt runqueue that are able to run (no longer throttled).

The sequence is:

- A real-time task is running when the timer sets the rt runqueue
to throttled, and the rt task is resched_task()ed and switched
out, and idle is switched in since there are no non-rt tasks to
run on that cpu.

- Eventually the do_sched_rt_period_timer() runs and un-throttles
the rt runqueue, but we just exit the timer interrupt and go back
to executing the idle task in the idle loop forever.

If we change the sched_rt_rq_enqueue() routine to use some of the code
from the CONFIG_RT_GROUP_SCHED enabled version of this same routine and
resched_task() the currently executing task (idle in our case) if it is
a lower priority task than the higher rt task in the now un-throttled
runqueue, the problem is no longer observed.

Thank you for your time and consideration.

Signed-off-by: John Blackwood <[email protected]>

Index: a/kernel/sched_rt.c
===================================================================
--- a.orig/kernel/sched_rt.c 2008-08-22 15:11:31.000000000 -0400
+++ a/kernel/sched_rt.c 2008-08-22 15:12:36.000000000 -0400
@@ -193,6 +193,12 @@

static inline void sched_rt_rq_enqueue(struct rt_rq *rt_rq)
{
+ if (rt_rq->rt_nr_running) {
+ struct task_struct *curr = rq_of_rt_rq(rt_rq)->curr;
+
+ if (rt_rq->highest_prio < curr->prio)
+ resched_task(curr);
+ }
}

static inline void sched_rt_rq_dequeue(struct rt_rq *rt_rq)


2008-08-27 08:03:58

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH] sched_rt_rq_enqueue() resched idle

On Tue, 2008-08-26 at 15:09 -0400, John Blackwood wrote:
> Hi Peter,
>
> When sysctl_sched_rt_runtime is set to something other than -1 and the
> CONFIG_RT_GROUP_SCHED kernel parameter is NOT enabled, we get into a state
> where we see one or more CPUs idling forvever even though there are
> real-time
> tasks in their rt runqueue that are able to run (no longer throttled).
>
> The sequence is:
>
> - A real-time task is running when the timer sets the rt runqueue
> to throttled, and the rt task is resched_task()ed and switched
> out, and idle is switched in since there are no non-rt tasks to
> run on that cpu.
>
> - Eventually the do_sched_rt_period_timer() runs and un-throttles
> the rt runqueue, but we just exit the timer interrupt and go back
> to executing the idle task in the idle loop forever.
>
> If we change the sched_rt_rq_enqueue() routine to use some of the code
> from the CONFIG_RT_GROUP_SCHED enabled version of this same routine and
> resched_task() the currently executing task (idle in our case) if it is
> a lower priority task than the higher rt task in the now un-throttled
> runqueue, the problem is no longer observed.

Very good spotting, Thanks!

However I think the patch isn't quite good, as highest_prio is only
available on SMP || RT_GROUP_SCHED.

Furthermore, on !RT_GROUP_SCHED any RT task will be higher than current,
so we can do the below, do you agree?

---
diff --git a/kernel/sched_rt.c b/kernel/sched_rt.c
index 94daace..f672aee 100644
--- a/kernel/sched_rt.c
+++ b/kernel/sched_rt.c
@@ -199,6 +199,8 @@ static inline struct rt_rq *group_rt_rq(struct sched_rt_entity *rt_se)

static inline void sched_rt_rq_enqueue(struct rt_rq *rt_rq)
{
+ if (rt_rq->rt_nr_running)
+ resched_task(rq_of_rt_rq(rt_rq)->curr);
}

static inline void sched_rt_rq_dequeue(struct rt_rq *rt_rq)

2008-08-27 13:05:47

by John Blackwood

[permalink] [raw]
Subject: Re: [PATCH] sched_rt_rq_enqueue() resched idle

> On Tue, 2008-08-26 at 15:09 -0400, John Blackwood wrote:
> > > Hi Peter,
> > >
> > > When sysctl_sched_rt_runtime is set to something other than -1
and the
> > > CONFIG_RT_GROUP_SCHED kernel parameter is NOT enabled, we get
into a state
> > > where we see one or more CPUs idling forvever even though there are
> > > real-time
> > > tasks in their rt runqueue that are able to run (no longer
throttled).
> > >
> > > The sequence is:
> > >
> > > - A real-time task is running when the timer sets the rt runqueue
> > > to throttled, and the rt task is resched_task()ed and switched
> > > out, and idle is switched in since there are no non-rt tasks to
> > > run on that cpu.
> > >
> > > - Eventually the do_sched_rt_period_timer() runs and un-throttles
> > > the rt runqueue, but we just exit the timer interrupt and go back
> > > to executing the idle task in the idle loop forever.
> > >
> > > If we change the sched_rt_rq_enqueue() routine to use some of the
code
> > > from the CONFIG_RT_GROUP_SCHED enabled version of this same
routine and
> > > resched_task() the currently executing task (idle in our case) if
it is
> > > a lower priority task than the higher rt task in the now un-throttled
> > > runqueue, the problem is no longer observed.
>
> Very good spotting, Thanks!

You're welcome.

> However I think the patch isn't quite good, as highest_prio is only
> available on SMP || RT_GROUP_SCHED.
>
> Furthermore, on !RT_GROUP_SCHED any RT task will be higher than current,
> so we can do the below, do you agree?

Yes, I see what you are saying.
The patch version below looks good.
I re-tested with it and it works fine.

Thanks!

> diff --git a/kernel/sched_rt.c b/kernel/sched_rt.c
> index 94daace..f672aee 100644
> --- a/kernel/sched_rt.c
> +++ b/kernel/sched_rt.c
> @@ -199,6 +199,8 @@ static inline struct rt_rq *group_rt_rq(struct
sched_rt_entity *rt_se)
>
> static inline void sched_rt_rq_enqueue(struct rt_rq *rt_rq)
> {
> + if (rt_rq->rt_nr_running)
> + resched_task(rq_of_rt_rq(rt_rq)->curr);
> }
>
> static inline void sched_rt_rq_dequeue(struct rt_rq *rt_rq)
>
>

2008-08-27 13:19:33

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH] sched_rt_rq_enqueue() resched idle

Ingo, please apply.


---
Subject: sched: sched_rt_rq_enqueue() resched idle
From: John Blackwood <[email protected]>
Date: Tue, 26 Aug 2008 15:09:43 -0400

When sysctl_sched_rt_runtime is set to something other than -1 and the
CONFIG_RT_GROUP_SCHED kernel parameter is NOT enabled, we get into a state
where we see one or more CPUs idling forvever even though there are
real-time
tasks in their rt runqueue that are able to run (no longer throttled).

The sequence is:

- A real-time task is running when the timer sets the rt runqueue
to throttled, and the rt task is resched_task()ed and switched
out, and idle is switched in since there are no non-rt tasks to
run on that cpu.

- Eventually the do_sched_rt_period_timer() runs and un-throttles
the rt runqueue, but we just exit the timer interrupt and go back
to executing the idle task in the idle loop forever.

If we change the sched_rt_rq_enqueue() routine to use some of the code
from the CONFIG_RT_GROUP_SCHED enabled version of this same routine and
resched_task() the currently executing task (idle in our case) if it is
a lower priority task than the higher rt task in the now un-throttled
runqueue, the problem is no longer observed.

Signed-off-by: John Blackwood <[email protected]>
Signed-off-by: Peter Zijlstra <[email protected]>
---
kernel/sched_rt.c | 2 ++
1 file changed, 2 insertions(+)

Index: linux-2.6/kernel/sched_rt.c
===================================================================
--- linux-2.6.orig/kernel/sched_rt.c
+++ linux-2.6/kernel/sched_rt.c
@@ -199,6 +199,8 @@ static inline struct rt_rq *group_rt_rq(

static inline void sched_rt_rq_enqueue(struct rt_rq *rt_rq)
{
+ if (rt_rq->rt_nr_running)
+ resched_task(rq_of_rt_rq(rt_rq)->curr);
}

static inline void sched_rt_rq_dequeue(struct rt_rq *rt_rq)