LinuxLists.cc - [PATCH 1/2] sched/rt: Check to push task away when its affinity is changed

2015-04-30 17:07:32

Subject: [PATCH 1/2] sched/rt: Check to push task away when its affinity is changed

From: Xunlei Pang <[email protected]>

We may suffer from extra rt overload rq due to the affinity,
so when the affinity of any runnable rt task is changed, we
should check to trigger balancing, otherwise it will cause
some unnecessary delayed real-time response. Unfortunately,
current RT global scheduler does nothing about this.

For example: a 2-cpu system with two runnable FIFO tasks(same
rt_priority) bound on CPU0, let's name them rt1(running) and
rt2(runnable) respectively; CPU1 has no RTs. Then, someone sets
the affinity of rt2 to 0x3(i.e. CPU0 and CPU1), but after this,
rt2 still can't be scheduled enters schedule(), this
definitely causes some/big response latency for rt2.

This patch introduces a new sched_class::post_set_cpus_allowed()
for RT called after set_cpus_allowed_rt(). In this new function,
if the task is runnable but not running, it tries to push it away
once it got migratable.

Signed-off-by: Xunlei Pang <[email protected]>
---
kernel/sched/core.c | 3 +++
kernel/sched/rt.c | 17 +++++++++++++++++
kernel/sched/sched.h | 1 +
3 files changed, 21 insertions(+)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index d13fc13..64a1603 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -4773,6 +4773,9 @@ void do_set_cpus_allowed(struct task_struct *p, const struct cpumask *new_mask)

cpumask_copy(&p->cpus_allowed, new_mask);
p->nr_cpus_allowed = cpumask_weight(new_mask);
+
+ if (p->sched_class->post_set_cpus_allowed)
+ p->sched_class->post_set_cpus_allowed(p);
}

/*
diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
index cc49a7c..a9d33a3 100644
--- a/kernel/sched/rt.c
+++ b/kernel/sched/rt.c
@@ -2281,6 +2281,22 @@ static void set_cpus_allowed_rt(struct task_struct *p,
update_rt_migration(&rq->rt);
}

+static void post_set_cpus_allowed_rt(struct task_struct *p)
+{
+ struct rq *rq;
+
+ if (!task_on_rq_queued(p))
+ return;
+
+ rq = task_rq(p);
+ if (!task_running(rq, p) &&
+ p->nr_cpus_allowed > 1 &&
+ !test_tsk_need_resched(rq->curr) &&
+ cpupri_find(&rq->rd->cpupri, p, NULL)) {
+ push_rt_tasks(rq);
+ }
+}
+
/* Assumes rq->lock is held */
static void rq_online_rt(struct rq *rq)
{
@@ -2495,6 +2511,7 @@ const struct sched_class rt_sched_class = {
.select_task_rq = select_task_rq_rt,

.set_cpus_allowed = set_cpus_allowed_rt,
+ .post_set_cpus_allowed = post_set_cpus_allowed_rt,
.rq_online = rq_online_rt,
.rq_offline = rq_offline_rt,
.post_schedule = post_schedule_rt,
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index e0e1299..6f90645 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -1191,6 +1191,7 @@ struct sched_class {

void (*set_cpus_allowed)(struct task_struct *p,
const struct cpumask *newmask);
+ void (*post_set_cpus_allowed)(struct task_struct *p);

void (*rq_online)(struct rq *rq);
void (*rq_offline)(struct rq *rq);
--
1.9.1

2015-04-30 17:05:29

by Xunlei Pang

[permalink] [raw]

Subject: [PATCH 2/2] sched/rt: Optimizate task_woken_rt()

From: Xunlei Pang <[email protected]>

- Remove "has_pushable_tasks(rq)" condition, because for queued p,
"!task_running(rq, p)" and "p->nr_cpus_allowed > 1" implies true
"has_pushable_tasks(rq)".

- Remove "!test_tsk_need_resched(rq->curr)" condition, because
the flag might be set right before the waking up, but we still
need to push equal or lower priority tasks, it should be removed.
Without this condition, we actually get the right logic.

Signed-off-by: Xunlei Pang <[email protected]>
---
kernel/sched/rt.c | 2 --
1 file changed, 2 deletions(-)

diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
index a9d33a3..9d735da 100644
--- a/kernel/sched/rt.c
+++ b/kernel/sched/rt.c
@@ -2233,8 +2233,6 @@ out:
static void task_woken_rt(struct rq *rq, struct task_struct *p)
{
if (!task_running(rq, p) &&
- !test_tsk_need_resched(rq->curr) &&
- has_pushable_tasks(rq) &&
p->nr_cpus_allowed > 1 &&
(dl_task(rq->curr) || rt_task(rq->curr)) &&
(rq->curr->nr_cpus_allowed < 2 ||
--
1.9.1

2015-04-30 17:01:51

by Steven Rostedt

[permalink] [raw]

Subject: Re: [PATCH 2/2] sched/rt: Optimizate task_woken_rt()

On Fri, 1 May 2015 00:33:18 +0800
Xunlei Pang <[email protected]> wrote:

> From: Xunlei Pang <[email protected]>
>
> - Remove "has_pushable_tasks(rq)" condition, because for queued p,
> "!task_running(rq, p)" and "p->nr_cpus_allowed > 1" implies true
> "has_pushable_tasks(rq)".

This makes sense.

>
> - Remove "!test_tsk_need_resched(rq->curr)" condition, because
> the flag might be set right before the waking up, but we still
> need to push equal or lower priority tasks, it should be removed.
> Without this condition, we actually get the right logic.

But doesn't that happen when we schedule?

-- Steve

>
> Signed-off-by: Xunlei Pang <[email protected]>
> ---
> kernel/sched/rt.c | 2 --
> 1 file changed, 2 deletions(-)
>
> diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
> index a9d33a3..9d735da 100644
> --- a/kernel/sched/rt.c
> +++ b/kernel/sched/rt.c
> @@ -2233,8 +2233,6 @@ out:
> static void task_woken_rt(struct rq *rq, struct task_struct *p)
> {
> if (!task_running(rq, p) &&
> - !test_tsk_need_resched(rq->curr) &&
> - has_pushable_tasks(rq) &&
> p->nr_cpus_allowed > 1 &&
> (dl_task(rq->curr) || rt_task(rq->curr)) &&
> (rq->curr->nr_cpus_allowed < 2 ||

2015-04-30 17:05:51

by Steven Rostedt

[permalink] [raw]

Subject: Re: [PATCH 1/2] sched/rt: Check to push task away when its affinity is changed

On Fri, 1 May 2015 00:33:17 +0800
Xunlei Pang <[email protected]> wrote:

> From: Xunlei Pang <[email protected]>
>
> We may suffer from extra rt overload rq due to the affinity,
> so when the affinity of any runnable rt task is changed, we
> should check to trigger balancing, otherwise it will cause
> some unnecessary delayed real-time response. Unfortunately,
> current RT global scheduler does nothing about this.
>
> For example: a 2-cpu system with two runnable FIFO tasks(same
> rt_priority) bound on CPU0, let's name them rt1(running) and
> rt2(runnable) respectively; CPU1 has no RTs. Then, someone sets
> the affinity of rt2 to 0x3(i.e. CPU0 and CPU1), but after this,
> rt2 still can't be scheduled enters schedule(), this
> definitely causes some/big response latency for rt2.
>
> This patch introduces a new sched_class::post_set_cpus_allowed()
> for RT called after set_cpus_allowed_rt(). In this new function,
> if the task is runnable but not running, it tries to push it away
> once it got migratable.
>
> Signed-off-by: Xunlei Pang <[email protected]>
> ---
> kernel/sched/core.c | 3 +++
> kernel/sched/rt.c | 17 +++++++++++++++++
> kernel/sched/sched.h | 1 +
> 3 files changed, 21 insertions(+)
>
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index d13fc13..64a1603 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -4773,6 +4773,9 @@ void do_set_cpus_allowed(struct task_struct *p, const struct cpumask *new_mask)
>
> cpumask_copy(&p->cpus_allowed, new_mask);
> p->nr_cpus_allowed = cpumask_weight(new_mask);
> +
> + if (p->sched_class->post_set_cpus_allowed)
> + p->sched_class->post_set_cpus_allowed(p);
> }
>
> /*
> diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
> index cc49a7c..a9d33a3 100644
> --- a/kernel/sched/rt.c
> +++ b/kernel/sched/rt.c
> @@ -2281,6 +2281,22 @@ static void set_cpus_allowed_rt(struct task_struct *p,
> update_rt_migration(&rq->rt);
> }
>
> +static void post_set_cpus_allowed_rt(struct task_struct *p)
> +{
> + struct rq *rq;
> +
> + if (!task_on_rq_queued(p))
> + return;
> +
> + rq = task_rq(p);
> + if (!task_running(rq, p) &&
> + p->nr_cpus_allowed > 1 &&
> + !test_tsk_need_resched(rq->curr) &&
> + cpupri_find(&rq->rd->cpupri, p, NULL)) {

I don't think we need cpupri_find() call here. It's done in
push_rt_tasks() and doing it twice is just a wasted effort.

> + push_rt_tasks(rq);
> + }
> +}
> +
> /* Assumes rq->lock is held */
> static void rq_online_rt(struct rq *rq)
> {
> @@ -2495,6 +2511,7 @@ const struct sched_class rt_sched_class = {
> .select_task_rq = select_task_rq_rt,
>
> .set_cpus_allowed = set_cpus_allowed_rt,
> + .post_set_cpus_allowed = post_set_cpus_allowed_rt,
> .rq_online = rq_online_rt,
> .rq_offline = rq_offline_rt,
> .post_schedule = post_schedule_rt,
> diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
> index e0e1299..6f90645 100644
> --- a/kernel/sched/sched.h
> +++ b/kernel/sched/sched.h
> @@ -1191,6 +1191,7 @@ struct sched_class {
>
> void (*set_cpus_allowed)(struct task_struct *p,
> const struct cpumask *newmask);
> + void (*post_set_cpus_allowed)(struct task_struct *p);
>
> void (*rq_online)(struct rq *rq);
> void (*rq_offline)(struct rq *rq);

2015-05-01 02:51:59

by Steven Rostedt

[permalink] [raw]

Subject: Re: [PATCH 2/2] sched/rt: Optimizate task_woken_rt()

On Fri, 1 May 2015 10:02:47 +0800
[email protected] wrote:

> > >
> > > - Remove "!test_tsk_need_resched(rq->curr)" condition, because
> > > the flag might be set right before the waking up, but we still
> > > need to push equal or lower priority tasks, it should be removed.
> > > Without this condition, we actually get the right logic.
> >
> > But doesn't that happen when we schedule?
>
> It does, but will have some latency.

What latency? The need_resched flag is set, that means as soon as this
CPU is in a preemptable situation, it will schedule. The most common
case would be return from interrupt, as interrupts or softirqs are
usually what trigger the wakeups.

But if we do it here, the need resched flag could be set because
another higher priority task is about to preempt the current one that
is higher than what just woke up. So we move it now to another CPU, and
then on return of the interrupt we schedule. Then the current running
task gets preempted by the higher priority task and it too can migrate
to the CPU we just pushed the other one to, and it still doesn't run,
but yet it got moved for no reason at all.

I feel better if the need resched flag is set, we wait till a schedule
happens to see where everything is about to be moved.

>
> Still "rq->curr->prio <= p->prio" will be enough for us to ensure the
> proper
> push_rt_tasks() without this condition.

I have no idea what the above means.

>
> Beside that, for "rq->curr->nr_cpus_allowed < 2", I noticed it was
> introduced
> by commit b3bc211cfe7d5fe94b, but with "!test_tsk_need_resched(rq->curr)",
> it actaully can't be satisfied.

What can't be satisfied?

>
> So, I think this condition should be removed.

I'm still not convinced.

-- Steve