2021-01-19 08:38:09

by Dietmar Eggemann

[permalink] [raw]
Subject: [PATCH] sched/deadline: Reduce rq lock contention in dl_add_task_root_domain()

dl_add_task_root_domain() is called during sched domain rebuild:

rebuild_sched_domains_locked()
partition_and_rebuild_sched_domains()
rebuild_root_domains()
for all top_cpuset descendants:
update_tasks_root_domain()
for all tasks of cpuset:
dl_add_task_root_domain()

Change it so that only the task pi lock is taken to check if the task
has a SCHED_DEADLINE (DL) policy. In case that p is a DL task take the
rq lock as well to be able to safely de-reference root domain's DL
bandwidth structure.

Most of the tasks will have another policy (namely SCHED_NORMAL) and
can now bail without taking the rq lock.

One thing to note here: Even in case that there aren't any DL user
tasks, a slow frequency switching system with cpufreq gov schedutil has
a DL task (sugov) per frequency domain running which participates in DL
bandwidth management.

Reviewed-by: Quentin Perret <[email protected]>
Signed-off-by: Dietmar Eggemann <[email protected]>
---

The use case in which this makes a noticeable difference is Android's
'CPU pause' power management feature.

It uses CPU hotplug control to clear a CPU from active state to force
all threads which are not per-cpu kthreads away from this CPU.

Making DL bandwidth management faster during sched domain rebuild helps
to reduce the time to pause/un-pause a CPU.

kernel/sched/deadline.c | 11 +++++++----
1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index 5421782fe897..c7b1a63a053b 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -2409,9 +2409,13 @@ void dl_add_task_root_domain(struct task_struct *p)
struct rq *rq;
struct dl_bw *dl_b;

- rq = task_rq_lock(p, &rf);
- if (!dl_task(p))
- goto unlock;
+ raw_spin_lock_irqsave(&p->pi_lock, rf.flags);
+ if (!dl_task(p)) {
+ raw_spin_unlock_irqrestore(&p->pi_lock, rf.flags);
+ return;
+ }
+
+ rq = __task_rq_lock(p, &rf);

dl_b = &rq->rd->dl_bw;
raw_spin_lock(&dl_b->lock);
@@ -2420,7 +2424,6 @@ void dl_add_task_root_domain(struct task_struct *p)

raw_spin_unlock(&dl_b->lock);

-unlock:
task_rq_unlock(rq, p, &rf);
}

--
2.25.1


2021-01-19 20:57:11

by Valentin Schneider

[permalink] [raw]
Subject: Re: [PATCH] sched/deadline: Reduce rq lock contention in dl_add_task_root_domain()

On 19/01/21 09:35, Dietmar Eggemann wrote:
> dl_add_task_root_domain() is called during sched domain rebuild:
>
> rebuild_sched_domains_locked()
> partition_and_rebuild_sched_domains()
> rebuild_root_domains()
> for all top_cpuset descendants:
> update_tasks_root_domain()
> for all tasks of cpuset:
> dl_add_task_root_domain()
>
> Change it so that only the task pi lock is taken to check if the task
> has a SCHED_DEADLINE (DL) policy. In case that p is a DL task take the
> rq lock as well to be able to safely de-reference root domain's DL
> bandwidth structure.
>

A task's policy is stable under ->pi_lock, so LGTM.

Reviewed-by: Valentin Schneider <[email protected]>

> Most of the tasks will have another policy (namely SCHED_NORMAL) and
> can now bail without taking the rq lock.
>
> One thing to note here: Even in case that there aren't any DL user
> tasks, a slow frequency switching system with cpufreq gov schedutil has
> a DL task (sugov) per frequency domain running which participates in DL
> bandwidth management.
>
> Reviewed-by: Quentin Perret <[email protected]>
> Signed-off-by: Dietmar Eggemann <[email protected]>

Subject: Re: [PATCH] sched/deadline: Reduce rq lock contention in dl_add_task_root_domain()

On 1/19/21 9:35 AM, Dietmar Eggemann wrote:
> dl_add_task_root_domain() is called during sched domain rebuild:
>
> rebuild_sched_domains_locked()
> partition_and_rebuild_sched_domains()
> rebuild_root_domains()
> for all top_cpuset descendants:
> update_tasks_root_domain()
> for all tasks of cpuset:
> dl_add_task_root_domain()
>
> Change it so that only the task pi lock is taken to check if the task
> has a SCHED_DEADLINE (DL) policy. In case that p is a DL task take the
> rq lock as well to be able to safely de-reference root domain's DL
> bandwidth structure.
>
> Most of the tasks will have another policy (namely SCHED_NORMAL) and
> can now bail without taking the rq lock.
>
> One thing to note here: Even in case that there aren't any DL user
> tasks, a slow frequency switching system with cpufreq gov schedutil has
> a DL task (sugov) per frequency domain running which participates in DL
> bandwidth management.
>
> Reviewed-by: Quentin Perret <[email protected]>
> Signed-off-by: Dietmar Eggemann <[email protected]>

Reviewed-by: Daniel Bristot de Oliveira <[email protected]>

Thanks!
-- Daniel

2021-01-21 06:33:47

by Juri Lelli

[permalink] [raw]
Subject: Re: [PATCH] sched/deadline: Reduce rq lock contention in dl_add_task_root_domain()

Hi,

On 19/01/21 09:35, Dietmar Eggemann wrote:
> dl_add_task_root_domain() is called during sched domain rebuild:
>
> rebuild_sched_domains_locked()
> partition_and_rebuild_sched_domains()
> rebuild_root_domains()
> for all top_cpuset descendants:
> update_tasks_root_domain()
> for all tasks of cpuset:
> dl_add_task_root_domain()
>
> Change it so that only the task pi lock is taken to check if the task
> has a SCHED_DEADLINE (DL) policy. In case that p is a DL task take the
> rq lock as well to be able to safely de-reference root domain's DL
> bandwidth structure.
>
> Most of the tasks will have another policy (namely SCHED_NORMAL) and
> can now bail without taking the rq lock.
>
> One thing to note here: Even in case that there aren't any DL user
> tasks, a slow frequency switching system with cpufreq gov schedutil has
> a DL task (sugov) per frequency domain running which participates in DL
> bandwidth management.
>
> Reviewed-by: Quentin Perret <[email protected]>
> Signed-off-by: Dietmar Eggemann <[email protected]>

Looks good to me, thanks!

Acked-by: Juri Lelli <[email protected]>

Best,
Juri

2021-02-08 13:49:00

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH] sched/deadline: Reduce rq lock contention in dl_add_task_root_domain()

On Tue, Jan 19, 2021 at 09:35:42AM +0100, Dietmar Eggemann wrote:
> dl_add_task_root_domain() is called during sched domain rebuild:
>
> rebuild_sched_domains_locked()
> partition_and_rebuild_sched_domains()
> rebuild_root_domains()
> for all top_cpuset descendants:
> update_tasks_root_domain()
> for all tasks of cpuset:
> dl_add_task_root_domain()
>
> Change it so that only the task pi lock is taken to check if the task
> has a SCHED_DEADLINE (DL) policy. In case that p is a DL task take the
> rq lock as well to be able to safely de-reference root domain's DL
> bandwidth structure.
>
> Most of the tasks will have another policy (namely SCHED_NORMAL) and
> can now bail without taking the rq lock.
>
> One thing to note here: Even in case that there aren't any DL user
> tasks, a slow frequency switching system with cpufreq gov schedutil has
> a DL task (sugov) per frequency domain running which participates in DL
> bandwidth management.
>
> Reviewed-by: Quentin Perret <[email protected]>
> Signed-off-by: Dietmar Eggemann <[email protected]>

Thanks!

Subject: [tip: sched/core] sched/deadline: Reduce rq lock contention in dl_add_task_root_domain()

The following commit has been merged into the sched/core branch of tip:

Commit-ID: 3096b6fe494b7b4e45d20cb77aa6b715a3efe344
Gitweb: https://git.kernel.org/tip/3096b6fe494b7b4e45d20cb77aa6b715a3efe344
Author: Dietmar Eggemann <[email protected]>
AuthorDate: Tue, 19 Jan 2021 09:35:42 +01:00
Committer: Peter Zijlstra <[email protected]>
CommitterDate: Wed, 10 Feb 2021 14:44:48 +01:00

sched/deadline: Reduce rq lock contention in dl_add_task_root_domain()

dl_add_task_root_domain() is called during sched domain rebuild:

rebuild_sched_domains_locked()
partition_and_rebuild_sched_domains()
rebuild_root_domains()
for all top_cpuset descendants:
update_tasks_root_domain()
for all tasks of cpuset:
dl_add_task_root_domain()

Change it so that only the task pi lock is taken to check if the task
has a SCHED_DEADLINE (DL) policy. In case that p is a DL task take the
rq lock as well to be able to safely de-reference root domain's DL
bandwidth structure.

Most of the tasks will have another policy (namely SCHED_NORMAL) and
can now bail without taking the rq lock.

One thing to note here: Even in case that there aren't any DL user
tasks, a slow frequency switching system with cpufreq gov schedutil has
a DL task (sugov) per frequency domain running which participates in DL
bandwidth management.

Signed-off-by: Dietmar Eggemann <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Reviewed-by: Quentin Perret <[email protected]>
Reviewed-by: Valentin Schneider <[email protected]>
Reviewed-by: Daniel Bristot de Oliveira <[email protected]>
Acked-by: Juri Lelli <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
---
kernel/sched/deadline.c | 11 +++++++----
1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index 1508d12..6f37796 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -2388,9 +2388,13 @@ void dl_add_task_root_domain(struct task_struct *p)
struct rq *rq;
struct dl_bw *dl_b;

- rq = task_rq_lock(p, &rf);
- if (!dl_task(p))
- goto unlock;
+ raw_spin_lock_irqsave(&p->pi_lock, rf.flags);
+ if (!dl_task(p)) {
+ raw_spin_unlock_irqrestore(&p->pi_lock, rf.flags);
+ return;
+ }
+
+ rq = __task_rq_lock(p, &rf);

dl_b = &rq->rd->dl_bw;
raw_spin_lock(&dl_b->lock);
@@ -2399,7 +2403,6 @@ void dl_add_task_root_domain(struct task_struct *p)

raw_spin_unlock(&dl_b->lock);

-unlock:
task_rq_unlock(rq, p, &rf);
}

Subject: [tip: sched/core] sched/deadline: Reduce rq lock contention in dl_add_task_root_domain()

The following commit has been merged into the sched/core branch of tip:

Commit-ID: de40f33e788b0c016bfde512ace2f76339ef7ddb
Gitweb: https://git.kernel.org/tip/de40f33e788b0c016bfde512ace2f76339ef7ddb
Author: Dietmar Eggemann <[email protected]>
AuthorDate: Tue, 19 Jan 2021 09:35:42 +01:00
Committer: Ingo Molnar <[email protected]>
CommitterDate: Wed, 17 Feb 2021 14:12:42 +01:00

sched/deadline: Reduce rq lock contention in dl_add_task_root_domain()

dl_add_task_root_domain() is called during sched domain rebuild:

rebuild_sched_domains_locked()
partition_and_rebuild_sched_domains()
rebuild_root_domains()
for all top_cpuset descendants:
update_tasks_root_domain()
for all tasks of cpuset:
dl_add_task_root_domain()

Change it so that only the task pi lock is taken to check if the task
has a SCHED_DEADLINE (DL) policy. In case that p is a DL task take the
rq lock as well to be able to safely de-reference root domain's DL
bandwidth structure.

Most of the tasks will have another policy (namely SCHED_NORMAL) and
can now bail without taking the rq lock.

One thing to note here: Even in case that there aren't any DL user
tasks, a slow frequency switching system with cpufreq gov schedutil has
a DL task (sugov) per frequency domain running which participates in DL
bandwidth management.

Signed-off-by: Dietmar Eggemann <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Reviewed-by: Quentin Perret <[email protected]>
Reviewed-by: Valentin Schneider <[email protected]>
Reviewed-by: Daniel Bristot de Oliveira <[email protected]>
Acked-by: Juri Lelli <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
---
kernel/sched/deadline.c | 11 +++++++----
1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index 1508d12..6f37796 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -2388,9 +2388,13 @@ void dl_add_task_root_domain(struct task_struct *p)
struct rq *rq;
struct dl_bw *dl_b;

- rq = task_rq_lock(p, &rf);
- if (!dl_task(p))
- goto unlock;
+ raw_spin_lock_irqsave(&p->pi_lock, rf.flags);
+ if (!dl_task(p)) {
+ raw_spin_unlock_irqrestore(&p->pi_lock, rf.flags);
+ return;
+ }
+
+ rq = __task_rq_lock(p, &rf);

dl_b = &rq->rd->dl_bw;
raw_spin_lock(&dl_b->lock);
@@ -2399,7 +2403,6 @@ void dl_add_task_root_domain(struct task_struct *p)

raw_spin_unlock(&dl_b->lock);

-unlock:
task_rq_unlock(rq, p, &rf);
}