2023-12-14 17:58:15

by Keisuke Nishimura

[permalink] [raw]
Subject: [PATCH 1/2] sched/fair: take into account scheduling domain in select_idle_smt()

When picking out a CPU on a task wakeup, select_idle_smt() has to take
into account the scheduling domain of @target. This is because cpusets
and isolcpus can remove CPUs from the domain to isolate them from other
SMT siblings.

This fix checks if the candidate CPU is in the target scheduling domain.

The commit df3cb4ea1fb6 ("sched/fair: Fix wrong cpu selecting from isolated
domain") originally proposed this fix by adding the check of the scheduling
domain in the loop. However, the commit 3e6efe87cd5cc ("sched/fair: Remove
redundant check in select_idle_smt()") accidentally removed the check.
This commit brings the check back with the tiny optimization of computing
the intersection of the task's CPU mask and the sched domain mask up front.

Fixes: 3e6efe87cd5c ("sched/fair: Remove redundant check in select_idle_smt()")
Signed-off-by: Keisuke Nishimura <[email protected]>
Signed-off-by: Julia Lawall <[email protected]>
---
kernel/sched/fair.c | 15 +++++++++++----
1 file changed, 11 insertions(+), 4 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index bcd0f230e21f..71306b48cf68 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -7284,11 +7284,18 @@ static int select_idle_core(struct task_struct *p, int core, struct cpumask *cpu
/*
* Scan the local SMT mask for idle CPUs.
*/
-static int select_idle_smt(struct task_struct *p, int target)
+static int select_idle_smt(struct task_struct *p, struct sched_domain *sd, int target)
{
int cpu;
+ struct cpumask *cpus = this_cpu_cpumask_var_ptr(select_rq_mask);
+
+ /*
+ * Check if a candidate cpu is in the LLC scheduling domain where target exists.
+ * Due to isolcpus and cpusets, there is no guarantee that it holds.
+ */
+ cpumask_and(cpus, sched_domain_span(sd), p->cpus_ptr);

- for_each_cpu_and(cpu, cpu_smt_mask(target), p->cpus_ptr) {
+ for_each_cpu_and(cpu, cpu_smt_mask(target), cpus) {
if (cpu == target)
continue;
if (available_idle_cpu(cpu) || sched_idle_cpu(cpu))
@@ -7314,7 +7321,7 @@ static inline int select_idle_core(struct task_struct *p, int core, struct cpuma
return __select_idle_cpu(core, p);
}

-static inline int select_idle_smt(struct task_struct *p, int target)
+static inline int select_idle_smt(struct task_struct *p, struct sched_domain *sd, int target)
{
return -1;
}
@@ -7564,7 +7571,7 @@ static int select_idle_sibling(struct task_struct *p, int prev, int target)
has_idle_core = test_idle_cores(target);

if (!has_idle_core && cpus_share_cache(prev, target)) {
- i = select_idle_smt(p, prev);
+ i = select_idle_smt(p, sd, prev);
if ((unsigned int)i < nr_cpumask_bits)
return i;
}
--
2.34.1


2023-12-14 17:58:18

by Keisuke Nishimura

[permalink] [raw]
Subject: [PATCH 2/2] sched/fair: take into account scheduling domain in select_idle_core()

When picking out a CPU on a task wakeup, select_idle_smt() has to take
into account the scheduling domain where the function looks for the CPU.
This is because cpusets and isolcpus can remove CPUs from the domain
to isolate them from other SMT siblings.

This change replaces the set of CPUs allowed to run the task from
p->cpus_ptr by the intersection of p->cpus_ptr and sched_domain_span(sd)
which is stored in the cpus argument provided by select_idle_cpu.

Fixes: 9fe1f127b913 ("sched/fair: Merge select_idle_core/cpu()")
Signed-off-by: Keisuke Nishimura <[email protected]>
Signed-off-by: Julia Lawall <[email protected]>
---
kernel/sched/fair.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 71306b48cf68..3b7d32632674 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -7262,7 +7262,7 @@ static int select_idle_core(struct task_struct *p, int core, struct cpumask *cpu
if (!available_idle_cpu(cpu)) {
idle = false;
if (*idle_cpu == -1) {
- if (sched_idle_cpu(cpu) && cpumask_test_cpu(cpu, p->cpus_ptr)) {
+ if (sched_idle_cpu(cpu) && cpumask_test_cpu(cpu, cpus)) {
*idle_cpu = cpu;
break;
}
@@ -7270,7 +7270,7 @@ static int select_idle_core(struct task_struct *p, int core, struct cpumask *cpu
}
break;
}
- if (*idle_cpu == -1 && cpumask_test_cpu(cpu, p->cpus_ptr))
+ if (*idle_cpu == -1 && cpumask_test_cpu(cpu, cpus))
*idle_cpu = cpu;
}

--
2.34.1

2023-12-15 15:19:35

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH 1/2] sched/fair: take into account scheduling domain in select_idle_smt()

On Thu, Dec 14, 2023 at 06:55:50PM +0100, Keisuke Nishimura wrote:
> When picking out a CPU on a task wakeup, select_idle_smt() has to take
> into account the scheduling domain of @target. This is because cpusets
> and isolcpus can remove CPUs from the domain to isolate them from other
> SMT siblings.
>
> This fix checks if the candidate CPU is in the target scheduling domain.
>
> The commit df3cb4ea1fb6 ("sched/fair: Fix wrong cpu selecting from isolated
> domain") originally proposed this fix by adding the check of the scheduling
> domain in the loop. However, the commit 3e6efe87cd5cc ("sched/fair: Remove
> redundant check in select_idle_smt()") accidentally removed the check.
> This commit brings the check back with the tiny optimization of computing
> the intersection of the task's CPU mask and the sched domain mask up front.
>
> Fixes: 3e6efe87cd5c ("sched/fair: Remove redundant check in select_idle_smt()")

Simply reverting that patch is simpler no? That cpumask_and() is likely
more expensive than anything else that function does.

And I'm probably already in holiday more, but I don't immediately
understand the problem, if you're doing cpusets, then the affinity in
p->cpus_ptr should never cross your set, so how can it go wrong?

Is this some isolcpus idiocy? (I so hate that option)

> Signed-off-by: Keisuke Nishimura <[email protected]>
> Signed-off-by: Julia Lawall <[email protected]>
> ---
> kernel/sched/fair.c | 15 +++++++++++----
> 1 file changed, 11 insertions(+), 4 deletions(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index bcd0f230e21f..71306b48cf68 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -7284,11 +7284,18 @@ static int select_idle_core(struct task_struct *p, int core, struct cpumask *cpu
> /*
> * Scan the local SMT mask for idle CPUs.
> */
> -static int select_idle_smt(struct task_struct *p, int target)
> +static int select_idle_smt(struct task_struct *p, struct sched_domain *sd, int target)
> {
> int cpu;
> + struct cpumask *cpus = this_cpu_cpumask_var_ptr(select_rq_mask);
> +
> + /*
> + * Check if a candidate cpu is in the LLC scheduling domain where target exists.
> + * Due to isolcpus and cpusets, there is no guarantee that it holds.
> + */
> + cpumask_and(cpus, sched_domain_span(sd), p->cpus_ptr);
>
> - for_each_cpu_and(cpu, cpu_smt_mask(target), p->cpus_ptr) {
> + for_each_cpu_and(cpu, cpu_smt_mask(target), cpus) {
> if (cpu == target)
> continue;
> if (available_idle_cpu(cpu) || sched_idle_cpu(cpu))
> @@ -7314,7 +7321,7 @@ static inline int select_idle_core(struct task_struct *p, int core, struct cpuma
> return __select_idle_cpu(core, p);
> }
>
> -static inline int select_idle_smt(struct task_struct *p, int target)
> +static inline int select_idle_smt(struct task_struct *p, struct sched_domain *sd, int target)
> {
> return -1;
> }
> @@ -7564,7 +7571,7 @@ static int select_idle_sibling(struct task_struct *p, int prev, int target)
> has_idle_core = test_idle_cores(target);
>
> if (!has_idle_core && cpus_share_cache(prev, target)) {
> - i = select_idle_smt(p, prev);
> + i = select_idle_smt(p, sd, prev);
> if ((unsigned int)i < nr_cpumask_bits)
> return i;
> }
> --
> 2.34.1
>

2023-12-15 15:23:44

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH 2/2] sched/fair: take into account scheduling domain in select_idle_core()

On Thu, Dec 14, 2023 at 06:55:51PM +0100, Keisuke Nishimura wrote:
> When picking out a CPU on a task wakeup, select_idle_smt() has to take
> into account the scheduling domain where the function looks for the CPU.
> This is because cpusets and isolcpus can remove CPUs from the domain
> to isolate them from other SMT siblings.

Same question as before, when cpusets, the cpu should also be unset from
p->cpus_ptr. So I'm thinking you're one of those isolcpus users I wish
that would go away ;-)

> This change replaces the set of CPUs allowed to run the task from
> p->cpus_ptr by the intersection of p->cpus_ptr and sched_domain_span(sd)
> which is stored in the cpus argument provided by select_idle_cpu.
>
> Fixes: 9fe1f127b913 ("sched/fair: Merge select_idle_core/cpu()")
> Signed-off-by: Keisuke Nishimura <[email protected]>
> Signed-off-by: Julia Lawall <[email protected]>
> ---
> kernel/sched/fair.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 71306b48cf68..3b7d32632674 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -7262,7 +7262,7 @@ static int select_idle_core(struct task_struct *p, int core, struct cpumask *cpu
> if (!available_idle_cpu(cpu)) {
> idle = false;
> if (*idle_cpu == -1) {
> - if (sched_idle_cpu(cpu) && cpumask_test_cpu(cpu, p->cpus_ptr)) {
> + if (sched_idle_cpu(cpu) && cpumask_test_cpu(cpu, cpus)) {
> *idle_cpu = cpu;
> break;
> }
> @@ -7270,7 +7270,7 @@ static int select_idle_core(struct task_struct *p, int core, struct cpumask *cpu
> }
> break;
> }
> - if (*idle_cpu == -1 && cpumask_test_cpu(cpu, p->cpus_ptr))
> + if (*idle_cpu == -1 && cpumask_test_cpu(cpu, cpus))
> *idle_cpu = cpu;
> }

Aside of that, the actual patch seems to be fine, just the rationale
needs work.