Michal noted that a cpuset's effective_cpus can be a non-empy mask, but
because of the masking done with housekeeping_cpumask(HK_FLAG_DOMAIN)
further down the line, we can still end up with an empty cpumask being
passed down to partition_sched_domains_locked().
Do the proper thing and don't just check the mask is non-empty - check
that its intersection with housekeeping_cpumask(HK_FLAG_DOMAIN) is
non-empty.
Fixes: cd1cb3350561 ("sched/topology: Don't try to build empty sched domains")
Reported-by: Michal Koutný <[email protected]>
Signed-off-by: Valentin Schneider <[email protected]>
---
kernel/cgroup/cpuset.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
index c87ee6412b36..e4c10785dc7c 100644
--- a/kernel/cgroup/cpuset.c
+++ b/kernel/cgroup/cpuset.c
@@ -798,8 +798,14 @@ static int generate_sched_domains(cpumask_var_t **domains,
cpumask_subset(cp->cpus_allowed, top_cpuset.effective_cpus))
continue;
+ /*
+ * Skip cpusets that would lead to an empty sched domain.
+ * That could be because effective_cpus is empty, or because
+ * it's only spanning CPUs outside the housekeeping mask.
+ */
if (is_sched_load_balance(cp) &&
- !cpumask_empty(cp->effective_cpus))
+ cpumask_intersects(cp->effective_cpus,
+ housekeeping_cpumask(HK_FLAG_DOMAIN)))
csa[csn++] = cp;
/* skip @cp's subtree if not a partition root */
--
2.22.0
On 04/11/2019 00:39, Valentin Schneider wrote:
> Michal noted that a cpuset's effective_cpus can be a non-empy mask, but
> because of the masking done with housekeeping_cpumask(HK_FLAG_DOMAIN)
> further down the line, we can still end up with an empty cpumask being
> passed down to partition_sched_domains_locked().
>
> Do the proper thing and don't just check the mask is non-empty - check
> that its intersection with housekeeping_cpumask(HK_FLAG_DOMAIN) is
> non-empty.
>
> Fixes: cd1cb3350561 ("sched/topology: Don't try to build empty sched domains")
> Reported-by: Michal Koutný <[email protected]>
> Signed-off-by: Valentin Schneider <[email protected]>
> ---
Sigh, this is missing:
v2 changes:
- Fix broken diff
At least it *does* apply this time around.
> kernel/cgroup/cpuset.c | 8 +++++++-
> 1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
> index c87ee6412b36..e4c10785dc7c 100644
> --- a/kernel/cgroup/cpuset.c
> +++ b/kernel/cgroup/cpuset.c
> @@ -798,8 +798,14 @@ static int generate_sched_domains(cpumask_var_t **domains,
> cpumask_subset(cp->cpus_allowed, top_cpuset.effective_cpus))
> continue;
>
> + /*
> + * Skip cpusets that would lead to an empty sched domain.
> + * That could be because effective_cpus is empty, or because
> + * it's only spanning CPUs outside the housekeeping mask.
> + */
> if (is_sched_load_balance(cp) &&
> - !cpumask_empty(cp->effective_cpus))
> + cpumask_intersects(cp->effective_cpus,
> + housekeeping_cpumask(HK_FLAG_DOMAIN)))
> csa[csn++] = cp;
>
> /* skip @cp's subtree if not a partition root */
>
On 04/11/2019 00:39, Valentin Schneider wrote:
> Michal noted that a cpuset's effective_cpus can be a non-empy mask, but
> because of the masking done with housekeeping_cpumask(HK_FLAG_DOMAIN)
> further down the line, we can still end up with an empty cpumask being
> passed down to partition_sched_domains_locked().
>
> Do the proper thing and don't just check the mask is non-empty - check
> that its intersection with housekeeping_cpumask(HK_FLAG_DOMAIN) is
> non-empty.
>
> Fixes: cd1cb3350561 ("sched/topology: Don't try to build empty sched domains")
> Reported-by: Michal Koutný <[email protected]>
Michal, could I nag you for a reviewed-by? I'd feel a bit more confident
with any sort of approval from folks who actually do use cpusets.
> Signed-off-by: Valentin Schneider <[email protected]>
Hello.
On Thu, Nov 14, 2019 at 04:03:50PM +0000, Valentin Schneider <[email protected]> wrote:
> Michal, could I nag you for a reviewed-by? I'd feel a bit more confident
> with any sort of approval from folks who actually do use cpusets.
TL;DR I played with the v5.4-rc6 _without_ this fixup and I conclude it
unnecessary (IOW my previous theoretical observation was wrong).
The original problem is non-issue with v2 cpuset controller, because
effective_cpus are never empty. isolcpus doesn't take out cpuset CPUs,
hotplug does. In the case, no online CPU remains in the cpuset, it
inherits ancestor's non-empty cpuset.
I reproduced the problem with v1 (before your fix). However, in v1
effective == allowed (we're destructive and overwrite allowed on
hotunplug) and we already check the emptiness of
cpumask_intersects(cp->cpus_allowed, housekeeping_cpumask(HK_FLAG_DOMAIN)
few lines higher. I.e. the fixup adds redundant check against the empty
sched domain production.
Sorry for the noise and HTH,
Michal
The following commit has been merged into the sched/urgent branch of tip:
Commit-ID: 48a723d23b0d957e5b5861b974864e53c6841de8
Gitweb: https://git.kernel.org/tip/48a723d23b0d957e5b5861b974864e53c6841de8
Author: Valentin Schneider <[email protected]>
AuthorDate: Mon, 04 Nov 2019 00:39:06
Committer: Ingo Molnar <[email protected]>
CommitterDate: Fri, 15 Nov 2019 11:02:20 +01:00
sched/topology, cpuset: Account for housekeeping CPUs to avoid empty cpumasks
Michal noted that a cpuset's effective_cpus can be a non-empy mask, but
because of the masking done with housekeeping_cpumask(HK_FLAG_DOMAIN)
further down the line, we can still end up with an empty cpumask being
passed down to partition_sched_domains_locked().
Do the proper thing and don't just check the mask is non-empty - check
that its intersection with housekeeping_cpumask(HK_FLAG_DOMAIN) is
non-empty.
Reported-by: Michal Koutný <[email protected]>
Signed-off-by: Valentin Schneider <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Cc: [email protected]
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Fixes: cd1cb3350561 ("sched/topology: Don't try to build empty sched domains")
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
---
kernel/cgroup/cpuset.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
index c87ee64..e4c1078 100644
--- a/kernel/cgroup/cpuset.c
+++ b/kernel/cgroup/cpuset.c
@@ -798,8 +798,14 @@ static int generate_sched_domains(cpumask_var_t **domains,
cpumask_subset(cp->cpus_allowed, top_cpuset.effective_cpus))
continue;
+ /*
+ * Skip cpusets that would lead to an empty sched domain.
+ * That could be because effective_cpus is empty, or because
+ * it's only spanning CPUs outside the housekeeping mask.
+ */
if (is_sched_load_balance(cp) &&
- !cpumask_empty(cp->effective_cpus))
+ cpumask_intersects(cp->effective_cpus,
+ housekeeping_cpumask(HK_FLAG_DOMAIN)))
csa[csn++] = cp;
/* skip @cp's subtree if not a partition root */
On 15/11/2019 18:13, tip-bot2 for Valentin Schneider wrote:
> The following commit has been merged into the sched/urgent branch of tip:
>
> Commit-ID: 48a723d23b0d957e5b5861b974864e53c6841de8
> Gitweb: https://git.kernel.org/tip/48a723d23b0d957e5b5861b974864e53c6841de8
> Author: Valentin Schneider <[email protected]>
> AuthorDate: Mon, 04 Nov 2019 00:39:06
> Committer: Ingo Molnar <[email protected]>
> CommitterDate: Fri, 15 Nov 2019 11:02:20 +01:00
Michal just noted this might actually *not* be required. I'm setting up some
tests on qemu right now to convince myself.
On 15/11/2019 17:18, Michal Koutn? wrote:
> Hello.
>
> On Thu, Nov 14, 2019 at 04:03:50PM +0000, Valentin Schneider <[email protected]> wrote:
>> Michal, could I nag you for a reviewed-by? I'd feel a bit more confident
>> with any sort of approval from folks who actually do use cpusets.
> TL;DR I played with the v5.4-rc6 _without_ this fixup and I conclude it
> unnecessary (IOW my previous theoretical observation was wrong).
>
Thanks for going through the trouble of testing the thing.
>
> The original problem is non-issue with v2 cpuset controller, because
> effective_cpus are never empty. isolcpus doesn't take out cpuset CPUs,
> hotplug does. In the case, no online CPU remains in the cpuset, it
> inherits ancestor's non-empty cpuset.
>
But we still take out the isolcpus from the domain span before handing it
over to the scheduler:
cpumask_or(dp, dp, b->effective_cpus);
cpumask_and(dp, dp, housekeeping_cpumask(HK_FLAG_DOMAIN));
But...
> I reproduced the problem with v1 (before your fix). However, in v1
> effective == allowed (we're destructive and overwrite allowed on
> hotunplug) and we already check the emptiness of
>
> cpumask_intersects(cp->cpus_allowed, housekeeping_cpumask(HK_FLAG_DOMAIN)
>
> few lines higher. I.e. the fixup adds redundant check against the empty
> sched domain production.
>
...You're right, I've been misreading that as a '!is_sched_load_balance()'
condition ever since. Duh. So this condition will always catch cpusets than
only span outside the housekeeping domain, and my previous fixup will
catch newly-empty cpusets (due to HP). Perhaps it would've been cleaner to
merge the two, but as things stand this patch isn't needed (as you say).
I tried this out to really be sure (8 CPU SMP aarch64 qemu target):
cd /sys/fs/cgroup/cpuset
mkdir cs1
echo 1 > cs1/cpuset.cpu_exclusive
echo 0 > cs1/cpuset.mems
echo 0-4 > cs1/cpuset.cpus
mkdir cs2
echo 1 > cs2/cpuset.cpu_exclusive
echo 0 > cs2/cpuset.mems
echo 5-7 > cs2/cpuset.cpus
echo 0 > cpuset.sched_load_balance
booted with
isolcpus=6-7
It seems that creating a cpuset with CPUs only outside the housekeeping
domain is forbidden, so I'm creating cs2 with *one* CPU in the domain. When
I hotplug it out, nothing dies horribly:
echo 0 > /sys/devices/system/cpu/cpu5/online
[ 24.688145] CPU5: shutdown
[ 24.689438] psci: CPU5 killed.
[ 24.714168] allowed=0-4 effective=0-4 housekeeping=0-5
[ 24.714642] allowed=6-7 effective=6-7 housekeeping=0-5
[ 24.715416] CPU5 attaching NULL sched-domain.
> Sorry for the noise and HTH,
Sure does, thanks!
> Michal
>