2013-04-27 10:00:52

by Zefan Li

[permalink] [raw]
Subject: [PATCH 1/2] cpuset: use rebuild_sched_domains() in cpuset_hotplug_workfn()

From: Li Zhong <[email protected]>

In cpuset_hotplug_workfn(), partition_sched_domains() is called without
hotplug lock held, which is actually needed (stated in the function
header of partition_sched_domains()).

This patch tries to use rebuild_sched_domains() to solve the above
issue, and makes the code looks a little simpler.

Signed-off-by: Li Zhong <[email protected]>
Signed-off-by: Li Zefan <[email protected]>
---
kernel/cpuset.c | 13 ++-----------
1 file changed, 2 insertions(+), 11 deletions(-)

diff --git a/kernel/cpuset.c b/kernel/cpuset.c
index 943968d..b0f18ba 100644
--- a/kernel/cpuset.c
+++ b/kernel/cpuset.c
@@ -2184,17 +2184,8 @@ static void cpuset_hotplug_workfn(struct work_struct *work)
flush_workqueue(cpuset_propagate_hotplug_wq);

/* rebuild sched domains if cpus_allowed has changed */
- if (cpus_updated) {
- struct sched_domain_attr *attr;
- cpumask_var_t *doms;
- int ndoms;
-
- mutex_lock(&cpuset_mutex);
- ndoms = generate_sched_domains(&doms, &attr);
- mutex_unlock(&cpuset_mutex);
-
- partition_sched_domains(ndoms, doms, attr);
- }
+ if (cpus_updated)
+ rebuild_sched_domains();
}

void cpuset_update_active_cpus(bool cpu_online)
--
1.8.0.2


2013-04-27 09:55:55

by Zefan Li

[permalink] [raw]
Subject: [PATCH 2/2] cpuset: fix cpu hotplug vs rebuild_sched_domains() race

rebuild_sched_domains() might pass doms with offlined cpu to
partition_sched_domains(), which results in an oops:

general protection fault: 0000 [#1] SMP
...
RIP: 0010:[<ffffffff81077a1e>] [<ffffffff81077a1e>] get_group+0x6e/0x90
...
Call Trace:
[<ffffffff8107f07c>] build_sched_domains+0x70c/0xcb0
[<ffffffff8107f2a7>] ? build_sched_domains+0x937/0xcb0
[<ffffffff81173f64>] ? kfree+0xe4/0x1b0
[<ffffffff8107f6e0>] ? partition_sched_domains+0xc0/0x470
[<ffffffff8107f905>] partition_sched_domains+0x2e5/0x470
[<ffffffff8107f6e0>] ? partition_sched_domains+0xc0/0x470
[<ffffffff810c9007>] ? generate_sched_domains+0xc7/0x530
[<ffffffff810c94a8>] rebuild_sched_domains_locked+0x38/0x70
[<ffffffff810cb4a4>] cpuset_write_resmask+0x1a4/0x500
[<ffffffff810c8700>] ? cpuset_mount+0xe0/0xe0
[<ffffffff810c7f50>] ? cpuset_read_u64+0x100/0x100
[<ffffffff810be890>] ? cgroup_iter_next+0x90/0x90
[<ffffffff810cb300>] ? cpuset_css_offline+0x70/0x70
[<ffffffff810c1a73>] cgroup_file_write+0x133/0x2e0
[<ffffffff8118995b>] vfs_write+0xcb/0x130
[<ffffffff8118a174>] sys_write+0x64/0xa0

Reported-by: Li Zhong <[email protected]>
Signed-off-by: Li Zefan <[email protected]>
---
kernel/cpuset.c | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/kernel/cpuset.c b/kernel/cpuset.c
index b0f18ba..ef05901 100644
--- a/kernel/cpuset.c
+++ b/kernel/cpuset.c
@@ -769,12 +769,20 @@ static void rebuild_sched_domains_locked(void)
lockdep_assert_held(&cpuset_mutex);
get_online_cpus();

+ /*
+ * We have raced with CPU hotplug. Don't do anything to avoid
+ * passing doms with offlined cpu to partition_sched_domains().
+ * Anyways, hotplug work item will rebuild sched domains.
+ */
+ if (!cpumask_equal(top_cpuset.cpus_allowed, cpu_active_mask))
+ goto out;
+
/* Generate domain masks and attrs */
ndoms = generate_sched_domains(&doms, &attr);

/* Have scheduler rebuild the domains */
partition_sched_domains(ndoms, doms, attr);
-
+out:
put_online_cpus();
}
#else /* !CONFIG_SMP */
--
1.8.0.2

2013-04-27 14:23:31

by Tejun Heo

[permalink] [raw]
Subject: Re: [PATCH 2/2] cpuset: fix cpu hotplug vs rebuild_sched_domains() race

On Sat, Apr 27, 2013 at 05:53:48PM +0800, Li Zefan wrote:
> rebuild_sched_domains() might pass doms with offlined cpu to
> partition_sched_domains(), which results in an oops:
>
> general protection fault: 0000 [#1] SMP
> ...
> RIP: 0010:[<ffffffff81077a1e>] [<ffffffff81077a1e>] get_group+0x6e/0x90
> ...
> Call Trace:
> [<ffffffff8107f07c>] build_sched_domains+0x70c/0xcb0
> [<ffffffff8107f2a7>] ? build_sched_domains+0x937/0xcb0
> [<ffffffff81173f64>] ? kfree+0xe4/0x1b0
> [<ffffffff8107f6e0>] ? partition_sched_domains+0xc0/0x470
> [<ffffffff8107f905>] partition_sched_domains+0x2e5/0x470
> [<ffffffff8107f6e0>] ? partition_sched_domains+0xc0/0x470
> [<ffffffff810c9007>] ? generate_sched_domains+0xc7/0x530
> [<ffffffff810c94a8>] rebuild_sched_domains_locked+0x38/0x70
> [<ffffffff810cb4a4>] cpuset_write_resmask+0x1a4/0x500
> [<ffffffff810c8700>] ? cpuset_mount+0xe0/0xe0
> [<ffffffff810c7f50>] ? cpuset_read_u64+0x100/0x100
> [<ffffffff810be890>] ? cgroup_iter_next+0x90/0x90
> [<ffffffff810cb300>] ? cpuset_css_offline+0x70/0x70
> [<ffffffff810c1a73>] cgroup_file_write+0x133/0x2e0
> [<ffffffff8118995b>] vfs_write+0xcb/0x130
> [<ffffffff8118a174>] sys_write+0x64/0xa0
>
> Reported-by: Li Zhong <[email protected]>
> Signed-off-by: Li Zefan <[email protected]>

Applied 1-2 to cgroup/for-3.10.

Thanks.

--
tejun