Hello,
This series contains assorted cleanups which also prepare for the
planned migration taskset handling update.
This patchset contains the following sixteen patches.
0001-cgroup-disallow-xattr-release_agent-and-name-if-sane.patch
0002-cgroup-drop-CGRP_ROOT_SUBSYS_BOUND.patch
0003-cgroup-enable-task_cg_lists-on-the-first-cgroup-moun.patch
0004-cgroup-relocate-cgroup_enable_task_cg_lists.patch
0005-cgroup-implement-cgroup_has_tasks-and-unexport-cgrou.patch
0006-cgroup-reimplement-cgroup_transfer_tasks-without-usi.patch
0007-cgroup-make-css_set_lock-a-rwsem-and-rename-it-to-cs.patch
0008-cpuset-use-css_task_iter_start-next-end-instead-of-c.patch
0009-cgroup-remove-css_scan_tasks.patch
0010-cgroup-separate-out-put_css_set_locked-and-remove-pu.patch
0011-cgroup-move-css_set_rwsem-locking-outside-of-cgroup_.patch
0012-cgroup-drop-skip_css-from-cgroup_taskset_for_each.patch
0013-cpuset-don-t-use-cgroup_taskset_cur_css.patch
0014-cgroup-remove-cgroup_taskset_cur_css-and-cgroup_task.patch
0015-cgroup-cosmetic-updates-to-cgroup_attach_task.patch
0016-cgroup-unexport-functions.patch
The notables ones are
0003-0004 move task_cg_list enabling to the first mount instead of
the first css task iteration.
0005-0009 make css_set_lock a rwsem so that css_task_iter allows
blocking during iteration and removes css_scan_tasks().
0010-0015 clean up migration path to prepare for the planned
migration taskset handling update.
This patchset is on top of
cgroup/for-3.15 f7cef064aa01 ("Merge branch 'driver-core-next' into cgroup/for-3.15")
+ [1] [PATCHSET v2 cgroup/for-3.15] cgroup: convert to kernfs
+ [2] [PATCHSET v2 cgroup/for-3.15] cgroup: cleanups after kernfs conversion
and also available in the following git branch.
git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup.git review-moar-cleanups
diffstat follows.
block/blk-cgroup.c | 2
include/linux/cgroup.h | 33 --
kernel/cgroup.c | 579 ++++++++++++++-----------------------------
kernel/cgroup_freezer.c | 2
kernel/cpuset.c | 201 ++++----------
kernel/events/core.c | 2
kernel/sched/core.c | 4
mm/memcontrol.c | 4
net/core/netclassid_cgroup.c | 2
net/core/netprio_cgroup.c | 2
10 files changed, 278 insertions(+), 553 deletions(-)
Thanks.
--
tejun
[1] http://lkml.kernel.org/g/[email protected]
[2] http://lkml.kernel.org/g/[email protected]
Disallow more mount options if sane_behavior. Note that xattr used to
generate warning.
While at it, simplify option check in cgroup_mount() and update
sane_behavior comment in cgroup.h.
Signed-off-by: Tejun Heo <[email protected]>
---
include/linux/cgroup.h | 6 +++---
kernel/cgroup.c | 14 ++++----------
2 files changed, 7 insertions(+), 13 deletions(-)
diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
index 5f2c629..fa415a8 100644
--- a/include/linux/cgroup.h
+++ b/include/linux/cgroup.h
@@ -225,8 +225,8 @@ enum {
*
* The followings are the behaviors currently affected this flag.
*
- * - Mount options "noprefix" and "clone_children" are disallowed.
- * Also, cgroupfs file cgroup.clone_children is not created.
+ * - Mount options "noprefix", "xattr", "clone_children",
+ * "release_agent" and "name" are disallowed.
*
* - When mounting an existing superblock, mount options should
* match.
@@ -244,7 +244,7 @@ enum {
* - "release_agent" and "notify_on_release" are removed.
* Replacement notification mechanism will be implemented.
*
- * - "xattr" mount option is deprecated. kernfs always enables it.
+ * - "cgroup.clone_children" is removed.
*
* - cpuset: tasks will be kept in empty cpusets when hotplug happens
* and take masks of ancestors with non-empty cpus/mems, instead of
diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 4c53e90..47160ce 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -1224,18 +1224,12 @@ static int parse_cgroupfs_options(char *data, struct cgroup_sb_opts *opts)
if (opts->flags & CGRP_ROOT_SANE_BEHAVIOR) {
pr_warning("cgroup: sane_behavior: this is still under development and its behaviors will change, proceed at your own risk\n");
- if (opts->flags & CGRP_ROOT_NOPREFIX) {
- pr_err("cgroup: sane_behavior: noprefix is not allowed\n");
+ if ((opts->flags & (CGRP_ROOT_NOPREFIX | CGRP_ROOT_XATTR)) ||
+ opts->cpuset_clone_children || opts->release_agent ||
+ opts->name) {
+ pr_err("cgroup: sane_behavior: noprefix, xattr, clone_children, release_agent and name are not allowed\n");
return -EINVAL;
}
-
- if (opts->cpuset_clone_children) {
- pr_err("cgroup: sane_behavior: clone_children is not allowed\n");
- return -EINVAL;
- }
-
- if (opts->flags & CGRP_ROOT_XATTR)
- pr_warning("cgroup: sane_behavior: xattr is always available, flag unnecessary\n");
}
/*
--
1.8.5.3
cgroup_task_count() read-locks css_set_lock and walks all tasks to
count them and then returns the result. The only thing all the users
want is determining whether the cgroup is empty or not. This patch
implements cgroup_has_tasks() which tests whether cgroup->cset_links
is empty, replaces all cgroup_task_count() usages and unexports it.
Note that the test isn't synchronized. This is the same as before.
The test has always been racy.
This will help planned css_set locking update.
Signed-off-by: Tejun Heo <[email protected]>
Cc: Li Zefan <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Balbir Singh <[email protected]>
Cc: KAMEZAWA Hiroyuki <[email protected]>
---
include/linux/cgroup.h | 8 ++++++--
kernel/cgroup.c | 2 +-
kernel/cpuset.c | 2 +-
mm/memcontrol.c | 4 ++--
4 files changed, 10 insertions(+), 6 deletions(-)
diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
index 8ca31c1..f173cfb 100644
--- a/include/linux/cgroup.h
+++ b/include/linux/cgroup.h
@@ -455,6 +455,12 @@ static inline bool cgroup_sane_behavior(const struct cgroup *cgrp)
return cgrp->root->flags & CGRP_ROOT_SANE_BEHAVIOR;
}
+/* no synchronization, the result can only be used as a hint */
+static inline bool cgroup_has_tasks(struct cgroup *cgrp)
+{
+ return !list_empty(&cgrp->cset_links);
+}
+
/* returns ino associated with a cgroup, 0 indicates unmounted root */
static inline ino_t cgroup_ino(struct cgroup *cgrp)
{
@@ -514,8 +520,6 @@ int cgroup_rm_cftypes(struct cftype *cfts);
bool cgroup_is_descendant(struct cgroup *cgrp, struct cgroup *ancestor);
-int cgroup_task_count(const struct cgroup *cgrp);
-
/*
* Control Group taskset, used to pass around set of tasks to cgroup_subsys
* methods.
diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 1ebc6e30..96a3a85 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -2390,7 +2390,7 @@ EXPORT_SYMBOL_GPL(cgroup_add_cftypes);
*
* Return the number of tasks in the cgroup.
*/
-int cgroup_task_count(const struct cgroup *cgrp)
+static int cgroup_task_count(const struct cgroup *cgrp)
{
int count = 0;
struct cgrp_cset_link *link;
diff --git a/kernel/cpuset.c b/kernel/cpuset.c
index e97a6e8..ae190b0 100644
--- a/kernel/cpuset.c
+++ b/kernel/cpuset.c
@@ -467,7 +467,7 @@ static int validate_change(struct cpuset *cur, struct cpuset *trial)
* be changed to have empty cpus_allowed or mems_allowed.
*/
ret = -ENOSPC;
- if ((cgroup_task_count(cur->css.cgroup) || cur->attach_in_progress)) {
+ if ((cgroup_has_tasks(cur->css.cgroup) || cur->attach_in_progress)) {
if (!cpumask_empty(cur->cpus_allowed) &&
cpumask_empty(trial->cpus_allowed))
goto out;
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index c1c2549..d9c6ac1 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -4958,7 +4958,7 @@ static int mem_cgroup_force_empty(struct mem_cgroup *memcg)
struct cgroup *cgrp = memcg->css.cgroup;
/* returns EBUSY if there is a task or if we come here twice. */
- if (cgroup_task_count(cgrp) || !list_empty(&cgrp->children))
+ if (cgroup_has_tasks(cgrp) || !list_empty(&cgrp->children))
return -EBUSY;
/* we call try-to-free pages for make this cgroup empty */
@@ -5140,7 +5140,7 @@ static int __memcg_activate_kmem(struct mem_cgroup *memcg,
* of course permitted.
*/
mutex_lock(&memcg_create_mutex);
- if (cgroup_task_count(memcg->css.cgroup) || memcg_has_children(memcg))
+ if (cgroup_has_tasks(memcg->css.cgroup) || memcg_has_children(memcg))
err = -EBUSY;
mutex_unlock(&memcg_create_mutex);
if (err)
--
1.8.5.3
With module support gone, a lot of functions no longer need to be
exported. Unexport them.
Signed-off-by: Tejun Heo <[email protected]>
---
kernel/cgroup.c | 8 --------
1 file changed, 8 deletions(-)
diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 057ab96..abb1873 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -242,7 +242,6 @@ bool cgroup_is_descendant(struct cgroup *cgrp, struct cgroup *ancestor)
}
return false;
}
-EXPORT_SYMBOL_GPL(cgroup_is_descendant);
static int cgroup_is_releasable(const struct cgroup *cgrp)
{
@@ -1662,7 +1661,6 @@ struct task_struct *cgroup_taskset_first(struct cgroup_taskset *tset)
return tset->single.task;
}
}
-EXPORT_SYMBOL_GPL(cgroup_taskset_first);
/**
* cgroup_taskset_next - iterate to the next task in taskset
@@ -1681,7 +1679,6 @@ struct task_struct *cgroup_taskset_next(struct cgroup_taskset *tset)
tc = flex_array_get(tset->tc_array, tset->idx++);
return tc->task;
}
-EXPORT_SYMBOL_GPL(cgroup_taskset_next);
/**
* cgroup_task_migrate - move a task from one cgroup to another.
@@ -2364,7 +2361,6 @@ int cgroup_add_cftypes(struct cgroup_subsys *ss, struct cftype *cfts)
mutex_unlock(&cgroup_tree_mutex);
return ret;
}
-EXPORT_SYMBOL_GPL(cgroup_add_cftypes);
/**
* cgroup_task_count - count the number of tasks in a cgroup.
@@ -2438,7 +2434,6 @@ css_next_child(struct cgroup_subsys_state *pos_css,
return cgroup_css(next, parent_css->ss);
}
-EXPORT_SYMBOL_GPL(css_next_child);
/**
* css_next_descendant_pre - find the next descendant for pre-order walk
@@ -2481,7 +2476,6 @@ css_next_descendant_pre(struct cgroup_subsys_state *pos,
return NULL;
}
-EXPORT_SYMBOL_GPL(css_next_descendant_pre);
/**
* css_rightmost_descendant - return the rightmost descendant of a css
@@ -2513,7 +2507,6 @@ css_rightmost_descendant(struct cgroup_subsys_state *pos)
return last;
}
-EXPORT_SYMBOL_GPL(css_rightmost_descendant);
static struct cgroup_subsys_state *
css_leftmost_descendant(struct cgroup_subsys_state *pos)
@@ -2567,7 +2560,6 @@ css_next_descendant_post(struct cgroup_subsys_state *pos,
/* no sibling left, visit parent */
return css_parent(pos);
}
-EXPORT_SYMBOL_GPL(css_next_descendant_post);
/**
* css_advance_task_iter - advance a task itererator to the next css_set
--
1.8.5.3
cgroup_taskset_cur_css() will be removed during the planned
resturcturing of migration path. The only use of
cgroup_taskset_cur_css() is finding out the old cgroup_subsys_state of
the leader in cpuset_attach(). This usage can easily be removed by
remembering the old value from cpuset_can_attach().
Signed-off-by: Tejun Heo <[email protected]>
Cc: Li Zefan <[email protected]>
---
kernel/cpuset.c | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/kernel/cpuset.c b/kernel/cpuset.c
index bf20e4a..d8bec21 100644
--- a/kernel/cpuset.c
+++ b/kernel/cpuset.c
@@ -1379,6 +1379,8 @@ static int fmeter_getrate(struct fmeter *fmp)
return val;
}
+static struct cpuset *cpuset_attach_old_cs;
+
/* Called by cgroups to determine if a cpuset is usable; cpuset_mutex held */
static int cpuset_can_attach(struct cgroup_subsys_state *css,
struct cgroup_taskset *tset)
@@ -1387,6 +1389,9 @@ static int cpuset_can_attach(struct cgroup_subsys_state *css,
struct task_struct *task;
int ret;
+ /* used later by cpuset_attach() */
+ cpuset_attach_old_cs = task_cs(cgroup_taskset_first(tset));
+
mutex_lock(&cpuset_mutex);
/*
@@ -1450,10 +1455,8 @@ static void cpuset_attach(struct cgroup_subsys_state *css,
struct mm_struct *mm;
struct task_struct *task;
struct task_struct *leader = cgroup_taskset_first(tset);
- struct cgroup_subsys_state *oldcss = cgroup_taskset_cur_css(tset,
- cpuset_cgrp_id);
struct cpuset *cs = css_cs(css);
- struct cpuset *oldcs = css_cs(oldcss);
+ struct cpuset *oldcs = cpuset_attach_old_cs;
struct cpuset *cpus_cs = effective_cpumask_cpuset(cs);
struct cpuset *mems_cs = effective_nodemask_cpuset(cs);
--
1.8.5.3
cgroup_attach_task() is planned to go through restructuring. Let's
tidy it up a bit in preparation.
* Update cgroup_attach_task() to receive the target task argument in
@leader instead of @tsk.
* Rename @tsk to @task.
* Rename @retval to @ret.
This is purely cosmetic.
Signed-off-by: Tejun Heo <[email protected]>
---
kernel/cgroup.c | 45 +++++++++++++++++++++++----------------------
1 file changed, 23 insertions(+), 22 deletions(-)
diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 0108753..057ab96 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -1727,20 +1727,20 @@ static void cgroup_task_migrate(struct cgroup *old_cgrp,
/**
* cgroup_attach_task - attach a task or a whole threadgroup to a cgroup
* @cgrp: the cgroup to attach to
- * @tsk: the task or the leader of the threadgroup to be attached
+ * @leader: the task or the leader of the threadgroup to be attached
* @threadgroup: attach the whole threadgroup?
*
* Call holding cgroup_mutex and the group_rwsem of the leader. Will take
* task_lock of @tsk or each thread in the threadgroup individually in turn.
*/
-static int cgroup_attach_task(struct cgroup *cgrp, struct task_struct *tsk,
+static int cgroup_attach_task(struct cgroup *cgrp, struct task_struct *leader,
bool threadgroup)
{
- int retval, i, group_size;
+ int ret, i, group_size;
struct cgroupfs_root *root = cgrp->root;
struct cgroup_subsys_state *css, *failed_css = NULL;
/* threadgroup list cursor and array */
- struct task_struct *leader = tsk;
+ struct task_struct *task;
struct task_and_cgroup *tc;
struct flex_array *group;
struct cgroup_taskset tset = { };
@@ -1753,7 +1753,7 @@ static int cgroup_attach_task(struct cgroup *cgrp, struct task_struct *tsk,
* threads exit, this will just be an over-estimate.
*/
if (threadgroup)
- group_size = get_nr_threads(tsk);
+ group_size = get_nr_threads(task);
else
group_size = 1;
/* flex_array supports very large thread-groups better than kmalloc. */
@@ -1761,8 +1761,8 @@ static int cgroup_attach_task(struct cgroup *cgrp, struct task_struct *tsk,
if (!group)
return -ENOMEM;
/* pre-allocate to guarantee space while iterating in rcu read-side. */
- retval = flex_array_prealloc(group, 0, group_size, GFP_KERNEL);
- if (retval)
+ ret = flex_array_prealloc(group, 0, group_size, GFP_KERNEL);
+ if (ret)
goto out_free_group_list;
i = 0;
@@ -1773,17 +1773,18 @@ static int cgroup_attach_task(struct cgroup *cgrp, struct task_struct *tsk,
*/
down_read(&css_set_rwsem);
rcu_read_lock();
+ task = leader;
do {
struct task_and_cgroup ent;
- /* @tsk either already exited or can't exit until the end */
- if (tsk->flags & PF_EXITING)
+ /* @task either already exited or can't exit until the end */
+ if (task->flags & PF_EXITING)
goto next;
/* as per above, nr_threads may decrease, but not increase. */
BUG_ON(i >= group_size);
- ent.task = tsk;
- ent.cgrp = task_cgroup_from_root(tsk, root);
+ ent.task = task;
+ ent.cgrp = task_cgroup_from_root(task, root);
/* nothing to do if this task is already in the cgroup */
if (ent.cgrp == cgrp)
goto next;
@@ -1791,13 +1792,13 @@ static int cgroup_attach_task(struct cgroup *cgrp, struct task_struct *tsk,
* saying GFP_ATOMIC has no effect here because we did prealloc
* earlier, but it's good form to communicate our expectations.
*/
- retval = flex_array_put(group, i, &ent, GFP_ATOMIC);
- BUG_ON(retval != 0);
+ ret = flex_array_put(group, i, &ent, GFP_ATOMIC);
+ BUG_ON(ret != 0);
i++;
next:
if (!threadgroup)
break;
- } while_each_thread(leader, tsk);
+ } while_each_thread(leader, task);
rcu_read_unlock();
up_read(&css_set_rwsem);
/* remember the number of threads in the array for later. */
@@ -1806,7 +1807,7 @@ static int cgroup_attach_task(struct cgroup *cgrp, struct task_struct *tsk,
tset.tc_array_len = group_size;
/* methods shouldn't be called if no task is actually migrating */
- retval = 0;
+ ret = 0;
if (!group_size)
goto out_free_group_list;
@@ -1815,8 +1816,8 @@ static int cgroup_attach_task(struct cgroup *cgrp, struct task_struct *tsk,
*/
for_each_css(css, i, cgrp) {
if (css->ss->can_attach) {
- retval = css->ss->can_attach(css, &tset);
- if (retval) {
+ ret = css->ss->can_attach(css, &tset);
+ if (ret) {
failed_css = css;
goto out_cancel_attach;
}
@@ -1834,7 +1835,7 @@ static int cgroup_attach_task(struct cgroup *cgrp, struct task_struct *tsk,
old_cset = task_css_set(tc->task);
tc->cset = find_css_set(old_cset, cgrp);
if (!tc->cset) {
- retval = -ENOMEM;
+ ret = -ENOMEM;
goto out_put_css_set_refs;
}
}
@@ -1862,9 +1863,9 @@ static int cgroup_attach_task(struct cgroup *cgrp, struct task_struct *tsk,
/*
* step 5: success! and cleanup
*/
- retval = 0;
+ ret = 0;
out_put_css_set_refs:
- if (retval) {
+ if (ret) {
for (i = 0; i < group_size; i++) {
tc = flex_array_get(group, i);
if (!tc->cset)
@@ -1873,7 +1874,7 @@ out_put_css_set_refs:
}
}
out_cancel_attach:
- if (retval) {
+ if (ret) {
for_each_css(css, i, cgrp) {
if (css == failed_css)
break;
@@ -1883,7 +1884,7 @@ out_cancel_attach:
}
out_free_group_list:
flex_array_free(group);
- return retval;
+ return ret;
}
/*
--
1.8.5.3
Instead of repeatedly locking and unlocking css_set_rwsem inside
cgroup_task_migrate(), update cgroup_attach_task() to grab it outside
of the loop and update cgroup_task_migrate() to use
put_css_set_locked().
Signed-off-by: Tejun Heo <[email protected]>
---
kernel/cgroup.c | 16 +++++++++++-----
1 file changed, 11 insertions(+), 5 deletions(-)
diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 63d1a4e..ac78311 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -1713,10 +1713,13 @@ int cgroup_taskset_size(struct cgroup_taskset *tset)
EXPORT_SYMBOL_GPL(cgroup_taskset_size);
-/*
+/**
* cgroup_task_migrate - move a task from one cgroup to another.
+ * @old_cgrp; the cgroup @tsk is being migrated from
+ * @tsk: the task being migrated
+ * @new_cset: the new css_set @tsk is being attached to
*
- * Must be called with cgroup_mutex and threadgroup locked.
+ * Must be called with cgroup_mutex, threadgroup and css_set_rwsem locked.
*/
static void cgroup_task_migrate(struct cgroup *old_cgrp,
struct task_struct *tsk,
@@ -1724,6 +1727,9 @@ static void cgroup_task_migrate(struct cgroup *old_cgrp,
{
struct css_set *old_cset;
+ lockdep_assert_held(&cgroup_mutex);
+ lockdep_assert_held(&css_set_rwsem);
+
/*
* We are synchronized through threadgroup_lock() against PF_EXITING
* setting such that we can't race against cgroup_exit() changing the
@@ -1737,9 +1743,7 @@ static void cgroup_task_migrate(struct cgroup *old_cgrp,
task_unlock(tsk);
/* Update the css_set linked lists if we're using them */
- down_write(&css_set_rwsem);
list_move(&tsk->cg_list, &new_cset->tasks);
- up_write(&css_set_rwsem);
/*
* We just gained a reference on old_cset by taking it from the
@@ -1747,7 +1751,7 @@ static void cgroup_task_migrate(struct cgroup *old_cgrp,
* we're safe to drop it here; it will be freed under RCU.
*/
set_bit(CGRP_RELEASABLE, &old_cgrp->flags);
- put_css_set(old_cset, false);
+ put_css_set_locked(old_cset, false);
}
/**
@@ -1870,10 +1874,12 @@ static int cgroup_attach_task(struct cgroup *cgrp, struct task_struct *tsk,
* proceed to move all tasks to the new cgroup. There are no
* failure cases after here, so this is the commit point.
*/
+ down_write(&css_set_rwsem);
for (i = 0; i < group_size; i++) {
tc = flex_array_get(group, i);
cgroup_task_migrate(tc->cgrp, tc->task, tc->cset);
}
+ up_write(&css_set_rwsem);
/* nothing is sensitive to fork() after this point. */
/*
--
1.8.5.3
The two functions don't have any users left. Remove them along with
cgroup_taskset->cur_cgrp.
Signed-off-by: Tejun Heo <[email protected]>
---
include/linux/cgroup.h | 3 ---
kernel/cgroup.c | 30 ------------------------------
2 files changed, 33 deletions(-)
diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
index 3c67883..21887b6 100644
--- a/include/linux/cgroup.h
+++ b/include/linux/cgroup.h
@@ -526,9 +526,6 @@ bool cgroup_is_descendant(struct cgroup *cgrp, struct cgroup *ancestor);
struct cgroup_taskset;
struct task_struct *cgroup_taskset_first(struct cgroup_taskset *tset);
struct task_struct *cgroup_taskset_next(struct cgroup_taskset *tset);
-struct cgroup_subsys_state *cgroup_taskset_cur_css(struct cgroup_taskset *tset,
- int subsys_id);
-int cgroup_taskset_size(struct cgroup_taskset *tset);
/**
* cgroup_taskset_for_each - iterate cgroup_taskset
diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index ac78311..0108753 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -1645,7 +1645,6 @@ struct cgroup_taskset {
struct flex_array *tc_array;
int tc_array_len;
int idx;
- struct cgroup *cur_cgrp;
};
/**
@@ -1660,7 +1659,6 @@ struct task_struct *cgroup_taskset_first(struct cgroup_taskset *tset)
tset->idx = 0;
return cgroup_taskset_next(tset);
} else {
- tset->cur_cgrp = tset->single.cgrp;
return tset->single.task;
}
}
@@ -1681,39 +1679,11 @@ struct task_struct *cgroup_taskset_next(struct cgroup_taskset *tset)
return NULL;
tc = flex_array_get(tset->tc_array, tset->idx++);
- tset->cur_cgrp = tc->cgrp;
return tc->task;
}
EXPORT_SYMBOL_GPL(cgroup_taskset_next);
/**
- * cgroup_taskset_cur_css - return the matching css for the current task
- * @tset: taskset of interest
- * @subsys_id: the ID of the target subsystem
- *
- * Return the css for the current (last returned) task of @tset for
- * subsystem specified by @subsys_id. This function must be preceded by
- * either cgroup_taskset_first() or cgroup_taskset_next().
- */
-struct cgroup_subsys_state *cgroup_taskset_cur_css(struct cgroup_taskset *tset,
- int subsys_id)
-{
- return cgroup_css(tset->cur_cgrp, cgroup_subsys[subsys_id]);
-}
-EXPORT_SYMBOL_GPL(cgroup_taskset_cur_css);
-
-/**
- * cgroup_taskset_size - return the number of tasks in taskset
- * @tset: taskset of interest
- */
-int cgroup_taskset_size(struct cgroup_taskset *tset)
-{
- return tset->tc_array ? tset->tc_array_len : 1;
-}
-EXPORT_SYMBOL_GPL(cgroup_taskset_size);
-
-
-/**
* cgroup_task_migrate - move a task from one cgroup to another.
* @old_cgrp; the cgroup @tsk is being migrated from
* @tsk: the task being migrated
--
1.8.5.3
If !NULL, @skip_css makes cgroup_taskset_for_each() skip the matching
css. The intention of the interface is to make it easy to skip css's
(cgroup_subsys_states) which already match the migration target;
however, this is entirely unnecessary as migration taskset doesn't
include tasks which are already in the target cgroup. Drop @skip_css
from cgroup_taskset_for_each().
Signed-off-by: Tejun Heo <[email protected]>
Cc: Li Zefan <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Daniel Borkmann <[email protected]>
---
block/blk-cgroup.c | 2 +-
include/linux/cgroup.h | 8 ++------
kernel/cgroup_freezer.c | 2 +-
kernel/cpuset.c | 4 ++--
kernel/events/core.c | 2 +-
kernel/sched/core.c | 4 ++--
net/core/netclassid_cgroup.c | 2 +-
net/core/netprio_cgroup.c | 2 +-
8 files changed, 11 insertions(+), 15 deletions(-)
diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
index 1cef07c..4aefd46 100644
--- a/block/blk-cgroup.c
+++ b/block/blk-cgroup.c
@@ -894,7 +894,7 @@ static int blkcg_can_attach(struct cgroup_subsys_state *css,
int ret = 0;
/* task_lock() is needed to avoid races with exit_io_context() */
- cgroup_taskset_for_each(task, css, tset) {
+ cgroup_taskset_for_each(task, tset) {
task_lock(task);
ioc = task->io_context;
if (ioc && atomic_read(&ioc->nr_tasks) > 1)
diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
index db5ccf4..3c67883 100644
--- a/include/linux/cgroup.h
+++ b/include/linux/cgroup.h
@@ -533,15 +533,11 @@ int cgroup_taskset_size(struct cgroup_taskset *tset);
/**
* cgroup_taskset_for_each - iterate cgroup_taskset
* @task: the loop cursor
- * @skip_css: skip if task's css matches this, %NULL to iterate through all
* @tset: taskset to iterate
*/
-#define cgroup_taskset_for_each(task, skip_css, tset) \
+#define cgroup_taskset_for_each(task, tset) \
for ((task) = cgroup_taskset_first((tset)); (task); \
- (task) = cgroup_taskset_next((tset))) \
- if (!(skip_css) || \
- cgroup_taskset_cur_css((tset), \
- (skip_css)->ss->id) != (skip_css))
+ (task) = cgroup_taskset_next((tset)))
/*
* Control Group subsystem type.
diff --git a/kernel/cgroup_freezer.c b/kernel/cgroup_freezer.c
index 98ea26a9..7201a63 100644
--- a/kernel/cgroup_freezer.c
+++ b/kernel/cgroup_freezer.c
@@ -187,7 +187,7 @@ static void freezer_attach(struct cgroup_subsys_state *new_css,
* current state before executing the following - !frozen tasks may
* be visible in a FROZEN cgroup and frozen tasks in a THAWED one.
*/
- cgroup_taskset_for_each(task, new_css, tset) {
+ cgroup_taskset_for_each(task, tset) {
if (!(freezer->state & CGROUP_FREEZING)) {
__thaw_task(task);
} else {
diff --git a/kernel/cpuset.c b/kernel/cpuset.c
index 65ae0bd..bf20e4a 100644
--- a/kernel/cpuset.c
+++ b/kernel/cpuset.c
@@ -1398,7 +1398,7 @@ static int cpuset_can_attach(struct cgroup_subsys_state *css,
(cpumask_empty(cs->cpus_allowed) || nodes_empty(cs->mems_allowed)))
goto out_unlock;
- cgroup_taskset_for_each(task, css, tset) {
+ cgroup_taskset_for_each(task, tset) {
/*
* Kthreads which disallow setaffinity shouldn't be moved
* to a new cpuset; we don't want to change their cpu
@@ -1467,7 +1467,7 @@ static void cpuset_attach(struct cgroup_subsys_state *css,
guarantee_online_mems(mems_cs, &cpuset_attach_nodemask_to);
- cgroup_taskset_for_each(task, css, tset) {
+ cgroup_taskset_for_each(task, tset) {
/*
* can_attach beforehand should guarantee that this doesn't
* fail. TODO: have a better way to handle failure here
diff --git a/kernel/events/core.c b/kernel/events/core.c
index a3c3ab5..6dd7149 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -8021,7 +8021,7 @@ static void perf_cgroup_attach(struct cgroup_subsys_state *css,
{
struct task_struct *task;
- cgroup_taskset_for_each(task, css, tset)
+ cgroup_taskset_for_each(task, tset)
task_function_call(task, __perf_cgroup_move, task);
}
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index d4cfc55..ba386a0 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -7600,7 +7600,7 @@ static int cpu_cgroup_can_attach(struct cgroup_subsys_state *css,
{
struct task_struct *task;
- cgroup_taskset_for_each(task, css, tset) {
+ cgroup_taskset_for_each(task, tset) {
#ifdef CONFIG_RT_GROUP_SCHED
if (!sched_rt_can_attach(css_tg(css), task))
return -EINVAL;
@@ -7618,7 +7618,7 @@ static void cpu_cgroup_attach(struct cgroup_subsys_state *css,
{
struct task_struct *task;
- cgroup_taskset_for_each(task, css, tset)
+ cgroup_taskset_for_each(task, tset)
sched_move_task(task);
}
diff --git a/net/core/netclassid_cgroup.c b/net/core/netclassid_cgroup.c
index b865662..22931e1 100644
--- a/net/core/netclassid_cgroup.c
+++ b/net/core/netclassid_cgroup.c
@@ -73,7 +73,7 @@ static void cgrp_attach(struct cgroup_subsys_state *css,
void *v = (void *)(unsigned long)cs->classid;
struct task_struct *p;
- cgroup_taskset_for_each(p, css, tset) {
+ cgroup_taskset_for_each(p, tset) {
task_lock(p);
iterate_fd(p->files, 0, update_classid, v);
task_unlock(p);
diff --git a/net/core/netprio_cgroup.c b/net/core/netprio_cgroup.c
index d7d23e2..f9f3a40 100644
--- a/net/core/netprio_cgroup.c
+++ b/net/core/netprio_cgroup.c
@@ -224,7 +224,7 @@ static void net_prio_attach(struct cgroup_subsys_state *css,
struct task_struct *p;
void *v = (void *)(unsigned long)css->cgroup->id;
- cgroup_taskset_for_each(p, css, tset) {
+ cgroup_taskset_for_each(p, tset) {
task_lock(p);
iterate_fd(p->files, 0, update_netprio, v);
task_unlock(p);
--
1.8.5.3
Now that css_task_iter_start/next_end() supports blocking while
iterating, there's no reason to use css_scan_tasks() which is more
cumbersome to use and scheduled to be removed.
Convert all css_scan_tasks() usages in cpuset to
css_task_iter_start/next/end(). This simplifies the code by removing
heap allocation and callbacks.
Signed-off-by: Tejun Heo <[email protected]>
Cc: Li Zefan <[email protected]>
---
kernel/cpuset.c | 186 ++++++++++++++++++--------------------------------------
1 file changed, 58 insertions(+), 128 deletions(-)
diff --git a/kernel/cpuset.c b/kernel/cpuset.c
index ae190b0..65ae0bd 100644
--- a/kernel/cpuset.c
+++ b/kernel/cpuset.c
@@ -829,55 +829,36 @@ static struct cpuset *effective_nodemask_cpuset(struct cpuset *cs)
}
/**
- * cpuset_change_cpumask - make a task's cpus_allowed the same as its cpuset's
- * @tsk: task to test
- * @data: cpuset to @tsk belongs to
- *
- * Called by css_scan_tasks() for each task in a cgroup whose cpus_allowed
- * mask needs to be changed.
- *
- * We don't need to re-check for the cgroup/cpuset membership, since we're
- * holding cpuset_mutex at this point.
- */
-static void cpuset_change_cpumask(struct task_struct *tsk, void *data)
-{
- struct cpuset *cs = data;
- struct cpuset *cpus_cs = effective_cpumask_cpuset(cs);
-
- set_cpus_allowed_ptr(tsk, cpus_cs->cpus_allowed);
-}
-
-/**
* update_tasks_cpumask - Update the cpumasks of tasks in the cpuset.
* @cs: the cpuset in which each task's cpus_allowed mask needs to be changed
- * @heap: if NULL, defer allocating heap memory to css_scan_tasks()
- *
- * Called with cpuset_mutex held
- *
- * The css_scan_tasks() function will scan all the tasks in a cgroup,
- * calling callback functions for each.
*
- * No return value. It's guaranteed that css_scan_tasks() always returns 0
- * if @heap != NULL.
+ * Iterate through each task of @cs updating its cpus_allowed to the
+ * effective cpuset's. As this function is called with cpuset_mutex held,
+ * cpuset membership stays stable.
*/
-static void update_tasks_cpumask(struct cpuset *cs, struct ptr_heap *heap)
+static void update_tasks_cpumask(struct cpuset *cs)
{
- css_scan_tasks(&cs->css, NULL, cpuset_change_cpumask, cs, heap);
+ struct cpuset *cpus_cs = effective_cpumask_cpuset(cs);
+ struct css_task_iter it;
+ struct task_struct *task;
+
+ css_task_iter_start(&cs->css, &it);
+ while ((task = css_task_iter_next(&it)))
+ set_cpus_allowed_ptr(task, cpus_cs->cpus_allowed);
+ css_task_iter_end(&it);
}
/*
* update_tasks_cpumask_hier - Update the cpumasks of tasks in the hierarchy.
* @root_cs: the root cpuset of the hierarchy
* @update_root: update root cpuset or not?
- * @heap: the heap used by css_scan_tasks()
*
* This will update cpumasks of tasks in @root_cs and all other empty cpusets
* which take on cpumask of @root_cs.
*
* Called with cpuset_mutex held
*/
-static void update_tasks_cpumask_hier(struct cpuset *root_cs,
- bool update_root, struct ptr_heap *heap)
+static void update_tasks_cpumask_hier(struct cpuset *root_cs, bool update_root)
{
struct cpuset *cp;
struct cgroup_subsys_state *pos_css;
@@ -898,7 +879,7 @@ static void update_tasks_cpumask_hier(struct cpuset *root_cs,
continue;
rcu_read_unlock();
- update_tasks_cpumask(cp, heap);
+ update_tasks_cpumask(cp);
rcu_read_lock();
css_put(&cp->css);
@@ -914,7 +895,6 @@ static void update_tasks_cpumask_hier(struct cpuset *root_cs,
static int update_cpumask(struct cpuset *cs, struct cpuset *trialcs,
const char *buf)
{
- struct ptr_heap heap;
int retval;
int is_load_balanced;
@@ -947,19 +927,13 @@ static int update_cpumask(struct cpuset *cs, struct cpuset *trialcs,
if (retval < 0)
return retval;
- retval = heap_init(&heap, PAGE_SIZE, GFP_KERNEL, NULL);
- if (retval)
- return retval;
-
is_load_balanced = is_sched_load_balance(trialcs);
mutex_lock(&callback_mutex);
cpumask_copy(cs->cpus_allowed, trialcs->cpus_allowed);
mutex_unlock(&callback_mutex);
- update_tasks_cpumask_hier(cs, true, &heap);
-
- heap_free(&heap);
+ update_tasks_cpumask_hier(cs, true);
if (is_load_balanced)
rebuild_sched_domains_locked();
@@ -1052,53 +1026,22 @@ static void cpuset_change_task_nodemask(struct task_struct *tsk,
task_unlock(tsk);
}
-struct cpuset_change_nodemask_arg {
- struct cpuset *cs;
- nodemask_t *newmems;
-};
-
-/*
- * Update task's mems_allowed and rebind its mempolicy and vmas' mempolicy
- * of it to cpuset's new mems_allowed, and migrate pages to new nodes if
- * memory_migrate flag is set. Called with cpuset_mutex held.
- */
-static void cpuset_change_nodemask(struct task_struct *p, void *data)
-{
- struct cpuset_change_nodemask_arg *arg = data;
- struct cpuset *cs = arg->cs;
- struct mm_struct *mm;
- int migrate;
-
- cpuset_change_task_nodemask(p, arg->newmems);
-
- mm = get_task_mm(p);
- if (!mm)
- return;
-
- migrate = is_memory_migrate(cs);
-
- mpol_rebind_mm(mm, &cs->mems_allowed);
- if (migrate)
- cpuset_migrate_mm(mm, &cs->old_mems_allowed, arg->newmems);
- mmput(mm);
-}
-
static void *cpuset_being_rebound;
/**
* update_tasks_nodemask - Update the nodemasks of tasks in the cpuset.
* @cs: the cpuset in which each task's mems_allowed mask needs to be changed
- * @heap: if NULL, defer allocating heap memory to css_scan_tasks()
*
- * Called with cpuset_mutex held. No return value. It's guaranteed that
- * css_scan_tasks() always returns 0 if @heap != NULL.
+ * Iterate through each task of @cs updating its mems_allowed to the
+ * effective cpuset's. As this function is called with cpuset_mutex held,
+ * cpuset membership stays stable.
*/
-static void update_tasks_nodemask(struct cpuset *cs, struct ptr_heap *heap)
+static void update_tasks_nodemask(struct cpuset *cs)
{
static nodemask_t newmems; /* protected by cpuset_mutex */
struct cpuset *mems_cs = effective_nodemask_cpuset(cs);
- struct cpuset_change_nodemask_arg arg = { .cs = cs,
- .newmems = &newmems };
+ struct css_task_iter it;
+ struct task_struct *task;
cpuset_being_rebound = cs; /* causes mpol_dup() rebind */
@@ -1114,7 +1057,25 @@ static void update_tasks_nodemask(struct cpuset *cs, struct ptr_heap *heap)
* It's ok if we rebind the same mm twice; mpol_rebind_mm()
* is idempotent. Also migrate pages in each mm to new nodes.
*/
- css_scan_tasks(&cs->css, NULL, cpuset_change_nodemask, &arg, heap);
+ css_task_iter_start(&cs->css, &it);
+ while ((task = css_task_iter_next(&it))) {
+ struct mm_struct *mm;
+ bool migrate;
+
+ cpuset_change_task_nodemask(task, &newmems);
+
+ mm = get_task_mm(task);
+ if (!mm)
+ continue;
+
+ migrate = is_memory_migrate(cs);
+
+ mpol_rebind_mm(mm, &cs->mems_allowed);
+ if (migrate)
+ cpuset_migrate_mm(mm, &cs->old_mems_allowed, &newmems);
+ mmput(mm);
+ }
+ css_task_iter_end(&it);
/*
* All the tasks' nodemasks have been updated, update
@@ -1130,15 +1091,13 @@ static void update_tasks_nodemask(struct cpuset *cs, struct ptr_heap *heap)
* update_tasks_nodemask_hier - Update the nodemasks of tasks in the hierarchy.
* @cs: the root cpuset of the hierarchy
* @update_root: update the root cpuset or not?
- * @heap: the heap used by css_scan_tasks()
*
* This will update nodemasks of tasks in @root_cs and all other empty cpusets
* which take on nodemask of @root_cs.
*
* Called with cpuset_mutex held
*/
-static void update_tasks_nodemask_hier(struct cpuset *root_cs,
- bool update_root, struct ptr_heap *heap)
+static void update_tasks_nodemask_hier(struct cpuset *root_cs, bool update_root)
{
struct cpuset *cp;
struct cgroup_subsys_state *pos_css;
@@ -1159,7 +1118,7 @@ static void update_tasks_nodemask_hier(struct cpuset *root_cs,
continue;
rcu_read_unlock();
- update_tasks_nodemask(cp, heap);
+ update_tasks_nodemask(cp);
rcu_read_lock();
css_put(&cp->css);
@@ -1184,7 +1143,6 @@ static int update_nodemask(struct cpuset *cs, struct cpuset *trialcs,
const char *buf)
{
int retval;
- struct ptr_heap heap;
/*
* top_cpuset.mems_allowed tracks node_stats[N_MEMORY];
@@ -1223,17 +1181,11 @@ static int update_nodemask(struct cpuset *cs, struct cpuset *trialcs,
if (retval < 0)
goto done;
- retval = heap_init(&heap, PAGE_SIZE, GFP_KERNEL, NULL);
- if (retval < 0)
- goto done;
-
mutex_lock(&callback_mutex);
cs->mems_allowed = trialcs->mems_allowed;
mutex_unlock(&callback_mutex);
- update_tasks_nodemask_hier(cs, true, &heap);
-
- heap_free(&heap);
+ update_tasks_nodemask_hier(cs, true);
done:
return retval;
}
@@ -1261,38 +1213,22 @@ static int update_relax_domain_level(struct cpuset *cs, s64 val)
}
/**
- * cpuset_change_flag - make a task's spread flags the same as its cpuset's
- * @tsk: task to be updated
- * @data: cpuset to @tsk belongs to
- *
- * Called by css_scan_tasks() for each task in a cgroup.
- *
- * We don't need to re-check for the cgroup/cpuset membership, since we're
- * holding cpuset_mutex at this point.
- */
-static void cpuset_change_flag(struct task_struct *tsk, void *data)
-{
- struct cpuset *cs = data;
-
- cpuset_update_task_spread_flag(cs, tsk);
-}
-
-/**
* update_tasks_flags - update the spread flags of tasks in the cpuset.
* @cs: the cpuset in which each task's spread flags needs to be changed
- * @heap: if NULL, defer allocating heap memory to css_scan_tasks()
- *
- * Called with cpuset_mutex held
*
- * The css_scan_tasks() function will scan all the tasks in a cgroup,
- * calling callback functions for each.
- *
- * No return value. It's guaranteed that css_scan_tasks() always returns 0
- * if @heap != NULL.
+ * Iterate through each task of @cs updating its spread flags. As this
+ * function is called with cpuset_mutex held, cpuset membership stays
+ * stable.
*/
-static void update_tasks_flags(struct cpuset *cs, struct ptr_heap *heap)
+static void update_tasks_flags(struct cpuset *cs)
{
- css_scan_tasks(&cs->css, NULL, cpuset_change_flag, cs, heap);
+ struct css_task_iter it;
+ struct task_struct *task;
+
+ css_task_iter_start(&cs->css, &it);
+ while ((task = css_task_iter_next(&it)))
+ cpuset_update_task_spread_flag(cs, task);
+ css_task_iter_end(&it);
}
/*
@@ -1310,7 +1246,6 @@ static int update_flag(cpuset_flagbits_t bit, struct cpuset *cs,
struct cpuset *trialcs;
int balance_flag_changed;
int spread_flag_changed;
- struct ptr_heap heap;
int err;
trialcs = alloc_trial_cpuset(cs);
@@ -1326,10 +1261,6 @@ static int update_flag(cpuset_flagbits_t bit, struct cpuset *cs,
if (err < 0)
goto out;
- err = heap_init(&heap, PAGE_SIZE, GFP_KERNEL, NULL);
- if (err < 0)
- goto out;
-
balance_flag_changed = (is_sched_load_balance(cs) !=
is_sched_load_balance(trialcs));
@@ -1344,8 +1275,7 @@ static int update_flag(cpuset_flagbits_t bit, struct cpuset *cs,
rebuild_sched_domains_locked();
if (spread_flag_changed)
- update_tasks_flags(cs, &heap);
- heap_free(&heap);
+ update_tasks_flags(cs);
out:
free_trial_cpuset(trialcs);
return err;
@@ -2138,7 +2068,7 @@ retry:
*/
if ((sane && cpumask_empty(cs->cpus_allowed)) ||
(!cpumask_empty(&off_cpus) && !cpumask_empty(cs->cpus_allowed)))
- update_tasks_cpumask(cs, NULL);
+ update_tasks_cpumask(cs);
mutex_lock(&callback_mutex);
nodes_andnot(cs->mems_allowed, cs->mems_allowed, off_mems);
@@ -2152,7 +2082,7 @@ retry:
*/
if ((sane && nodes_empty(cs->mems_allowed)) ||
(!nodes_empty(off_mems) && !nodes_empty(cs->mems_allowed)))
- update_tasks_nodemask(cs, NULL);
+ update_tasks_nodemask(cs);
is_empty = cpumask_empty(cs->cpus_allowed) ||
nodes_empty(cs->mems_allowed);
@@ -2214,7 +2144,7 @@ static void cpuset_hotplug_workfn(struct work_struct *work)
mutex_lock(&callback_mutex);
top_cpuset.mems_allowed = new_mems;
mutex_unlock(&callback_mutex);
- update_tasks_nodemask(&top_cpuset, NULL);
+ update_tasks_nodemask(&top_cpuset);
}
mutex_unlock(&cpuset_mutex);
--
1.8.5.3
put_css_set() is performed in two steps - it first tries to put
without grabbing css_set_rwsem if such put wouldn't make the count
zero. If that fails, it puts after write-locking css_set_rwsem. This
patch separates out the second phase into put_css_set_locked() which
should be called with css_set_rwsem locked.
Also, put_css_set_taskexit() is droped and put_css_set() is made to
take @taskexit. There are only a handful users of these functions.
No point in providing different variants.
put_css_locked() will be used by later changes. This patch doesn't
introduce any functional changes.
Signed-off-by: Tejun Heo <[email protected]>
---
kernel/cgroup.c | 50 +++++++++++++++++++++++---------------------------
1 file changed, 23 insertions(+), 27 deletions(-)
diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 8c1f840..63d1a4e 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -369,22 +369,14 @@ static unsigned long css_set_hash(struct cgroup_subsys_state *css[])
return key;
}
-static void __put_css_set(struct css_set *cset, int taskexit)
+static void put_css_set_locked(struct css_set *cset, bool taskexit)
{
struct cgrp_cset_link *link, *tmp_link;
- /*
- * Ensure that the refcount doesn't hit zero while any readers
- * can see it. Similar to atomic_dec_and_lock(), but for an
- * rwlock
- */
- if (atomic_add_unless(&cset->refcount, -1, 1))
- return;
- down_write(&css_set_rwsem);
- if (!atomic_dec_and_test(&cset->refcount)) {
- up_write(&css_set_rwsem);
+ lockdep_assert_held(&css_set_rwsem);
+
+ if (!atomic_dec_and_test(&cset->refcount))
return;
- }
/* This css_set is dead. unlink it and release cgroup refcounts */
hash_del(&cset->hlist);
@@ -406,10 +398,24 @@ static void __put_css_set(struct css_set *cset, int taskexit)
kfree(link);
}
- up_write(&css_set_rwsem);
kfree_rcu(cset, rcu_head);
}
+static void put_css_set(struct css_set *cset, bool taskexit)
+{
+ /*
+ * Ensure that the refcount doesn't hit zero while any readers
+ * can see it. Similar to atomic_dec_and_lock(), but for an
+ * rwlock
+ */
+ if (atomic_add_unless(&cset->refcount, -1, 1))
+ return;
+
+ down_write(&css_set_rwsem);
+ put_css_set_locked(cset, taskexit);
+ up_write(&css_set_rwsem);
+}
+
/*
* refcounted get/put for css_set objects
*/
@@ -418,16 +424,6 @@ static inline void get_css_set(struct css_set *cset)
atomic_inc(&cset->refcount);
}
-static inline void put_css_set(struct css_set *cset)
-{
- __put_css_set(cset, 0);
-}
-
-static inline void put_css_set_taskexit(struct css_set *cset)
-{
- __put_css_set(cset, 1);
-}
-
/**
* compare_css_sets - helper function for find_existing_css_set().
* @cset: candidate css_set being tested
@@ -1751,7 +1747,7 @@ static void cgroup_task_migrate(struct cgroup *old_cgrp,
* we're safe to drop it here; it will be freed under RCU.
*/
set_bit(CGRP_RELEASABLE, &old_cgrp->flags);
- put_css_set(old_cset);
+ put_css_set(old_cset, false);
}
/**
@@ -1897,7 +1893,7 @@ out_put_css_set_refs:
tc = flex_array_get(group, i);
if (!tc->cset)
break;
- put_css_set(tc->cset);
+ put_css_set(tc->cset, false);
}
}
out_cancel_attach:
@@ -3714,7 +3710,7 @@ static int cgroup_destroy_locked(struct cgroup *cgrp)
/*
* css_set_rwsem synchronizes access to ->cset_links and prevents
- * @cgrp from being removed while __put_css_set() is in progress.
+ * @cgrp from being removed while put_css_set() is in progress.
*/
down_read(&css_set_rwsem);
empty = list_empty(&cgrp->cset_links);
@@ -4266,7 +4262,7 @@ void cgroup_exit(struct task_struct *tsk, int run_callbacks)
}
task_unlock(tsk);
- put_css_set_taskexit(cset);
+ put_css_set(cset, true);
}
static void check_for_release(struct cgroup *cgrp)
--
1.8.5.3
css_scan_tasks() doesn't have any user left. Remove it.
Signed-off-by: Tejun Heo <[email protected]>
---
include/linux/cgroup.h | 6 --
kernel/cgroup.c | 162 -------------------------------------------------
2 files changed, 168 deletions(-)
diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
index f173cfb..db5ccf4 100644
--- a/include/linux/cgroup.h
+++ b/include/linux/cgroup.h
@@ -14,7 +14,6 @@
#include <linux/rcupdate.h>
#include <linux/rculist.h>
#include <linux/cgroupstats.h>
-#include <linux/prio_heap.h>
#include <linux/rwsem.h>
#include <linux/idr.h>
#include <linux/workqueue.h>
@@ -811,11 +810,6 @@ void css_task_iter_start(struct cgroup_subsys_state *css,
struct task_struct *css_task_iter_next(struct css_task_iter *it);
void css_task_iter_end(struct css_task_iter *it);
-int css_scan_tasks(struct cgroup_subsys_state *css,
- bool (*test)(struct task_struct *, void *),
- void (*process)(struct task_struct *, void *),
- void *data, struct ptr_heap *heap);
-
int cgroup_attach_task_all(struct task_struct *from, struct task_struct *);
int cgroup_transfer_tasks(struct cgroup *to, struct cgroup *from);
diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 5ad1b25..8c1f840 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -2696,168 +2696,6 @@ void css_task_iter_end(struct css_task_iter *it)
up_read(&css_set_rwsem);
}
-static inline int started_after_time(struct task_struct *t1,
- struct timespec *time,
- struct task_struct *t2)
-{
- int start_diff = timespec_compare(&t1->start_time, time);
- if (start_diff > 0) {
- return 1;
- } else if (start_diff < 0) {
- return 0;
- } else {
- /*
- * Arbitrarily, if two processes started at the same
- * time, we'll say that the lower pointer value
- * started first. Note that t2 may have exited by now
- * so this may not be a valid pointer any longer, but
- * that's fine - it still serves to distinguish
- * between two tasks started (effectively) simultaneously.
- */
- return t1 > t2;
- }
-}
-
-/*
- * This function is a callback from heap_insert() and is used to order
- * the heap.
- * In this case we order the heap in descending task start time.
- */
-static inline int started_after(void *p1, void *p2)
-{
- struct task_struct *t1 = p1;
- struct task_struct *t2 = p2;
- return started_after_time(t1, &t2->start_time, t2);
-}
-
-/**
- * css_scan_tasks - iterate though all the tasks in a css
- * @css: the css to iterate tasks of
- * @test: optional test callback
- * @process: process callback
- * @data: data passed to @test and @process
- * @heap: optional pre-allocated heap used for task iteration
- *
- * Iterate through all the tasks in @css, calling @test for each, and if it
- * returns %true, call @process for it also.
- *
- * @test may be NULL, meaning always true (select all tasks), which
- * effectively duplicates css_task_iter_{start,next,end}() but does not
- * lock css_set_rwsem for the call to @process.
- *
- * It is guaranteed that @process will act on every task that is a member
- * of @css for the duration of this call. This function may or may not
- * call @process for tasks that exit or move to a different css during the
- * call, or are forked or move into the css during the call.
- *
- * Note that @test may be called with locks held, and may in some
- * situations be called multiple times for the same task, so it should be
- * cheap.
- *
- * If @heap is non-NULL, a heap has been pre-allocated and will be used for
- * heap operations (and its "gt" member will be overwritten), else a
- * temporary heap will be used (allocation of which may cause this function
- * to fail).
- */
-int css_scan_tasks(struct cgroup_subsys_state *css,
- bool (*test)(struct task_struct *, void *),
- void (*process)(struct task_struct *, void *),
- void *data, struct ptr_heap *heap)
-{
- int retval, i;
- struct css_task_iter it;
- struct task_struct *p, *dropped;
- /* Never dereference latest_task, since it's not refcounted */
- struct task_struct *latest_task = NULL;
- struct ptr_heap tmp_heap;
- struct timespec latest_time = { 0, 0 };
-
- if (heap) {
- /* The caller supplied our heap and pre-allocated its memory */
- heap->gt = &started_after;
- } else {
- /* We need to allocate our own heap memory */
- heap = &tmp_heap;
- retval = heap_init(heap, PAGE_SIZE, GFP_KERNEL, &started_after);
- if (retval)
- /* cannot allocate the heap */
- return retval;
- }
-
- again:
- /*
- * Scan tasks in the css, using the @test callback to determine
- * which are of interest, and invoking @process callback on the
- * ones which need an update. Since we don't want to hold any
- * locks during the task updates, gather tasks to be processed in a
- * heap structure. The heap is sorted by descending task start
- * time. If the statically-sized heap fills up, we overflow tasks
- * that started later, and in future iterations only consider tasks
- * that started after the latest task in the previous pass. This
- * guarantees forward progress and that we don't miss any tasks.
- */
- heap->size = 0;
- css_task_iter_start(css, &it);
- while ((p = css_task_iter_next(&it))) {
- /*
- * Only affect tasks that qualify per the caller's callback,
- * if he provided one
- */
- if (test && !test(p, data))
- continue;
- /*
- * Only process tasks that started after the last task
- * we processed
- */
- if (!started_after_time(p, &latest_time, latest_task))
- continue;
- dropped = heap_insert(heap, p);
- if (dropped == NULL) {
- /*
- * The new task was inserted; the heap wasn't
- * previously full
- */
- get_task_struct(p);
- } else if (dropped != p) {
- /*
- * The new task was inserted, and pushed out a
- * different task
- */
- get_task_struct(p);
- put_task_struct(dropped);
- }
- /*
- * Else the new task was newer than anything already in
- * the heap and wasn't inserted
- */
- }
- css_task_iter_end(&it);
-
- if (heap->size) {
- for (i = 0; i < heap->size; i++) {
- struct task_struct *q = heap->ptrs[i];
- if (i == 0) {
- latest_time = q->start_time;
- latest_task = q;
- }
- /* Process the task per the caller's callback */
- process(q, data);
- put_task_struct(q);
- }
- /*
- * If we had to process any tasks at all, scan again
- * in case some of them were in the middle of forking
- * children that didn't get processed.
- * Not the most efficient way to do it, but it avoids
- * having to take callback_mutex in the fork path
- */
- goto again;
- }
- if (heap == &tmp_heap)
- heap_free(&tmp_heap);
- return 0;
-}
-
/**
* cgroup_trasnsfer_tasks - move tasks from one cgroup to another
* @to: cgroup to which the tasks will be moved
--
1.8.5.3
Currently there are two ways to walk tasks of a cgroup -
css_task_iter_start/next/end() and css_scan_tasks(). The latter
builds on the former but allows blocking while iterating.
Unfortunately, the way css_scan_tasks() is implemented is rather
nasty, it uses a priority heap of pointers to extract some number of
tasks in task creation order and loops over them invoking the callback
and repeats that until it reaches the end. It requires either
preallocated heap or may fail under memory pressure, while unlikely to
be problematic, the complexity is O(N^2), and in general just nasty.
We're gonna convert all css_scan_users() to
css_task_iter_start/next/end() and remove css_scan_users(). As
css_scan_tasks() users may block, let's convert css_set_lock to a
rwsem so that tasks can block during css_task_iter_*() is in progress.
While this does increase the chance of possible deadlock scenarios,
given the current usage, the probability is relatively low, and even
if that happens, the right thing to do is updating the iteration in
the similar way to css iterators so that it can handle blocking.
Most conversions are trivial; however, task_cgroup_path() now expects
to be called with css_set_rwsem locked instead of locking itself.
This is because the function is called with RCU read lock held and
rwsem locking should nest outside RCU read lock.
Signed-off-by: Tejun Heo <[email protected]>
---
kernel/cgroup.c | 104 +++++++++++++++++++++++++++++++-------------------------
1 file changed, 57 insertions(+), 47 deletions(-)
diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index a5f965c..5ad1b25 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -42,6 +42,7 @@
#include <linux/sched.h>
#include <linux/slab.h>
#include <linux/spinlock.h>
+#include <linux/rwsem.h>
#include <linux/string.h>
#include <linux/sort.h>
#include <linux/kmod.h>
@@ -341,11 +342,10 @@ static struct css_set init_css_set;
static struct cgrp_cset_link init_cgrp_cset_link;
/*
- * css_set_lock protects the list of css_set objects, and the chain of
- * tasks off each css_set. Nests outside task->alloc_lock due to
- * css_task_iter_start().
+ * css_set_rwsem protects the list of css_set objects, and the chain of
+ * tasks off each css_set.
*/
-static DEFINE_RWLOCK(css_set_lock);
+static DECLARE_RWSEM(css_set_rwsem);
static int css_set_count;
/*
@@ -380,9 +380,9 @@ static void __put_css_set(struct css_set *cset, int taskexit)
*/
if (atomic_add_unless(&cset->refcount, -1, 1))
return;
- write_lock(&css_set_lock);
+ down_write(&css_set_rwsem);
if (!atomic_dec_and_test(&cset->refcount)) {
- write_unlock(&css_set_lock);
+ up_write(&css_set_rwsem);
return;
}
@@ -396,7 +396,7 @@ static void __put_css_set(struct css_set *cset, int taskexit)
list_del(&link->cset_link);
list_del(&link->cgrp_link);
- /* @cgrp can't go away while we're holding css_set_lock */
+ /* @cgrp can't go away while we're holding css_set_rwsem */
if (list_empty(&cgrp->cset_links) && notify_on_release(cgrp)) {
if (taskexit)
set_bit(CGRP_RELEASABLE, &cgrp->flags);
@@ -406,7 +406,7 @@ static void __put_css_set(struct css_set *cset, int taskexit)
kfree(link);
}
- write_unlock(&css_set_lock);
+ up_write(&css_set_rwsem);
kfree_rcu(cset, rcu_head);
}
@@ -627,11 +627,11 @@ static struct css_set *find_css_set(struct css_set *old_cset,
/* First see if we already have a cgroup group that matches
* the desired set */
- read_lock(&css_set_lock);
+ down_read(&css_set_rwsem);
cset = find_existing_css_set(old_cset, cgrp, template);
if (cset)
get_css_set(cset);
- read_unlock(&css_set_lock);
+ up_read(&css_set_rwsem);
if (cset)
return cset;
@@ -655,7 +655,7 @@ static struct css_set *find_css_set(struct css_set *old_cset,
* find_existing_css_set() */
memcpy(cset->subsys, template, sizeof(cset->subsys));
- write_lock(&css_set_lock);
+ down_write(&css_set_rwsem);
/* Add reference counts and links from the new css_set. */
list_for_each_entry(link, &old_cset->cgrp_links, cgrp_link) {
struct cgroup *c = link->cgrp;
@@ -673,7 +673,7 @@ static struct css_set *find_css_set(struct css_set *old_cset,
key = css_set_hash(cset->subsys);
hash_add(css_set_table, &cset->hlist, key);
- write_unlock(&css_set_lock);
+ up_write(&css_set_rwsem);
return cset;
}
@@ -739,14 +739,14 @@ static void cgroup_destroy_root(struct cgroupfs_root *root)
* Release all the links from cset_links to this hierarchy's
* root cgroup
*/
- write_lock(&css_set_lock);
+ down_write(&css_set_rwsem);
list_for_each_entry_safe(link, tmp_link, &cgrp->cset_links, cset_link) {
list_del(&link->cset_link);
list_del(&link->cgrp_link);
kfree(link);
}
- write_unlock(&css_set_lock);
+ up_write(&css_set_rwsem);
if (!list_empty(&root->root_list)) {
list_del(&root->root_list);
@@ -764,7 +764,7 @@ static void cgroup_destroy_root(struct cgroupfs_root *root)
/*
* Return the cgroup for "task" from the given hierarchy. Must be
- * called with cgroup_mutex held.
+ * called with cgroup_mutex and css_set_rwsem held.
*/
static struct cgroup *task_cgroup_from_root(struct task_struct *task,
struct cgroupfs_root *root)
@@ -772,8 +772,9 @@ static struct cgroup *task_cgroup_from_root(struct task_struct *task,
struct css_set *cset;
struct cgroup *res = NULL;
- BUG_ON(!mutex_is_locked(&cgroup_mutex));
- read_lock(&css_set_lock);
+ lockdep_assert_held(&cgroup_mutex);
+ lockdep_assert_held(&css_set_rwsem);
+
/*
* No need to lock the task - since we hold cgroup_mutex the
* task can't change groups, so the only thing that can happen
@@ -794,7 +795,7 @@ static struct cgroup *task_cgroup_from_root(struct task_struct *task,
}
}
}
- read_unlock(&css_set_lock);
+
BUG_ON(!res);
return res;
}
@@ -1308,7 +1309,7 @@ static void cgroup_enable_task_cg_lists(void)
{
struct task_struct *p, *g;
- write_lock(&css_set_lock);
+ down_write(&css_set_rwsem);
if (use_task_css_set_links)
goto out_unlock;
@@ -1341,7 +1342,7 @@ static void cgroup_enable_task_cg_lists(void)
} while_each_thread(g, p);
read_unlock(&tasklist_lock);
out_unlock:
- write_unlock(&css_set_lock);
+ up_write(&css_set_rwsem);
}
static void init_cgroup_housekeeping(struct cgroup *cgrp)
@@ -1406,7 +1407,7 @@ static int cgroup_setup_root(struct cgroupfs_root *root, unsigned long ss_mask)
root_cgrp->id = ret;
/*
- * We're accessing css_set_count without locking css_set_lock here,
+ * We're accessing css_set_count without locking css_set_rwsem here,
* but that's OK - it can only be increased by someone holding
* cgroup_lock, and that's us. The worst that can happen is that we
* have some link structures left over
@@ -1449,10 +1450,10 @@ static int cgroup_setup_root(struct cgroupfs_root *root, unsigned long ss_mask)
* Link the top cgroup in this hierarchy into all the css_set
* objects.
*/
- write_lock(&css_set_lock);
+ down_write(&css_set_rwsem);
hash_for_each(css_set_table, i, cset, hlist)
link_css_set(&tmp_links, cset, root_cgrp);
- write_unlock(&css_set_lock);
+ up_write(&css_set_rwsem);
BUG_ON(!list_empty(&root_cgrp->children));
BUG_ON(atomic_read(&root->nr_cgrps) != 1);
@@ -1615,6 +1616,7 @@ char *task_cgroup_path(struct task_struct *task, char *buf, size_t buflen)
char *path = NULL;
mutex_lock(&cgroup_mutex);
+ down_read(&css_set_rwsem);
root = idr_get_next(&cgroup_hierarchy_idr, &hierarchy_id);
@@ -1627,6 +1629,7 @@ char *task_cgroup_path(struct task_struct *task, char *buf, size_t buflen)
path = buf;
}
+ up_read(&css_set_rwsem);
mutex_unlock(&cgroup_mutex);
return path;
}
@@ -1738,9 +1741,9 @@ static void cgroup_task_migrate(struct cgroup *old_cgrp,
task_unlock(tsk);
/* Update the css_set linked lists if we're using them */
- write_lock(&css_set_lock);
+ down_write(&css_set_rwsem);
list_move(&tsk->cg_list, &new_cset->tasks);
- write_unlock(&css_set_lock);
+ up_write(&css_set_rwsem);
/*
* We just gained a reference on old_cset by taking it from the
@@ -1798,6 +1801,7 @@ static int cgroup_attach_task(struct cgroup *cgrp, struct task_struct *tsk,
* already PF_EXITING could be freed from underneath us unless we
* take an rcu_read_lock.
*/
+ down_read(&css_set_rwsem);
rcu_read_lock();
do {
struct task_and_cgroup ent;
@@ -1825,6 +1829,7 @@ static int cgroup_attach_task(struct cgroup *cgrp, struct task_struct *tsk,
break;
} while_each_thread(leader, tsk);
rcu_read_unlock();
+ up_read(&css_set_rwsem);
/* remember the number of threads in the array for later. */
group_size = i;
tset.tc_array = group;
@@ -2002,7 +2007,11 @@ int cgroup_attach_task_all(struct task_struct *from, struct task_struct *tsk)
mutex_lock(&cgroup_mutex);
for_each_active_root(root) {
- struct cgroup *from_cgrp = task_cgroup_from_root(from, root);
+ struct cgroup *from_cgrp;
+
+ down_read(&css_set_rwsem);
+ from_cgrp = task_cgroup_from_root(from, root);
+ up_read(&css_set_rwsem);
retval = cgroup_attach_task(from_cgrp, tsk, false);
if (retval)
@@ -2395,10 +2404,10 @@ static int cgroup_task_count(const struct cgroup *cgrp)
int count = 0;
struct cgrp_cset_link *link;
- read_lock(&css_set_lock);
+ down_read(&css_set_rwsem);
list_for_each_entry(link, &cgrp->cset_links, cset_link)
count += atomic_read(&link->cset->refcount);
- read_unlock(&css_set_lock);
+ up_read(&css_set_rwsem);
return count;
}
@@ -2629,12 +2638,12 @@ static void css_advance_task_iter(struct css_task_iter *it)
*/
void css_task_iter_start(struct cgroup_subsys_state *css,
struct css_task_iter *it)
- __acquires(css_set_lock)
+ __acquires(css_set_rwsem)
{
/* no one should try to iterate before mounting cgroups */
WARN_ON_ONCE(!use_task_css_set_links);
- read_lock(&css_set_lock);
+ down_read(&css_set_rwsem);
it->origin_css = css;
it->cset_link = &css->cgroup->cset_links;
@@ -2682,9 +2691,9 @@ struct task_struct *css_task_iter_next(struct css_task_iter *it)
* Finish task iteration started by css_task_iter_start().
*/
void css_task_iter_end(struct css_task_iter *it)
- __releases(css_set_lock)
+ __releases(css_set_rwsem)
{
- read_unlock(&css_set_lock);
+ up_read(&css_set_rwsem);
}
static inline int started_after_time(struct task_struct *t1,
@@ -2734,7 +2743,7 @@ static inline int started_after(void *p1, void *p2)
*
* @test may be NULL, meaning always true (select all tasks), which
* effectively duplicates css_task_iter_{start,next,end}() but does not
- * lock css_set_lock for the call to @process.
+ * lock css_set_rwsem for the call to @process.
*
* It is guaranteed that @process will act on every task that is a member
* of @css for the duration of this call. This function may or may not
@@ -3866,12 +3875,12 @@ static int cgroup_destroy_locked(struct cgroup *cgrp)
lockdep_assert_held(&cgroup_mutex);
/*
- * css_set_lock synchronizes access to ->cset_links and prevents
+ * css_set_rwsem synchronizes access to ->cset_links and prevents
* @cgrp from being removed while __put_css_set() is in progress.
*/
- read_lock(&css_set_lock);
+ down_read(&css_set_rwsem);
empty = list_empty(&cgrp->cset_links);
- read_unlock(&css_set_lock);
+ up_read(&css_set_rwsem);
if (!empty)
return -EBUSY;
@@ -4207,6 +4216,7 @@ int proc_cgroup_show(struct seq_file *m, void *v)
retval = 0;
mutex_lock(&cgroup_mutex);
+ down_read(&css_set_rwsem);
for_each_active_root(root) {
struct cgroup_subsys *ss;
@@ -4232,6 +4242,7 @@ int proc_cgroup_show(struct seq_file *m, void *v)
}
out_unlock:
+ up_read(&css_set_rwsem);
mutex_unlock(&cgroup_mutex);
put_task_struct(tsk);
out_free:
@@ -4327,12 +4338,12 @@ void cgroup_post_fork(struct task_struct *child)
* lock on fork.
*/
if (use_task_css_set_links) {
- write_lock(&css_set_lock);
+ down_write(&css_set_rwsem);
task_lock(child);
if (list_empty(&child->cg_list))
list_add(&child->cg_list, &task_css_set(child)->tasks);
task_unlock(child);
- write_unlock(&css_set_lock);
+ up_write(&css_set_rwsem);
}
/*
@@ -4389,15 +4400,14 @@ void cgroup_exit(struct task_struct *tsk, int run_callbacks)
int i;
/*
- * Unlink from the css_set task list if necessary.
- * Optimistically check cg_list before taking
- * css_set_lock
+ * Unlink from the css_set task list if necessary. Optimistically
+ * check cg_list before taking css_set_rwsem.
*/
if (!list_empty(&tsk->cg_list)) {
- write_lock(&css_set_lock);
+ down_write(&css_set_rwsem);
if (!list_empty(&tsk->cg_list))
list_del_init(&tsk->cg_list);
- write_unlock(&css_set_lock);
+ up_write(&css_set_rwsem);
}
/* Reassign the task to the init_css_set. */
@@ -4649,7 +4659,7 @@ static int current_css_set_cg_links_read(struct seq_file *seq, void *v)
if (!name_buf)
return -ENOMEM;
- read_lock(&css_set_lock);
+ down_read(&css_set_rwsem);
rcu_read_lock();
cset = rcu_dereference(current->cgroups);
list_for_each_entry(link, &cset->cgrp_links, cgrp_link) {
@@ -4665,7 +4675,7 @@ static int current_css_set_cg_links_read(struct seq_file *seq, void *v)
c->root->hierarchy_id, name);
}
rcu_read_unlock();
- read_unlock(&css_set_lock);
+ up_read(&css_set_rwsem);
kfree(name_buf);
return 0;
}
@@ -4676,7 +4686,7 @@ static int cgroup_css_links_read(struct seq_file *seq, void *v)
struct cgroup_subsys_state *css = seq_css(seq);
struct cgrp_cset_link *link;
- read_lock(&css_set_lock);
+ down_read(&css_set_rwsem);
list_for_each_entry(link, &css->cgroup->cset_links, cset_link) {
struct css_set *cset = link->cset;
struct task_struct *task;
@@ -4692,7 +4702,7 @@ static int cgroup_css_links_read(struct seq_file *seq, void *v)
}
}
}
- read_unlock(&css_set_lock);
+ up_read(&css_set_rwsem);
return 0;
}
--
1.8.5.3
Reimplement cgroup_transfer_tasks() so that it repeatedly fetches the
first task in the cgroup and then tranfers it. This achieves the same
result without using css_scan_tasks() which is scheduled to be
removed.
Signed-off-by: Tejun Heo <[email protected]>
---
kernel/cgroup.c | 31 ++++++++++++++++++++-----------
1 file changed, 20 insertions(+), 11 deletions(-)
diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 96a3a85..a5f965c 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -2849,15 +2849,6 @@ int css_scan_tasks(struct cgroup_subsys_state *css,
return 0;
}
-static void cgroup_transfer_one_task(struct task_struct *task, void *data)
-{
- struct cgroup *new_cgroup = data;
-
- mutex_lock(&cgroup_mutex);
- cgroup_attach_task(new_cgroup, task, false);
- mutex_unlock(&cgroup_mutex);
-}
-
/**
* cgroup_trasnsfer_tasks - move tasks from one cgroup to another
* @to: cgroup to which the tasks will be moved
@@ -2865,8 +2856,26 @@ static void cgroup_transfer_one_task(struct task_struct *task, void *data)
*/
int cgroup_transfer_tasks(struct cgroup *to, struct cgroup *from)
{
- return css_scan_tasks(&from->dummy_css, NULL, cgroup_transfer_one_task,
- to, NULL);
+ struct css_task_iter it;
+ struct task_struct *task;
+ int ret = 0;
+
+ do {
+ css_task_iter_start(&from->dummy_css, &it);
+ task = css_task_iter_next(&it);
+ if (task)
+ get_task_struct(task);
+ css_task_iter_end(&it);
+
+ if (task) {
+ mutex_lock(&cgroup_mutex);
+ ret = cgroup_attach_task(to, task, false);
+ mutex_unlock(&cgroup_mutex);
+ put_task_struct(task);
+ }
+ } while (task && !ret);
+
+ return ret;
}
/*
--
1.8.5.3
Before kernfs conversion, due to the way super_block lookup works,
cgroup roots were created and made visible before being fully
initialized. This in turn required a special flag to mark that the
root hasn't been fully initialized so that the destruction path can
tell fully bound ones from half initialized.
That flag is CGRP_ROOT_SUBSYS_BOUND and no longer necessary after the
kernfs conversion as the lookup and creation of new root are atomic
w.r.t. cgroup_mutex. This patch removes the flag and passes the
requests subsystem mask to cgroup_setup_root() so that it can set the
respective mask bits as subsystems are bound.
Signed-off-by: Tejun Heo <[email protected]>
---
include/linux/cgroup.h | 2 --
kernel/cgroup.c | 28 ++++------------------------
2 files changed, 4 insertions(+), 26 deletions(-)
diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
index fa415a8..8ca31c1 100644
--- a/include/linux/cgroup.h
+++ b/include/linux/cgroup.h
@@ -265,8 +265,6 @@ enum {
/* mount options live below bit 16 */
CGRP_ROOT_OPTION_MASK = (1 << 16) - 1,
-
- CGRP_ROOT_SUBSYS_BOUND = (1 << 16), /* subsystems finished binding */
};
/*
diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 47160ce..caed061 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -733,7 +733,6 @@ static void cgroup_destroy_root(struct cgroupfs_root *root)
{
struct cgroup *cgrp = &root->top_cgroup;
struct cgrp_cset_link *link, *tmp_link;
- int ret;
mutex_lock(&cgroup_tree_mutex);
mutex_lock(&cgroup_mutex);
@@ -742,11 +741,7 @@ static void cgroup_destroy_root(struct cgroupfs_root *root)
BUG_ON(!list_empty(&cgrp->children));
/* Rebind all subsystems back to the default hierarchy */
- if (root->flags & CGRP_ROOT_SUBSYS_BOUND) {
- ret = rebind_subsystems(root, 0, root->subsys_mask);
- /* Shouldn't be able to fail ... */
- BUG_ON(ret);
- }
+ WARN_ON(rebind_subsystems(root, 0, root->subsys_mask));
/*
* Release all the links from cset_links to this hierarchy's
@@ -1053,13 +1048,7 @@ static int rebind_subsystems(struct cgroupfs_root *root,
}
}
- /*
- * Mark @root has finished binding subsystems. @root->subsys_mask
- * now matches the bound subsystems.
- */
- root->flags |= CGRP_ROOT_SUBSYS_BOUND;
kernfs_activate(cgrp->kn);
-
return 0;
}
@@ -1351,15 +1340,6 @@ static struct cgroupfs_root *cgroup_root_from_opts(struct cgroup_sb_opts *opts)
init_cgroup_root(root);
- /*
- * We need to set @root->subsys_mask now so that @root can be
- * matched by cgroup_test_super() before it finishes
- * initialization; otherwise, competing mounts with the same
- * options may try to bind the same subsystems instead of waiting
- * for the first one leading to unexpected mount errors.
- * SUBSYS_BOUND will be set once actual binding is complete.
- */
- root->subsys_mask = opts->subsys_mask;
root->flags = opts->flags;
if (opts->release_agent)
strcpy(root->release_agent_path, opts->release_agent);
@@ -1370,7 +1350,7 @@ static struct cgroupfs_root *cgroup_root_from_opts(struct cgroup_sb_opts *opts)
return root;
}
-static int cgroup_setup_root(struct cgroupfs_root *root)
+static int cgroup_setup_root(struct cgroupfs_root *root, unsigned long ss_mask)
{
LIST_HEAD(tmp_links);
struct cgroup *root_cgrp = &root->top_cgroup;
@@ -1413,7 +1393,7 @@ static int cgroup_setup_root(struct cgroupfs_root *root)
if (ret)
goto destroy_root;
- ret = rebind_subsystems(root, root->subsys_mask, 0);
+ ret = rebind_subsystems(root, ss_mask, 0);
if (ret)
goto destroy_root;
@@ -1530,7 +1510,7 @@ retry:
goto out_unlock;
}
- ret = cgroup_setup_root(root);
+ ret = cgroup_setup_root(root, opts.subsys_mask);
if (ret)
cgroup_free_root(root);
--
1.8.5.3
Move it above so that prototype isn't necessary. Let's also move the
definition of use_task_css_set_links next to it.
This is purely cosmetic.
Signed-off-by: Tejun Heo <[email protected]>
---
kernel/cgroup.c | 103 ++++++++++++++++++++++++++------------------------------
1 file changed, 48 insertions(+), 55 deletions(-)
diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index efae7e2..1ebc6e30 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -173,7 +173,6 @@ static int cgroup_destroy_locked(struct cgroup *cgrp);
static int cgroup_addrm_files(struct cgroup *cgrp, struct cftype cfts[],
bool is_add);
static void cgroup_pidlist_destroy_all(struct cgroup *cgrp);
-static void cgroup_enable_task_cg_lists(void);
/**
* cgroup_css - obtain a cgroup's css for the specified subsystem
@@ -370,14 +369,6 @@ static unsigned long css_set_hash(struct cgroup_subsys_state *css[])
return key;
}
-/*
- * We don't maintain the lists running through each css_set to its task
- * until after the first call to css_task_iter_start(). This reduces the
- * fork()/exit() overhead for people who have cgroups compiled into their
- * kernel but not actually in use.
- */
-static bool use_task_css_set_links __read_mostly;
-
static void __put_css_set(struct css_set *cset, int taskexit)
{
struct cgrp_cset_link *link, *tmp_link;
@@ -1305,6 +1296,54 @@ static int cgroup_remount(struct kernfs_root *kf_root, int *flags, char *data)
return ret;
}
+/*
+ * To reduce the fork() overhead for systems that are not actually using
+ * their cgroups capability, we don't maintain the lists running through
+ * each css_set to its tasks until we see the list actually used - in other
+ * words after the first mount.
+ */
+static bool use_task_css_set_links __read_mostly;
+
+static void cgroup_enable_task_cg_lists(void)
+{
+ struct task_struct *p, *g;
+
+ write_lock(&css_set_lock);
+
+ if (use_task_css_set_links)
+ goto out_unlock;
+
+ use_task_css_set_links = true;
+
+ /*
+ * We need tasklist_lock because RCU is not safe against
+ * while_each_thread(). Besides, a forking task that has passed
+ * cgroup_post_fork() without seeing use_task_css_set_links = 1
+ * is not guaranteed to have its child immediately visible in the
+ * tasklist if we walk through it with RCU.
+ */
+ read_lock(&tasklist_lock);
+ do_each_thread(g, p) {
+ task_lock(p);
+
+ WARN_ON_ONCE(!list_empty(&p->cg_list) ||
+ task_css_set(p) != &init_css_set);
+
+ /*
+ * We should check if the process is exiting, otherwise
+ * it will race with cgroup_exit() in that the list
+ * entry won't be deleted though the process has exited.
+ */
+ if (!(p->flags & PF_EXITING))
+ list_add(&p->cg_list, &task_css_set(p)->tasks);
+
+ task_unlock(p);
+ } while_each_thread(g, p);
+ read_unlock(&tasklist_lock);
+out_unlock:
+ write_unlock(&css_set_lock);
+}
+
static void init_cgroup_housekeeping(struct cgroup *cgrp)
{
atomic_set(&cgrp->refcnt, 1);
@@ -2363,52 +2402,6 @@ int cgroup_task_count(const struct cgroup *cgrp)
return count;
}
-/*
- * To reduce the fork() overhead for systems that are not actually using
- * their cgroups capability, we don't maintain the lists running through
- * each css_set to its tasks until we see the list actually used - in other
- * words after the first mount.
- */
-static void cgroup_enable_task_cg_lists(void)
-{
- struct task_struct *p, *g;
-
- write_lock(&css_set_lock);
-
- if (use_task_css_set_links)
- goto out_unlock;
-
- use_task_css_set_links = true;
-
- /*
- * We need tasklist_lock because RCU is not safe against
- * while_each_thread(). Besides, a forking task that has passed
- * cgroup_post_fork() without seeing use_task_css_set_links = 1
- * is not guaranteed to have its child immediately visible in the
- * tasklist if we walk through it with RCU.
- */
- read_lock(&tasklist_lock);
- do_each_thread(g, p) {
- task_lock(p);
-
- WARN_ON_ONCE(!list_empty(&p->cg_list) ||
- task_css_set(p) != &init_css_set);
-
- /*
- * We should check if the process is exiting, otherwise
- * it will race with cgroup_exit() in that the list
- * entry won't be deleted though the process has exited.
- */
- if (!(p->flags & PF_EXITING))
- list_add(&p->cg_list, &task_css_set(p)->tasks);
-
- task_unlock(p);
- } while_each_thread(g, p);
- read_unlock(&tasklist_lock);
-out_unlock:
- write_unlock(&css_set_lock);
-}
-
/**
* css_next_child - find the next child of a given css
* @pos_css: the current position (%NULL to initiate traversal)
--
1.8.5.3
Tasks are not linked on their css_sets until cgroup task iteration is
actually used. This is to avoid incurring overhead on the fork and
exit paths for systems which have cgroup compiled in but don't use it.
This lazy binding also affects the task migration path. It has to be
careful so that it doesn't link tasks to css_sets when task_cg_lists
linking is not enabled yet. Unfortunately, this conditional linking
in the migration path interferes with planned migration updates.
This patch moves the lazy binding a bit earlier, to the first cgroup
mount. It's a clear indication that cgroup is being used on the
system and task_cg_lists linking is highly likely to be enabled soon
anyway through "tasks" and "cgroup.procs" files.
This allows cgroup_task_migrate() to always link @tsk->cg_list. Note
that it may still race with cgroup_post_fork() but who wins that race
is inconsequential.
While at it, make use_task_css_set_links a bool, add sanity checks in
cgroup_enable_task_cg_lists() and css_task_iter_start(), and update
the former so that it's guaranteed and assumes to run only once.
Signed-off-by: Tejun Heo <[email protected]>
---
kernel/cgroup.c | 40 +++++++++++++++++++++++++++-------------
1 file changed, 27 insertions(+), 13 deletions(-)
diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index caed061..efae7e2 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -173,6 +173,7 @@ static int cgroup_destroy_locked(struct cgroup *cgrp);
static int cgroup_addrm_files(struct cgroup *cgrp, struct cftype cfts[],
bool is_add);
static void cgroup_pidlist_destroy_all(struct cgroup *cgrp);
+static void cgroup_enable_task_cg_lists(void);
/**
* cgroup_css - obtain a cgroup's css for the specified subsystem
@@ -375,7 +376,7 @@ static unsigned long css_set_hash(struct cgroup_subsys_state *css[])
* fork()/exit() overhead for people who have cgroups compiled into their
* kernel but not actually in use.
*/
-static int use_task_css_set_links __read_mostly;
+static bool use_task_css_set_links __read_mostly;
static void __put_css_set(struct css_set *cset, int taskexit)
{
@@ -1439,6 +1440,13 @@ static struct dentry *cgroup_mount(struct file_system_type *fs_type,
struct cgroup_sb_opts opts;
struct dentry *dentry;
int ret;
+
+ /*
+ * The first time anyone tries to mount a cgroup, enable the list
+ * linking each css_set to its tasks and fix up all existing tasks.
+ */
+ if (!use_task_css_set_links)
+ cgroup_enable_task_cg_lists();
retry:
mutex_lock(&cgroup_tree_mutex);
mutex_lock(&cgroup_mutex);
@@ -1692,8 +1700,7 @@ static void cgroup_task_migrate(struct cgroup *old_cgrp,
/* Update the css_set linked lists if we're using them */
write_lock(&css_set_lock);
- if (!list_empty(&tsk->cg_list))
- list_move(&tsk->cg_list, &new_cset->tasks);
+ list_move(&tsk->cg_list, &new_cset->tasks);
write_unlock(&css_set_lock);
/*
@@ -2360,13 +2367,19 @@ int cgroup_task_count(const struct cgroup *cgrp)
* To reduce the fork() overhead for systems that are not actually using
* their cgroups capability, we don't maintain the lists running through
* each css_set to its tasks until we see the list actually used - in other
- * words after the first call to css_task_iter_start().
+ * words after the first mount.
*/
static void cgroup_enable_task_cg_lists(void)
{
struct task_struct *p, *g;
+
write_lock(&css_set_lock);
- use_task_css_set_links = 1;
+
+ if (use_task_css_set_links)
+ goto out_unlock;
+
+ use_task_css_set_links = true;
+
/*
* We need tasklist_lock because RCU is not safe against
* while_each_thread(). Besides, a forking task that has passed
@@ -2377,16 +2390,22 @@ static void cgroup_enable_task_cg_lists(void)
read_lock(&tasklist_lock);
do_each_thread(g, p) {
task_lock(p);
+
+ WARN_ON_ONCE(!list_empty(&p->cg_list) ||
+ task_css_set(p) != &init_css_set);
+
/*
* We should check if the process is exiting, otherwise
* it will race with cgroup_exit() in that the list
* entry won't be deleted though the process has exited.
*/
- if (!(p->flags & PF_EXITING) && list_empty(&p->cg_list))
+ if (!(p->flags & PF_EXITING))
list_add(&p->cg_list, &task_css_set(p)->tasks);
+
task_unlock(p);
} while_each_thread(g, p);
read_unlock(&tasklist_lock);
+out_unlock:
write_unlock(&css_set_lock);
}
@@ -2619,13 +2638,8 @@ void css_task_iter_start(struct cgroup_subsys_state *css,
struct css_task_iter *it)
__acquires(css_set_lock)
{
- /*
- * The first time anyone tries to iterate across a css, we need to
- * enable the list linking each css_set to its tasks, and fix up
- * all existing tasks.
- */
- if (!use_task_css_set_links)
- cgroup_enable_task_cg_lists();
+ /* no one should try to iterate before mounting cgroups */
+ WARN_ON_ONCE(!use_task_css_set_links);
read_lock(&css_set_lock);
--
1.8.5.3
On Sun 09-02-14 08:52:33, Tejun Heo wrote:
> cgroup_task_count() read-locks css_set_lock and walks all tasks to
> count them and then returns the result. The only thing all the users
> want is determining whether the cgroup is empty or not. This patch
> implements cgroup_has_tasks() which tests whether cgroup->cset_links
> is empty, replaces all cgroup_task_count() usages and unexports it.
>
> Note that the test isn't synchronized. This is the same as before.
> The test has always been racy.
>
> This will help planned css_set locking update.
>
> Signed-off-by: Tejun Heo <[email protected]>
> Cc: Li Zefan <[email protected]>
> Cc: Johannes Weiner <[email protected]>
> Cc: Michal Hocko <[email protected]>
> Cc: Balbir Singh <[email protected]>
> Cc: KAMEZAWA Hiroyuki <[email protected]>
Looks good.
Acked-by: Michal Hocko <[email protected]>
> ---
> include/linux/cgroup.h | 8 ++++++--
> kernel/cgroup.c | 2 +-
> kernel/cpuset.c | 2 +-
> mm/memcontrol.c | 4 ++--
> 4 files changed, 10 insertions(+), 6 deletions(-)
>
> diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
> index 8ca31c1..f173cfb 100644
> --- a/include/linux/cgroup.h
> +++ b/include/linux/cgroup.h
> @@ -455,6 +455,12 @@ static inline bool cgroup_sane_behavior(const struct cgroup *cgrp)
> return cgrp->root->flags & CGRP_ROOT_SANE_BEHAVIOR;
> }
>
> +/* no synchronization, the result can only be used as a hint */
> +static inline bool cgroup_has_tasks(struct cgroup *cgrp)
> +{
> + return !list_empty(&cgrp->cset_links);
> +}
> +
> /* returns ino associated with a cgroup, 0 indicates unmounted root */
> static inline ino_t cgroup_ino(struct cgroup *cgrp)
> {
> @@ -514,8 +520,6 @@ int cgroup_rm_cftypes(struct cftype *cfts);
>
> bool cgroup_is_descendant(struct cgroup *cgrp, struct cgroup *ancestor);
>
> -int cgroup_task_count(const struct cgroup *cgrp);
> -
> /*
> * Control Group taskset, used to pass around set of tasks to cgroup_subsys
> * methods.
> diff --git a/kernel/cgroup.c b/kernel/cgroup.c
> index 1ebc6e30..96a3a85 100644
> --- a/kernel/cgroup.c
> +++ b/kernel/cgroup.c
> @@ -2390,7 +2390,7 @@ EXPORT_SYMBOL_GPL(cgroup_add_cftypes);
> *
> * Return the number of tasks in the cgroup.
> */
> -int cgroup_task_count(const struct cgroup *cgrp)
> +static int cgroup_task_count(const struct cgroup *cgrp)
> {
> int count = 0;
> struct cgrp_cset_link *link;
> diff --git a/kernel/cpuset.c b/kernel/cpuset.c
> index e97a6e8..ae190b0 100644
> --- a/kernel/cpuset.c
> +++ b/kernel/cpuset.c
> @@ -467,7 +467,7 @@ static int validate_change(struct cpuset *cur, struct cpuset *trial)
> * be changed to have empty cpus_allowed or mems_allowed.
> */
> ret = -ENOSPC;
> - if ((cgroup_task_count(cur->css.cgroup) || cur->attach_in_progress)) {
> + if ((cgroup_has_tasks(cur->css.cgroup) || cur->attach_in_progress)) {
> if (!cpumask_empty(cur->cpus_allowed) &&
> cpumask_empty(trial->cpus_allowed))
> goto out;
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index c1c2549..d9c6ac1 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -4958,7 +4958,7 @@ static int mem_cgroup_force_empty(struct mem_cgroup *memcg)
> struct cgroup *cgrp = memcg->css.cgroup;
>
> /* returns EBUSY if there is a task or if we come here twice. */
> - if (cgroup_task_count(cgrp) || !list_empty(&cgrp->children))
> + if (cgroup_has_tasks(cgrp) || !list_empty(&cgrp->children))
> return -EBUSY;
>
> /* we call try-to-free pages for make this cgroup empty */
> @@ -5140,7 +5140,7 @@ static int __memcg_activate_kmem(struct mem_cgroup *memcg,
> * of course permitted.
> */
> mutex_lock(&memcg_create_mutex);
> - if (cgroup_task_count(memcg->css.cgroup) || memcg_has_children(memcg))
> + if (cgroup_has_tasks(memcg->css.cgroup) || memcg_has_children(memcg))
> err = -EBUSY;
> mutex_unlock(&memcg_create_mutex);
> if (err)
> --
> 1.8.5.3
>
--
Michal Hocko
SUSE Labs
cgroup_attach_task() is planned to go through restructuring. Let's
tidy it up a bit in preparation.
* Update cgroup_attach_task() to receive the target task argument in
@leader instead of @tsk.
* Rename @tsk to @task.
* Rename @retval to @ret.
This is purely cosmetic.
v2: get_nr_threads() was using uninitialized @task instead of @leader.
Fixed. Reported by Dan Carpenter.
Signed-off-by: Tejun Heo <[email protected]>
Cc: Dan Carpenter <[email protected]>
---
Hello,
The fixed part doesn't affect the end result. It gets removed by the
next patchset. I'll leave the git branch as-is for now.
Thanks.
kernel/cgroup.c | 45 +++++++++++++++++++++++----------------------
1 file changed, 23 insertions(+), 22 deletions(-)
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -1726,20 +1726,20 @@ static void cgroup_task_migrate(struct c
/**
* cgroup_attach_task - attach a task or a whole threadgroup to a cgroup
* @cgrp: the cgroup to attach to
- * @tsk: the task or the leader of the threadgroup to be attached
+ * @leader: the task or the leader of the threadgroup to be attached
* @threadgroup: attach the whole threadgroup?
*
* Call holding cgroup_mutex and the group_rwsem of the leader. Will take
* task_lock of @tsk or each thread in the threadgroup individually in turn.
*/
-static int cgroup_attach_task(struct cgroup *cgrp, struct task_struct *tsk,
+static int cgroup_attach_task(struct cgroup *cgrp, struct task_struct *leader,
bool threadgroup)
{
- int retval, i, group_size;
+ int ret, i, group_size;
struct cgroupfs_root *root = cgrp->root;
struct cgroup_subsys_state *css, *failed_css = NULL;
/* threadgroup list cursor and array */
- struct task_struct *leader = tsk;
+ struct task_struct *task;
struct task_and_cgroup *tc;
struct flex_array *group;
struct cgroup_taskset tset = { };
@@ -1752,7 +1752,7 @@ static int cgroup_attach_task(struct cgr
* threads exit, this will just be an over-estimate.
*/
if (threadgroup)
- group_size = get_nr_threads(tsk);
+ group_size = get_nr_threads(leader);
else
group_size = 1;
/* flex_array supports very large thread-groups better than kmalloc. */
@@ -1760,8 +1760,8 @@ static int cgroup_attach_task(struct cgr
if (!group)
return -ENOMEM;
/* pre-allocate to guarantee space while iterating in rcu read-side. */
- retval = flex_array_prealloc(group, 0, group_size, GFP_KERNEL);
- if (retval)
+ ret = flex_array_prealloc(group, 0, group_size, GFP_KERNEL);
+ if (ret)
goto out_free_group_list;
i = 0;
@@ -1772,17 +1772,18 @@ static int cgroup_attach_task(struct cgr
*/
down_read(&css_set_rwsem);
rcu_read_lock();
+ task = leader;
do {
struct task_and_cgroup ent;
- /* @tsk either already exited or can't exit until the end */
- if (tsk->flags & PF_EXITING)
+ /* @task either already exited or can't exit until the end */
+ if (task->flags & PF_EXITING)
goto next;
/* as per above, nr_threads may decrease, but not increase. */
BUG_ON(i >= group_size);
- ent.task = tsk;
- ent.cgrp = task_cgroup_from_root(tsk, root);
+ ent.task = task;
+ ent.cgrp = task_cgroup_from_root(task, root);
/* nothing to do if this task is already in the cgroup */
if (ent.cgrp == cgrp)
goto next;
@@ -1790,13 +1791,13 @@ static int cgroup_attach_task(struct cgr
* saying GFP_ATOMIC has no effect here because we did prealloc
* earlier, but it's good form to communicate our expectations.
*/
- retval = flex_array_put(group, i, &ent, GFP_ATOMIC);
- BUG_ON(retval != 0);
+ ret = flex_array_put(group, i, &ent, GFP_ATOMIC);
+ BUG_ON(ret != 0);
i++;
next:
if (!threadgroup)
break;
- } while_each_thread(leader, tsk);
+ } while_each_thread(leader, task);
rcu_read_unlock();
up_read(&css_set_rwsem);
/* remember the number of threads in the array for later. */
@@ -1805,7 +1806,7 @@ static int cgroup_attach_task(struct cgr
tset.tc_array_len = group_size;
/* methods shouldn't be called if no task is actually migrating */
- retval = 0;
+ ret = 0;
if (!group_size)
goto out_free_group_list;
@@ -1814,8 +1815,8 @@ static int cgroup_attach_task(struct cgr
*/
for_each_css(css, i, cgrp) {
if (css->ss->can_attach) {
- retval = css->ss->can_attach(css, &tset);
- if (retval) {
+ ret = css->ss->can_attach(css, &tset);
+ if (ret) {
failed_css = css;
goto out_cancel_attach;
}
@@ -1833,7 +1834,7 @@ static int cgroup_attach_task(struct cgr
old_cset = task_css_set(tc->task);
tc->cset = find_css_set(old_cset, cgrp);
if (!tc->cset) {
- retval = -ENOMEM;
+ ret = -ENOMEM;
goto out_put_css_set_refs;
}
}
@@ -1861,9 +1862,9 @@ static int cgroup_attach_task(struct cgr
/*
* step 5: success! and cleanup
*/
- retval = 0;
+ ret = 0;
out_put_css_set_refs:
- if (retval) {
+ if (ret) {
for (i = 0; i < group_size; i++) {
tc = flex_array_get(group, i);
if (!tc->cset)
@@ -1872,7 +1873,7 @@ out_put_css_set_refs:
}
}
out_cancel_attach:
- if (retval) {
+ if (ret) {
for_each_css(css, i, cgrp) {
if (css == failed_css)
break;
@@ -1882,7 +1883,7 @@ out_cancel_attach:
}
out_free_group_list:
flex_array_free(group);
- return retval;
+ return ret;
}
/*
On 2014/2/9 21:52, Tejun Heo wrote:
> Hello,
>
> This series contains assorted cleanups which also prepare for the
> planned migration taskset handling update.
>
> This patchset contains the following sixteen patches.
>
> 0001-cgroup-disallow-xattr-release_agent-and-name-if-sane.patch
> 0002-cgroup-drop-CGRP_ROOT_SUBSYS_BOUND.patch
> 0003-cgroup-enable-task_cg_lists-on-the-first-cgroup-moun.patch
> 0004-cgroup-relocate-cgroup_enable_task_cg_lists.patch
> 0005-cgroup-implement-cgroup_has_tasks-and-unexport-cgrou.patch
> 0006-cgroup-reimplement-cgroup_transfer_tasks-without-usi.patch
> 0007-cgroup-make-css_set_lock-a-rwsem-and-rename-it-to-cs.patch
> 0008-cpuset-use-css_task_iter_start-next-end-instead-of-c.patch
> 0009-cgroup-remove-css_scan_tasks.patch
> 0010-cgroup-separate-out-put_css_set_locked-and-remove-pu.patch
> 0011-cgroup-move-css_set_rwsem-locking-outside-of-cgroup_.patch
> 0012-cgroup-drop-skip_css-from-cgroup_taskset_for_each.patch
> 0013-cpuset-don-t-use-cgroup_taskset_cur_css.patch
> 0014-cgroup-remove-cgroup_taskset_cur_css-and-cgroup_task.patch
> 0015-cgroup-cosmetic-updates-to-cgroup_attach_task.patch
> 0016-cgroup-unexport-functions.patch
>
> The notables ones are
>
> 0003-0004 move task_cg_list enabling to the first mount instead of
> the first css task iteration.
>
> 0005-0009 make css_set_lock a rwsem so that css_task_iter allows
> blocking during iteration and removes css_scan_tasks().
>
> 0010-0015 clean up migration path to prepare for the planned
> migration taskset handling update.
>
Acked-by: Li Zefan <[email protected]>
On 2014/2/9 21:52, Tejun Heo wrote:
> css_scan_tasks() doesn't have any user left. Remove it.
>
I always dislike css_scan_tasks().
On Sun, Feb 09, 2014 at 08:52:28AM -0500, Tejun Heo wrote:
> Hello,
>
> This series contains assorted cleanups which also prepare for the
> planned migration taskset handling update.
Applied to cgroup/for-3.15. Thanks.
--
tejun