2011-04-19 10:07:47

by Konstantin Khlebnikov

[permalink] [raw]
Subject: [PATCH] cpuset: allow empty cpu/node masks

Allow to attach tasks to cpusets with empty cpu/node masks.
Handle empty masks with help guarantee_online_cpus() and guarantee_online_mems()

This aimed to fix attaching tasks to the newly created cgroups in hierarchies with
cpuset subsystem. Cpuset always require initializing cpuset.cpus and cpuset.mems,
because they are empty by default, this fact block task attaching with -ENOSPC.

After this patch empty masks are always handled by waking up the cpuset hierarchy
and searching valid masks on parents. cpuset_attach() already handle it corretly.

Signed-off-by: Konstantin Khlebnikov <[email protected]>
---
kernel/cpuset.c | 22 ++++++++++------------
1 files changed, 10 insertions(+), 12 deletions(-)

diff --git a/kernel/cpuset.c b/kernel/cpuset.c
index 33eee16..e32453a 100644
--- a/kernel/cpuset.c
+++ b/kernel/cpuset.c
@@ -226,6 +226,9 @@ static char cpuset_name[CPUSET_NAME_LEN];
static char cpuset_nodelist[CPUSET_NODELIST_LEN];
static DEFINE_SPINLOCK(cpuset_buffer_lock);

+/* Protected by cgroup_lock */
+static cpumask_var_t cpus_attach;
+
/*
* This is ugly, but preserves the userspace API for existing cpuset
* users. If someone tries to mount the "cpuset" filesystem, we
@@ -428,8 +431,10 @@ static int validate_change(const struct cpuset *cur, const struct cpuset *trial)

/* Cpusets with tasks can't have empty cpus_allowed or mems_allowed */
if (cgroup_task_count(cur->css.cgroup)) {
- if (cpumask_empty(trial->cpus_allowed) ||
- nodes_empty(trial->mems_allowed)) {
+ if ((!cpumask_empty(cur->cpus_allowed) &&
+ cpumask_empty(trial->cpus_allowed)) ||
+ (!nodes_empty(cur->mems_allowed) &&
+ nodes_empty(trial->mems_allowed))) {
return -ENOSPC;
}
}
@@ -797,8 +802,7 @@ void rebuild_sched_domains(void)
static int cpuset_test_cpumask(struct task_struct *tsk,
struct cgroup_scanner *scan)
{
- return !cpumask_equal(&tsk->cpus_allowed,
- (cgroup_cs(scan->cg))->cpus_allowed);
+ return !cpumask_equal(&tsk->cpus_allowed, cpus_attach);
}

/**
@@ -815,7 +819,7 @@ static int cpuset_test_cpumask(struct task_struct *tsk,
static void cpuset_change_cpumask(struct task_struct *tsk,
struct cgroup_scanner *scan)
{
- set_cpus_allowed_ptr(tsk, ((cgroup_cs(scan->cg))->cpus_allowed));
+ set_cpus_allowed_ptr(tsk, cpus_attach);
}

/**
@@ -835,6 +839,7 @@ static void update_tasks_cpumask(struct cpuset *cs, struct ptr_heap *heap)
{
struct cgroup_scanner scan;

+ guarantee_online_cpus(cs, cpus_attach);
scan.cg = cs->css.cgroup;
scan.test_task = cpuset_test_cpumask;
scan.process_task = cpuset_change_cpumask;
@@ -1367,18 +1372,11 @@ static int fmeter_getrate(struct fmeter *fmp)
return val;
}

-/* Protected by cgroup_lock */
-static cpumask_var_t cpus_attach;
-
/* Called by cgroups to determine if a cpuset is usable; cgroup_mutex held */
static int cpuset_can_attach(struct cgroup_subsys *ss, struct cgroup *cont,
struct task_struct *tsk, bool threadgroup)
{
int ret;
- struct cpuset *cs = cgroup_cs(cont);
-
- if (cpumask_empty(cs->cpus_allowed) || nodes_empty(cs->mems_allowed))
- return -ENOSPC;

/*
* Kthreads bound to specific cpus cannot be moved to a new cpuset; we


2011-04-19 10:18:15

by Paul Menage

[permalink] [raw]
Subject: Re: [PATCH] cpuset: allow empty cpu/node masks

On Tue, Apr 19, 2011 at 12:07 PM, Konstantin Khlebnikov
<[email protected]> wrote:
>
> This aimed to fix attaching tasks to the newly created cgroups in hierarchies with
> cpuset subsystem. Cpuset always require initializing cpuset.cpus and cpuset.mems,
> because they are empty by default, this fact block task attaching with -ENOSPC.

You can set the cgroup.clone_children to 1 to get the parent masks
automatically copied into the child. So I don't think this patch is
needed.

Paul

2011-04-19 11:08:21

by Konstantin Khlebnikov

[permalink] [raw]
Subject: Re: [PATCH] cpuset: allow empty cpu/node masks

Paul Menage wrote:
> On Tue, Apr 19, 2011 at 12:07 PM, Konstantin Khlebnikov
> <[email protected]> wrote:
>>
>> This aimed to fix attaching tasks to the newly created cgroups in hierarchies with
>> cpuset subsystem. Cpuset always require initializing cpuset.cpus and cpuset.mems,
>> because they are empty by default, this fact block task attaching with -ENOSPC.
>
> You can set the cgroup.clone_children to 1 to get the parent masks
> automatically copied into the child. So I don't think this patch is
> needed.

cgroup.clone_children does not work if ns_cgroup is mounted,
but it is ok if ns_cgroup is alredy scheduled to remove soon.

However, I still think that all cgroups must have reasonable default state.
Is cgroup.clone_children=1 planned to be default?

So, I just want to make cgroup worked out of the box:

mount -t cgroup cgroup /cgroup
mkdir /cgroup/foo
echo $$ > /cgroup/foo/tasks

Initialization before tasks attaching should be optional.

>
> Paul

2011-04-19 11:47:45

by Paul Menage

[permalink] [raw]
Subject: Re: [PATCH] cpuset: allow empty cpu/node masks

On Tue, Apr 19, 2011 at 1:08 PM, Konstantin Khlebnikov
<[email protected]> wrote:
> Is cgroup.clone_children=1 planned to be default?

I don't think so. It would be a user-visible API change, for no
significant benefit.

>
> So, I just want to make cgroup worked out of the box:
>
> mount -t cgroup cgroup /cgroup
> mkdir /cgroup/foo
> echo $$ > /cgroup/foo/tasks

Just because cgroups has a very simple filesystem-based ASCII API,
that doesn't stop it from being a complex and
barely-human-comprehendable system. :-)

I'm not sure what we gain by making that approach work - to do useful
stuff with cgroups (rather than simply playing around with moving
tasks into cgroups) you do need to have a better understanding of
what's going on. It's not too hard to make the initial instructions
explain how to only mount the subsystems that you want, or pass the
clone_children option.

Paul

2011-04-19 12:17:38

by Konstantin Khlebnikov

[permalink] [raw]
Subject: Re: [PATCH] cpuset: allow empty cpu/node masks

Paul Menage wrote:
> On Tue, Apr 19, 2011 at 1:08 PM, Konstantin Khlebnikov
> <[email protected]> wrote:
>> Is cgroup.clone_children=1 planned to be default?
>
> I don't think so. It would be a user-visible API change, for no
> significant benefit.
>
>>
>> So, I just want to make cgroup worked out of the box:
>>
>> mount -t cgroup cgroup /cgroup
>> mkdir /cgroup/foo
>> echo $$> /cgroup/foo/tasks
>
> Just because cgroups has a very simple filesystem-based ASCII API,
> that doesn't stop it from being a complex and
> barely-human-comprehendable system. :-)
>
> I'm not sure what we gain by making that approach work - to do useful
> stuff with cgroups (rather than simply playing around with moving
> tasks into cgroups) you do need to have a better understanding of
> what's going on. It's not too hard to make the initial instructions
> explain how to only mount the subsystems that you want, or pass the
> clone_children option.
>

Ok, then the problem is that the cgroupfs can be mounted without the
specifying set of subsystems. Thus this operation is unsafe for future compatibily.

I think it is best to oblige all subsystems to initialize their css
by default with reasonable unlimited values or to inherit they from parent.

Currently cpuset is only one cgroup that "does not work by default",
and I see no reason why it should be special.