2008-07-07 22:58:36

by Matt Helsley

[permalink] [raw]
Subject: [PATCH 0/4] Container Freezer: Reuse Suspend Freezer


This patchset reuses the container infrastructure and the swsusp freezer to
freeze a group of tasks.

The freezer subsystem in the container filesystem defines a file named
freezer.state. Writing "FROZEN" to the state file will freeze all tasks in the
cgroup. Subsequently writing "RUNNING" will unfreeze the tasks in the cgroup.
Reading will return the current state.

* Examples of usage :

# mkdir /containers/freezer
# mount -t cgroup -ofreezer,signal freezer /containers
# mkdir /containers/0
# echo $some_pid > /containers/0/tasks

to get status of the freezer subsystem :

# cat /containers/0/freezer.state
RUNNING

to freeze all tasks in the container :

# echo FROZEN > /containers/0/freezer.state
# cat /containers/0/freezer.state
FREEZING
# cat /containers/0/freezer.state
FROZEN

to unfreeze all tasks in the container :

# echo RUNNING > /containers/0/freezer.state
# cat /containers/0/freezer.state
RUNNING

to kill all tasks in the container :

# echo 9 > /containers/0/signal.kill

I've reworked Cedric's patches to use task_lock() to protect access to the
task's cgroup.

Paul, Pavel asked me to send these to Rafael next. They are patches to make
the freezer useful for checkpoint/restart using cgroups so it would be nice
to get an explicit [N]Ack from you first.

Rafael, if Paul agrees, please consider applying these patches.

Changes since v3:
v4 (Almost all of these changes are confined to patch 3):
Reworked the series to use task_lock() instead of RCU.
Reworked the series to use write_string() and read_seq_string()
cgroup methods.
Fixed the race Paul Menage identified.
Fixed up check_if_frozen() to do more than just test the FROZEN
flag. In some cases tasks could be stopped (T) and marked
FREEZING. When that happens we can safely assume that it
will be frozen immediately upon waking up in the kernel.
Waiting for it to get marked with PF_FROZEN in order to
transition to the FROZEN state would block unnecessarily.
Removed freezer_ prefix from static functions in cgroup_freezer.c.
Simplified STATE_ switch.
Updated the locking comments.
v3:
Ported to 2.6.26-rc5-mm2 with Rafael's freezer patches
Tested on 24 combinations of 3 architectures (x86, x86_64, ppc64)
with 8 different kernel configs varying power management
and cgroup config variables. Each patch builds and boots
in these 24 combinations.
Passes functional testing.
v2 (roughly patches 3 and 5):
Moved the "kill" file into a separate cgroup subsystem (signal) and
it's own patch.
Changed the name of the file from freezer.freeze to freezer.state.
Switched from taking 1 and 0 as input to the strings "FROZEN" and
"RUNNING", respectively. This helps keep the interface
human-usable if/when we need to more states.
Checked that stopped or interrupted is "frozen enough"
Since try_to_freeze() is called upon wakeup of these tasks
this should be fine. This idea comes from recent changes to
the freezer.
Checked that if (task == current) whilst freezing cgroup we're ok
Fixed bug where -EBUSY would always be returned when freezing
Added code to handle userspace retries for any remaining -EBUSY

Cheers,
-Matt Helsley

--


2008-07-07 23:02:37

by Matt Helsley

[permalink] [raw]
Subject: Re: [PATCH 0/4] Container Freezer: Reuse Suspend Freezer


On Mon, 2008-07-07 at 15:58 -0700, Matt Helsley wrote:
> This patchset reuses the container infrastructure and the swsusp freezer to
> freeze a group of tasks.
>
> The freezer subsystem in the container filesystem defines a file named
> freezer.state. Writing "FROZEN" to the state file will freeze all tasks in the
> cgroup. Subsequently writing "RUNNING" will unfreeze the tasks in the cgroup.
> Reading will return the current state.
>
> * Examples of usage :
>
> # mkdir /containers/freezer
> # mount -t cgroup -ofreezer,signal freezer /containers
> # mkdir /containers/0
> # echo $some_pid > /containers/0/tasks
>
> to get status of the freezer subsystem :
>
> # cat /containers/0/freezer.state
> RUNNING
>
> to freeze all tasks in the container :
>
> # echo FROZEN > /containers/0/freezer.state
> # cat /containers/0/freezer.state
> FREEZING
> # cat /containers/0/freezer.state
> FROZEN
>
> to unfreeze all tasks in the container :
>
> # echo RUNNING > /containers/0/freezer.state
> # cat /containers/0/freezer.state
> RUNNING
>
> to kill all tasks in the container :
>
> # echo 9 > /containers/0/signal.kill
>
> I've reworked Cedric's patches to use task_lock() to protect access to the
> task's cgroup.
>
> Paul, Pavel asked me to send these to Rafael next. They are patches to make
> the freezer useful for checkpoint/restart using cgroups so it would be nice
> to get an explicit [N]Ack from you first.
>
> Rafael, if Paul agrees, please consider applying these patches.
>
> Changes since v3:
> v4 (Almost all of these changes are confined to patch 3):
> Reworked the series to use task_lock() instead of RCU.
> Reworked the series to use write_string() and read_seq_string()
> cgroup methods.

FYI - This means these patches need Paul's patches introducing
write_string(). I can certainly restore the old code for .read
and .write, but I was anticipating write_string() making it into various
trees first. If that's not necessarily the case please let me know.

Cheers,
-Matt

2008-07-08 03:29:19

by Kamezawa Hiroyuki

[permalink] [raw]
Subject: Re: [PATCH 0/4] Container Freezer: Reuse Suspend Freezer

Hi, could I make brief questions ?

On Mon, 07 Jul 2008 15:58:23 -0700
Matt Helsley <[email protected]> wrote:
> to get status of the freezer subsystem :
>
> # cat /containers/0/freezer.state
> RUNNING
>
> to freeze all tasks in the container :
>
> # echo FROZEN > /containers/0/freezer.state
> # cat /containers/0/freezer.state
> FREEZING
> # cat /containers/0/freezer.state
> FROZEN
>
I'm just curious.

1. When we see FREEZING and have to retry ?
While there are some threads which wait for some event ?

2. What happens when FROZEN threads are moved to other group ?
Can we move them ?
Can we wake them up (if it can be moved) ?
What operations are allowed to frozen threads ?

Thanks,
-Kame

2008-07-08 19:57:46

by Matt Helsley

[permalink] [raw]
Subject: Re: [PATCH 0/4] Container Freezer: Reuse Suspend Freezer


Cedric was accidentally not Cc'd on the introduction (PATCH 0). Adding
him.

On Tue, 2008-07-08 at 12:31 +0900, KAMEZAWA Hiroyuki wrote:
> Hi, could I make brief questions ?
>
> On Mon, 07 Jul 2008 15:58:23 -0700
> Matt Helsley <[email protected]> wrote:
> > to get status of the freezer subsystem :
> >
> > # cat /containers/0/freezer.state
> > RUNNING
> >
> > to freeze all tasks in the container :
> >
> > # echo FROZEN > /containers/0/freezer.state
> > # cat /containers/0/freezer.state
> > FREEZING
> > # cat /containers/0/freezer.state
> > FROZEN
> >
> I'm just curious.
>
> 1. When we see FREEZING and have to retry ?

One example is when some processes are in the a specific portion of
vfork() you might see FREEZING and have to retry.

> While there are some threads which wait for some event ?

Depending on which kind of "wait" they are performing, yes. If it's
uninterruptible sleep and the PF_FROZEN flag is not set then you might
see FREEZING.

> 2. What happens when FROZEN threads are moved to other group ?
> Can we move them ?

If the destination cgroup is not FROZEN, yes.

If the destination cgroup is FREEZING this works as expected.

If the destination cgroup is RUNNING you'd have a problem unfreezing the
task. This happends because the cgroup has a state inconsistent with the
task's state. To unfreeze the task you'd have to try to freeze and then
unfreeze the destination cgroup.

There are several ways I could change this.

One is to try and disallow users from moving frozen tasks. That doesn't
seem like a good approach since it would require a new cgroups interface
"can_detach()".

However we can prevent attach so I could add checks that look at the
attaching tasks's state and refuse the attach when the task's state
(unfrozen, frozen) is inconsistent with the cgroup state (RUNNING,
FREEZING, FROZEN). I'll send a fifth patch on top of this series showing
this idea.

Rather than refuse to allow attach we could change the destination
cgroup's state during attach so that the two states are consistent.
However, this introduces more ugly cases for userspace to be aware of.

> Can we wake them up (if it can be moved) ?

You can't wake them until you've unfrozen them.

> What operations are allowed to frozen threads ?

Any operation that doesn't require the threads to run.

Cheers,
-Matt Helsley

2008-07-08 20:06:52

by Paul Menage

[permalink] [raw]
Subject: Re: [PATCH 0/4] Container Freezer: Reuse Suspend Freezer

On Tue, Jul 8, 2008 at 12:39 PM, Matt Helsley <[email protected]> wrote:
>
> One is to try and disallow users from moving frozen tasks. That doesn't
> seem like a good approach since it would require a new cgroups interface
> "can_detach()".

Detaching from the old cgroup happens at the same time as attaching to
the new cgroup, so can_attach() would work here.

Paul

2008-07-08 20:07:33

by Paul Menage

[permalink] [raw]
Subject: Re: [PATCH 0/4] Container Freezer: Reuse Suspend Freezer

On Tue, Jul 8, 2008 at 1:06 PM, Paul Menage <[email protected]> wrote:
> On Tue, Jul 8, 2008 at 12:39 PM, Matt Helsley <[email protected]> wrote:
>>
>> One is to try and disallow users from moving frozen tasks. That doesn't
>> seem like a good approach since it would require a new cgroups interface
>> "can_detach()".
>
> Detaching from the old cgroup happens at the same time as attaching to
> the new cgroup, so can_attach() would work here.

And the whole can_attach()/attach() protocol needs reworking anyway,
see my email (hopefully) later today.

Paul

2008-07-09 21:59:20

by Matt Helsley

[permalink] [raw]
Subject: Re: [PATCH 0/4] Container Freezer: Reuse Suspend Freezer


On Tue, 2008-07-08 at 13:07 -0700, Paul Menage wrote:
> On Tue, Jul 8, 2008 at 1:06 PM, Paul Menage <[email protected]> wrote:
> > On Tue, Jul 8, 2008 at 12:39 PM, Matt Helsley <[email protected]> wrote:
> >>
> >> One is to try and disallow users from moving frozen tasks. That doesn't
> >> seem like a good approach since it would require a new cgroups interface
> >> "can_detach()".
> >
> > Detaching from the old cgroup happens at the same time as attaching to
> > the new cgroup, so can_attach() would work here.

Update: I've made a patch implementing this. However it might be better
to just modify attach() to thaw the moving task rather than disallow
moving the frozen task. Serge, Cedric, Kame-san, do you have any
thoughts on which is more useful and/or intuitive?

> And the whole can_attach()/attach() protocol needs reworking anyway,
> see my email (hopefully) later today.
>
> Paul

Interesting. I look forward to seeing this.

Cheers,
-Matt

2008-07-10 00:39:29

by Kamezawa Hiroyuki

[permalink] [raw]
Subject: Re: [PATCH 0/4] Container Freezer: Reuse Suspend Freezer

On Wed, 09 Jul 2008 14:58:43 -0700
Matt Helsley <[email protected]> wrote:

>
> On Tue, 2008-07-08 at 13:07 -0700, Paul Menage wrote:
> > On Tue, Jul 8, 2008 at 1:06 PM, Paul Menage <[email protected]> wrote:
> > > On Tue, Jul 8, 2008 at 12:39 PM, Matt Helsley <[email protected]> wrote:
> > >>
> > >> One is to try and disallow users from moving frozen tasks. That doesn't
> > >> seem like a good approach since it would require a new cgroups interface
> > >> "can_detach()".
> > >
> > > Detaching from the old cgroup happens at the same time as attaching to
> > > the new cgroup, so can_attach() would work here.
>
> Update: I've made a patch implementing this. However it might be better
> to just modify attach() to thaw the moving task rather than disallow
> moving the frozen task. Serge, Cedric, Kame-san, do you have any
> thoughts on which is more useful and/or intuitive?
>

Thank you for explanation in previous mail.

Hmm, just thawing seems atractive but it will confuse people (I think).

I think some kind of process-group is freezed by this freezer and "moving
freezed task" is wrong(unexpected) operation in general. And there will
be no demand to do that from users.
I think just taking "moving freezed task" as error-operation and returning
-EBUSY is better.

Thanks,
-Kame

> > And the whole can_attach()/attach() protocol needs reworking anyway,
> > see my email (hopefully) later today.
> >
> > Paul
>
> Interesting. I look forward to seeing this.
>
> Cheers,
> -Matt
>
>

2008-07-10 02:18:44

by Matt Helsley

[permalink] [raw]
Subject: [RFC][PATCH] Container Freezer: Don't Let Frozen Stuff Change


On Thu, 2008-07-10 at 09:42 +0900, KAMEZAWA Hiroyuki wrote:
> On Wed, 09 Jul 2008 14:58:43 -0700
> Matt Helsley <[email protected]> wrote:
>
> >
> > On Tue, 2008-07-08 at 13:07 -0700, Paul Menage wrote:
> > > On Tue, Jul 8, 2008 at 1:06 PM, Paul Menage <[email protected]> wrote:
> > > > On Tue, Jul 8, 2008 at 12:39 PM, Matt Helsley <[email protected]> wrote:
> > > >>
> > > >> One is to try and disallow users from moving frozen tasks. That doesn't
> > > >> seem like a good approach since it would require a new cgroups interface
> > > >> "can_detach()".
> > > >
> > > > Detaching from the old cgroup happens at the same time as attaching to
> > > > the new cgroup, so can_attach() would work here.
> >
> > Update: I've made a patch implementing this. However it might be better
> > to just modify attach() to thaw the moving task rather than disallow
> > moving the frozen task. Serge, Cedric, Kame-san, do you have any
> > thoughts on which is more useful and/or intuitive?
> >
>
> Thank you for explanation in previous mail.
>
> Hmm, just thawing seems atractive but it will confuse people (I think).
>
> I think some kind of process-group is freezed by this freezer and "moving
> freezed task" is wrong(unexpected) operation in general. And there will
> be no demand to do that from users.
> I think just taking "moving freezed task" as error-operation and returning
> -EBUSY is better.

Kame-san,

I've been working on changes to the can_attach() code so it was pretty
easy to try this out.

Don't let frozen tasks or cgroups change. This means frozen tasks can't
leave their current cgroup for another cgroup. It also means that tasks
cannot be added to or removed from a cgroup in the FROZEN state. We
enforce these rules by checking for frozen tasks and cgroups in the
can_attach() function.

Signed-off-by: Matt Helsley <[email protected]>
---
Builds, boots, passes testing against 2.6.26-rc5-mm2

kernel/cgroup_freezer.c | 42 +++++++++++++++++++++++++-----------------
1 file changed, 25 insertions(+), 17 deletions(-)

Index: linux-2.6.26-rc5-mm2/kernel/cgroup_freezer.c
===================================================================
--- linux-2.6.26-rc5-mm2.orig/kernel/cgroup_freezer.c
+++ linux-2.6.26-rc5-mm2/kernel/cgroup_freezer.c
@@ -89,26 +89,43 @@ static void freezer_destroy(struct cgrou
struct cgroup *cgroup)
{
kfree(cgroup_freezer(cgroup));
}

+/* Task is frozen or will freeze immediately when next it gets woken */
+static bool is_task_frozen_enough(struct task_struct *task)
+{
+ return (frozen(task) || (task_is_stopped_or_traced(task) && freezing(task)));
+}

+/*
+ * The call to cgroup_lock() in the freezer.state write method prevents
+ * a write to that file racing against an attach, and hence the
+ * can_attach() result will remain valid until the attach completes.
+ */
static int freezer_can_attach(struct cgroup_subsys *ss,
struct cgroup *new_cgroup,
struct task_struct *task)
{
struct freezer *freezer;
- int retval = 0;
+ int retval;
+
+ /* Anything frozen can't move or be moved to/from */
+
+ if (is_task_frozen_enough(task))
+ return -EBUSY;

- /*
- * The call to cgroup_lock() in the freezer.state write method prevents
- * a write to that file racing against an attach, and hence the
- * can_attach() result will remain valid until the attach completes.
- */
freezer = cgroup_freezer(new_cgroup);
if (freezer->state == STATE_FROZEN)
+ return -EBUSY;
+
+ retval = 0;
+ task_lock(task);
+ freezer = task_freezer(task);
+ if (freezer->state == STATE_FROZEN)
retval = -EBUSY;
+ task_unlock(task);
return retval;
}

static void freezer_fork(struct cgroup_subsys *ss, struct task_struct *task)
{
@@ -139,16 +156,11 @@ static void check_if_frozen(struct cgrou
unsigned int nfrozen = 0, ntotal = 0;

cgroup_iter_start(cgroup, &it);
while ((task = cgroup_iter_next(cgroup, &it))) {
ntotal++;
- /*
- * Task is frozen or will freeze immediately when next it gets
- * woken
- */
- if (frozen(task) ||
- (task_is_stopped_or_traced(task) && freezing(task)))
+ if (is_task_frozen_enough(task))
nfrozen++;
}

/*
* Transition to FROZEN when no new tasks can be added ensures
@@ -195,15 +207,11 @@ static int try_to_freeze_cgroup(struct c
freezer->state = STATE_FREEZING;
cgroup_iter_start(cgroup, &it);
while ((task = cgroup_iter_next(cgroup, &it))) {
if (!freeze_task(task, true))
continue;
- if (task_is_stopped_or_traced(task) && freezing(task))
- /*
- * The freeze flag is set so these tasks will
- * immediately go into the fridge upon waking.
- */
+ if (is_task_frozen_enough(task))
continue;
if (!freezing(task) && !freezer_should_skip(task))
num_cant_freeze_now++;
}
cgroup_iter_end(cgroup, &it);

2008-07-10 03:22:14

by Li Zefan

[permalink] [raw]
Subject: Re: [RFC][PATCH] Container Freezer: Don't Let Frozen Stuff Change

Matt Helsley wrote:
> On Thu, 2008-07-10 at 09:42 +0900, KAMEZAWA Hiroyuki wrote:
>> On Wed, 09 Jul 2008 14:58:43 -0700
>> Matt Helsley <[email protected]> wrote:
>>
>>> On Tue, 2008-07-08 at 13:07 -0700, Paul Menage wrote:
>>>> On Tue, Jul 8, 2008 at 1:06 PM, Paul Menage <[email protected]> wrote:
>>>>> On Tue, Jul 8, 2008 at 12:39 PM, Matt Helsley <[email protected]> wrote:
>>>>>> One is to try and disallow users from moving frozen tasks. That doesn't
>>>>>> seem like a good approach since it would require a new cgroups interface
>>>>>> "can_detach()".
>>>>> Detaching from the old cgroup happens at the same time as attaching to
>>>>> the new cgroup, so can_attach() would work here.
>>> Update: I've made a patch implementing this. However it might be better
>>> to just modify attach() to thaw the moving task rather than disallow
>>> moving the frozen task. Serge, Cedric, Kame-san, do you have any
>>> thoughts on which is more useful and/or intuitive?
>>>
>> Thank you for explanation in previous mail.
>>
>> Hmm, just thawing seems atractive but it will confuse people (I think).
>>
>> I think some kind of process-group is freezed by this freezer and "moving
>> freezed task" is wrong(unexpected) operation in general. And there will
>> be no demand to do that from users.
>> I think just taking "moving freezed task" as error-operation and returning
>> -EBUSY is better.
>
> Kame-san,
>
> I've been working on changes to the can_attach() code so it was pretty
> easy to try this out.
>
> Don't let frozen tasks or cgroups change. This means frozen tasks can't
> leave their current cgroup for another cgroup. It also means that tasks
> cannot be added to or removed from a cgroup in the FROZEN state. We
> enforce these rules by checking for frozen tasks and cgroups in the
> can_attach() function.
>
> Signed-off-by: Matt Helsley <[email protected]>
> ---
> Builds, boots, passes testing against 2.6.26-rc5-mm2
>
> kernel/cgroup_freezer.c | 42 +++++++++++++++++++++++++-----------------
> 1 file changed, 25 insertions(+), 17 deletions(-)
>
> Index: linux-2.6.26-rc5-mm2/kernel/cgroup_freezer.c
> ===================================================================
> --- linux-2.6.26-rc5-mm2.orig/kernel/cgroup_freezer.c
> +++ linux-2.6.26-rc5-mm2/kernel/cgroup_freezer.c
> @@ -89,26 +89,43 @@ static void freezer_destroy(struct cgrou
> struct cgroup *cgroup)
> {
> kfree(cgroup_freezer(cgroup));
> }
>
> +/* Task is frozen or will freeze immediately when next it gets woken */
> +static bool is_task_frozen_enough(struct task_struct *task)
> +{
> + return (frozen(task) || (task_is_stopped_or_traced(task) && freezing(task)));
> +}
>
> +/*
> + * The call to cgroup_lock() in the freezer.state write method prevents
> + * a write to that file racing against an attach, and hence the
> + * can_attach() result will remain valid until the attach completes.
> + */
> static int freezer_can_attach(struct cgroup_subsys *ss,
> struct cgroup *new_cgroup,
> struct task_struct *task)
> {
> struct freezer *freezer;
> - int retval = 0;
> + int retval;
> +
> + /* Anything frozen can't move or be moved to/from */
> +
> + if (is_task_frozen_enough(task))
> + return -EBUSY;
>

cgroup_lock() can prevent the state change of old_cgroup and new_cgroup, but
will the following racy happen ?
1 2
can_attach(tsk)
is_task_frozen_enough(tsk) == false
freeze_task(tsk)
attach(tsk)

i.e., will is_task_frozen_enough(tsk) remain valid through can_attach() and attach()?

> - /*
> - * The call to cgroup_lock() in the freezer.state write method prevents
> - * a write to that file racing against an attach, and hence the
> - * can_attach() result will remain valid until the attach completes.
> - */
> freezer = cgroup_freezer(new_cgroup);
> if (freezer->state == STATE_FROZEN)
> + return -EBUSY;
> +
> + retval = 0;
> + task_lock(task);
> + freezer = task_freezer(task);
> + if (freezer->state == STATE_FROZEN)
> retval = -EBUSY;
> + task_unlock(task);
> return retval;
> }
>
> static void freezer_fork(struct cgroup_subsys *ss, struct task_struct *task)
> {
> @@ -139,16 +156,11 @@ static void check_if_frozen(struct cgrou
> unsigned int nfrozen = 0, ntotal = 0;
>
> cgroup_iter_start(cgroup, &it);
> while ((task = cgroup_iter_next(cgroup, &it))) {
> ntotal++;
> - /*
> - * Task is frozen or will freeze immediately when next it gets
> - * woken
> - */
> - if (frozen(task) ||
> - (task_is_stopped_or_traced(task) && freezing(task)))
> + if (is_task_frozen_enough(task))
> nfrozen++;
> }
>
> /*
> * Transition to FROZEN when no new tasks can be added ensures
> @@ -195,15 +207,11 @@ static int try_to_freeze_cgroup(struct c
> freezer->state = STATE_FREEZING;
> cgroup_iter_start(cgroup, &it);
> while ((task = cgroup_iter_next(cgroup, &it))) {
> if (!freeze_task(task, true))
> continue;
> - if (task_is_stopped_or_traced(task) && freezing(task))
> - /*
> - * The freeze flag is set so these tasks will
> - * immediately go into the fridge upon waking.
> - */
> + if (is_task_frozen_enough(task))
> continue;
> if (!freezing(task) && !freezer_should_skip(task))
> num_cant_freeze_now++;
> }
> cgroup_iter_end(cgroup, &it);
>

2008-07-10 15:44:01

by Serge E. Hallyn

[permalink] [raw]
Subject: Re: [PATCH 0/4] Container Freezer: Reuse Suspend Freezer

Quoting KAMEZAWA Hiroyuki ([email protected]):
> On Wed, 09 Jul 2008 14:58:43 -0700
> Matt Helsley <[email protected]> wrote:
>
> >
> > On Tue, 2008-07-08 at 13:07 -0700, Paul Menage wrote:
> > > On Tue, Jul 8, 2008 at 1:06 PM, Paul Menage <[email protected]> wrote:
> > > > On Tue, Jul 8, 2008 at 12:39 PM, Matt Helsley <[email protected]> wrote:
> > > >>
> > > >> One is to try and disallow users from moving frozen tasks. That doesn't
> > > >> seem like a good approach since it would require a new cgroups interface
> > > >> "can_detach()".
> > > >
> > > > Detaching from the old cgroup happens at the same time as attaching to
> > > > the new cgroup, so can_attach() would work here.
> >
> > Update: I've made a patch implementing this. However it might be better
> > to just modify attach() to thaw the moving task rather than disallow
> > moving the frozen task. Serge, Cedric, Kame-san, do you have any
> > thoughts on which is more useful and/or intuitive?
> >
>
> Thank you for explanation in previous mail.
>
> Hmm, just thawing seems atractive but it will confuse people (I think).
>
> I think some kind of process-group is freezed by this freezer and "moving
> freezed task" is wrong(unexpected) operation in general. And there will
> be no demand to do that from users.
> I think just taking "moving freezed task" as error-operation and returning
> -EBUSY is better.
>
> Thanks,
> -Kame

I'm torn. Allowing the moves is kind of cool, but I think I agree that
we should start out with the simpler semantics, which in this case is
disallowing the move. The race Li may have found will only become more
complicated when both sides of the race can change the task's frozen
state.

-serge

2008-07-11 23:52:44

by Matt Helsley

[permalink] [raw]
Subject: Re: [RFC][PATCH] Container Freezer: Don't Let Frozen Stuff Change


On Thu, 2008-07-10 at 11:20 +0800, Li Zefan wrote:
> Matt Helsley wrote:
> > On Thu, 2008-07-10 at 09:42 +0900, KAMEZAWA Hiroyuki wrote:
> >> On Wed, 09 Jul 2008 14:58:43 -0700
> >> Matt Helsley <[email protected]> wrote:
> >>
> >>> On Tue, 2008-07-08 at 13:07 -0700, Paul Menage wrote:
> >>>> On Tue, Jul 8, 2008 at 1:06 PM, Paul Menage <[email protected]> wrote:
> >>>>> On Tue, Jul 8, 2008 at 12:39 PM, Matt Helsley <[email protected]> wrote:
> >>>>>> One is to try and disallow users from moving frozen tasks. That doesn't
> >>>>>> seem like a good approach since it would require a new cgroups interface
> >>>>>> "can_detach()".
> >>>>> Detaching from the old cgroup happens at the same time as attaching to
> >>>>> the new cgroup, so can_attach() would work here.
> >>> Update: I've made a patch implementing this. However it might be better
> >>> to just modify attach() to thaw the moving task rather than disallow
> >>> moving the frozen task. Serge, Cedric, Kame-san, do you have any
> >>> thoughts on which is more useful and/or intuitive?
> >>>
> >> Thank you for explanation in previous mail.
> >>
> >> Hmm, just thawing seems atractive but it will confuse people (I think).
> >>
> >> I think some kind of process-group is freezed by this freezer and "moving
> >> freezed task" is wrong(unexpected) operation in general. And there will
> >> be no demand to do that from users.
> >> I think just taking "moving freezed task" as error-operation and returning
> >> -EBUSY is better.
> >
> > Kame-san,
> >
> > I've been working on changes to the can_attach() code so it was pretty
> > easy to try this out.
> >
> > Don't let frozen tasks or cgroups change. This means frozen tasks can't
> > leave their current cgroup for another cgroup. It also means that tasks
> > cannot be added to or removed from a cgroup in the FROZEN state. We
> > enforce these rules by checking for frozen tasks and cgroups in the
> > can_attach() function.
> >
> > Signed-off-by: Matt Helsley <[email protected]>
> > ---
> > Builds, boots, passes testing against 2.6.26-rc5-mm2
> >
> > kernel/cgroup_freezer.c | 42 +++++++++++++++++++++++++-----------------
> > 1 file changed, 25 insertions(+), 17 deletions(-)
> >
> > Index: linux-2.6.26-rc5-mm2/kernel/cgroup_freezer.c
> > ===================================================================
> > --- linux-2.6.26-rc5-mm2.orig/kernel/cgroup_freezer.c
> > +++ linux-2.6.26-rc5-mm2/kernel/cgroup_freezer.c
> > @@ -89,26 +89,43 @@ static void freezer_destroy(struct cgrou
> > struct cgroup *cgroup)
> > {
> > kfree(cgroup_freezer(cgroup));
> > }
> >
> > +/* Task is frozen or will freeze immediately when next it gets woken */
> > +static bool is_task_frozen_enough(struct task_struct *task)
> > +{
> > + return (frozen(task) || (task_is_stopped_or_traced(task) && freezing(task)));
> > +}
> >
> > +/*
> > + * The call to cgroup_lock() in the freezer.state write method prevents
> > + * a write to that file racing against an attach, and hence the
> > + * can_attach() result will remain valid until the attach completes.
> > + */
> > static int freezer_can_attach(struct cgroup_subsys *ss,
> > struct cgroup *new_cgroup,
> > struct task_struct *task)
> > {
> > struct freezer *freezer;
> > - int retval = 0;
> > + int retval;
> > +
> > + /* Anything frozen can't move or be moved to/from */
> > +
> > + if (is_task_frozen_enough(task))
> > + return -EBUSY;
> >
>
> cgroup_lock() can prevent the state change of old_cgroup and new_cgroup, but
> will the following racy happen ?
> 1 2

For most of the paths using these functions we have:

cgroup_lock() cgroup_lock()
... ...
> can_attach(tsk)
> is_task_frozen_enough(tsk) == false
> freeze_task(tsk)
or thaw_process(tsk)
> attach(tsk)
... ...
cgroup_unlock() cgroup_unlock()

I've checked the cgroup freezer subsystem and the cgroup "core" and
this interleaving isn't possible between those two pieces. Only the
swsusp invocation of freeze_task() does not protect freeze/thaw with the
cgroup_lock. I'll be looking into this some more to see if that's really
a problem and if so how we might solve it.

Thanks for this excellent question.

Cheers,
-Matt Helsley