2012-05-23 10:40:50

by Cyrill Gorcunov

[permalink] [raw]
Subject: [rfc v2 0/7] procfs fdinfo extension v2

Hi guys,

here is an updated version of "get more detailed fdinfo" on
eventfd/epoll/fsnotify files. The main change from previous
version is that now almost everything is under CONFIG_CHECKPOINT_RESTORE
config symbol (except of conversion of fdinfo proc handling routines
to seq files which I think increase readability for mainline as well).
I've tried hard to minimaze impact on source code.

The comments and espec. complains are highly appreciated (even proposal
of some different design is welcome).

Andrew, as to questions what else needed for c/r in mainline -- that's
what I counted (at least at moment)

- fdinfo to restore eventfd/epoll/fsnotify (this series)

- restore of file-owner UIDs (there were a series from
Eric for mapping kernel-uids to user-uids, so I'm waiting
for their final merge before I adopt my patch and send it out
(maybe I've missed something and the whole Eric's series already
in linux-next, need to check)

- IPC c/r (there were a series from Stas not sure what status
of them at moment, as far as I remember they should be rewored)

- finally it would be great to have ability to attach tasks to
frozen tasks cgroup to thaw them at one moment (the preliminary
patch I've proposed pretty long ago, but Tejun was modifying cgroups
code and asked to wait until 3.4 is release, so I didn't check
the current status of task cgroups at moment, I've it in my todo list)

that's all I know.

Cyrill


2012-05-24 18:02:22

by Matt Helsley

[permalink] [raw]
Subject: Re: [rfc v2 0/7] procfs fdinfo extension v2

On Wed, May 23, 2012 at 02:25:41PM +0400, Cyrill Gorcunov wrote:

<snip>

> - finally it would be great to have ability to attach tasks to
> frozen tasks cgroup to thaw them at one moment (the preliminary
> patch I've proposed pretty long ago, but Tejun was modifying cgroups
> code and asked to wait until 3.4 is release, so I didn't check
> the current status of task cgroups at moment, I've it in my todo list)

This still strikes me as the wrong way to go about freezing for c/r.
You never explained why you had to do it this way. Why can't you inject
the parasite thread, move that thread out of the cgroup-to-be-frozen,
then freeze?

As best I can tell your reply last time only fleshed out the details
of *how* you would like it to work, not *why it needs to* work that way:

http://lkml.org/lkml/2011/11/30/27

Cheers,
-Matt Helsley

2012-05-24 18:23:53

by Cyrill Gorcunov

[permalink] [raw]
Subject: Re: [rfc v2 0/7] procfs fdinfo extension v2

On Thu, May 24, 2012 at 11:01:21AM -0700, Matt Helsley wrote:
> On Wed, May 23, 2012 at 02:25:41PM +0400, Cyrill Gorcunov wrote:
>
> <snip>
>
> > - finally it would be great to have ability to attach tasks to
> > frozen tasks cgroup to thaw them at one moment (the preliminary
> > patch I've proposed pretty long ago, but Tejun was modifying cgroups
> > code and asked to wait until 3.4 is release, so I didn't check
> > the current status of task cgroups at moment, I've it in my todo list)
>
> This still strikes me as the wrong way to go about freezing for c/r.
> You never explained why you had to do it this way. Why can't you inject
> the parasite thread, move that thread out of the cgroup-to-be-frozen,
> then freeze?

Hi Matt,

last time I checked the code, the ptrace has been able to work with
frozen groups (!), that's the main problem and moving thread out of
frozen cgroup will not help. The same applies to restore time, we
want to create frozen cgroup, restore everything up to original IP
via ptrace (or parasite restorer) and thaw the cgroup. Or you asking
something else?

>
> As best I can tell your reply last time only fleshed out the details
> of *how* you would like it to work, not *why it needs to* work that way:
>
> http://lkml.org/lkml/2011/11/30/27

Cyrill

2012-05-24 18:42:08

by Cyrill Gorcunov

[permalink] [raw]
Subject: Re: [rfc v2 0/7] procfs fdinfo extension v2

On Thu, May 24, 2012 at 10:23:47PM +0400, Cyrill Gorcunov wrote:
> >
> > As best I can tell your reply last time only fleshed out the details
> > of *how* you would like it to work, not *why it needs to* work that way:
> >
> > http://lkml.org/lkml/2011/11/30/27

Matt, as to "why it needs to work that way" -- this approach will simply
work well without a need to patch mainline kernel much, since almost everything
needed for transparent restore already in kernel except "ptrace with frozen cgroup",
(we even restore without frozen cgroup but with a couple of nasty tricks in crtools
utility now). But if there some other approach which we've missed, I would
really appreciate if you share it.

Cyrill

2012-05-24 19:49:10

by Pavel Emelyanov

[permalink] [raw]
Subject: Re: [rfc v2 0/7] procfs fdinfo extension v2

On 05/24/2012 10:01 PM, Matt Helsley wrote:
> On Wed, May 23, 2012 at 02:25:41PM +0400, Cyrill Gorcunov wrote:
>
> <snip>
>
>> - finally it would be great to have ability to attach tasks to
>> frozen tasks cgroup to thaw them at one moment (the preliminary
>> patch I've proposed pretty long ago, but Tejun was modifying cgroups
>> code and asked to wait until 3.4 is release, so I didn't check
>> the current status of task cgroups at moment, I've it in my todo list)
>
> This still strikes me as the wrong way to go about freezing for c/r.
> You never explained why you had to do it this way. Why can't you inject
> the parasite thread, move that thread out of the cgroup-to-be-frozen,
> then freeze?

Matt, I think that Cyrill copied this from some old wishlist and I didn't update him
in time :(

The thing is that we seem to have resolved all the issues with freezing/unfreezing
the processes we're checkpoiting/restoring, and currently we are OK with the existing
ptrace functionality. No more modifications of freeze cgroup are required.

Cyrill?

> As best I can tell your reply last time only fleshed out the details
> of *how* you would like it to work, not *why it needs to* work that way:
>
> http://lkml.org/lkml/2011/11/30/27
>
> Cheers,
> -Matt Helsley
>
> .
>

2012-05-24 20:27:46

by Cyrill Gorcunov

[permalink] [raw]
Subject: Re: [rfc v2 0/7] procfs fdinfo extension v2

On Thu, May 24, 2012 at 11:48:15PM +0400, Pavel Emelyanov wrote:
>
> The thing is that we seem to have resolved all the issues with freezing/unfreezing
> the processes we're checkpoiting/restoring, and currently we are OK with the existing
> ptrace functionality. No more modifications of freeze cgroup are required.
>
> Cyrill?

Oh, crap, sorry Matt, Pavel, indeed I somehow missed that this is not
"must have" anymore, this bullet should be dropped from list. Sorry again
guys for confusion.

Cyrill

2012-05-24 20:32:43

by Matt Helsley

[permalink] [raw]
Subject: Re: [rfc v2 0/7] procfs fdinfo extension v2

On Fri, May 25, 2012 at 12:27:39AM +0400, Cyrill Gorcunov wrote:
> On Thu, May 24, 2012 at 11:48:15PM +0400, Pavel Emelyanov wrote:
> >
> > The thing is that we seem to have resolved all the issues with freezing/unfreezing
> > the processes we're checkpoiting/restoring, and currently we are OK with the existing
> > ptrace functionality. No more modifications of freeze cgroup are required.
> >
> > Cyrill?
>
> Oh, crap, sorry Matt, Pavel, indeed I somehow missed that this is not
> "must have" anymore, this bullet should be dropped from list. Sorry again
> guys for confusion.

OK, no problem -- if anything I'm glad to have verification that this
change is not needed.

Cheers,
-Matt