Hi, this series hopefully in a good shape
- sys_kcmp now depends on CONFIG_CHECKPOINT_RESTORE
- the extension of /proc/pid/stat now done against
linux-next/master
Please letme know if I've missed something.
Thanks,
Cyrill
On Mon, 13 Feb 2012 20:48:22 +0400
Cyrill Gorcunov <[email protected]> wrote:
> Hi, this series hopefully in a good shape
>
> - sys_kcmp now depends on CONFIG_CHECKPOINT_RESTORE
>
> - the extension of /proc/pid/stat now done against
> linux-next/master
>
> Please letme know if I've missed something.
Thus far our (my) approach has been to trickle the c/r support code
into mainline as it is developed. Under the assumption that the end
result will be acceptable and useful kernel code.
I'm afraid that I'm losing confidence in that approach. We have this
patchset, we have Stanislav's "IPC: checkpoint/restore in userspace
enhancements" (which apparently needs to get more complex to support
LSM context c/r). I simply *don't know* what additional patchsets are
expected. And from what you told me it sounds like networking support
is at a very early stage and I fear for what the end result of that
will look like.
So I don't feel that I can continue feeding these things into mainline
until someone can convince me that we won't have a nasty mess (and/or
an unsufficiently useful feature) at the end of the project.
The traditional approach is to develop the feature out-of-tree until it
is "finished". That's a lot more hackwork for you guys and it leads to
a poorer feature - this approach inevitably has a lower level of review
and inhibits code rework.
An alternative is for me to buffer the patches in my tree until it is
all sufficiently finished. That also is more work for your team, but
it will produce better code, because of additional review and code
rework resulting from that review.
I don't know how many patches that would end up being (this is part of
the problem!) nor how long they would be carried for.
So. Please talk to me. How long is this all going to take, and what
will the final result look like?
On 02/15/2012 02:51 AM, Andrew Morton wrote:
> On Mon, 13 Feb 2012 20:48:22 +0400
> Cyrill Gorcunov <[email protected]> wrote:
>
>> Hi, this series hopefully in a good shape
>>
>> - sys_kcmp now depends on CONFIG_CHECKPOINT_RESTORE
>>
>> - the extension of /proc/pid/stat now done against
>> linux-next/master
>>
>> Please letme know if I've missed something.
>
> Thus far our (my) approach has been to trickle the c/r support code
> into mainline as it is developed. Under the assumption that the end
> result will be acceptable and useful kernel code.
>
> I'm afraid that I'm losing confidence in that approach. We have this
> patchset, we have Stanislav's "IPC: checkpoint/restore in userspace
> enhancements" (which apparently needs to get more complex to support
> LSM context c/r). I simply *don't know* what additional patchsets are
> expected. And from what you told me it sounds like networking support
> is at a very early stage and I fear for what the end result of that
> will look like.
I understand. But there was a confidence that nobody wanted the c/r stuff to
be the "one big kernel subsystem", but it should rather be "a bunch of small
API-s for what is required". The amount of code for the initial C/R attempt was
~100 patches. The amount of code to support our user-space C/R implementation
*only* is ~10 and the feature-set of both is already comparable.
As far as the networking is concerned -- we will not require any additional
patches to implement the basic netns configuration migration (ip can show and
re-configure all we need about routing, interfaces, devices, etc. and the
iptables-save/iptables-restore will handle 99.9% of the netfilter part). For
what we currently need is the ability to explore sockets queues, but currently
this doesn't turn out to be a lot of code -- I have 60-lines patch for unix
sockets and Tejun showed the way how to do the same with TCP using 130 lines
of code. UDP won't require anything, its queues can be silently dropped. The
recent 50 patches with *_diag stuff doesn't count, because it works not for C/R
only, the ss tool can benefit from 100% of the added functionality (this, btw,
shows that not every piece of code we add for C/R is for C/R *only*).
> So I don't feel that I can continue feeding these things into mainline
> until someone can convince me that we won't have a nasty mess (and/or
> an unsufficiently useful feature) at the end of the project.
Isn't the CONFIG_CHECKPOINT_RESTORE option turned off by default enough?
> The traditional approach is to develop the feature out-of-tree until it
> is "finished". That's a lot more hackwork for you guys and it leads to
> a poorer feature - this approach inevitably has a lower level of review
> and inhibits code rework.
That's why we started sending patches early.
> An alternative is for me to buffer the patches in my tree until it is
> all sufficiently finished. That also is more work for your team, but
> it will produce better code, because of additional review and code
> rework resulting from that review.
>
> I don't know how many patches that would end up being (this is part of
> the problem!) nor how long they would be carried for.
Neither do I :(
> So. Please talk to me. How long is this all going to take, and what
> will the final result look like?
The Big Intermediate Result we're trying to achieve is -- take a basic
OpenVZ or LXC container based on e.g. rhel6 template and make sure we can
checkpoint and restore it without breaking one.
The More-or-less Finished state of the project would be when it's able to
do all the stuff that the OpenVZ's implementation can. The list of major
features which are yet absent in the CRIU and for which we will require the
kernel support includes
* shared kernel objects (this thread)
* tcp connection
* pty stuff
* sysvipc
* iterative working set migration
The latter one is an ability to find out which pages processes use and catch
when they change data on them. I planned to discuss this on LSF, but we can
start earlier if you want.
Other currently missing stuff is quite minor or doesn't require any new things
form the kernel like signalfd-s or netfilter.
The Ultimate Goal is hard to describe because we have the variety of ideas
about what the CRIU can do including such things as checkpointing desktop apps'
with their xserver state or live-migrating parts of a multi-process app from
one box to another.
Thanks,
Pavel
On Wed, Feb 15, 2012 at 08:52:36AM +0400, Pavel Emelyanov wrote:
> On 02/15/2012 02:51 AM, Andrew Morton wrote:
> > On Mon, 13 Feb 2012 20:48:22 +0400
> > Cyrill Gorcunov <[email protected]> wrote:
> >
> >> Hi, this series hopefully in a good shape
> >>
> >> - sys_kcmp now depends on CONFIG_CHECKPOINT_RESTORE
> >>
> >> - the extension of /proc/pid/stat now done against
> >> linux-next/master
> >>
> >> Please letme know if I've missed something.
> >
> > Thus far our (my) approach has been to trickle the c/r support code
> > into mainline as it is developed. Under the assumption that the end
> > result will be acceptable and useful kernel code.
> >
> > I'm afraid that I'm losing confidence in that approach. We have this
> > patchset, we have Stanislav's "IPC: checkpoint/restore in userspace
> > enhancements" (which apparently needs to get more complex to support
> > LSM context c/r). I simply *don't know* what additional patchsets are
> > expected. And from what you told me it sounds like networking support
> > is at a very early stage and I fear for what the end result of that
> > will look like.
>
> I understand. But there was a confidence that nobody wanted the c/r stuff to
> be the "one big kernel subsystem", but it should rather be "a bunch of small
> API-s for what is required". The amount of code for the initial C/R attempt was
> ~100 patches. The amount of code to support our user-space C/R implementation
> *only* is ~10 and the feature-set of both is already comparable.
>
Andrew, I hope Pavel has addressed all your concerns? What I personally
trying to achieve mostly -- the patches should be as minimum as possible,
still usable. I believe the patches which are already in tree are useful for
other projects as well (for example -- /proc/pid/task/tid/"children" to find
all children and build process topology fast). prctl extension look a bit
redundant for kernel in general, but they are easily turnable off via Kconfig
option. /proc/pid/map_files/ might be redundant too but it could be eliminated
via Kconfig as well. So I think the both series actually do not bring much noise
into kernel itself.
Cyrill