2009-03-02 13:44:18

by Serge E. Hallyn

[permalink] [raw]
Subject: Re: [RFC][PATCH 8/8] check files for checkpointability

Quoting Dave Hansen ([email protected]):
>
> Introduce a files_struct counter to indicate whether a particular
> file_struct has ever contained a file which can not be
> checkpointed. This flag is a one-way trip; once it is set, it may
> not be unset.
>
> We assume at allocation that a new files_struct is clean and may
> be checkpointed. However, as soon as it has had its files filled
> from its parent's, we check it for real in __scan_files_for_cr().
> At that point, we mark it if it contained any uncheckpointable
> files.
>
> We also check each 'struct file' when it is installed in a fd
> slot. This way, if anyone open()s or managed to dup() an
> unsuppored file, we can catch it.
>
> Signed-off-by: Dave Hansen <[email protected]>

So on a practical note, Ingo's scheme appears to be paying off. In
order for any program's files_struct to be checkpointable right now,
it must be statically compiled, else ld.so (I assume) looks up
/proc/$$/status. So since proc is not checkpointable, the result
is irreversibly non-checkpointable.

So... does it make sense to mark proc as checkpointable? Do we
reasonably assume that the same procfile will be available at
restart?

-serge


2009-03-02 15:56:53

by Dave Hansen

[permalink] [raw]
Subject: Re: [RFC][PATCH 8/8] check files for checkpointability

On Mon, 2009-03-02 at 07:37 -0600, Serge E. Hallyn wrote:
> Quoting Dave Hansen ([email protected]):
> >
> > Introduce a files_struct counter to indicate whether a particular
> > file_struct has ever contained a file which can not be
> > checkpointed. This flag is a one-way trip; once it is set, it may
> > not be unset.
> >
> > We assume at allocation that a new files_struct is clean and may
> > be checkpointed. However, as soon as it has had its files filled
> > from its parent's, we check it for real in __scan_files_for_cr().
> > At that point, we mark it if it contained any uncheckpointable
> > files.
> >
> > We also check each 'struct file' when it is installed in a fd
> > slot. This way, if anyone open()s or managed to dup() an
> > unsuppored file, we can catch it.
> >
> > Signed-off-by: Dave Hansen <[email protected]>
>
> So on a practical note, Ingo's scheme appears to be paying off. In
> order for any program's files_struct to be checkpointable right now,
> it must be statically compiled, else ld.so (I assume) looks up
> /proc/$$/status. So since proc is not checkpointable, the result
> is irreversibly non-checkpointable.
>
> So... does it make sense to mark proc as checkpointable? Do we
> reasonably assume that the same procfile will be available at
> restart?

Can I kick and scream for a minute? :)

dave@nimitz:~/lse/linux/2.5/linux-2.6.git$ grep -r 'struct file_operations.*{' fs/ | grep /proc/ | wc -l
51

I'll need to go actually look at (and mark) each of those. But, the
upside is that I'll have to go look at each of those.

-- Dave

2009-03-02 15:59:45

by Nathan Lynch

[permalink] [raw]
Subject: Re: [RFC][PATCH 8/8] check files for checkpointability

On Mon, 2 Mar 2009 07:37:54 -0600
"Serge E. Hallyn" <[email protected]> wrote:

> Quoting Dave Hansen ([email protected]):
> >
> > Introduce a files_struct counter to indicate whether a particular
> > file_struct has ever contained a file which can not be
> > checkpointed. This flag is a one-way trip; once it is set, it may
> > not be unset.
> >
> > We assume at allocation that a new files_struct is clean and may
> > be checkpointed. However, as soon as it has had its files filled
> > from its parent's, we check it for real in __scan_files_for_cr().
> > At that point, we mark it if it contained any uncheckpointable
> > files.
> >
> > We also check each 'struct file' when it is installed in a fd
> > slot. This way, if anyone open()s or managed to dup() an
> > unsuppored file, we can catch it.
> >
> > Signed-off-by: Dave Hansen <[email protected]>
>
> So on a practical note, Ingo's scheme appears to be paying off. In
> order for any program's files_struct to be checkpointable right now,
> it must be statically compiled, else ld.so (I assume) looks up
> /proc/$$/status. So since proc is not checkpointable, the result
> is irreversibly non-checkpointable.
>
> So... does it make sense to mark proc as checkpointable? Do we
> reasonably assume that the same procfile will be available at
> restart?

With respect to /proc/$x/* where $x is the pid the restarted task wants,
is that not a chicken-and-egg problem?

2009-03-02 16:27:52

by Dave Hansen

[permalink] [raw]
Subject: Re: [RFC][PATCH 8/8] check files for checkpointability

On Mon, 2009-03-02 at 09:59 -0600, Nathan Lynch wrote:
> On Mon, 2 Mar 2009 07:37:54 -0600
> "Serge E. Hallyn" <[email protected]> wrote:
> > So on a practical note, Ingo's scheme appears to be paying off. In
> > order for any program's files_struct to be checkpointable right now,
> > it must be statically compiled, else ld.so (I assume) looks up
> > /proc/$$/status. So since proc is not checkpointable, the result
> > is irreversibly non-checkpointable.
> >
> > So... does it make sense to mark proc as checkpointable? Do we
> > reasonably assume that the same procfile will be available at
> > restart?
>
> With respect to /proc/$x/* where $x is the pid the restarted task wants,
> is that not a chicken-and-egg problem?

Do you mean that we have to go look into /proc to figure out which task
we want before we can checkpoint it? That makes the process *doing* the
checkpoint uncheckpointable, but no the process being examined.

Anyway, I'll fix /proc. It is pretty important.

-- Dave

2009-03-02 16:29:35

by Serge E. Hallyn

[permalink] [raw]
Subject: Re: [RFC][PATCH 8/8] check files for checkpointability

Quoting Nathan Lynch ([email protected]):
> On Mon, 2 Mar 2009 07:37:54 -0600
> "Serge E. Hallyn" <[email protected]> wrote:
>
> > Quoting Dave Hansen ([email protected]):
> > >
> > > Introduce a files_struct counter to indicate whether a particular
> > > file_struct has ever contained a file which can not be
> > > checkpointed. This flag is a one-way trip; once it is set, it may
> > > not be unset.
> > >
> > > We assume at allocation that a new files_struct is clean and may
> > > be checkpointed. However, as soon as it has had its files filled
> > > from its parent's, we check it for real in __scan_files_for_cr().
> > > At that point, we mark it if it contained any uncheckpointable
> > > files.
> > >
> > > We also check each 'struct file' when it is installed in a fd
> > > slot. This way, if anyone open()s or managed to dup() an
> > > unsuppored file, we can catch it.
> > >
> > > Signed-off-by: Dave Hansen <[email protected]>
> >
> > So on a practical note, Ingo's scheme appears to be paying off. In
> > order for any program's files_struct to be checkpointable right now,
> > it must be statically compiled, else ld.so (I assume) looks up
> > /proc/$$/status. So since proc is not checkpointable, the result
> > is irreversibly non-checkpointable.
> >
> > So... does it make sense to mark proc as checkpointable? Do we
> > reasonably assume that the same procfile will be available at
> > restart?
>
> With respect to /proc/$x/* where $x is the pid the restarted task wants,
> is that not a chicken-and-egg problem?

I don't think so... the task will get the pid back (eventually :). So
sure it won't really be supported yet but we can ignore that for now
imo.

The question is, do we worry about the fact that the procfile contents
might be different at restart (different kernel, etc).

-serge

2009-03-02 17:23:17

by Nathan Lynch

[permalink] [raw]
Subject: Re: [RFC][PATCH 8/8] check files for checkpointability

On Mon, 02 Mar 2009 08:27:31 -0800
Dave Hansen <[email protected]> wrote:

> On Mon, 2009-03-02 at 09:59 -0600, Nathan Lynch wrote:
> > On Mon, 2 Mar 2009 07:37:54 -0600
> > "Serge E. Hallyn" <[email protected]> wrote:
> > > So on a practical note, Ingo's scheme appears to be paying off.
> > > In order for any program's files_struct to be checkpointable
> > > right now, it must be statically compiled, else ld.so (I assume)
> > > looks up /proc/$$/status. So since proc is not checkpointable,
> > > the result is irreversibly non-checkpointable.
> > >
> > > So... does it make sense to mark proc as checkpointable? Do we
> > > reasonably assume that the same procfile will be available at
> > > restart?
> >
> > With respect to /proc/$x/* where $x is the pid the restarted task
> > wants, is that not a chicken-and-egg problem?
>
> Do you mean that we have to go look into /proc to figure out which
> task we want before we can checkpoint it? That makes the process
> *doing* the checkpoint uncheckpointable, but no the process being
> examined.

No.. I mean what if a process 1234 does

f = fopen("/proc/1234/stat", "r");

and is then checkpointed. Can that path be resolved during restart,
before pid 1234 is alive?

2009-03-02 17:31:18

by Dave Hansen

[permalink] [raw]
Subject: Re: [RFC][PATCH 8/8] check files for checkpointability

On Mon, 2009-03-02 at 11:22 -0600, Nathan Lynch wrote:
> No.. I mean what if a process 1234 does
>
> f = fopen("/proc/1234/stat", "r");
>
> and is then checkpointed. Can that path be resolved during restart,
> before pid 1234 is alive?

Heh, that's a good one.

It does mean that we can't do restore like this:

for_each_cr_task()
restore_task_struct()
restore_files()
...

We have to do:

for_each_cr_task()
restore_task_struct()
for_each_cr_task()
restore_files()


-- Dave

2009-03-02 17:44:57

by Serge E. Hallyn

[permalink] [raw]
Subject: Re: [RFC][PATCH 8/8] check files for checkpointability

Quoting Dave Hansen ([email protected]):
> On Mon, 2009-03-02 at 11:22 -0600, Nathan Lynch wrote:
> > No.. I mean what if a process 1234 does
> >
> > f = fopen("/proc/1234/stat", "r");
> >
> > and is then checkpointed. Can that path be resolved during restart,
> > before pid 1234 is alive?
>
> Heh, that's a good one.
>
> It does mean that we can't do restore like this:
>
> for_each_cr_task()
> restore_task_struct()
> restore_files()
> ...
>
> We have to do:
>
> for_each_cr_task()
> restore_task_struct()
> for_each_cr_task()
> restore_files()
>
>
> -- Dave

Which is what we actually do, right?

Actually we have userspace create the tasks first, and
then each task calls sys_restart which does restore_files().

-serge

2009-03-02 17:58:45

by Dave Hansen

[permalink] [raw]
Subject: Re: [RFC][PATCH 8/8] check files for checkpointability

On Mon, 2009-03-02 at 11:44 -0600, Serge E. Hallyn wrote:
> Which is what we actually do, right?
>
> Actually we have userspace create the tasks first, and
> then each task calls sys_restart which does restore_files().

Quoting:

http://git.ncl.cs.columbia.edu/?p=linux-cr.git;a=commitdiff;h=3c1b1900f92ed12f5020a7b566065bffda2908d8;hp=7aec1a8f3345bb33c3f93226c895a45ec269bb59

> Restarting of multiple processes expects all restarting tasks to call
> sys_restart(). Once inside the system call, each task will restart
> itself at the same order that they were saved. The internals of the
> syscall will take care of in-kernel synchronization bewteen tasks.

I guess it is OK since everybody sleeps once they enter sys_restart()
until the container init decides it is ready to go.

-- Dave

2009-03-02 18:13:38

by Dave Hansen

[permalink] [raw]
Subject: Re: [RFC][PATCH 8/8] check files for checkpointability

On Mon, 2009-03-02 at 11:44 -0600, Serge E. Hallyn wrote:
> Quoting Dave Hansen ([email protected]):
> > On Mon, 2009-03-02 at 11:22 -0600, Nathan Lynch wrote:
> > > No.. I mean what if a process 1234 does
> > >
> > > f = fopen("/proc/1234/stat", "r");
> > >
> > > and is then checkpointed. Can that path be resolved during restart,
> > > before pid 1234 is alive?
> >
> > Heh, that's a good one.
> >
> > It does mean that we can't do restore like this:
> >
> > for_each_cr_task()
> > restore_task_struct()
> > restore_files()
> > ...
> >
> > We have to do:
> >
> > for_each_cr_task()
> > restore_task_struct()
> > for_each_cr_task()
> > restore_files()
> >
> Which is what we actually do, right?

OK, I have a really evil one.

What if task 1234 does:

open(O_RDONLY, "/proc/5678/fdinfo/44");

and task 5678 does:

open(O_RDONLY, "/proc/5678/fdinfo/55");

There is no right order.

The only right way I can think to do it is that we have to loop on the
restore and defer files that we can't seem to find right now, hoping
that they'll show up as the restore progresses.

Basically:

for_each_cr_task()
deferred_files = restore_files()
retry:
making_progress = 0
for_each(deferred_file)
restore(deferred_file)
if (making_progress)
goto retry;

-- Dave

2009-03-02 18:35:57

by Serge E. Hallyn

[permalink] [raw]
Subject: Re: [RFC][PATCH 8/8] check files for checkpointability

Quoting Dave Hansen ([email protected]):
> On Mon, 2009-03-02 at 11:44 -0600, Serge E. Hallyn wrote:
> > Quoting Dave Hansen ([email protected]):
> > > On Mon, 2009-03-02 at 11:22 -0600, Nathan Lynch wrote:
> > > > No.. I mean what if a process 1234 does
> > > >
> > > > f = fopen("/proc/1234/stat", "r");
> > > >
> > > > and is then checkpointed. Can that path be resolved during restart,
> > > > before pid 1234 is alive?
> > >
> > > Heh, that's a good one.
> > >
> > > It does mean that we can't do restore like this:
> > >
> > > for_each_cr_task()
> > > restore_task_struct()
> > > restore_files()
> > > ...
> > >
> > > We have to do:
> > >
> > > for_each_cr_task()
> > > restore_task_struct()
> > > for_each_cr_task()
> > > restore_files()
> > >
> > Which is what we actually do, right?
>
> OK, I have a really evil one.
>
> What if task 1234 does:
>
> open(O_RDONLY, "/proc/5678/fdinfo/44");
>
> and task 5678 does:
>
> open(O_RDONLY, "/proc/5678/fdinfo/55");

Nice one. Let's make fdinfo files uncheckpointable for now :)

-serge

2009-03-05 08:20:55

by Cedric Le Goater

[permalink] [raw]
Subject: Re: [RFC][PATCH 8/8] check files for checkpointability

Dave Hansen wrote:
> On Mon, 2009-03-02 at 11:44 -0600, Serge E. Hallyn wrote:
>> Quoting Dave Hansen ([email protected]):
>>> On Mon, 2009-03-02 at 11:22 -0600, Nathan Lynch wrote:
>>>> No.. I mean what if a process 1234 does
>>>>
>>>> f = fopen("/proc/1234/stat", "r");
>>>>
>>>> and is then checkpointed. Can that path be resolved during restart,
>>>> before pid 1234 is alive?
>>> Heh, that's a good one.
>>>
>>> It does mean that we can't do restore like this:
>>>
>>> for_each_cr_task()
>>> restore_task_struct()
>>> restore_files()
>>> ...
>>>
>>> We have to do:
>>>
>>> for_each_cr_task()
>>> restore_task_struct()
>>> for_each_cr_task()
>>> restore_files()
>>>
>> Which is what we actually do, right?
>
> OK, I have a really evil one.
>
> What if task 1234 does:
>
> open(O_RDONLY, "/proc/5678/fdinfo/44");
>
> and task 5678 does:
>
> open(O_RDONLY, "/proc/5678/fdinfo/55");
>
> There is no right order.
>
> The only right way I can think to do it is that we have to loop on the
> restore and defer files that we can't seem to find right now, hoping
> that they'll show up as the restore progresses.

or the restore algorithm should support recursion. for example, epoll,
attached 'struct files' to af_unix socket, pipes (2 ends), fifos (idem),
connected socket (you need the listening end), etc.

C.

> Basically:
>
> for_each_cr_task()
> deferred_files = restore_files()
> retry:
> making_progress = 0
> for_each(deferred_file)
> restore(deferred_file)
> if (making_progress)
> goto retry;