Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755173AbYHUJfm (ORCPT ); Thu, 21 Aug 2008 05:35:42 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1752298AbYHUJfc (ORCPT ); Thu, 21 Aug 2008 05:35:32 -0400 Received: from bohort.kerlabs.com ([62.160.40.57]:34743 "EHLO bohort.kerlabs.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752253AbYHUJfb (ORCPT ); Thu, 21 Aug 2008 05:35:31 -0400 Date: Thu, 21 Aug 2008 11:35:29 +0200 From: Louis Rilling To: Oren Laadan Cc: dave@linux.vnet.ibm.com, arnd@arndb.de, jeremy@goop.org, linux-kernel@vger.kernel.org, containers@lists.linux-foundation.org Subject: Re: [RFC v2][PATCH 2/9] General infrastructure for checkpoint restart Message-ID: <20080821093529.GG581@hawkmoon.kerlabs.com> Reply-To: Louis.Rilling@kerlabs.com References: Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="=_bohort-9501-1219311200-0001-2" Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.17+20080114 (2008-01-14) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 5224 Lines: 203 This is a MIME-formatted message. If you see this text it means that your E-mail software does not support MIME-formatted messages. --=_bohort-9501-1219311200-0001-2 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Wed, Aug 20, 2008 at 11:04:13PM -0400, Oren Laadan wrote: > > Add those interfaces, as well as helpers needed to easily manage the > file format. The code is roughly broken out as follows: > > ckpt/sys.c - user/kernel data transfer, as well as setup of the > checkpoint/restart context (a per-checkpoint data structure for > housekeeping) > > ckpt/checkpoint.c - output wrappers and basic checkpoint handling > > ckpt/restart.c - input wrappers and basic restart handling > > Patches to add the per-architecture support as well as the actual > work to do the memory checkpoint follow in subsequent patches. > [...] > diff --git a/checkpoint/sys.c b/checkpoint/sys.c > new file mode 100644 > index 0000000..2891c48 > --- /dev/null > +++ b/checkpoint/sys.c [...] > +/* > + * helpers to manage CR contexts: allocated for each checkpoint and/or > + * restart operation, and persists until the operation is completed. > + */ > + > +static atomic_t cr_ctx_count; /* unique checkpoint identifier */ I thought we agreed that this counter should be per-container. Perhaps add a TODO here? > + > +void cr_ctx_free(struct cr_ctx *ctx) > +{ > + > + if (ctx->file) > + fput(ctx->file); > + if (ctx->vfsroot) > + path_put(ctx->vfsroot); > + > + free_pages((unsigned long) ctx->tbuf, CR_TBUF_ORDER); > + free_pages((unsigned long) ctx->hbuf, CR_HBUF_ORDER); > + > + kfree(ctx); > +} > + > +struct cr_ctx *cr_ctx_alloc(pid_t pid, struct file *file, unsigned long = flags) > +{ > + struct cr_ctx *ctx; > + > + ctx =3D kzalloc(sizeof(*ctx), GFP_KERNEL); > + if (!ctx) > + return NULL; > + > + ctx->tbuf =3D (void *) __get_free_pages(GFP_KERNEL, CR_TBUF_ORDER); > + ctx->hbuf =3D (void *) __get_free_pages(GFP_KERNEL, CR_HBUF_ORDER); > + if (!ctx->tbuf || !ctx->hbuf) > + goto nomem; > + > + ctx->pid =3D pid; > + ctx->flags =3D flags; > + > + ctx->file =3D file; > + get_file(file); > + > + /* assume checkpointer is in container's root vfs */ I'm a bit puzzled by this assumption. I would say: either this is a self-checkpoint (only current process), or this is a container checkpoint. = In the latter case, I expect that in the general case the checkpointer lives outside the container (and the interface of sys_checkpoint() below confirms this), so it's root fs is probably not the container's one. Does it differ from what you're planning? Thanks, Louis > + ctx->vfsroot =3D ¤t->fs->root; > + path_get(ctx->vfsroot); > + > + ctx->crid =3D atomic_inc_return(&cr_ctx_count); > + > + return ctx; > + > + nomem: > + cr_ctx_free(ctx); > + return NULL; > +} > + > +/** > + * sys_checkpoint - checkpoint a container > + * @pid: pid of the container init(1) process > + * @fd: file to which dump the checkpoint image > + * @flags: checkpoint operation flags > + */ > +asmlinkage long sys_checkpoint(pid_t pid, int fd, unsigned long flags) > +{ > + struct cr_ctx *ctx; > + struct file *file; > + int fput_needed; > + int ret; > + > + file =3D fget_light(fd, &fput_needed); > + if (!file) > + return -EBADF; > + > + /* no flags for now */ > + if (flags) > + return -EINVAL; > + > + ctx =3D cr_ctx_alloc(pid, file, flags | CR_CTX_CKPT); > + if (!ctx) { > + fput_light(file, fput_needed); > + return -ENOMEM; > + } > + > + ret =3D do_checkpoint(ctx); > + > + cr_ctx_free(ctx); > + fput_light(file, fput_needed); > + > + return ret; > +} > + > +/** > + * sys_restart - restart a container > + * @crid: checkpoint image identifier > + * @fd: file from which read the checkpoint image > + * @flags: restart operation flags > + */ > +asmlinkage long sys_restart(int crid, int fd, unsigned long flags) > +{ > + struct cr_ctx *ctx; > + struct file *file; > + int fput_needed; > + int ret; > + > + file =3D fget_light(fd, &fput_needed); > + if (!file) > + return -EBADF; > + > + /* no flags for now */ > + if (flags) > + return -EINVAL; > + > + ctx =3D cr_ctx_alloc(crid, file, flags | CR_CTX_RSTR); > + if (!ctx) { > + fput_light(file, fput_needed); > + return -ENOMEM; > + } > + > + ret =3D do_restart(ctx); > + > + cr_ctx_free(ctx); > + fput_light(file, fput_needed); > + > + return ret; > +} --=20 Dr Louis Rilling Kerlabs Skype: louis.rilling Batiment Germanium Phone: (+33|0) 6 80 89 08 23 80 avenue des Buttes de Coesmes http://www.kerlabs.com/ 35700 Rennes --=_bohort-9501-1219311200-0001-2 Content-Type: application/pgp-signature; name="signature.asc" Content-Transfer-Encoding: 7bit Content-Description: Digital signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFIrTbhVKcRuvQ9Q1QRAvmJAKDIygnl8IBXqdZwrDkZzJY8o4Q0nACeLtsV 50ivpgmLfC8uDAhVBw9PAXA= =p8wl -----END PGP SIGNATURE----- --=_bohort-9501-1219311200-0001-2-- -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/