Message-ID: <1423364852.2641.2.camel@pluto.fritz.box>
Subject: Re: [RFC PATCH 3/8] kmod - teach call_usermodehelper() to use a
 namespace
From: Ian Kent <ikent@redhat.com>
To: Jeff Layton <jeff.layton@primarydata.com>
Cc: Kernel Mailing List <linux-kernel@vger.kernel.org>,
        David Howells <dhowells@redhat.com>,
        Oleg Nesterov <onestero@redhat.com>,
        Trond Myklebust <trond.myklebust@primarydata.com>,
        "J. Bruce Fields" <bfields@fieldses.org>,
        Benjamin Coddington <bcodding@redhat.com>,
        Al Viro <viro@ZenIV.linux.org.uk>,
        "Eric W. Biederman" <ebiederm@xmission.com>
Date: Sun, 08 Feb 2015 11:07:32 +0800
In-Reply-To: <20150206070859.7eb499b0@tlielax.poochiereds.net>
References: <20150205021553.8382.16297.stgit@pluto.fritz.box>
	 <20150205023410.8382.13695.stgit@pluto.fritz.box>
	 <20150206070859.7eb499b0@tlielax.poochiereds.net>
Content-Type: text/plain; charset="UTF-8"
Mime-Version: 1.0
Content-Transfer-Encoding: 7bit
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 4180
Lines: 113

On Fri, 2015-02-06 at 07:08 -0500, Jeff Layton wrote:
> On Thu, 05 Feb 2015 10:34:11 +0800
> Ian Kent <ikent@redhat.com> wrote:
> 
> > The call_usermodehelper() function executes all binaries in the
> > global "init" root context. This doesn't allow a binary to be run
> > within a namespace (eg. the namespace of a container).
> > 
> > Both containerized NFS client and NFS server need the ability to
> > execute a binary in a container's context. To do this use the init
> > process of the callers environment is used to setup the namespaces
> > in the same way the root init process is used otherwise.
> > 
> > Signed-off-by: Ian Kent <ikent@redhat.com>
> > Cc: Benjamin Coddington <bcodding@redhat.com>
> > Cc: Al Viro <viro@ZenIV.linux.org.uk>
> > Cc: J. Bruce Fields <bfields@fieldses.org>
> > Cc: David Howells <dhowells@redhat.com>
> > Cc: Trond Myklebust <trond.myklebust@primarydata.com>
> > Cc: Oleg Nesterov <onestero@redhat.com>
> > Cc: Eric W. Biederman <ebiederm@xmission.com>
> > Cc: Jeff Layton <jeff.layton@primarydata.com>
> > ---
> >  include/linux/kmod.h |   16 +++++++
> >  kernel/kmod.c        |  115 +++++++++++++++++++++++++++++++++++++++++++++++++-
> >  2 files changed, 128 insertions(+), 3 deletions(-)
> > 
> > diff --git a/include/linux/kmod.h b/include/linux/kmod.h
> > index 15bdeed..b0f1b3c 100644
> > --- a/include/linux/kmod.h
> > +++ b/include/linux/kmod.h
> > @@ -52,6 +52,7 @@ struct file;
> >  #define UMH_WAIT_EXEC	1	/* wait for the exec, but not the process */
> >  #define UMH_WAIT_PROC	2	/* wait for the process to complete */
> >  #define UMH_KILLABLE	4	/* wait for EXEC/PROC killable */
> > +#define UMH_USE_NS	8	/* exec using caller's init namespace */
> >  
> >  struct subprocess_info {
> >  	struct work_struct work;
> > @@ -69,6 +70,21 @@ struct subprocess_info {
> >  extern int
> >  call_usermodehelper(char *path, char **argv, char **envp, int flags);
> >  
> > +#if !defined(CONFIG_PROC_FS) || !defined(CONFIG_NAMESPACES)
> > +inline struct task_struct *umh_get_init_task(void)
> > +{
> > +	return ERR_PTR(-ENOTSUP);
> > +}
> > +
> > +inline int umh_enter_ns(struct task_struct *tsk, struct cred *new)
> > +{
> > +	return -ENOTSUP;
> > +}
> > +#else
> > +struct task_struct *umh_get_init_pid(void);
> > +int umh_enter_ns(struct task_struct *tsk, struct cred *new);
> > +#endif
> > +
> >  extern struct subprocess_info *
> >  call_usermodehelper_setup(char *path, char **argv, char **envp, gfp_t gfp_mask,
> >  			  int (*init)(struct subprocess_info *info, struct cred *new),
> > diff --git a/kernel/kmod.c b/kernel/kmod.c
> > index 14c0188..4c649d6 100644
> > --- a/kernel/kmod.c
> > +++ b/kernel/kmod.c
> > @@ -582,6 +582,98 @@ unlock:
> >  }
> >  EXPORT_SYMBOL(call_usermodehelper_exec);
> >  
> > +#if defined(CONFIG_PROC_FS) && defined(CONFIG_NAMESPACES)
> > +#define NS_PATH_MAX	35
> > +#define NS_PATH_FMT	"%lu/ns/%s"
> > +
> > +/* Note namespace name order is significant */
> > +static const char *ns_names[] = { "user", "ipc", "uts", "net", "pid", "mnt", NULL };
> > +
> > +struct task_struct *umh_get_init_pid(void)
> 
> nit: we're not getting a pid here but a task_struct pointer. Maybe this
> should be called umh_get_init_task?

Ha, yep.

> 
> > +{
> > +	struct task_struct *tsk;
> > +
> > +	rcu_read_lock();
> > +	tsk = find_task_by_vpid(1);
> > +	if (tsk)
> > +		get_task_struct(tsk);
> > +	rcu_read_unlock();
> 
> I'm not terribly familiar with the task_struct lifetime rules...
> 
> I assume that you can be assured that tsk won't go away while you hold
> the rcu_read_lock, but is doing a get_task_struct while holding it
> sufficient to pin it after you drop the lock?
> 
> IOW, could the refcount on the task_struct do a 0->1 transition here and
> end up being freed anyway after you've grabbed a reference?

Good point, I thought getting a reference under he read lock would be
enough but maybe I need more checks as I do with dentrys. I'll check
that.

Ian

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/