Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932934AbbBJAIl (ORCPT ); Mon, 9 Feb 2015 19:08:41 -0500 Received: from mx1.redhat.com ([209.132.183.28]:41232 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1761661AbbBJAIY (ORCPT ); Mon, 9 Feb 2015 19:08:24 -0500 Message-ID: <1423526889.2612.14.camel@pluto.fritz.box> Subject: Re: [RFC PATCH 3/8] kmod - teach call_usermodehelper() to use a namespace From: Ian Kent To: Oleg Nesterov Cc: Kernel Mailing List , David Howells , Trond Myklebust , "J. Bruce Fields" , Benjamin Coddington , Al Viro , Jeff Layton , "Eric W. Biederman" Date: Tue, 10 Feb 2015 08:08:09 +0800 In-Reply-To: <20150209160328.GA2817@redhat.com> References: <20150205021553.8382.16297.stgit@pluto.fritz.box> <20150205023410.8382.13695.stgit@pluto.fritz.box> <20150208190019.GA22948@redhat.com> <1423446203.2583.18.camel@pluto.fritz.box> <20150209160328.GA2817@redhat.com> Content-Type: text/plain; charset="UTF-8" Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3333 Lines: 92 On Mon, 2015-02-09 at 17:03 +0100, Oleg Nesterov wrote: > On 02/09, Ian Kent wrote: > > > > On Sun, 2015-02-08 at 20:00 +0100, Oleg Nesterov wrote: > > > > + > > > > + this = file_open_root(mnt->mnt_root, mnt, path, O_RDONLY); > > > > + if (unlikely(IS_ERR(this))) { > > > > + err = PTR_ERR(this); > > > > + break; > > > > + } > > > > + > > > > + err = setns_inode(file_inode(this), 0); > > > > + fput(this); > > > > + if (err) > > > > + break; > > > > + } > > > > + > > > > + return err; > > > > +} > > > > > > Yes, I need to actually read this series and setns paths, but at first glance > > > there must be a simpler method to call ops->install's and switch_task_namespaces. > > > > Yes, the namespaces implementation does seem a bit strange in this > > respect. I mentioned that concern the first time I posted these. But I'm > > still not that clear on the big picture of how namespace are meant to > > work. > > > > It's not just access to ops->install() that's the problem. > > > > For each of the individual namespaces we open a file handle, to get > > access to ops->install() for that namespace, install it, drop "all" the > > namespaces then replace them with the new set that essentially has one > > namespace changed. > > I understand. but I still can't understand why we can't implement something > like > enter_ns(struct nsproxy *p) > { > new_nsproxy = create_new_namespaces(...); > > p->mnt_ns->ns->ops->install(new_nsproxy, ...); > p->pid_ns_for_children->ns->ops->install(new_nsproxy, ...); > ... > > switch_task_namespaces(new_nsproxy); > } > > Why we should abuse fs/proc ? That sounds like a much better approach. Your saying just take a reference to the nsproxy from the located process and use it instead, right? Working out if there's a difference with what you from the open is challenging (I already tried), I'll have another go at it. > > See also below. > > > > Sorry if this was already discussed before, but to me it looks a bit strange > > > to abuse /proc/ files for this. And again, iiuc file_open_root() can fail if > > > tsk has already exited (init can be multithreaded). > > > > Not sure that the failure is a problem though as long as it's handled > > since, if the init process of the container is gone (or will be gone > > once were done), so is the container and the caller. > > Not really. Individual thread can exit while the whole "init" process can be alive. > In particular the main thread can exit and become a zombie, so find_task_by_vpid(1) > can't work in general. > > You can probably use task_active_pid_ns()-child_reaper, but again I do not think > you should pass "task_struct *" to enter_ns(). OK, I need to check this, btw, thanks for the comments. > > > And. Whatever we do, ops->install() or setns_inode() can't solve the problem with > pid_ns. You need the additional clone() to "activate" it. pidns_install() does not > actually change task_active_pid_ns(). Right, but all this is done in preparation for the following do_execve() call. Isn't that enough or am I missing something? Ian -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/