Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755404AbaKRRN1 (ORCPT ); Tue, 18 Nov 2014 12:13:27 -0500 Received: from mail-ob0-f179.google.com ([209.85.214.179]:44420 "EHLO mail-ob0-f179.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754464AbaKRRNX (ORCPT ); Tue, 18 Nov 2014 12:13:23 -0500 Date: Tue, 18 Nov 2014 11:13:21 -0600 From: Seth Forshee To: Andy Lutomirski Cc: Miklos Szeredi , "Eric W. Biederman" , "Serge H. Hallyn" , Michael j Theall , fuse-devel@lists.sourceforge.net, "linux-kernel@vger.kernel.org" , Linux FS Devel , seth.forhsee@canonical.com Subject: Re: [PATCH v5 2/4] fuse: Support fuse filesystems outside of init_user_ns Message-ID: <20141118171321.GB21726@ubuntu-mba51> References: <1414013060-137148-1-git-send-email-seth.forshee@canonical.com> <1414013060-137148-3-git-send-email-seth.forshee@canonical.com> <20141111140454.GD333@tucsk> <87mw7xd9zt.fsf@x220.int.ebiederm.org> <20141112130915.GG333@tucsk> <20141112162254.GB31775@ubuntu-hedt> <20141118152156.GA21726@ubuntu-mba51> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Nov 18, 2014 at 09:09:34AM -0800, Andy Lutomirski wrote: > On Tue, Nov 18, 2014 at 7:21 AM, Seth Forshee > wrote: > > On Wed, Nov 12, 2014 at 10:22:54AM -0600, Seth Forshee wrote: > >> On Wed, Nov 12, 2014 at 02:09:15PM +0100, Miklos Szeredi wrote: > >> > On Tue, Nov 11, 2014 at 09:37:10AM -0600, Eric W. Biederman wrote: > >> > > >> > > > Maybe I'm being dense, but can someone give a concrete example of such an > >> > > > attack? > >> > > > >> > > There are two variants of things at play here. > >> > > > >> > > There is the classic if you don't freeze your context at open time when > >> > > you pass that file descriptor to another process unexpected things can > >> > > happen. > >> > > > >> > > An essentially harmless but extremely confusing example is what happens > >> > > to a partial read when it stops halfway through a uid value and the next > >> > > read on the same file descriptor is from a process in a different user > >> > > namespace. Which uid value should be returned to userspace. > >> > > >> > Fuse device doesn't currently do partial reads, so that's a non-issue. > >> > > >> > > Now if I am in a nefarious mood I can create a unprivileged user > >> > > namespace, open /dev/fuse and mount a fuse filesystem. Pass the file > >> > > descriptor to /dev/fuse to a processes that is in the default user > >> > > namespace (and thus can use any uid/gid). With that file desctipor > >> > > report that there is a setuid 0 exectuable on that file system. > >> > > >> > Yes, and this would also be prevented by MNT_NOSUID, which would be a good idea > >> > anyway. I just don't see the reason we'd want to allow clearing MNT_NOSUID in a > >> > private namespace. > >> > > >> > So we don't currently see a use case for relaxing either the MNT_NOSUID > >> > restriction or for relaxing the requirement on the user namespace the fuse > >> > server is in. Is that correct? > >> > > >> > If so, we should leave both restrictions in place since that allows the greatest > >> > flexibility in the future, is either of those needs to be relaxed. > >> > >> I'm not aware of specific use cases for either at this point. However, > >> Andy's patch [1] will limit suid to the set of namespaces where the user > >> who mounted the filesystem already has privileges. Enforcing MNT_NOSUID > >> will require enforcement in the vfs, and in that case we definitely need > >> to decide whether the policy is to implicitly add the flag or fail the > >> mount attempt if the flag is not present [2]. > > > > I asked around a bit, and it turns out there are use cases for nested > > containers (i.e. a container within a container) where the rootfs for > > the outer container mounts a filesystem containing the rootfs for the > > inner container. If that mount is nosuid then suid utilities like ping > > aren't going to work in the inner container. > > > > So since there's a use case for suid in a userns mount and we have what > > we belive are sufficient protections against using this as a vector to > > get privileges outside the container, I'm planning to move ahead without > > the MNT_NOSUID restriction. Any objections? > > Are you talking about MNT_NOSUID the flag or my ns-dependent thing? I'm talking about dropping the proposed requirement from Miklos that all fuse userns mounts are required to have the MNT_NOSUID flag. I intend to keep your ns-dependent thing. Thanks, Seth -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/