Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751061AbdGMCe3 (ORCPT ); Wed, 12 Jul 2017 22:34:29 -0400 Received: from h2.hallyn.com ([78.46.35.8]:34056 "EHLO h2.hallyn.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750877AbdGMCe1 (ORCPT ); Wed, 12 Jul 2017 22:34:27 -0400 Date: Wed, 12 Jul 2017 21:34:25 -0500 From: "Serge E. Hallyn" To: "Theodore Ts'o" , Stefan Berger , "Eric W. Biederman" , "Serge E. Hallyn" , containers@lists.linux-foundation.org, lkp@01.org, linux-kernel@vger.kernel.org, zohar@linux.vnet.ibm.com, tycho@docker.com, James.Bottomley@HansenPartnership.com, vgoyal@redhat.com, christian.brauner@mailbox.org, amir73il@gmail.com, linux-security-module@vger.kernel.org, casey@schaufler-ca.com Subject: Re: [PATCH v2] xattr: Enable security.capability in user namespaces Message-ID: <20170713023425.GA24103@mail.hallyn.com> References: <1499785511-17192-1-git-send-email-stefanb@linux.vnet.ibm.com> <1499785511-17192-2-git-send-email-stefanb@linux.vnet.ibm.com> <87mv89iy7q.fsf@xmission.com> <20170712170346.GA17974@mail.hallyn.com> <877ezdgsey.fsf@xmission.com> <74664cc8-bc3e-75d6-5892-f8934404349f@linux.vnet.ibm.com> <20170713011554.xwmrgkzfwnibvgcu@thunk.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20170713011554.xwmrgkzfwnibvgcu@thunk.org> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3211 Lines: 71 Quoting Theodore Ts'o (tytso@mit.edu): > I'm really confused what problem that is trying to be solved, here, > but it **feels** really, really wrong. Hi, The intro to my original patch might help (or maybe not), as it has a different motivating text: http://lkml.org/lkml/2016/11/19/158 We want file capabilities to be supported in unprivileged containers, so that a piece of software can count on them being available rather than having to supporting multiple ways of getting+dropping privilege (for instance, being installed as uid 1000 with cap_net_raw=pe, versus being installed setuid-root and being expected to do PR_SET_KEEPCAPS and setuid). If subuids 10000-20000 are delegated to uid 1001 on the host, and uid 1001 sets up a container with subuid 100000 mapped to container uid 0, then the container root should be able to write file capabilities which affect (that is, delegate container root's privilege to) all ids over which it has privilege (all uids mapped into the container), but should not have privilege over any uids not mapped into the container. With regular file capabilities, this is impossible, since any filecap he writes can then be exercised on the host by uid 1000. The point of this set (and the ones before it) is to make it so that the filecap written by the container root is tagged on disk as belonging to subuid 100000. > Why do we need to store all of this state on a per-file basis, instead > of some kind of per-file system or per-container data structure? This needs to be writeable by an unprivileged user, with no help from the admin. AFAICS that rules out per-fs data structure. Note we are not assuming a filesystem per container. The typical case is (for instance) ~/.local/share/lxc/c1/rootfs being the root of container c1's filesystem. Mounting a filesystem from inside a user namespace is still mostly science fiction today. > And how many of these security.foo@uid=bar xattrs do you expect there > to be? How many "foo", and how many "bar"? For now I'm expecting two foos - security and ima. The '@uid=bar' is generic enough that it *can* be re-used for a different kind of property if we decide to later, but I have no intention of adding anything. Casey has mentioned 'smack=', but i think only to keep the option open. I don't believe he has concrete plans. > Maybe I missed the full write up, in which case please send me a link > to the full writeup --- ideally in the form of a design doc that > explains the problem statement, gives some examples of how it's going > to be used, what were the other alternatives that were considered, and > why they were rejected, etc. As I'd mentioned in an even older patch, http://lkml.org/lkml/2016/5/18/622 , I had considered using a completely separate xattr name, but that would have required invasive userspace changes. There's no design doc as such, mainly a progressive series of patches to lkml. I am very seriously considering writing a paper to detail both this design and the user ns design in general, as it has become clear (in unrelated conversations) there is still a lot of confusiong out there regarding uid namespaces and targeted capabilities. But it's not written yet. -serge