Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752720AbdGMVUy (ORCPT ); Thu, 13 Jul 2017 17:20:54 -0400 Received: from out02.mta.xmission.com ([166.70.13.232]:53822 "EHLO out02.mta.xmission.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751153AbdGMVUx (ORCPT ); Thu, 13 Jul 2017 17:20:53 -0400 From: ebiederm@xmission.com (Eric W. Biederman) To: "Serge E. Hallyn" Cc: Stefan Berger , "Theodore Ts'o" , containers@lists.linux-foundation.org, lkp@01.org, linux-kernel@vger.kernel.org, zohar@linux.vnet.ibm.com, tycho@docker.com, James.Bottomley@HansenPartnership.com, vgoyal@redhat.com, christian.brauner@mailbox.org, amir73il@gmail.com, linux-security-module@vger.kernel.org, casey@schaufler-ca.com References: <87mv89iy7q.fsf@xmission.com> <20170712170346.GA17974@mail.hallyn.com> <877ezdgsey.fsf@xmission.com> <74664cc8-bc3e-75d6-5892-f8934404349f@linux.vnet.ibm.com> <20170713011554.xwmrgkzfwnibvgcu@thunk.org> <87y3rscz9j.fsf@xmission.com> <20170713164012.brj2flnkaaks2oci@thunk.org> <87k23cb6os.fsf@xmission.com> <847ccb2a-30c0-a94c-df6f-091c8901eaa0@linux.vnet.ibm.com> <87bmoo8bxb.fsf@xmission.com> <20170713194842.GB4895@mail.hallyn.com> Date: Thu, 13 Jul 2017 16:12:37 -0500 In-Reply-To: <20170713194842.GB4895@mail.hallyn.com> (Serge E. Hallyn's message of "Thu, 13 Jul 2017 14:48:42 -0500") Message-ID: <87mv886ny2.fsf@xmission.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-XM-SPF: eid=1dVlXH-000549-8i;;;mid=<87mv886ny2.fsf@xmission.com>;;;hst=in01.mta.xmission.com;;;ip=67.3.213.87;;;frm=ebiederm@xmission.com;;;spf=neutral X-XM-AID: U2FsdGVkX18IPng0Nf5x0V+LJW3ETvvoR2ENc9ZBaOE= X-SA-Exim-Connect-IP: 67.3.213.87 X-SA-Exim-Mail-From: ebiederm@xmission.com X-Spam-Report: * -1.0 ALL_TRUSTED Passed through trusted hosts only via SMTP * 0.7 XMSubLong Long Subject * 0.0 TVD_RCVD_IP Message was received from an IP address * 0.0 T_TM2_M_HEADER_IN_MSG BODY: No description available. * 0.8 BAYES_50 BODY: Bayes spam probability is 40 to 60% * [score: 0.5000] * -0.0 DCC_CHECK_NEGATIVE Not listed in DCC * [sa06 1397; Body=1 Fuz1=1 Fuz2=1] X-Spam-DCC: XMission; sa06 1397; Body=1 Fuz1=1 Fuz2=1 X-Spam-Combo: ;"Serge E. Hallyn" X-Spam-Relay-Country: X-Spam-Timing: total 13593 ms - load_scoreonly_sql: 0.03 (0.0%), signal_user_changed: 5 (0.0%), b_tie_ro: 3.2 (0.0%), parse: 0.91 (0.0%), extract_message_metadata: 17 (0.1%), get_uri_detail_list: 2.9 (0.0%), tests_pri_-1000: 4.9 (0.0%), tests_pri_-950: 1.05 (0.0%), tests_pri_-900: 0.91 (0.0%), tests_pri_-400: 28 (0.2%), check_bayes: 27 (0.2%), b_tokenize: 9 (0.1%), b_tok_get_all: 10 (0.1%), b_comp_prob: 2.9 (0.0%), b_tok_touch_all: 3.1 (0.0%), b_finish: 0.57 (0.0%), tests_pri_0: 257 (1.9%), check_dkim_signature: 0.48 (0.0%), check_dkim_adsp: 2.6 (0.0%), tests_pri_500: 13275 (97.7%), poll_dns_idle: 13266 (97.6%), rewrite_mail: 0.00 (0.0%) Subject: Re: [PATCH v2] xattr: Enable security.capability in user namespaces X-Spam-Flag: No X-SA-Exim-Version: 4.2.1 (built Thu, 05 May 2016 13:38:54 -0600) X-SA-Exim-Scanned: Yes (on in01.mta.xmission.com) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4224 Lines: 102 "Serge E. Hallyn" writes: > Quoting Eric W. Biederman (ebiederm@xmission.com): >> Stefan Berger writes: >> >> > On 07/13/2017 01:14 PM, Eric W. Biederman wrote: >> >> Theodore Ts'o writes: >> >> >> >>> On Thu, Jul 13, 2017 at 07:11:36AM -0500, Eric W. Biederman wrote: >> >>>> The concise summary: >> >>>> >> >>>> Today we have the xattr security.capable that holds a set of >> >>>> capabilities that an application gains when executed. AKA setuid root exec >> >>>> without actually being setuid root. >> >>>> >> >>>> User namespaces have the concept of capabilities that are not global but >> >>>> are limited to their user namespace. We do not currently have >> >>>> filesystem support for this concept. >> >>> So correct me if I am wrong; in general, there will only be one >> >>> variant of the form: >> >>> >> >>> security.foo@uid=15000 >> >>> >> >>> It's not like there will be: >> >>> >> >>> security.foo@uid=1000 >> >>> security.foo@uid=2000 >> >>> >> >>> Except.... if you have an Distribution root directory which is shared >> >>> by many containers, you would need to put the xattrs in the overlay >> >>> inodes. Worse, each time you launch a new container, with a new >> >>> subuid allocation, you will have to iterate over all files with >> >>> capabilities and do a copy-up operations on the xattrs in overlayfs. >> >>> So that's actually a bit of a disaster. >> >>> >> >>> So for distribution overlays, you will need to do things a different >> >>> way, which is to map the distro subdirectory so you know that the >> >>> capability with the global uid 0 should be used for the container >> >>> "root" uid, right? >> >>> >> >>> So this hack of using security.foo@uid=1000 is *only* useful when the >> >>> subcontainer root wants to create the privileged executable. You >> >>> still have to do things the other way. >> >>> >> >>> So can we make perhaps the assertion that *either*: >> >>> >> >>> security.foo >> >>> >> >>> exists, *or* >> >>> >> >>> security.foo@uid=BAR >> >>> >> >>> exists, but never both? And there BAR is exclusive to only one >> >>> instances? >> >>> >> >>> Otherwise, I suspect that the architecture is going to turn around and >> >>> bite us in the *ss eventually, because someone will want to do >> >>> something crazy and the solution will not be scalable. >> >> Yep. That is what it looks like from here. >> >> >> >> Which is why I asked the question about scalability of the xattr >> >> implementations. It looks like trying to accomodate the general >> >> case just gets us in trouble, and sets unrealistic expectations. >> >> >> >> Which strongly suggests that Serge's previous version that >> >> just reved the format of security.capable so that a uid field could >> >> be added is likely to be the better approach. >> >> >> >> I want to see what Serge and Stefan have to say but the case looks >> >> pretty clear cut at the moment. > > I'm fine with that. Now, we'll be doing the enforcement at xattr > write time, meaning someone *can* come up with an fs image with >1 > such xattrs. Which is *fine*, I believe, it won't break anything > security-wise, and our goal is only to stop users from thinking it > is legitimate two write multiple such xattrs, so that they don't later > bug the fs folks like Ted saying "hey why can't I write 1000 of these, > I think that's a bug." > > So at xattr write time, > > 1. if there is already an xattr, and it is either the global > non-namespaced xattr, or it has kuid=X where X is the kuid > mapped to root in a parent of the container, then we refuse > the write > 2. if there is already an xattr, and it is for a kuid=X where > X is mapped into the container, then we overwrite the existing > xattr. > > At read/use time, we use the rules we have now. > > Does that seem reasonable? That sounds like it would keep us to one xattr of any given type so yes. It occurs to me while I am writing this that this is also important for ima/evm. There is an xattr that has a hash of all of the other security relevant xattrs. Without a limit on the number of xattrs calculating that security xattr could become time prohibitive. Eric