Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756171AbdLVOIh convert rfc822-to-8bit (ORCPT ); Fri, 22 Dec 2017 09:08:37 -0500 Received: from out01.mta.xmission.com ([166.70.13.231]:54951 "EHLO out01.mta.xmission.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751770AbdLVOIc (ORCPT ); Fri, 22 Dec 2017 09:08:32 -0500 From: ebiederm@xmission.com (Eric W. Biederman) To: Maciej =?utf-8?Q?=C5=BBenczykowski?= Cc: linux-security-module@vger.kernel.org, Linux Kernel Mailing List , Mahesh Bandewar , Willem de Bruijn , Linux Containers References: <20171221210605.181720-1-zenczykowski@gmail.com> <87wp1foiwa.fsf@xmission.com> <87fu83lfw5.fsf@xmission.com> Date: Fri, 22 Dec 2017 08:08:04 -0600 In-Reply-To: ("Maciej \=\?utf-8\?Q\?\=C5\=BBenczykowski\=22's\?\= message of "Fri, 22 Dec 2017 02:51:49 +0100") Message-ID: <87o9mqhn3v.fsf@xmission.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8BIT X-XM-SPF: eid=1eSNzy-0001Nd-LA;;;mid=<87o9mqhn3v.fsf@xmission.com>;;;hst=in02.mta.xmission.com;;;ip=67.3.133.177;;;frm=ebiederm@xmission.com;;;spf=neutral X-XM-AID: U2FsdGVkX19TOgS1cYmLLGu7Wrz6xktdrY7Yy0pFj34= X-SA-Exim-Connect-IP: 67.3.133.177 X-SA-Exim-Mail-From: ebiederm@xmission.com X-Spam-Report: * -1.0 ALL_TRUSTED Passed through trusted hosts only via SMTP * 0.0 TVD_RCVD_IP Message was received from an IP address * 0.7 XMSubLong Long Subject * 2.5 XMWhlSbjSex Whole Obfuscated Subjects * 0.0 T_TM2_M_HEADER_IN_MSG BODY: No description available. * 0.8 BAYES_50 BODY: Bayes spam probability is 40 to 60% * [score: 0.4995] * -0.0 DCC_CHECK_NEGATIVE Not listed in DCC * [sa06 1397; Body=1 Fuz1=1 Fuz2=1] * 0.0 T_TooManySym_01 4+ unique symbols in subject X-Spam-DCC: XMission; sa06 1397; Body=1 Fuz1=1 Fuz2=1 X-Spam-Combo: =?ISO-8859-1?Q?***;Maciej =c5=bbenczykowski ?= X-Spam-Relay-Country: X-Spam-Timing: total 425 ms - load_scoreonly_sql: 0.13 (0.0%), signal_user_changed: 2.7 (0.6%), b_tie_ro: 1.87 (0.4%), parse: 1.03 (0.2%), extract_message_metadata: 4.4 (1.0%), get_uri_detail_list: 2.5 (0.6%), tests_pri_-1000: 2.9 (0.7%), tests_pri_-950: 1.19 (0.3%), tests_pri_-900: 1.01 (0.2%), tests_pri_-400: 25 (6.0%), check_bayes: 24 (5.7%), b_tokenize: 8 (2.0%), b_tok_get_all: 9 (2.0%), b_comp_prob: 2.9 (0.7%), b_tok_touch_all: 2.5 (0.6%), b_finish: 0.56 (0.1%), tests_pri_0: 375 (88.3%), check_dkim_signature: 0.57 (0.1%), check_dkim_adsp: 5 (1.3%), tests_pri_500: 3.8 (0.9%), rewrite_mail: 0.00 (0.0%) Subject: Re: [PATCH] userns: honour no_new_privs for cap_bset during user ns creation/switch X-Spam-Flag: No X-SA-Exim-Version: 4.2.1 (built Thu, 05 May 2016 13:38:54 -0600) X-SA-Exim-Scanned: Yes (on in02.mta.xmission.com) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3924 Lines: 102 Maciej Żenczykowski writes: >> Good point about CAP_DAC_OVERRIDE on files you own. >> >> I think there is an argument that you are playing dangerous games with >> the permission system there, as it isn't effectively a file you own if >> you can't read it, and you can't change it's permissions. > > Append-only files are useful - particularly for logging. > It could also simply be a non-readable file on a R/O filesystem. > >> Given little things like that I can completely see no_new_privs meaning >> you can't create a user namespace. That seems consistent with the >> meaning and philosophy of no_new_privs. So simple it is hard to get >> wrong. > > Yes, I could totally buy the argument that no_new_privs should prevent > creating a user ns. > > However, there's also setns() and that's a fair bit harder to reason about. > Entirely deny it? But that actually seems potentially useful... > Allow it but cap it? That's what this does... > >> We could do more clever things like plug this whole in user namespaces, >> and that would not hurt my feelings. > > Sure, this particular one wouldn't be all that easy I think... and how > many such holes are there? > I found this particular one *after* your first reply in this thread. > >> However unless that is our only >> choice to avoid badly breaking userspace I would have to have to depend >> on user namespaces being perfect for no_new_privs to be a proper jail. > > This stuff is ridiculously complex to get right from userspace. :-( >> As a general rule user namespaces are where we tackle the subtle scary >> things that should work, and no_new_privs is where we implement a simple >> hard to get wrong jail. Most of the time the effect is the same to an >> outside observer (bounded permissions), but there is a real difference >> in difficulty of implementation. > > So, where to now... > > Would you accept patches that: > > - make no_new_priv block user ns creation? > > - make no_new_priv block user ns transition? Yes. The approach will need to be rethought if there is anything deliberately combining user namespaces and no_new_privs. As regressions are a no-no. So we need wide spread testing, to avoid that. But as much as possible I want no_new_privs to be simple and doing it's job. I will also take and encourage patches that close this minor privilege escalation from the user namespace side. As ideally creating a user namespace should be as safe as no_new_privs. > Or perhaps we can assume that lack of create privs is sufficient, and > if there's a pre-existing user ns for you to enter, then that's > acceptable... > Although this implies you probably always want to combine no_new_privs > with a leaf user ns, or no_new_privs isn't all that useful for root in > root ns... > This added complexity, probably means it should be blocked... Yes. > - inherits bset across user ns creation/transition based on X? > [this is the one we care about, because there are simply too many bugs > in the kernel wrt. certain caps] That was my suspicion, and attack surface reduction is a different discussion. Would no_new_privs preventing a userns transition be enough for the cases you care about? Otherwise this is a different conversation because it is not about semantics but about making the code safer to use. In general if code is simply not safe to user in a user namespace I would prefer to tighten the permission checks, and just not allow that code. Mostly what I have seen in previous conversations is simply concerns about code that is not used or needed, being a problem. > X could be: > - a new flag similar to no_new_priv > - a new securebit flag (w/lockbit) [provided securebits survive a > userns transition, haven't checked] > - or perhaps a new capability > - something else? > > How do we make forward progress? We start by causing no_new_privs to block userns creation and entering. Eric