Received: by 10.223.164.202 with SMTP id h10csp436798wrb; Tue, 7 Nov 2017 08:39:23 -0800 (PST) X-Google-Smtp-Source: ABhQp+TZk8UqtJPhtOrnZMq+wPQClbfC1xnjAIr9Q4xmitsU0dYvFbYXk9gWi5d4f4rDOAln52iB X-Received: by 10.99.95.76 with SMTP id t73mr19344912pgb.57.1510072763556; Tue, 07 Nov 2017 08:39:23 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1510072763; cv=none; d=google.com; s=arc-20160816; b=uyCjKmgYe/Vm9pp1DzYm3iqaIR8Q1XzIWL6LQn+nXuxTp1SoHw5sfAztIuf0bFDN1y oH2A7nGna6A62DGg2UQZZGw0dn97iuwYjqh4pY/LVIZPFUfihMA2r9kDemZ4WwjkiWtd eaJ+ab7tQHFBzipbwGtMe+xMmXOGGW0DoHwKFzGEuA8ZSiKlut8KWmJva/K6+6QmjhRo vBWx5AJEucEhCoAvO6jiCzYKRIMepee6BLsq4Ttc+W9qvn+cbcqPycTGr3JVnxblW7Lz fHM+7RVwsEEZFUFMkPzcS3YAzGnAxhIHEN2snaedNPDpH9rzgN5igWaJg+baFmRu6eNo TNDQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:arc-authentication-results; bh=gAY2gIo44ZeZJPeXc0hP1rtUyw867r+UxKsKB3c965Y=; b=Jmn6DHid1EeCnAzFjvrpzYIhq22ydt4SKu6GrKnrJoSGoTrJ9DTBmlPI2wCPgfQUIv AfSN1p4sRb5QkD80pJkAPtm2c4qvHNf+jC2M8U/BtHp/J0VkQO5yoGAMRN1XpRugShi1 uMGQzklTTBJ6xUNmrz/UflHIwkXohY6+EkEqmjnM8FF/dp0kBMqf+oli4MRP4fFFYKyX V/zlE1S6692PaS+o4RC5NQGGNroSqyg7E8CnK2lZvReeMiVGQzmgxvgyR5NqKu9cDSmJ Cvb/GovesGpnBSiCswFNrD4R0Y7L1Y+TFcMbF70Cy4LPtYCcJQUBgs/7nMMNqLHHUW8U 4QyA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id b189si1534855pgc.215.2017.11.07.08.39.10; Tue, 07 Nov 2017 08:39:23 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933342AbdKGDXO (ORCPT + 91 others); Mon, 6 Nov 2017 22:23:14 -0500 Received: from h2.hallyn.com ([78.46.35.8]:47632 "EHLO h2.hallyn.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932483AbdKGDXM (ORCPT ); Mon, 6 Nov 2017 22:23:12 -0500 Received: by h2.hallyn.com (Postfix, from userid 1001) id 98D11120475; Mon, 6 Nov 2017 21:23:10 -0600 (CST) Date: Mon, 6 Nov 2017 21:23:10 -0600 From: "Serge E. Hallyn" To: Daniel Micay Cc: "Serge E. Hallyn" , Mahesh Bandewar =?utf-8?B?KOCkruCkueClh+CktiDgpKzgpILgpKHgpYfgpLXgpL4=?= =?utf-8?B?4KSwKQ==?= , Mahesh Bandewar , LKML , Netdev , Kernel-hardening , Linux API , Kees Cook , "Eric W . Biederman" , Eric Dumazet , David Miller Subject: Re: [kernel-hardening] Re: [PATCH resend 2/2] userns: control capabilities of some user namespaces Message-ID: <20171107032310.GA6429@mail.hallyn.com> References: <20171103004436.40026-1-mahesh@bandewar.net> <20171104235346.GA17170@mail.hallyn.com> <20171106150302.GA26634@mail.hallyn.com> <1510003994.736.0.camel@gmail.com> <20171106221418.GA32543@mail.hallyn.com> <1510020963.736.42.camel@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1510020963.736.42.camel@gmail.com> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Nov 06, 2017 at 09:16:03PM -0500, Daniel Micay wrote: > On Mon, 2017-11-06 at 16:14 -0600, Serge E. Hallyn wrote: > > Quoting Daniel Micay (danielmicay@gmail.com): > > > Substantial added attack surface will never go away as a problem. > > > There > > > aren't a finite number of vulnerabilities to be found. > > > > There's varying levels of usefulness and quality. There is code which > > I > > want to be able to use in a container, and code which I can't ever see > > a > > reason for using there. The latter, especially if it's also in a > > staging driver, would be nice to have a toggle to disable. > > > > You're not advocating dropping the added attack surface, only adding a > > way of dealing with an 0day after the fact. Privilege raising 0days > > can > > exist anywhere, not just in code which only root in a user namespace > > can > > exercise. So from that point of view, ksplice seems a more complete > > solution. Why not just actually fix the bad code block when we know > > about it? > > That's not what I'm advocating. I only care about it for proactive > attack surface reduction downstream. I have no interest in using it to > block access to known vulnerabilities. > > > Finally, it has been well argued that you can gain many new caps from > > having only a few others. Given that, how could you ever be sure > > that, > > if an 0day is found which allows root in a user ns to abuse > > CAP_NET_ADMIN against the host, just keeping CAP_NET_ADMIN from them > > would suffice? > > I didn't suggest using it that way... > > > It seems to me that the existing control in > > /proc/sys/kernel/unprivileged_userns_clone might be the better duct > > tape > > in that case. > > There's no such thing as unprivileged_userns_clone in mainline. Hm. I was sure Kees had gotten that in... I guess I was wrong. > The advantage of this over unprivileged_userns_clone in Debian and maybe > some other distributions is not giving up unprivileged app containers / > sandboxes implemented via user namespaces. For example, Chromium's user > namespace sandbox likely only needs to have CAP_SYS_CHROOT. Chromium > will be dropping their setuid sandbox, forcing usage of user namespaces > to avoid losing the sandbox which will greatly increase local kernel > attack surface on the host by exposing netfilter management, etc. to > unprivileged users. > > The proposed approach isn't necessarily the best way to implement this > kind of mitigation but I think it's filling a real need. I think I definately prefer what I mentioned in the email to Boris. Basically a "permanent capability bounding set". The normal bounding set gets reset to a full set on every new user_ns creation. In this proposal, it would instead be set to the calling task's permanent capability set, which starts (at boot) full, and which privileged tasks can pull capabilities out of. -serge From 1583425378280970512@xxx Tue Nov 07 16:28:35 +0000 2017 X-GM-THRID: 1583003759650790753 X-Gmail-Labels: Inbox,Category Forums,HistoricalUnread