Received: by 10.223.164.202 with SMTP id h10csp780459wrb; Mon, 6 Nov 2017 15:41:21 -0800 (PST) X-Google-Smtp-Source: ABhQp+Tossu3Mm7iO0FHV99kcKkGdbcSpyBK/I+IHcNe3Qs0B3px99fmd8Uy8RhUSTSvEDxhZvpJ X-Received: by 10.101.81.131 with SMTP id h3mr16780437pgq.190.1510011681797; Mon, 06 Nov 2017 15:41:21 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1510011681; cv=none; d=google.com; s=arc-20160816; b=KHppWdzLdSLoRyX2ETod6JggESKHoCNucsrlQbDJuwKUMY2lLPGIBLjSkg8SXwUDR+ 4Nf1yIlyk0OvJvOq3KmLG7l4d45yebZZ5nxB4JIK1N9+6ufPkLTor9n/CFuJkusMUqSr 6wxnS7Nb+gtQSVaNVt91MSD4JlK8FlEAn9x2HUOaL4aX4J6GmFxBPDIwwHjkEDjuPxnp WWRLHziEuAb9D5RELG84ekzWJwCp951MA0C5gnJ4RQRO71elKCVqQQFCkqXZQiMLFWe+ 3JInjMvDpLhS0cbjbYSi8DYWi1bCyrgvrH1XL9o8VLT0fFEQN+Kl2hy3NKdfKS10yIMO N+ag== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:arc-authentication-results; bh=PBPTb4itP9Y3/ShVwzO0ymKn2kuzvpi810RBWEnlWME=; b=lLQEODVeprDCXGr732N2kfQq7vQepYfg/V/X6BMCOyVt81AMevZoEchUNBKzHBcRVN y0nJtKbgzJXYWb8GMajxeDrWG7n+jEejhJ8rkYVM2cP0fRYdZkdTgY59xO2/sz5YMALI 05Izl2bQTJtIx7z1AIacc1h3a2ri6VKGLf8z7+emj9yx9pmJOxhL8v6oJPBSL77G4WWr r3I7GJ+IBHQ6rAiamx06f9RYSRLWvgrH89AeWgRjNZTYgKat6E0PUNM12kCdDkwh9nSd EnHzlICZSFusFnLMwPhG2LlNr6VPJT/5moW8pYVm7GKJbVHy4pAZxV93k8RXqtWXtCZ4 4eow== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id y70si13257581pfg.300.2017.11.06.15.41.08; Mon, 06 Nov 2017 15:41:21 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754656AbdKFXjS (ORCPT + 94 others); Mon, 6 Nov 2017 18:39:18 -0500 Received: from h2.hallyn.com ([78.46.35.8]:37422 "EHLO h2.hallyn.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752860AbdKFXjP (ORCPT ); Mon, 6 Nov 2017 18:39:15 -0500 Received: by h2.hallyn.com (Postfix, from userid 1001) id DF27212034B; Mon, 6 Nov 2017 17:39:13 -0600 (CST) Date: Mon, 6 Nov 2017 17:39:13 -0600 From: "Serge E. Hallyn" To: Boris Lukashev Cc: "Serge E. Hallyn" , Daniel Micay , Mahesh Bandewar =?utf-8?B?KOCkruCkueClh+CktiDgpKzgpILgpKHgpYfgpLXgpL4=?= =?utf-8?B?4KSwKQ==?= , Mahesh Bandewar , LKML , Netdev , Kernel-hardening , Linux API , Kees Cook , "Eric W . Biederman" , Eric Dumazet , David Miller Subject: Re: [kernel-hardening] Re: [PATCH resend 2/2] userns: control capabilities of some user namespaces Message-ID: <20171106233913.GA1518@mail.hallyn.com> References: <20171103004436.40026-1-mahesh@bandewar.net> <20171104235346.GA17170@mail.hallyn.com> <20171106150302.GA26634@mail.hallyn.com> <1510003994.736.0.camel@gmail.com> <20171106221418.GA32543@mail.hallyn.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Quoting Boris Lukashev (blukashev@sempervictus.com): > On Mon, Nov 6, 2017 at 5:14 PM, Serge E. Hallyn wrote: > > Quoting Daniel Micay (danielmicay@gmail.com): > >> Substantial added attack surface will never go away as a problem. There > >> aren't a finite number of vulnerabilities to be found. > > > > There's varying levels of usefulness and quality. There is code which I > > want to be able to use in a container, and code which I can't ever see a > > reason for using there. The latter, especially if it's also in a > > staging driver, would be nice to have a toggle to disable. > > > > You're not advocating dropping the added attack surface, only adding a > > way of dealing with an 0day after the fact. Privilege raising 0days can > > exist anywhere, not just in code which only root in a user namespace can > > exercise. So from that point of view, ksplice seems a more complete > > solution. Why not just actually fix the bad code block when we know > > about it? > > > > Finally, it has been well argued that you can gain many new caps from > > having only a few others. Given that, how could you ever be sure that, > > if an 0day is found which allows root in a user ns to abuse > > CAP_NET_ADMIN against the host, just keeping CAP_NET_ADMIN from them > > would suffice? It seems to me that the existing control in > > /proc/sys/kernel/unprivileged_userns_clone might be the better duct tape > > in that case. > > > > -serge > > This seems to be heading toward "we need full zones in Linux" with > their own procfs and sysfs namespace and a stricter isolation model > for resources and capabilities. So long as things can happen in a > namespace which have a privileged relationship with host resources, > this is going to be cat-and-mouse to one degree or another. > > Containers and namespaces dont have a one-to-one relationship, so i'm > not sure that's the best term to use in the kernel security context Sorry - what's not the best term to use? > since there's a bunch of userspace and implementation delta across the > different systems (with their own security models and so forth). > Without accounting for what a specific implementation may or may not > do, and only looking at "how do we reduce privileged impact on parent > context from unprivileged namespaces," this patch does seem to provide > a logical way of reducing the privileges available in such a namespace > and often needed to mount escapes/impact parent context. What different implementations do is irrelevant - as an unprivileged user I can always, with no help, create a new user namespace mapping my current uid to root, and exercise this code. So the security model implemented by a particular userspace namespace-using driver doesn't matter, as it only restricts me if I choose to use it. But, I guess you're actually saying that some program might know that it should never use network code so want to drop CAP_NET_*? And you're saying that a "global capability bounding set" might be useful? Would it be better to actually implement it as a new bounding set that is maintained across user namespace creations, but is per-task (inherted by children of course)? Instead of a sysctl? -serge From 1583358778429363851@xxx Mon Nov 06 22:50:00 +0000 2017 X-GM-THRID: 1583003759650790753 X-Gmail-Labels: Inbox,Category Forums,HistoricalUnread