Received: by 10.223.164.202 with SMTP id h10csp796653wrb; Mon, 6 Nov 2017 16:04:33 -0800 (PST) X-Google-Smtp-Source: ABhQp+RSggeC7SQ5IYtvb1lnwVRG/qTAO8NGTtSgYgDFA059+wKx2tPXPLwRfswu/1mNUoB8Ac0o X-Received: by 10.159.202.143 with SMTP id p15mr16350644plo.325.1510013072914; Mon, 06 Nov 2017 16:04:32 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1510013072; cv=none; d=google.com; s=arc-20160816; b=npTGiwL91BpIkDxnMLMYj7ZKf3C6Ou4EKXjOmd+HZVrWTzdLRff5hfpgvNpBLjHLsU dWowAUPGabwlaSJeFrneVfuInGJ5tRTQOnzu9lvExSzXfPiZwuiBGmTqeKeYCS6oui/J zznJ+kDHV+cq8A+VqZSN9qw7DWuXFL5kdebfIBmui8zEfotEPoA6BuX0GLiqIeUNfl2K qP81gzYqtHVoJe0WOJIOUpNw3blaohusD6piaVgxFpn27cMI/VMumBFY7CAG7akTDZdB 2y1tAxJFcS8mrQ4HiCqa1apbsj39kR8AMFS04YIAxRYglasNxDW22r9h3TACvO374aKy xEIw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :references:in-reply-to:mime-version:dkim-signature :arc-authentication-results; bh=rSr8tyVxPhpi1ZNCFKj1aB9NSCCAWsurC7LRfgTsY3g=; b=keKIpo4b9cYOe9xd1oGrGY5E3/zpYhLtAtjKxnpzWu2Hw3V6h8/jo3qmX3880qp1ub e+2w7sJZnlRsxi/Jq9MUET8SLVWpec7y0Aoi1pE6Ao8SyCsIkGbo9Ef/icBG5bFSQ6Uo DDMOAb0W00ocaPQn2am4qwr/oHeYJb5ZLasYp7iOoQ7Cr5r8NFJZUD7OqOkcSN1MfLU1 olIzaPc9GYxZWpFbqQKAO38y1YvG1dHlRpccKZFtjVyuOAyU19qg3m/Jmi14bP9ouMBd IfuN9M84hrpAOE3NEzIivs6OSuf0Ly6lDig/GXxqgax/6riB++Rh2vOK7G8v3jMQpsb1 W4GA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@sempervictus-com.20150623.gappssmtp.com header.s=20150623 header.b=vdDRlyHL; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id j13si12509279pff.341.2017.11.06.16.04.19; Mon, 06 Nov 2017 16:04:32 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@sempervictus-com.20150623.gappssmtp.com header.s=20150623 header.b=vdDRlyHL; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933217AbdKGACE (ORCPT + 94 others); Mon, 6 Nov 2017 19:02:04 -0500 Received: from mail-pf0-f195.google.com ([209.85.192.195]:50236 "EHLO mail-pf0-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932831AbdKGAB7 (ORCPT ); Mon, 6 Nov 2017 19:01:59 -0500 Received: by mail-pf0-f195.google.com with SMTP id b6so8955544pfh.7 for ; Mon, 06 Nov 2017 16:01:59 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sempervictus-com.20150623.gappssmtp.com; s=20150623; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=rSr8tyVxPhpi1ZNCFKj1aB9NSCCAWsurC7LRfgTsY3g=; b=vdDRlyHLLhsxrTWTTXNzVIGjjb8ifXiNsoK8JaGzpp3YNZ42uQNpFwXCClh8Qprk8J omM7M4Ll1NFi6cgr6RLbyanYcZorR+JoLI9OTWclnHAvfW7PO5faI1jUuVAM/LucnP0b k89MCcvI0rrfLKI1I8vx3GiqFdM2PdCMcyNsjuLP4z9h0zEKIzuoeF2DYjTtEtwKll61 LqVsSF+TcL79QCSBmyLduNjxIyN8PeYf3VKY2u0TX4IMVX1JQS+o9b5u3ec643qTGmaa DeLNwNPrG3q3KlsG41CPAfMV06DzWo6tTSNeFuEOL0pFjTOQsLmrDJa/jlZvOpCf2Wxk s0VQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=rSr8tyVxPhpi1ZNCFKj1aB9NSCCAWsurC7LRfgTsY3g=; b=nDYzLnXaq07qAm/G335Oh5dQQ6VFiY/MJT2vKKM9g2CH/9JId8oiTzq03JOvYbX/de uUnA755OVBHuAgLC8ST5/cxxjqKVdDwIUQu14T9R37UtbYlbYhI8G1m/usgzLOwEi8n3 h8bc1fAPsjqdcCGLCtLwL85MS+dLI6nzdJukZDJCPzUBnZpIm5c7jnhclCEYa+afKVJr 5nw3VqQm+d+hXWQ2/psJiT/OhLMlirAthMvYDFvEkfBhKURqkbMCzhrP4nTQYR3/Yq4z 0+SLjmbAdmXdSPsak7uiMwLf4IAo3jN0nQJ/evAbDV5ZArWUPLIanIfMas3G2NaL6uyp hmbQ== X-Gm-Message-State: AMCzsaXKCQfxhJMZGuy7T/4wPpVCNqmOjM6hQE7CP8zqOBuF3L0LqILh O/0e6no5N+KEYI477UpRQ4+HV7clY/A3HLwJxZlxKw== X-Received: by 10.99.116.18 with SMTP id p18mr17052581pgc.269.1510012918719; Mon, 06 Nov 2017 16:01:58 -0800 (PST) MIME-Version: 1.0 Received: by 10.100.163.9 with HTTP; Mon, 6 Nov 2017 16:01:58 -0800 (PST) X-Originating-IP: [72.70.61.204] In-Reply-To: <20171106233913.GA1518@mail.hallyn.com> References: <20171103004436.40026-1-mahesh@bandewar.net> <20171104235346.GA17170@mail.hallyn.com> <20171106150302.GA26634@mail.hallyn.com> <1510003994.736.0.camel@gmail.com> <20171106221418.GA32543@mail.hallyn.com> <20171106233913.GA1518@mail.hallyn.com> From: Boris Lukashev Date: Mon, 6 Nov 2017 19:01:58 -0500 Message-ID: Subject: Re: [kernel-hardening] Re: [PATCH resend 2/2] userns: control capabilities of some user namespaces To: "Serge E. Hallyn" Cc: Daniel Micay , =?UTF-8?B?TWFoZXNoIEJhbmRld2FyICjgpK7gpLngpYfgpLYg4KSs4KSC4KSh4KWH4KS14KS+4KSwKQ==?= , Mahesh Bandewar , LKML , Netdev , Kernel-hardening , Linux API , Kees Cook , "Eric W . Biederman" , Eric Dumazet , David Miller Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Nov 6, 2017 at 6:39 PM, Serge E. Hallyn wrote: > Quoting Boris Lukashev (blukashev@sempervictus.com): >> On Mon, Nov 6, 2017 at 5:14 PM, Serge E. Hallyn wrote: >> > Quoting Daniel Micay (danielmicay@gmail.com): >> >> Substantial added attack surface will never go away as a problem. There >> >> aren't a finite number of vulnerabilities to be found. >> > >> > There's varying levels of usefulness and quality. There is code which I >> > want to be able to use in a container, and code which I can't ever see a >> > reason for using there. The latter, especially if it's also in a >> > staging driver, would be nice to have a toggle to disable. >> > >> > You're not advocating dropping the added attack surface, only adding a >> > way of dealing with an 0day after the fact. Privilege raising 0days can >> > exist anywhere, not just in code which only root in a user namespace can >> > exercise. So from that point of view, ksplice seems a more complete >> > solution. Why not just actually fix the bad code block when we know >> > about it? >> > >> > Finally, it has been well argued that you can gain many new caps from >> > having only a few others. Given that, how could you ever be sure that, >> > if an 0day is found which allows root in a user ns to abuse >> > CAP_NET_ADMIN against the host, just keeping CAP_NET_ADMIN from them >> > would suffice? It seems to me that the existing control in >> > /proc/sys/kernel/unprivileged_userns_clone might be the better duct tape >> > in that case. >> > >> > -serge >> >> This seems to be heading toward "we need full zones in Linux" with >> their own procfs and sysfs namespace and a stricter isolation model >> for resources and capabilities. So long as things can happen in a >> namespace which have a privileged relationship with host resources, >> this is going to be cat-and-mouse to one degree or another. >> >> Containers and namespaces dont have a one-to-one relationship, so i'm >> not sure that's the best term to use in the kernel security context > > Sorry - what's not the best term to use? Pardon, "containers," since they're namespaces+system construct. > >> since there's a bunch of userspace and implementation delta across the >> different systems (with their own security models and so forth). >> Without accounting for what a specific implementation may or may not >> do, and only looking at "how do we reduce privileged impact on parent >> context from unprivileged namespaces," this patch does seem to provide >> a logical way of reducing the privileges available in such a namespace >> and often needed to mount escapes/impact parent context. > > What different implementations do is irrelevant - as an unprivileged user > I can always, with no help, create a new user namespace mapping my current > uid to root, and exercise this code. So the security model implemented > by a particular userspace namespace-using driver doesn't matter, as it > only restricts me if I choose to use it. > > But, I guess you're actually saying that some program might know that it > should never use network code so want to drop CAP_NET_*? And you're > saying that a "global capability bounding set" might be useful? > The "global capability bounding set" with forced inheritance can be used to prevent the vector you describe wherein the capability of UID 0 in the child NS is restricted from the parent implicitly, so yes, that nomenclature seems appropriate. > Would it be better to actually implement it as a new bounding set that > is maintained across user namespace creations, but is per-task (inherted > by children of course)? Instead of a sysctl? > > -serge In line with the previous comment, the inheritance across subsequent invocations should be forced to prevent the context you described. Please pardon my ignorance, not sure what you mean in terms of "per-task" across namespace creation. -Boris -- Boris Lukashev Systems Architect Semper Victus From 1583362008944851984@xxx Mon Nov 06 23:41:21 +0000 2017 X-GM-THRID: 1583003759650790753 X-Gmail-Labels: Inbox,Category Forums,HistoricalUnread