Received: by 10.223.164.202 with SMTP id h10csp489300wrb; Thu, 9 Nov 2017 09:23:15 -0800 (PST) X-Google-Smtp-Source: ABhQp+QNmnBbhdzhz911lm6rIbuluKVUBYZOjQKOJ4mLdBSqpVXlFqB7co1CseV7ogHQGF1Nvz7f X-Received: by 10.84.134.164 with SMTP id 33mr1080954plh.292.1510248195799; Thu, 09 Nov 2017 09:23:15 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1510248195; cv=none; d=google.com; s=arc-20160816; b=VaioWs8vL+/F4+yUzcs80ZPBQdPIPCJgXHR67FisTSH0HwtpQZm2n5xpFcjA1LDsTH R1ZrqR+tIRX5E7vN8qVeR7RIc7s2rtBerovvbLeYv9TcV9769PUYSiyYse/GNa9C7aWE HWuACojxHucwadhayQfTjXFHdWpClbsGxO3YRALzIsIZtCP3nae1hc9LNvC++bYjSJyH t2pSVyIYVe+hGqysYvkJSXEUtRSUrufGOFqDicD2SOK6hA5FnzSiA5bFFsVZVZvH17lu NFDz+pBQxrOPuQER10QZ+SLQVi1O0IXCCDlevhiOY0hcxOLGf3ikUTnGczmpoXKiFlRM wULA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:arc-authentication-results; bh=M1CnNwVRH49DCtqXg5AvbRRew2wMXxU5ddAcId1gpsM=; b=Rqu6+r0lnWFGVI+wAFuM/SuACNb380Ixdk0dKeMarE8NPsSCZYEFD72hlC/SR5Xgyy 0NaSj7mnAeeyIzPKkqkAiHZckvidNWMrldyMlD6hT7YbanXhR9uEyigo3k4wM7NW3Z/j RcHgkdWLJocX0ojfALznEd5aKHFEWdj5cPbXVS1ZyBjI4GS80SoX+pXp1ol8Ttdv4CEt DWNVQvifAcMi4ORjLR2svNrul1b8ablsNJLfvSxMvEK3ktWKRVcUB5t/36C34xMidk0I NbWV78zpI4iHMybP7NMSYyvRj8hv2ACWtiCjvzKtyVNXk/hnUwjNa1fImGSrCiqGrCTu EBYg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id y12si6250827plt.757.2017.11.09.09.23.04; Thu, 09 Nov 2017 09:23:15 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753125AbdKIRWF (ORCPT + 81 others); Thu, 9 Nov 2017 12:22:05 -0500 Received: from h2.hallyn.com ([78.46.35.8]:49224 "EHLO h2.hallyn.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753009AbdKIRWD (ORCPT ); Thu, 9 Nov 2017 12:22:03 -0500 Received: by h2.hallyn.com (Postfix, from userid 1001) id B23931204F0; Thu, 9 Nov 2017 11:22:01 -0600 (CST) Date: Thu, 9 Nov 2017 11:22:01 -0600 From: "Serge E. Hallyn" To: Mahesh Bandewar Cc: LKML , Netdev , Kernel-hardening , Linux API , Kees Cook , Serge Hallyn , "Eric W . Biederman" , Eric Dumazet , David Miller , Mahesh Bandewar Subject: Re: [PATCH resend 1/2] capability: introduce sysctl for controlled user-ns capability whitelist Message-ID: <20171109172201.GA26229@mail.hallyn.com> References: <20171103004433.39954-1-mahesh@bandewar.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20171103004433.39954-1-mahesh@bandewar.net> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Quoting Mahesh Bandewar (mahesh@bandewar.net): > From: Mahesh Bandewar > > Add a sysctl variable kernel.controlled_userns_caps_whitelist. This > takes input as capability mask expressed as two comma separated hex > u32 words. The mask, however, is stored in kernel as kernel_cap_t type. > > Any capabilities that are not part of this mask will be controlled and > will not be allowed to processes in controlled user-ns. > > Signed-off-by: Mahesh Bandewar > --- > Documentation/sysctl/kernel.txt | 21 ++++++++++++++++++ > include/linux/capability.h | 3 +++ > kernel/capability.c | 47 +++++++++++++++++++++++++++++++++++++++++ > kernel/sysctl.c | 5 +++++ > 4 files changed, 76 insertions(+) > > diff --git a/Documentation/sysctl/kernel.txt b/Documentation/sysctl/kernel.txt > index 694968c7523c..a1d39dbae847 100644 > --- a/Documentation/sysctl/kernel.txt > +++ b/Documentation/sysctl/kernel.txt > @@ -25,6 +25,7 @@ show up in /proc/sys/kernel: > - bootloader_version [ X86 only ] > - callhome [ S390 only ] > - cap_last_cap > +- controlled_userns_caps_whitelist > - core_pattern > - core_pipe_limit > - core_uses_pid > @@ -187,6 +188,26 @@ CAP_LAST_CAP from the kernel. > > ============================================================== > > +controlled_userns_caps_whitelist > + > +Capability mask that is whitelisted for "controlled" user namespaces. > +Any capability that is missing from this mask will not be allowed to > +any process that is attached to a controlled-userns. e.g. if CAP_NET_RAW > +is not part of this mask, then processes running inside any controlled > +userns's will not be allowed to perform action that needs CAP_NET_RAW > +capability. However, processes that are attached to a parent user-ns > +hierarchy that is *not* controlled and has CAP_NET_RAW can continue > +performing those actions. User-namespaces are marked "controlled" at > +the time of their creation based on the capabilities of the creator. > +A process that does not have CAP_SYS_ADMIN will create user-namespaces > +that are controlled. Hm. I think that's fine (the way 'controlled' user namespaces are defined), but that is design decision in itself, and should perhaps be discussed. Did you consider other ways? What about using CAP_SETPCAP? > +The value is expressed as two comma separated hex words (u32). This Why comma separated? whitespace ok? Leading 0x ok? What is the default at boot? (Obviously the patch tells me, I'm asking for it to be spelled out in the doc) Otherwise looks good, thanks! Serge > +sysctl is avaialble in init-ns and users with CAP_SYS_ADMIN in init-ns > +are allowed to make changes. > + > +============================================================== > + > core_pattern: > > core_pattern is used to specify a core dumpfile pattern name. > diff --git a/include/linux/capability.h b/include/linux/capability.h > index b52e278e4744..6c0b9677c03f 100644 > --- a/include/linux/capability.h > +++ b/include/linux/capability.h > @@ -13,6 +13,7 @@ > #define _LINUX_CAPABILITY_H > > #include > +#include > > > #define _KERNEL_CAPABILITY_VERSION _LINUX_CAPABILITY_VERSION_3 > @@ -247,6 +248,8 @@ extern bool ptracer_capable(struct task_struct *tsk, struct user_namespace *ns); > > /* audit system wants to get cap info from files as well */ > extern int get_vfs_caps_from_disk(const struct dentry *dentry, struct cpu_vfs_cap_data *cpu_caps); > +int proc_douserns_caps_whitelist(struct ctl_table *table, int write, > + void __user *buff, size_t *lenp, loff_t *ppos); > > extern int cap_convert_nscap(struct dentry *dentry, void **ivalue, size_t size); > > diff --git a/kernel/capability.c b/kernel/capability.c > index f97fe77ceb88..62dbe3350c1b 100644 > --- a/kernel/capability.c > +++ b/kernel/capability.c > @@ -28,6 +28,8 @@ EXPORT_SYMBOL(__cap_empty_set); > > int file_caps_enabled = 1; > > +kernel_cap_t controlled_userns_caps_whitelist = CAP_FULL_SET; > + > static int __init file_caps_disable(char *str) > { > file_caps_enabled = 0; > @@ -506,3 +508,48 @@ bool ptracer_capable(struct task_struct *tsk, struct user_namespace *ns) > rcu_read_unlock(); > return (ret == 0); > } > + > +/* Controlled-userns capabilities routines */ > +#ifdef CONFIG_SYSCTL > +int proc_douserns_caps_whitelist(struct ctl_table *table, int write, > + void __user *buff, size_t *lenp, loff_t *ppos) > +{ > + DECLARE_BITMAP(caps_bitmap, CAP_LAST_CAP); > + struct ctl_table caps_table; > + char tbuf[NAME_MAX]; > + int ret; > + > + ret = bitmap_from_u32array(caps_bitmap, CAP_LAST_CAP, > + controlled_userns_caps_whitelist.cap, > + _KERNEL_CAPABILITY_U32S); > + if (ret != CAP_LAST_CAP) > + return -1; > + > + scnprintf(tbuf, NAME_MAX, "%*pb", CAP_LAST_CAP, caps_bitmap); > + > + caps_table.data = tbuf; > + caps_table.maxlen = NAME_MAX; > + caps_table.mode = table->mode; > + ret = proc_dostring(&caps_table, write, buff, lenp, ppos); > + if (ret) > + return ret; > + if (write) { > + kernel_cap_t tmp; > + > + if (!capable(CAP_SYS_ADMIN)) > + return -EPERM; > + > + ret = bitmap_parse_user(buff, *lenp, caps_bitmap, CAP_LAST_CAP); > + if (ret) > + return ret; > + > + ret = bitmap_to_u32array(tmp.cap, _KERNEL_CAPABILITY_U32S, > + caps_bitmap, CAP_LAST_CAP); > + if (ret != CAP_LAST_CAP) > + return -1; > + > + controlled_userns_caps_whitelist = tmp; > + } > + return 0; > +} > +#endif /* CONFIG_SYSCTL */ > diff --git a/kernel/sysctl.c b/kernel/sysctl.c > index d9c31bc2eaea..25c3f7b76ece 100644 > --- a/kernel/sysctl.c > +++ b/kernel/sysctl.c > @@ -1226,6 +1226,11 @@ static struct ctl_table kern_table[] = { > .extra2 = &one, > }, > #endif > + { > + .procname = "controlled_userns_caps_whitelist", > + .mode = 0644, > + .proc_handler = proc_douserns_caps_whitelist, > + }, > { } > }; > > -- > 2.15.0.403.gc27cc4dac6-goog From 1583003684527762870@xxx Fri Nov 03 00:45:56 +0000 2017 X-GM-THRID: 1583003684527762870 X-Gmail-Labels: Inbox,Category Forums,HistoricalUnread