Received: by 2002:a05:7412:d8a:b0:e2:908c:2ebd with SMTP id b10csp3706795rdg; Wed, 18 Oct 2023 03:52:09 -0700 (PDT) X-Google-Smtp-Source: AGHT+IG+3bmrUbCgLYTLbZy4WYDPs5NT8SAFy39z9D5J1ydkEKXsezMrlf5ItKS9wUTSESDyuV4t X-Received: by 2002:a25:9387:0:b0:d9a:e129:92a1 with SMTP id a7-20020a259387000000b00d9ae12992a1mr5222663ybm.54.1697626328854; Wed, 18 Oct 2023 03:52:08 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1697626328; cv=none; d=google.com; s=arc-20160816; b=rcp+Sa3HxmaThlVKFxDGl+6Nz9BcMQ48NaIrm/MVrVcEt0USfcGt47q+zX/eKqsecm VRcgeNwhP661dOFfTvne2MElb1AW6pRDLESuLR2xKlT6X8Em6762K74zVeRFAJ+yy/fX D9H15Me59Em6ZYqe0IgLZBC//1KHJKBiiL9rhIJqYZo6gRelWnSpn4eUFhxXrjf18+la LkQES9jhiuuD1D09qcs/KaLm1vaPockBfptqmnVjhduwpaH7MB9Q6XKXoZyXLQiBJ3xd IDPh3HTjVZVwf1IPP1FRQWl4AlKjNIW342a2lG8eodAFhdj9Y4AsmUcqUeZQ3Jie8ppl 4NTQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:ui-outboundreport:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:cc:to :from; bh=DoBalLWLPB6Ex/PCn/0LGWfZ/GMPhkIwzBLLpsE+xXI=; fh=c4ql7d5QsFreQmYQr7ycbAsMayBEmNUOv2nmR/PVp9M=; b=EuS5a5uPzqy0Ao7dgkGpBO1DQSvrg8stEjAS8Wo71YDMc7SToJfhMi6Ic6zl+Mo2gw cBje4AjX48i/udwgmvyzHfkkBji2l6ipkcW5cWUWe5riXv25k8UIyRKzxXsGU7N/PxOL wM7w3WKC+VYjL4coRkTPTdSn0PUcpYKlKa+V93Bwoel/zw5ADz+kHxequYos7n9LVHKh asqVVS6LKmS/pT1p3U5VDtfZ03R0tKTEsypVo14vVpiABCCNjcrH430KwOmogDs3UUL2 h4cIoxWrY4aWm+4ZCRSSPtilQhcEqTelRE2hMO8thoheWcZm6aoyEtUQ8rqrsbQnLrQ/ 3+3w== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:3 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=aisec.fraunhofer.de Return-Path: Received: from lipwig.vger.email (lipwig.vger.email. [2620:137:e000::3:3]) by mx.google.com with ESMTPS id b69-20020a633448000000b005774978dd75si2003905pga.175.2023.10.18.03.52.08 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 18 Oct 2023 03:52:08 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:3 as permitted sender) client-ip=2620:137:e000::3:3; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:3 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=aisec.fraunhofer.de Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by lipwig.vger.email (Postfix) with ESMTP id 5C9D3817296D; Wed, 18 Oct 2023 03:52:03 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at lipwig.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229563AbjJRKvp (ORCPT + 99 others); Wed, 18 Oct 2023 06:51:45 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48972 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229690AbjJRKvc (ORCPT ); Wed, 18 Oct 2023 06:51:32 -0400 Received: from mout.kundenserver.de (mout.kundenserver.de [212.227.126.133]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2B995FE; Wed, 18 Oct 2023 03:51:28 -0700 (PDT) Received: from weisslap.aisec.fraunhofer.de ([91.67.186.133]) by mrelayeu.kundenserver.de (mreue012 [212.227.15.167]) with ESMTPSA (Nemesis) id 1M2Plu-1qpDK02lgs-003vUA; Wed, 18 Oct 2023 12:51:06 +0200 From: =?UTF-8?q?Michael=20Wei=C3=9F?= To: Alexander Mikhalitsyn , Christian Brauner , Alexei Starovoitov , Paul Moore Cc: Daniel Borkmann , Andrii Nakryiko , Martin KaFai Lau , Song Liu , Yonghong Song , John Fastabend , KP Singh , Stanislav Fomichev , Hao Luo , Jiri Olsa , Quentin Monnet , Alexander Viro , Miklos Szeredi , Amir Goldstein , "Serge E. Hallyn" , bpf@vger.kernel.org, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, gyroidos@aisec.fraunhofer.de, =?UTF-8?q?Michael=20Wei=C3=9F?= Subject: [RFC PATCH v2 14/14] device_cgroup: Allow mknod in non-initial userns if guarded Date: Wed, 18 Oct 2023 12:50:33 +0200 Message-Id: <20231018105033.13669-15-michael.weiss@aisec.fraunhofer.de> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20231018105033.13669-1-michael.weiss@aisec.fraunhofer.de> References: <20231018105033.13669-1-michael.weiss@aisec.fraunhofer.de> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Provags-ID: V03:K1:p+Jv7n/tSXj9VOdub/TyRhRZeiciOIbTx11fcWiGi9mKRboHY1V l3Y/axn61Mwfb1USZtNvgmraJ1CvN/YuyhUiM8MPPLLy3N9UcB3qIkXNVazSm0dD2d1J8N7 uPKVlmahAJpd7/TQyOQPb7QixZrgDlGBQnVqsB55vJGRa/srXyHE/FYrh62/Xb6SLAiNIAZ EMgJSuPBLUzKhU2OZ8/jg== UI-OutboundReport: notjunk:1;M01:P0:bWVLZ0m+uHw=;xZezacsXngGcvH0Bz/M7eP4ELMi 4AKs7tVSf04OhugwkXmK4kLGTy79PUmq3E2OhojNEXh7mA/P5Hpkd6Gu0iNxVex7N+b38m1mw VG5DJO61uPkIhhatZiKXD6Vx3zU+0tf3LRFT/ESZuNw5udUo0lkQnujM0x5hWzT0LyrE9+d/b bhjdwEyoyDQQGmcD0CppnWSXW89XpgCccRVLuKCdu4QHEusAaK4MVVDJTGfR01NXZmSjNWjOb ltu5zGJg5ezN8IwQ2hNHvQExD5LjaoTZ+7P4RqikOQc1i9kld9B3P23uCh1tIV8G4zdldqNeg lyBqE5//uO9VydAokKzr3xW9nbaXMMtXLRk7mGYWQdYcrk3HsUQ4TBjUzHNicacwq75pWhbxS 7lBcWLXhK67HHijf88W4b9QMXfQMRzh5X3DN2XItGKVjt7GbFuXNSf+XLYThWQSqztBliLqW5 i/4Hj6LkNwZ4Eud3ros5+f8VhB25d7CCBHPyd5iisNKQK6/KUAgJ5TRp9/ML5g+oWw39N7o+G IOhGCaGWeKEgKHLFxhSt4RPDyAd79tLzKBm/f1ayZmE3jLkXNlI3WWtuLVH4BHMeru1gfnRQL H53jjRvlxHM8oJnEkMFMSVOy1heQJoZdn4L6b/iLuwMS3bthXFxHdR5BMcTZ+aiyXck2IawtF Fv6ynUUeUInBWzpwk9fteyiPvcGHZ6d1lFvMvzlR0+CoeeOHLuHXDmbeoqptZAqR54RdpT0j1 yYmDGKaAf+ECfucMZWyjtBcHEnq5oeAh7poy5o9pnUOVCgwB6JexbEOSsbIPR4G5VqU9fUBGO l2558RO1KZDEekYpURHUWxC7tDkKBFud3PReNx1rcIcp33BT2sCWJ/hBXh//CIptigA+47bzj s8PUNBkPd2dr2ko+XFC1YNa8YlxpfKA4R0w0= X-Spam-Status: No, score=-0.8 required=5.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lipwig.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (lipwig.vger.email [0.0.0.0]); Wed, 18 Oct 2023 03:52:03 -0700 (PDT) If a container manager restricts its unprivileged (user namespaced) children by a device cgroup, it is not necessary to deny mknod() anymore. Thus, user space applications may map devices on different locations in the file system by using mknod() inside the container. A use case for this, we also use in GyroidOS, is to run virsh for VMs inside an unprivileged container. virsh creates device nodes, e.g., "/var/run/libvirt/qemu/11-fgfg.dev/null" which currently fails in a non-initial userns, even if a cgroup device white list with the corresponding major, minor of /dev/null exists. Thus, in this case the usual bind mounts or pre populated device nodes under /dev are not sufficient. To circumvent this limitation, allow mknod() by checking CAP_MKNOD in the userns by implementing the security_inode_mknod_nscap(). The hook implementation checks if the corresponding permission flag BPF_DEVCG_ACC_MKNOD_UNS is set for the device in the bpf program. To avoid to create unusable inodes in user space the hook also checks SB_I_NODEV on the corresponding super block. Further, the security_sb_alloc_userns() hook is implemented using cgroup_bpf_current_enabled() to allow usage of device nodes on super blocks mounted by a guarded task. Signed-off-by: Michael Weiß --- security/device_cgroup/lsm.c | 27 +++++++++++++++++++++++++++ 1 file changed, 27 insertions(+) diff --git a/security/device_cgroup/lsm.c b/security/device_cgroup/lsm.c index a963536d0a15..6bc984d9c9d1 100644 --- a/security/device_cgroup/lsm.c +++ b/security/device_cgroup/lsm.c @@ -66,10 +66,37 @@ static int devcg_inode_mknod(struct inode *dir, struct dentry *dentry, return __devcg_inode_mknod(mode, dev, DEVCG_ACC_MKNOD); } +#ifdef CONFIG_CGROUP_BPF +static int devcg_sb_alloc_userns(struct super_block *sb) +{ + if (cgroup_bpf_current_enabled(CGROUP_DEVICE)) + return 0; + + return -EPERM; +} + +static int devcg_inode_mknod_nscap(struct inode *dir, struct dentry *dentry, + umode_t mode, dev_t dev) +{ + if (!cgroup_bpf_current_enabled(CGROUP_DEVICE)) + return -EPERM; + + // avoid to create unusable inodes in user space + if (dentry->d_sb->s_iflags & SB_I_NODEV) + return -EPERM; + + return __devcg_inode_mknod(mode, dev, BPF_DEVCG_ACC_MKNOD_UNS); +} +#endif /* CONFIG_CGROUP_BPF */ + static struct security_hook_list devcg_hooks[] __ro_after_init = { LSM_HOOK_INIT(inode_permission, devcg_inode_permission), LSM_HOOK_INIT(inode_mknod, devcg_inode_mknod), LSM_HOOK_INIT(dev_permission, devcg_dev_permission), +#ifdef CONFIG_CGROUP_BPF + LSM_HOOK_INIT(sb_alloc_userns, devcg_sb_alloc_userns), + LSM_HOOK_INIT(inode_mknod_nscap, devcg_inode_mknod_nscap), +#endif }; static int __init devcgroup_init(void) -- 2.30.2