Received: by 2002:a05:6a10:1a4d:0:0:0:0 with SMTP id nk13csp3467848pxb; Fri, 4 Feb 2022 09:08:21 -0800 (PST) X-Google-Smtp-Source: ABdhPJyiRUUXyV8/lA9ViK6QztW9ekrUmPrxcYH1dIALjtkz0UQt7yCkQiMs5IYeo4v6D/CJNGZo X-Received: by 2002:a17:907:970e:: with SMTP id jg14mr3350075ejc.372.1643994501349; Fri, 04 Feb 2022 09:08:21 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1643994501; cv=none; d=google.com; s=arc-20160816; b=Fkm02WYm9pP21KCUeHse6acxn/+8VXwni4wNnNqoCXaVniZsZGU8be0XvuFYQy3TkN pDNvavl1U39TrEEE9cvGn5GDZvQzQ1qHwuDgyi3aQ3zG6NlnCyQvVWuInmVTy4ISKzjk gPAvyxcPL4Rpp/SbUV2+v0jwcy+DKujjBYoNi2pvEoDPUxsHvQCPYNQXRa/qdRG7v8O/ mNOyCBmfv8brAqrBnrPITqkQGjr2pAcyXySeLYaDQp/zAnHbR59RMhshke0jK1LQduLI nOgXsABAZxBQpuyCJE1ZK7AQ/TyAkB5Qt5cvauOMDnNzbYiehzp64fxIzKkoSsgg8KaN UZuQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=tD2KDm69mKJLTpIzvWk3XCQ248a8+6+3r1o2QpnCBqo=; b=0Qivno5ohrd1DrT/zV7MKorLEjjTLil55vUvuPhIox7XRQ8f/zg9NIBxuqawcP8Aot qBTLY7u+8thTRrHXRf0GHM2JAKRWDiLFVt90XRM2ENjp1+EHwuik5nGJulvGBRLrjO+x nsCgUQLtNt7NGvOolQ2W1xRvUp+sOCY+ZAYn3d4PI1BJD/aMsHG/zdRyTj+qBaXXQnqo HYJUXBO+46hljJSOq0iSPrPpvuzdfb5gc4VyOT/t+JQ8K4cRiUmr39Aygh6YgZZx2gNv 7xcmlS1jNYtnhmqtC7T0cQgdoWTTBBXxu5juavZLWsbahnx6sDbPzeFYkWoQ78pRBsxo 1iMQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@ibm.com header.s=pp1 header.b=Qbi3wk+N; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=ibm.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id gt20si1599462ejc.694.2022.02.04.09.07.53; Fri, 04 Feb 2022 09:08:21 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@ibm.com header.s=pp1 header.b=Qbi3wk+N; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=ibm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242598AbiBAUjA (ORCPT + 99 others); Tue, 1 Feb 2022 15:39:00 -0500 Received: from mx0b-001b2d01.pphosted.com ([148.163.158.5]:23464 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S240201AbiBAUiQ (ORCPT ); Tue, 1 Feb 2022 15:38:16 -0500 Received: from pps.filterd (m0098414.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 211Js7Yt031299; Tue, 1 Feb 2022 20:37:57 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pp1; bh=tD2KDm69mKJLTpIzvWk3XCQ248a8+6+3r1o2QpnCBqo=; b=Qbi3wk+NosG2m3IrWV0YNtYSeeTyVXvl7kzxTy1k4upidq5kx/iVRusDg0+RoJlnM80T 3NSfNKHw2f1ZVxyerBSv4A7gNI5RSJmLoyrrEszaLVgDGD8GHmhC1CkUNKupYTzjT25D Ldc8N7k/42jeaxNxdvd9cOv8YE6k2B5WW2APbPniJeAtSEPCMTCPz2u/+EL9BFLpHm3y Wg+mjFUO3YafoSmP+Ehuf3uXTynZtSMWF1m7JGBFmuSTct7lF9+0QXd4J0ozv0yS8vuj DeT8vnQwwWeto6R1FxSg3OVjargOJXARy/2GVuGci0y17PUW0soEnW9j8d0071P3qBS2 FA== Received: from pps.reinject (localhost [127.0.0.1]) by mx0b-001b2d01.pphosted.com with ESMTP id 3dybe58pxu-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 01 Feb 2022 20:37:57 +0000 Received: from m0098414.ppops.net (m0098414.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.43/8.16.0.43) with SMTP id 211KUueV030488; Tue, 1 Feb 2022 20:37:57 GMT Received: from ppma01dal.us.ibm.com (83.d6.3fa9.ip4.static.sl-reverse.com [169.63.214.131]) by mx0b-001b2d01.pphosted.com with ESMTP id 3dybe58pxg-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 01 Feb 2022 20:37:57 +0000 Received: from pps.filterd (ppma01dal.us.ibm.com [127.0.0.1]) by ppma01dal.us.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 211KWdNk027132; Tue, 1 Feb 2022 20:37:56 GMT Received: from b01cxnp23033.gho.pok.ibm.com (b01cxnp23033.gho.pok.ibm.com [9.57.198.28]) by ppma01dal.us.ibm.com with ESMTP id 3dvw7bq1be-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 01 Feb 2022 20:37:56 +0000 Received: from b01ledav003.gho.pok.ibm.com (b01ledav003.gho.pok.ibm.com [9.57.199.108]) by b01cxnp23033.gho.pok.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 211KbqjW43581726 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 1 Feb 2022 20:37:52 GMT Received: from b01ledav003.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 415CAB206A; Tue, 1 Feb 2022 20:37:52 +0000 (GMT) Received: from b01ledav003.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 23A5BB2068; Tue, 1 Feb 2022 20:37:52 +0000 (GMT) Received: from sbct-3.pok.ibm.com (unknown [9.47.158.153]) by b01ledav003.gho.pok.ibm.com (Postfix) with ESMTP; Tue, 1 Feb 2022 20:37:52 +0000 (GMT) From: Stefan Berger To: linux-integrity@vger.kernel.org Cc: zohar@linux.ibm.com, serge@hallyn.com, christian.brauner@ubuntu.com, containers@lists.linux.dev, dmitry.kasatkin@gmail.com, ebiederm@xmission.com, krzysztof.struczynski@huawei.com, roberto.sassu@huawei.com, mpeters@redhat.com, lhinds@redhat.com, lsturman@redhat.com, puiterwi@redhat.com, jejb@linux.ibm.com, jamjoom@us.ibm.com, linux-kernel@vger.kernel.org, paul@paul-moore.com, rgb@redhat.com, linux-security-module@vger.kernel.org, jmorris@namei.org, Stefan Berger , Christian Brauner , James Bottomley Subject: [PATCH v10 22/27] securityfs: Extend securityfs with namespacing support Date: Tue, 1 Feb 2022 15:37:30 -0500 Message-Id: <20220201203735.164593-23-stefanb@linux.ibm.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20220201203735.164593-1-stefanb@linux.ibm.com> References: <20220201203735.164593-1-stefanb@linux.ibm.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-TM-AS-GCONF: 00 X-Proofpoint-GUID: J0AZYSs4Ack20RdhWP18JiNk2K9ijtSn X-Proofpoint-ORIG-GUID: rHpdfbVCSHl1hb5w-zUsI1m0CKG5jUZs X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.816,Hydra:6.0.425,FMLib:17.11.62.513 definitions=2022-02-01_09,2022-02-01_01,2021-12-02_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 impostorscore=0 suspectscore=0 priorityscore=1501 malwarescore=0 lowpriorityscore=0 mlxscore=0 adultscore=0 spamscore=0 bulkscore=0 phishscore=0 clxscore=1015 mlxlogscore=999 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2201110000 definitions=main-2202010114 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Enable multiple instances of securityfs by keying each instance with a pointer to the user namespace it belongs to. Since we do not need the pinning of the filesystem for the virtualization case, limit the usage of simple_pin_fs() and simpe_release_fs() to the case when the init_user_ns is active. This simplifies the cleanup for the virtualization case where usage of securityfs_remove() to free dentries is therefore not needed anymore. For the initial securityfs, i.e. the one mounted in the host userns mount, nothing changes. The rules for securityfs_remove() are as before and it is still paired with securityfs_create(). Specifically, a file created via securityfs_create_dentry() in the initial securityfs mount still needs to be removed by a call to securityfs_remove(). Creating a new dentry in the initial securityfs mount still pins the filesystem like it always did. Consequently, the initial securityfs mount is not destroyed on umount/shutdown as long as at least one user of it still has dentries that it hasn't removed with a call to securityfs_remove(). Prevent mounting of an instance of securityfs in another user namespace than it belongs to. Also, prevent accesses to files and directories by a user namespace that is neither the user namespace it belongs to nor an ancestor of the user namespace that the instance of securityfs belongs to. Do not prevent access if securityfs was bind-mounted and therefore the init_user_ns is the owning user namespace. Suggested-by: Christian Brauner Signed-off-by: Stefan Berger Signed-off-by: James Bottomley --- security/inode.c | 72 ++++++++++++++++++++++++++++++++++++++++-------- 1 file changed, 61 insertions(+), 11 deletions(-) diff --git a/security/inode.c b/security/inode.c index 13e6780c4444..e525ba960063 100644 --- a/security/inode.c +++ b/security/inode.c @@ -21,9 +21,37 @@ #include #include #include +#include -static struct vfsmount *mount; -static int mount_count; +static struct vfsmount *init_securityfs_mount; +static int init_securityfs_mount_count; + +static int securityfs_permission(struct user_namespace *mnt_userns, + struct inode *inode, int mask) +{ + int err; + + err = generic_permission(&init_user_ns, inode, mask); + if (!err) { + /* Unless bind-mounted, deny access if current_user_ns() is not + * ancestor. + */ + if (inode->i_sb->s_user_ns != &init_user_ns && + !in_userns(current_user_ns(), inode->i_sb->s_user_ns)) + err = -EACCES; + } + + return err; +} + +static const struct inode_operations securityfs_dir_inode_operations = { + .permission = securityfs_permission, + .lookup = simple_lookup, +}; + +static const struct inode_operations securityfs_file_inode_operations = { + .permission = securityfs_permission, +}; static void securityfs_free_inode(struct inode *inode) { @@ -40,20 +68,25 @@ static const struct super_operations securityfs_super_operations = { static int securityfs_fill_super(struct super_block *sb, struct fs_context *fc) { static const struct tree_descr files[] = {{""}}; + struct user_namespace *ns = fc->user_ns; int error; + if (WARN_ON(ns != current_user_ns())) + return -EINVAL; + error = simple_fill_super(sb, SECURITYFS_MAGIC, files); if (error) return error; sb->s_op = &securityfs_super_operations; + sb->s_root->d_inode->i_op = &securityfs_dir_inode_operations; return 0; } static int securityfs_get_tree(struct fs_context *fc) { - return get_tree_single(fc, securityfs_fill_super); + return get_tree_keyed(fc, securityfs_fill_super, fc->user_ns); } static const struct fs_context_operations securityfs_context_ops = { @@ -71,6 +104,7 @@ static struct file_system_type fs_type = { .name = "securityfs", .init_fs_context = securityfs_init_fs_context, .kill_sb = kill_litter_super, + .fs_flags = FS_USERNS_MOUNT, }; /** @@ -109,6 +143,7 @@ static struct dentry *securityfs_create_dentry(const char *name, umode_t mode, const struct file_operations *fops, const struct inode_operations *iops) { + struct user_namespace *ns = current_user_ns(); struct dentry *dentry; struct inode *dir, *inode; int error; @@ -118,12 +153,19 @@ static struct dentry *securityfs_create_dentry(const char *name, umode_t mode, pr_debug("securityfs: creating file '%s'\n",name); - error = simple_pin_fs(&fs_type, &mount, &mount_count); - if (error) - return ERR_PTR(error); + if (ns == &init_user_ns) { + error = simple_pin_fs(&fs_type, &init_securityfs_mount, + &init_securityfs_mount_count); + if (error) + return ERR_PTR(error); + } - if (!parent) - parent = mount->mnt_root; + if (!parent) { + if (ns == &init_user_ns) + parent = init_securityfs_mount->mnt_root; + else + return ERR_PTR(-EINVAL); + } dir = d_inode(parent); @@ -148,7 +190,7 @@ static struct dentry *securityfs_create_dentry(const char *name, umode_t mode, inode->i_atime = inode->i_mtime = inode->i_ctime = current_time(inode); inode->i_private = data; if (S_ISDIR(mode)) { - inode->i_op = &simple_dir_inode_operations; + inode->i_op = &securityfs_dir_inode_operations; inode->i_fop = &simple_dir_operations; inc_nlink(inode); inc_nlink(dir); @@ -156,6 +198,7 @@ static struct dentry *securityfs_create_dentry(const char *name, umode_t mode, inode->i_op = iops ? iops : &simple_symlink_inode_operations; inode->i_link = data; } else { + inode->i_op = &securityfs_file_inode_operations; inode->i_fop = fops; } d_instantiate(dentry, inode); @@ -167,7 +210,9 @@ static struct dentry *securityfs_create_dentry(const char *name, umode_t mode, dentry = ERR_PTR(error); out: inode_unlock(dir); - simple_release_fs(&mount, &mount_count); + if (ns == &init_user_ns) + simple_release_fs(&init_securityfs_mount, + &init_securityfs_mount_count); return dentry; } @@ -293,11 +338,14 @@ EXPORT_SYMBOL_GPL(securityfs_create_symlink); */ void securityfs_remove(struct dentry *dentry) { + struct user_namespace *ns; struct inode *dir; if (!dentry || IS_ERR(dentry)) return; + ns = dentry->d_sb->s_user_ns; + dir = d_inode(dentry->d_parent); inode_lock(dir); if (simple_positive(dentry)) { @@ -310,7 +358,9 @@ void securityfs_remove(struct dentry *dentry) dput(dentry); } inode_unlock(dir); - simple_release_fs(&mount, &mount_count); + if (ns == &init_user_ns) + simple_release_fs(&init_securityfs_mount, + &init_securityfs_mount_count); } EXPORT_SYMBOL_GPL(securityfs_remove); -- 2.31.1