Received: by 2002:a05:6a10:2785:0:0:0:0 with SMTP id ia5csp27279pxb; Tue, 12 Jan 2021 18:58:19 -0800 (PST) X-Google-Smtp-Source: ABdhPJx1UpgzMTd69VdZ/QiZwSSZKTTDSuFEOqt9rXq9s3SPV+/HmbdFGkvk2qpjpy2NYgSNZeNB X-Received: by 2002:a17:906:5857:: with SMTP id h23mr1231752ejs.465.1610506699187; Tue, 12 Jan 2021 18:58:19 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1610506699; cv=none; d=google.com; s=arc-20160816; b=VFWLRqW/LIlfeRmaNfAeCp8bgmihkl+lfNl725sdaZKpUmCsf2cVxW4eesSC4T2KkF 2wj68kEwI9tppiH46jrzQlLDp5OiVkE4dCczPP3VapsT1ZRFQC9hyycQAwQcyIzrQGOB 5sTlCpoW6PnSboEMBKILY6InEADe359MpnDIghTHELREdeKLR8NGB2A9zCx5t8GJakvL WkJvP7NbYBrcWa2IxZ+DenEc8FLHKusABhVQitBOR136HmfoHtJGlnpf4UVC533Sut3A miO09v4tiDUr+Eala7ewLwz0C2eXomxONSFmiJy62DLSvK6Ki0tpBzk8Q4s61mC8uCma OaRA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=v+y25McDs08neI3SnqFCrMC6U+7DE69hzbp/cr3w4IU=; b=RV3X6MDuNfTLQDdNFwWtSQ1BcAGIOQzyf3q50yLyW1GxiBA7op/miPdraq6W5qdPMN R72jtrtIN6NgK3TujTXzbvQM4yFo8BA08dM92u1OxmHiTW6l6w3DqPujfyVdWAm6aPje 0H7MVGUbtwRCxnlAICq/bnok18tQcAOnm9BWc8Y115+lPRKKPbiplK8Yw9YvMQZHWsYi HAkGwuNUJcu4zKeBeEZWzt/G+ZOTYkTWNylb8zlmRhIQixTbV4D7NJBIs/zd135d3HVD EATLhGcWWkKFHvKgibkzp62+QLSfaFShH1SxiFsvcLVfg2gNNkgnL3nhhT01XwdiI06o 6//g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id m3si315455ejr.562.2021.01.12.18.57.55; Tue, 12 Jan 2021 18:58:19 -0800 (PST) Received-SPF: pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2393906AbhALWDw (ORCPT + 99 others); Tue, 12 Jan 2021 17:03:52 -0500 Received: from youngberry.canonical.com ([91.189.89.112]:43004 "EHLO youngberry.canonical.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2393847AbhALWDv (ORCPT ); Tue, 12 Jan 2021 17:03:51 -0500 Received: from ip5f5af0a0.dynamic.kabel-deutschland.de ([95.90.240.160] helo=wittgenstein.fritz.box) by youngberry.canonical.com with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.86_2) (envelope-from ) id 1kzRko-0003bd-DF; Tue, 12 Jan 2021 22:03:06 +0000 From: Christian Brauner To: Alexander Viro , Christoph Hellwig , linux-fsdevel@vger.kernel.org Cc: John Johansen , James Morris , Mimi Zohar , Dmitry Kasatkin , Stephen Smalley , Casey Schaufler , Arnd Bergmann , Andreas Dilger , OGAWA Hirofumi , Geoffrey Thomas , Mrunal Patel , Josh Triplett , Andy Lutomirski , Theodore Tso , Alban Crequy , Tycho Andersen , David Howells , James Bottomley , Seth Forshee , =?UTF-8?q?St=C3=A9phane=20Graber?= , Linus Torvalds , Aleksa Sarai , Lennart Poettering , "Eric W. Biederman" , smbarber@chromium.org, Phil Estes , Serge Hallyn , Kees Cook , Todd Kjos , Paul Moore , Jonathan Corbet , containers@lists.linux-foundation.org, linux-security-module@vger.kernel.org, linux-api@vger.kernel.org, linux-ext4@vger.kernel.org, linux-xfs@vger.kernel.org, linux-integrity@vger.kernel.org, selinux@vger.kernel.org, Christian Brauner , Christoph Hellwig Subject: [PATCH v5 09/42] mount: attach mappings to mounts Date: Tue, 12 Jan 2021 23:00:51 +0100 Message-Id: <20210112220124.837960-10-christian.brauner@ubuntu.com> X-Mailer: git-send-email 2.30.0 In-Reply-To: <20210112220124.837960-1-christian.brauner@ubuntu.com> References: <20210112220124.837960-1-christian.brauner@ubuntu.com> MIME-Version: 1.0 X-Patch-Hashes: v=1; h=sha256; i=RNdtgoKaLGCMkof4zQAU3N/xXjoHA/9IvgBRCO94lyQ=; m=m4sUTX63tJxvlN1Z3yC4IXx3n9AwTpPsjK1VR2fo7TE=; p=ACCirFhhOVkzdjl4ucxJqbt+8lVejkyYQx6FbUt2W0Y=; g=4082bc07095b82d49722a5cf3c1cbde1a8cff890 X-Patch-Sig: m=pgp; i=christian.brauner@ubuntu.com; s=0x0x91C61BC06578DCA2; b=iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCX/4YtQAKCRCRxhvAZXjcojOHAPwJ60p Jde8zuYoECLqnZgXBRFHaHLBnEA7sZmKWUf76yQD/d0wHNPoB6SEzYiVZsKY/YVpkPJYpt2GB76b9 R2Jp6g4= Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org In order to support per-mount idmappings vfsmounts will be marked with user namespaces. The idmapping associated with that user namespace will be used to map the ids of vfs objects when they are accessed through that mount. By default all vfsmounts are marked with the initial user namespace. The initial user namespace is used to indicate that a mount is not idmapped. All operations behave as before. Based on prior discussions we want to attach the whole user namespace and not just a dedicated idmapping struct. This allows us to reuse all the helpers that already exist for dealing with idmappings instead of introducing a whole new range of helpers. In addition, if we decide in the future that we are confident enough to enable unprivileged users to setup idmapped mounts we can allow the user namespace an already idmapped mounts has been marked with to be replaced with another one. The permission checking would then take into account whether the caller is privileged in the user namespace the mount is currently marked with and that is about to be replaced with another one. For now, we will enforce in later patches that once a mount has been idmapped it can't be remapped. This keeps permission checking and life-cycle management simple, especially since users can always create a new mount with a different idmapping anyway. The idea to attach user namespaces to vfsmounts has been floated around in various forms at Linux Plumbers in ~2018 with the original idea tracing back to a discussion during a conference in St. Petersburg between Christoph, Tycho, and myself. Cc: Christoph Hellwig Cc: David Howells Cc: Al Viro Cc: linux-fsdevel@vger.kernel.org Signed-off-by: Christian Brauner --- /* v2 */ patch introduced - Christoph Hellwig : - Split internal implementation into separate patch and move syscall implementation later. /* v3 */ - David Howells : - Remove MNT_IDMAPPED flag. We can simply check the pointer and use smp_load_acquire() in later patches. - Tycho Andersen : - Use READ_ONCE() in mnt_user_ns(). /* v4 */ - Serge Hallyn : - Use "mnt_userns" to refer to a vfsmount's userns everywhere to make terminology consistent. - Christoph Hellwig : - Drop the READ_ONCE() from this patch. At this point in the series we don't allowing changing the vfsmount's userns. The infra to do that is only introduced as almost the last patch in the series and there we immediately use smp_load_acquire() and smp_store_release(). /* v5 */ base-commit: 7c53f6b671f4aba70ff15e1b05148b10d58c2837 --- fs/namespace.c | 9 +++++++++ include/linux/fs.h | 1 + include/linux/mount.h | 6 ++++++ 3 files changed, 16 insertions(+) diff --git a/fs/namespace.c b/fs/namespace.c index 6efae2681bcd..ceb2943f8458 100644 --- a/fs/namespace.c +++ b/fs/namespace.c @@ -228,6 +228,7 @@ static struct mount *alloc_vfsmnt(const char *name) INIT_HLIST_NODE(&mnt->mnt_mp_list); INIT_LIST_HEAD(&mnt->mnt_umounting); INIT_HLIST_HEAD(&mnt->mnt_stuck_children); + mnt->mnt.mnt_userns = &init_user_ns; } return mnt; @@ -567,6 +568,11 @@ int sb_prepare_remount_readonly(struct super_block *sb) static void free_vfsmnt(struct mount *mnt) { + struct user_namespace *mnt_userns; + + mnt_userns = mnt_user_ns(&mnt->mnt); + if (mnt_userns != &init_user_ns) + put_user_ns(mnt_userns); kfree_const(mnt->mnt_devname); #ifdef CONFIG_SMP free_percpu(mnt->mnt_pcp); @@ -1075,6 +1081,9 @@ static struct mount *clone_mnt(struct mount *old, struct dentry *root, mnt->mnt.mnt_flags &= ~(MNT_WRITE_HOLD|MNT_MARKED|MNT_INTERNAL); atomic_inc(&sb->s_active); + mnt->mnt.mnt_userns = mnt_user_ns(&old->mnt); + if (mnt->mnt.mnt_userns != &init_user_ns) + mnt->mnt.mnt_userns = get_user_ns(mnt->mnt.mnt_userns); mnt->mnt.mnt_sb = sb; mnt->mnt.mnt_root = dget(root); mnt->mnt_mountpoint = mnt->mnt.mnt_root; diff --git a/include/linux/fs.h b/include/linux/fs.h index 1da4c21fb588..0e2b8d235dca 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -2278,6 +2278,7 @@ struct file_system_type { #define FS_HAS_SUBTYPE 4 #define FS_USERNS_MOUNT 8 /* Can be mounted by userns root */ #define FS_DISALLOW_NOTIFY_PERM 16 /* Disable fanotify permission events */ +#define FS_ALLOW_IDMAP 32 /* FS has been updated to handle vfs idmappings. */ #define FS_THP_SUPPORT 8192 /* Remove once all fs converted */ #define FS_RENAME_DOES_D_MOVE 32768 /* FS will handle d_move() during rename() internally. */ int (*init_fs_context)(struct fs_context *); diff --git a/include/linux/mount.h b/include/linux/mount.h index aaf343b38671..52de25e08319 100644 --- a/include/linux/mount.h +++ b/include/linux/mount.h @@ -72,8 +72,14 @@ struct vfsmount { struct dentry *mnt_root; /* root of the mounted tree */ struct super_block *mnt_sb; /* pointer to superblock */ int mnt_flags; + struct user_namespace *mnt_userns; } __randomize_layout; +static inline struct user_namespace *mnt_user_ns(const struct vfsmount *mnt) +{ + return mnt->mnt_userns; +} + struct file; /* forward dec */ struct path; -- 2.30.0