2024-01-08 12:10:05

by Aleksandr Mikhalitsyn

[permalink] [raw]
Subject: [PATCH v1 0/9] fuse: basic support for idmapped mounts

Dear friends,

This patch series aimed to provide support for idmapped mounts
for fuse. We already have idmapped mounts support for almost all
widely-used filesystems:
* local (ext4, btrfs, xfs, fat, vfat, ntfs3, squashfs, f2fs, erofs, ZFS (out-of-tree))
* network (ceph)

Git tree (based on https://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse.git/log/?h=for-next):
v1: https://github.com/mihalicyn/linux/commits/fuse_idmapped_mounts.v1
current: https://github.com/mihalicyn/linux/commits/fuse_idmapped_mounts

Having fuse supported looks like a good next step. At the same time
fuse conceptually close to the network filesystems and supporting it is
a quite challenging task.

Let me briefly explain what was done in this series and which obstacles we have.

With this series, you can use idmapped mounts with fuse if the following
conditions are met:
1. The filesystem daemon declares idmap support (new FUSE_INIT response feature
flags FUSE_OWNER_UID_GID_EXT and FUSE_ALLOW_IDMAP)
2. The filesystem superblock was mounted with the "default_permissions" parameter
3. The filesystem fuse daemon does not perform any UID/GID-based checks internally
and fully trusts the kernel to do that (yes, it's almost the same as 2.)

I have prepared a bunch of real-world examples of the user space modifications
that can be done to use this extension:
- libfuse support
https://github.com/mihalicyn/libfuse/commits/idmap_support
- fuse-overlayfs support:
https://github.com/mihalicyn/fuse-overlayfs/commits/idmap_support
- cephfs-fuse conversion example
https://github.com/mihalicyn/ceph/commits/fuse_idmap
- glusterfs conversion example (there is a conceptual issue)
https://github.com/mihalicyn/glusterfs/commits/fuse_idmap

The glusterfs is a bit problematic, unfortunately, because even if the glusterfs
superblock was mounted with the "default_permissions" parameter (1 and 2 conditions
are satisfied), it fails to satisfy the 3rd condition. The glusterfs fuse daemon sends
caller UIDs/GIDs over the wire and all the permission checks are done twice (first
on the client side (in the fuse kernel module) and second on the glusterfs server side).
Just for demonstration's sake, I found a hacky (but working) solution for glusterfs
that disables these server-side permission checks (see [1]). This allows you to play
with the filesystem and idmapped mounts and it works just fine.

The problem described above is the main problem that we can meet when
working on idmapped mounts support for network-based filesystems (or network-like filesystems
like fuse). When people look at the idmapped mounts feature at first they tend to think
that idmaps are for faking caller UIDs/GIDs, but that's not the case. There was a big
discussion about this in the "ceph: support idmapped mounts" patch series [2], [3].
The brief outcome from this discussion is that we don't want and don't have to fool
filesystem code and map a caller's UID/GID everywhere, but only in VFS i_op's
which are provided with a "struct mnt_idmap *idmap"). For example ->lookup()
callback is not provided with it and that's on purpose! We don't expect the low-level
filesystem code to do any permissions checks inside this callback because everything
was already checked on the higher level (see may_lookup() helper). For local filesystems
this assumption works like a charm, but for network-based, unfortunately, not.
For example, the cephfs kernel client *always* send called UID/GID with *any* request
(->lookup included!) and then *may* (depending on the MDS configuration) perform any
permissions checks on the server side based on these values, which obviously leads
to issues/inconsistencies if VFS idmaps are involved.

Fuse filesystem very-very close to cephfs example, because we have req->in.h.uid/req->in.h.gid
and these values are present in all fuse requests and userspace may use them as it wants.

All of the above explains why we have a "default_permissions" requirement. If filesystem
does not use it, then permission checks will be widespread across all the i_op's like
->lookup, ->unlink, ->readlink instead of being consolidated in the one place (->permission callback).

In this series, my approach is the same as in cephfs [4], [5]. Don't touch req->in.h.uid/req->in.h.gid values
at all (because we can't properly idmap them as we don't have "struct mnt_idmap *idmap" everywhere),
instead, provide the userspace with a new optional (FUSE_OWNER_UID_GID_EXT) UID/GID suitable
only for ->mknod, ->mkdir, ->symlink, ->atomic_open and these values have to be used as the
owner UID and GID for newly created inodes.

Things to discuss:
- we enable idmapped mounts support only if "default_permissions" mode is enabled,
because otherwise, we would need to deal with UID/GID mappings on the userspace side OR
provide the userspace with idmapped req->in.h.uid/req->in.h.gid values which is not
something that we probably want to do. Idmapped mounts philosophy is not about faking
caller uid/gid.

- We have a small offlist discussion with Christian about adding fs_type->allow_idmap
hook. Christian pointed out that it would be nice to have a superblock flag instead like
SB_I_NOIDMAP and we can set this flag during mount time if we see that the filesystem does not
support idmappings. But, unfortunately, I didn't succeed here because the kernel will
know if the filesystem supports idmapping or not after FUSE_INIT request, but FUSE_INIT request
is being sent at the end of the mounting process, so the mount and superblock will exist and
visible by the userspace in that time. It seems like setting SB_I_NOIDMAP flag, in this
case, is too late as a user may do the trick by creating an idmapped mount while it wasn't
restricted by SB_I_NOIDMAP. Alternatively, we can introduce a "positive" version SB_I_ALLOWIDMAP
and a "weak" version of FS_ALLOW_IDMAP like FS_MAY_ALLOW_IDMAP. So if FS_MAY_ALLOW_IDMAP is set,
then SB_I_ALLOWIDMAP has to be set on the superblock to allow the creation of an idmapped mount.
But that's a matter of our discussion.

How to play with it:
1. take any patched filesystem from the list (fuse-overlayfs, cephfs-fuse, glusterfs) and mount it
2. ./mount-idmapped --map-mount b:1000:0:2 /mnt/my_fuse_mount /mnt/my_fuse_mount_idmapped
(maps UID/GIDs as 1000 -> 0, 1001 -> 1)
[ taken from https://raw.githubusercontent.com/brauner/mount-idmapped/master/mount-idmapped.c ]

[1] https://github.com/mihalicyn/glusterfs/commit/ab3ec2c7cbe22618cba9cc94a52a492b1904d0b2
[2] https://lore.kernel.org/lkml/[email protected]/
[3] https://lore.kernel.org/lkml/CAEivzxfw1fHO2TFA4dx3u23ZKK6Q+EThfzuibrhA3RKM=ZOYLg@mail.gmail.com/
[4] https://github.com/ceph/ceph/pull/52575
[5] https://lore.kernel.org/all/[email protected]/

Thanks!
Alex

Cc: Christian Brauner <[email protected]>
Cc: Seth Forshee <[email protected]>
Cc: Miklos Szeredi <[email protected]>
Cc: Amir Goldstein <[email protected]>
Cc: Bernd Schubert <[email protected]>
Cc: <[email protected]>
Signed-off-by: Alexander Mikhalitsyn <[email protected]>

Alexander Mikhalitsyn (9):
fs/namespace: introduce fs_type->allow_idmap hook
fs/fuse: add FUSE_OWNER_UID_GID_EXT extension
fs/fuse: support idmap for mkdir/mknod/symlink/create
fs/fuse: support idmapped getattr inode op
fs/fuse: support idmapped ->permission inode op
fs/fuse: support idmapped ->setattr op
fs/fuse: drop idmap argument from __fuse_get_acl
fs/fuse: support idmapped ->set_acl
fs/fuse: allow idmapped mounts

fs/fuse/acl.c | 10 ++-
fs/fuse/dir.c | 143 +++++++++++++++++++++++++-------------
fs/fuse/file.c | 2 +-
fs/fuse/fuse_i.h | 10 ++-
fs/fuse/inode.c | 24 ++++++-
fs/namespace.c | 3 +-
include/linux/fs.h | 1 +
include/uapi/linux/fuse.h | 24 ++++++-
8 files changed, 153 insertions(+), 64 deletions(-)

--
2.34.1



2024-01-08 12:10:28

by Aleksandr Mikhalitsyn

[permalink] [raw]
Subject: [PATCH v1 1/9] fs/namespace: introduce fs_type->allow_idmap hook

Right now we determine if filesystem support vfs idmappings
or not basing on the FS_ALLOW_IDMAP flag presence. This
"static" way works perfecly well for local filesystems
like ext4, xfs, btrfs, etc. But for network-like filesystems
like fuse, cephfs this approach is not ideal, because sometimes
proper support of vfs idmaps requires some extensions for the on-wire
protocol, which implies that changes have to be made not only
in the Linux kernel code but also in the 3rd party components like
libfuse, cephfs MDS server and so on.

We have seen that issue during our work on cephfs idmapped mounts [1]
with Christian, but right now I'm working on the idmapped mounts
support for fuse and I think that it is a right time for this extension.

[1] 5ccd8530dd7 ("ceph: handle idmapped mounts in create_request_message()")

Cc: Christian Brauner <[email protected]>
Cc: Seth Forshee <[email protected]>
Cc: Miklos Szeredi <[email protected]>
Cc: Amir Goldstein <[email protected]>
Cc: Bernd Schubert <[email protected]>
Cc: <[email protected]>
Signed-off-by: Alexander Mikhalitsyn <[email protected]>
---
fs/namespace.c | 3 ++-
include/linux/fs.h | 1 +
2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/fs/namespace.c b/fs/namespace.c
index fbf0e596fcd3..02eb47b3d728 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -4300,7 +4300,8 @@ static int can_idmap_mount(const struct mount_kattr *kattr, struct mount *mnt)
return -EPERM;

/* The underlying filesystem doesn't support idmapped mounts yet. */
- if (!(m->mnt_sb->s_type->fs_flags & FS_ALLOW_IDMAP))
+ if (!(m->mnt_sb->s_type->fs_flags & FS_ALLOW_IDMAP) ||
+ (m->mnt_sb->s_type->allow_idmap && !m->mnt_sb->s_type->allow_idmap(m->mnt_sb)))
return -EINVAL;

/* We're not controlling the superblock. */
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 98b7a7a8c42e..f2e373b5420a 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2377,6 +2377,7 @@ struct file_system_type {
#define FS_RENAME_DOES_D_MOVE 32768 /* FS will handle d_move() during rename() internally. */
int (*init_fs_context)(struct fs_context *);
const struct fs_parameter_spec *parameters;
+ bool (*allow_idmap)(struct super_block *);
struct dentry *(*mount) (struct file_system_type *, int,
const char *, void *);
void (*kill_sb) (struct super_block *);
--
2.34.1


2024-01-08 12:10:48

by Aleksandr Mikhalitsyn

[permalink] [raw]
Subject: [PATCH v1 2/9] fs/fuse: add FUSE_OWNER_UID_GID_EXT extension

To properly support vfs idmappings we need to provide
a fuse daemon with the correct owner uid/gid for
inode creation requests like mkdir, mknod, atomic_open,
symlink.

Right now, fuse daemons use req->in.h.uid/req->in.h.gid
to set inode owner. These fields contain fsuid/fsgid of the
syscall's caller. And that's perfectly fine, because inode
owner have to be set to these values. But, for idmapped mounts
it's not the case and caller fsuid/fsgid != inode owner, because
idmapped mounts do nothing with the caller fsuid/fsgid, but
affect inode owner uid/gid. It means that we can't apply vfsid
mapping to caller fsuid/fsgid, but instead we have to introduce
a new fields to store inode owner uid/gid which will be appropriately
transformed.

Christian and I have done the same to support idmapped mounts in
the cephfs recently [1].

[1] 5ccd8530 ("ceph: handle idmapped mounts in create_request_message()")

Cc: Miklos Szeredi <[email protected]>
Cc: Christian Brauner <[email protected]>
Cc: Seth Forshee <[email protected]>
Cc: Amir Goldstein <[email protected]>
Cc: Bernd Schubert <[email protected]>
Cc: <[email protected]>
Signed-off-by: Alexander Mikhalitsyn <[email protected]>
---
fs/fuse/dir.c | 34 +++++++++++++++++++++++++++++++---
fs/fuse/fuse_i.h | 3 +++
fs/fuse/inode.c | 4 +++-
include/uapi/linux/fuse.h | 19 +++++++++++++++++++
4 files changed, 56 insertions(+), 4 deletions(-)

diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c
index 6f5f9ff95380..e78ad4742aef 100644
--- a/fs/fuse/dir.c
+++ b/fs/fuse/dir.c
@@ -568,7 +568,33 @@ static int get_create_supp_group(struct inode *dir, struct fuse_in_arg *ext)
return 0;
}

-static int get_create_ext(struct fuse_args *args,
+static int get_owner_uid_gid(struct mnt_idmap *idmap, struct fuse_conn *fc, struct fuse_in_arg *ext)
+{
+ struct fuse_ext_header *xh;
+ struct fuse_owner_uid_gid *owner_creds;
+ u32 owner_creds_len = fuse_ext_size(sizeof(*owner_creds));
+ kuid_t owner_fsuid;
+ kgid_t owner_fsgid;
+
+ xh = extend_arg(ext, owner_creds_len);
+ if (!xh)
+ return -ENOMEM;
+
+ xh->size = owner_creds_len;
+ xh->type = FUSE_EXT_OWNER_UID_GID;
+
+ owner_creds = (struct fuse_owner_uid_gid *) &xh[1];
+
+ owner_fsuid = mapped_fsuid(idmap, fc->user_ns);
+ owner_fsgid = mapped_fsgid(idmap, fc->user_ns);
+ owner_creds->uid = from_kuid(fc->user_ns, owner_fsuid);
+ owner_creds->gid = from_kgid(fc->user_ns, owner_fsgid);
+
+ return 0;
+}
+
+static int get_create_ext(struct mnt_idmap *idmap,
+ struct fuse_args *args,
struct inode *dir, struct dentry *dentry,
umode_t mode)
{
@@ -580,6 +606,8 @@ static int get_create_ext(struct fuse_args *args,
err = get_security_context(dentry, mode, &ext);
if (!err && fc->create_supp_group)
err = get_create_supp_group(dir, &ext);
+ if (!err && fc->owner_uid_gid_ext)
+ err = get_owner_uid_gid(idmap, fc, &ext);

if (!err && ext.size) {
WARN_ON(args->in_numargs >= ARRAY_SIZE(args->in_args));
@@ -662,7 +690,7 @@ static int fuse_create_open(struct inode *dir, struct dentry *entry,
args.out_args[1].size = sizeof(outopen);
args.out_args[1].value = &outopen;

- err = get_create_ext(&args, dir, entry, mode);
+ err = get_create_ext(&nop_mnt_idmap, &args, dir, entry, mode);
if (err)
goto out_put_forget_req;

@@ -790,7 +818,7 @@ static int create_new_entry(struct fuse_mount *fm, struct fuse_args *args,
args->out_args[0].value = &outarg;

if (args->opcode != FUSE_LINK) {
- err = get_create_ext(args, dir, entry, mode);
+ err = get_create_ext(&nop_mnt_idmap, args, dir, entry, mode);
if (err)
goto out_put_forget_req;
}
diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h
index 1df83eebda92..15ec95dea276 100644
--- a/fs/fuse/fuse_i.h
+++ b/fs/fuse/fuse_i.h
@@ -806,6 +806,9 @@ struct fuse_conn {
/* Add supplementary group info when creating a new inode */
unsigned int create_supp_group:1;

+ /* Add owner_{u,g}id info when creating a new inode */
+ unsigned int owner_uid_gid_ext:1;
+
/* Does the filesystem support per inode DAX? */
unsigned int inode_dax:1;

diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c
index ab824a8908b7..08cd3714b32d 100644
--- a/fs/fuse/inode.c
+++ b/fs/fuse/inode.c
@@ -1284,6 +1284,8 @@ static void process_init_reply(struct fuse_mount *fm, struct fuse_args *args,
fc->create_supp_group = 1;
if (flags & FUSE_DIRECT_IO_ALLOW_MMAP)
fc->direct_io_allow_mmap = 1;
+ if (flags & FUSE_OWNER_UID_GID_EXT)
+ fc->owner_uid_gid_ext = 1;
} else {
ra_pages = fc->max_read / PAGE_SIZE;
fc->no_lock = 1;
@@ -1330,7 +1332,7 @@ void fuse_send_init(struct fuse_mount *fm)
FUSE_NO_OPENDIR_SUPPORT | FUSE_EXPLICIT_INVAL_DATA |
FUSE_HANDLE_KILLPRIV_V2 | FUSE_SETXATTR_EXT | FUSE_INIT_EXT |
FUSE_SECURITY_CTX | FUSE_CREATE_SUPP_GROUP |
- FUSE_HAS_EXPIRE_ONLY | FUSE_DIRECT_IO_ALLOW_MMAP;
+ FUSE_HAS_EXPIRE_ONLY | FUSE_DIRECT_IO_ALLOW_MMAP | FUSE_OWNER_UID_GID_EXT;
#ifdef CONFIG_FUSE_DAX
if (fm->fc->dax)
flags |= FUSE_MAP_ALIGNMENT;
diff --git a/include/uapi/linux/fuse.h b/include/uapi/linux/fuse.h
index e7418d15fe39..ebe82104b172 100644
--- a/include/uapi/linux/fuse.h
+++ b/include/uapi/linux/fuse.h
@@ -211,6 +211,10 @@
* 7.39
* - add FUSE_DIRECT_IO_ALLOW_MMAP
* - add FUSE_STATX and related structures
+ *
+ * 7.40
+ * - add FUSE_EXT_OWNER_UID_GID
+ * - add FUSE_OWNER_UID_GID_EXT
*/

#ifndef _LINUX_FUSE_H
@@ -410,6 +414,8 @@ struct fuse_file_lock {
* symlink and mknod (single group that matches parent)
* FUSE_HAS_EXPIRE_ONLY: kernel supports expiry-only entry invalidation
* FUSE_DIRECT_IO_ALLOW_MMAP: allow shared mmap in FOPEN_DIRECT_IO mode.
+ * FUSE_OWNER_UID_GID_EXT: add inode owner UID/GID info to create, mkdir,
+ * symlink and mknod
*/
#define FUSE_ASYNC_READ (1 << 0)
#define FUSE_POSIX_LOCKS (1 << 1)
@@ -452,6 +458,7 @@ struct fuse_file_lock {

/* Obsolete alias for FUSE_DIRECT_IO_ALLOW_MMAP */
#define FUSE_DIRECT_IO_RELAX FUSE_DIRECT_IO_ALLOW_MMAP
+#define FUSE_OWNER_UID_GID_EXT (1ULL << 37)

/**
* CUSE INIT request/reply flags
@@ -561,11 +568,13 @@ struct fuse_file_lock {
* extension type
* FUSE_MAX_NR_SECCTX: maximum value of &fuse_secctx_header.nr_secctx
* FUSE_EXT_GROUPS: &fuse_supp_groups extension
+ * FUSE_EXT_OWNER_UID_GID: &fuse_owner_uid_gid extension
*/
enum fuse_ext_type {
/* Types 0..31 are reserved for fuse_secctx_header */
FUSE_MAX_NR_SECCTX = 31,
FUSE_EXT_GROUPS = 32,
+ FUSE_EXT_OWNER_UID_GID = 33,
};

enum fuse_opcode {
@@ -1153,4 +1162,14 @@ struct fuse_supp_groups {
uint32_t groups[];
};

+/**
+ * struct fuse_owner_uid_gid - Inode owner UID/GID extension
+ * @uid: inode owner UID
+ * @gid: inode owner GID
+ */
+struct fuse_owner_uid_gid {
+ uint32_t uid;
+ uint32_t gid;
+};
+
#endif /* _LINUX_FUSE_H */
--
2.34.1


2024-01-08 12:11:04

by Aleksandr Mikhalitsyn

[permalink] [raw]
Subject: [PATCH v1 3/9] fs/fuse: support idmap for mkdir/mknod/symlink/create

We have all the infrastructure in place, we just need
to pass an idmapping here.

Cc: Christian Brauner <[email protected]>
Cc: Seth Forshee <[email protected]>
Cc: Miklos Szeredi <[email protected]>
Cc: Amir Goldstein <[email protected]>
Cc: Bernd Schubert <[email protected]>
Cc: <[email protected]>
Signed-off-by: Alexander Mikhalitsyn <[email protected]>
---
fs/fuse/dir.c | 33 +++++++++++++++++----------------
1 file changed, 17 insertions(+), 16 deletions(-)

diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c
index e78ad4742aef..a0968f086b62 100644
--- a/fs/fuse/dir.c
+++ b/fs/fuse/dir.c
@@ -633,9 +633,9 @@ static void free_ext_value(struct fuse_args *args)
* If the filesystem doesn't support this, then fall back to separate
* 'mknod' + 'open' requests.
*/
-static int fuse_create_open(struct inode *dir, struct dentry *entry,
- struct file *file, unsigned int flags,
- umode_t mode, u32 opcode)
+static int fuse_create_open(struct mnt_idmap *idmap, struct inode *dir,
+ struct dentry *entry, struct file *file,
+ unsigned int flags, umode_t mode, u32 opcode)
{
int err;
struct inode *inode;
@@ -690,7 +690,7 @@ static int fuse_create_open(struct inode *dir, struct dentry *entry,
args.out_args[1].size = sizeof(outopen);
args.out_args[1].value = &outopen;

- err = get_create_ext(&nop_mnt_idmap, &args, dir, entry, mode);
+ err = get_create_ext(idmap, &args, dir, entry, mode);
if (err)
goto out_put_forget_req;

@@ -749,6 +749,7 @@ static int fuse_atomic_open(struct inode *dir, struct dentry *entry,
umode_t mode)
{
int err;
+ struct mnt_idmap *idmap = file_mnt_idmap(file);
struct fuse_conn *fc = get_fuse_conn(dir);
struct dentry *res = NULL;

@@ -773,7 +774,7 @@ static int fuse_atomic_open(struct inode *dir, struct dentry *entry,
if (fc->no_create)
goto mknod;

- err = fuse_create_open(dir, entry, file, flags, mode, FUSE_CREATE);
+ err = fuse_create_open(idmap, dir, entry, file, flags, mode, FUSE_CREATE);
if (err == -ENOSYS) {
fc->no_create = 1;
goto mknod;
@@ -784,7 +785,7 @@ static int fuse_atomic_open(struct inode *dir, struct dentry *entry,
return err;

mknod:
- err = fuse_mknod(&nop_mnt_idmap, dir, entry, mode, 0);
+ err = fuse_mknod(idmap, dir, entry, mode, 0);
if (err)
goto out_dput;
no_open:
@@ -794,9 +795,9 @@ static int fuse_atomic_open(struct inode *dir, struct dentry *entry,
/*
* Code shared between mknod, mkdir, symlink and link
*/
-static int create_new_entry(struct fuse_mount *fm, struct fuse_args *args,
- struct inode *dir, struct dentry *entry,
- umode_t mode)
+static int create_new_entry(struct mnt_idmap *idmap, struct fuse_mount *fm,
+ struct fuse_args *args, struct inode *dir,
+ struct dentry *entry, umode_t mode)
{
struct fuse_entry_out outarg;
struct inode *inode;
@@ -818,7 +819,7 @@ static int create_new_entry(struct fuse_mount *fm, struct fuse_args *args,
args->out_args[0].value = &outarg;

if (args->opcode != FUSE_LINK) {
- err = get_create_ext(&nop_mnt_idmap, args, dir, entry, mode);
+ err = get_create_ext(idmap, args, dir, entry, mode);
if (err)
goto out_put_forget_req;
}
@@ -884,13 +885,13 @@ static int fuse_mknod(struct mnt_idmap *idmap, struct inode *dir,
args.in_args[0].value = &inarg;
args.in_args[1].size = entry->d_name.len + 1;
args.in_args[1].value = entry->d_name.name;
- return create_new_entry(fm, &args, dir, entry, mode);
+ return create_new_entry(idmap, fm, &args, dir, entry, mode);
}

static int fuse_create(struct mnt_idmap *idmap, struct inode *dir,
struct dentry *entry, umode_t mode, bool excl)
{
- return fuse_mknod(&nop_mnt_idmap, dir, entry, mode, 0);
+ return fuse_mknod(idmap, dir, entry, mode, 0);
}

static int fuse_tmpfile(struct mnt_idmap *idmap, struct inode *dir,
@@ -902,7 +903,7 @@ static int fuse_tmpfile(struct mnt_idmap *idmap, struct inode *dir,
if (fc->no_tmpfile)
return -EOPNOTSUPP;

- err = fuse_create_open(dir, file->f_path.dentry, file, file->f_flags, mode, FUSE_TMPFILE);
+ err = fuse_create_open(idmap, dir, file->f_path.dentry, file, file->f_flags, mode, FUSE_TMPFILE);
if (err == -ENOSYS) {
fc->no_tmpfile = 1;
err = -EOPNOTSUPP;
@@ -929,7 +930,7 @@ static int fuse_mkdir(struct mnt_idmap *idmap, struct inode *dir,
args.in_args[0].value = &inarg;
args.in_args[1].size = entry->d_name.len + 1;
args.in_args[1].value = entry->d_name.name;
- return create_new_entry(fm, &args, dir, entry, S_IFDIR);
+ return create_new_entry(idmap, fm, &args, dir, entry, S_IFDIR);
}

static int fuse_symlink(struct mnt_idmap *idmap, struct inode *dir,
@@ -945,7 +946,7 @@ static int fuse_symlink(struct mnt_idmap *idmap, struct inode *dir,
args.in_args[0].value = entry->d_name.name;
args.in_args[1].size = len;
args.in_args[1].value = link;
- return create_new_entry(fm, &args, dir, entry, S_IFLNK);
+ return create_new_entry(idmap, fm, &args, dir, entry, S_IFLNK);
}

void fuse_flush_time_update(struct inode *inode)
@@ -1139,7 +1140,7 @@ static int fuse_link(struct dentry *entry, struct inode *newdir,
args.in_args[0].value = &inarg;
args.in_args[1].size = newent->d_name.len + 1;
args.in_args[1].value = newent->d_name.name;
- err = create_new_entry(fm, &args, newdir, newent, inode->i_mode);
+ err = create_new_entry(&nop_mnt_idmap, fm, &args, newdir, newent, inode->i_mode);
if (!err)
fuse_update_ctime_in_cache(inode);
else if (err == -EINTR)
--
2.34.1


2024-01-08 12:11:30

by Aleksandr Mikhalitsyn

[permalink] [raw]
Subject: [PATCH v1 4/9] fs/fuse: support idmapped getattr inode op

We have to:
- pass an idmapping to the generic_fillattr()
to properly handle UIG/GID mapping for the userspace.
- pass -/- to fuse_fillattr() (analog of generic_fillattr() in fuse).

Difference between these two is that generic_fillattr() takes all
the stat() data from the inode directly, while fuse_fillattr() codepath
takes a fresh data just from the userspace reply on the FUSE_GETATTR request.

Cc: Christian Brauner <[email protected]>
Cc: Seth Forshee <[email protected]>
Cc: Miklos Szeredi <[email protected]>
Cc: Amir Goldstein <[email protected]>
Cc: Bernd Schubert <[email protected]>
Cc: <[email protected]>
Signed-off-by: Alexander Mikhalitsyn <[email protected]>
---
fs/fuse/dir.c | 44 ++++++++++++++++++++++++--------------------
1 file changed, 24 insertions(+), 20 deletions(-)

diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c
index a0968f086b62..5efcf06622f0 100644
--- a/fs/fuse/dir.c
+++ b/fs/fuse/dir.c
@@ -1149,18 +1149,22 @@ static int fuse_link(struct dentry *entry, struct inode *newdir,
return err;
}

-static void fuse_fillattr(struct inode *inode, struct fuse_attr *attr,
- struct kstat *stat)
+static void fuse_fillattr(struct mnt_idmap *idmap, struct inode *inode,
+ struct fuse_attr *attr, struct kstat *stat)
{
unsigned int blkbits;
struct fuse_conn *fc = get_fuse_conn(inode);
+ vfsuid_t vfsuid = make_vfsuid(idmap, fc->user_ns,
+ make_kuid(fc->user_ns, attr->uid));
+ vfsgid_t vfsgid = make_vfsgid(idmap, fc->user_ns,
+ make_kgid(fc->user_ns, attr->gid));

stat->dev = inode->i_sb->s_dev;
stat->ino = attr->ino;
stat->mode = (inode->i_mode & S_IFMT) | (attr->mode & 07777);
stat->nlink = attr->nlink;
- stat->uid = make_kuid(fc->user_ns, attr->uid);
- stat->gid = make_kgid(fc->user_ns, attr->gid);
+ stat->uid = vfsuid_into_kuid(vfsuid);
+ stat->gid = vfsgid_into_kgid(vfsgid);
stat->rdev = inode->i_rdev;
stat->atime.tv_sec = attr->atime;
stat->atime.tv_nsec = attr->atimensec;
@@ -1199,8 +1203,8 @@ static void fuse_statx_to_attr(struct fuse_statx *sx, struct fuse_attr *attr)
attr->blksize = sx->blksize;
}

-static int fuse_do_statx(struct inode *inode, struct file *file,
- struct kstat *stat)
+static int fuse_do_statx(struct mnt_idmap *idmap, struct inode *inode,
+ struct file *file, struct kstat *stat)
{
int err;
struct fuse_attr attr;
@@ -1253,15 +1257,15 @@ static int fuse_do_statx(struct inode *inode, struct file *file,
stat->result_mask = sx->mask & (STATX_BASIC_STATS | STATX_BTIME);
stat->btime.tv_sec = sx->btime.tv_sec;
stat->btime.tv_nsec = min_t(u32, sx->btime.tv_nsec, NSEC_PER_SEC - 1);
- fuse_fillattr(inode, &attr, stat);
+ fuse_fillattr(idmap, inode, &attr, stat);
stat->result_mask |= STATX_TYPE;
}

return 0;
}

-static int fuse_do_getattr(struct inode *inode, struct kstat *stat,
- struct file *file)
+static int fuse_do_getattr(struct mnt_idmap *idmap, struct inode *inode,
+ struct kstat *stat, struct file *file)
{
int err;
struct fuse_getattr_in inarg;
@@ -1300,15 +1304,15 @@ static int fuse_do_getattr(struct inode *inode, struct kstat *stat,
ATTR_TIMEOUT(&outarg),
attr_version);
if (stat)
- fuse_fillattr(inode, &outarg.attr, stat);
+ fuse_fillattr(idmap, inode, &outarg.attr, stat);
}
}
return err;
}

-static int fuse_update_get_attr(struct inode *inode, struct file *file,
- struct kstat *stat, u32 request_mask,
- unsigned int flags)
+static int fuse_update_get_attr(struct mnt_idmap *idmap, struct inode *inode,
+ struct file *file, struct kstat *stat,
+ u32 request_mask, unsigned int flags)
{
struct fuse_inode *fi = get_fuse_inode(inode);
struct fuse_conn *fc = get_fuse_conn(inode);
@@ -1339,16 +1343,16 @@ static int fuse_update_get_attr(struct inode *inode, struct file *file,
forget_all_cached_acls(inode);
/* Try statx if BTIME is requested */
if (!fc->no_statx && (request_mask & ~STATX_BASIC_STATS)) {
- err = fuse_do_statx(inode, file, stat);
+ err = fuse_do_statx(idmap, inode, file, stat);
if (err == -ENOSYS) {
fc->no_statx = 1;
goto retry;
}
} else {
- err = fuse_do_getattr(inode, stat, file);
+ err = fuse_do_getattr(idmap, inode, stat, file);
}
} else if (stat) {
- generic_fillattr(&nop_mnt_idmap, request_mask, inode, stat);
+ generic_fillattr(idmap, request_mask, inode, stat);
stat->mode = fi->orig_i_mode;
stat->ino = fi->orig_ino;
if (test_bit(FUSE_I_BTIME, &fi->state)) {
@@ -1362,7 +1366,7 @@ static int fuse_update_get_attr(struct inode *inode, struct file *file,

int fuse_update_attributes(struct inode *inode, struct file *file, u32 mask)
{
- return fuse_update_get_attr(inode, file, NULL, mask, 0);
+ return fuse_update_get_attr(&nop_mnt_idmap, inode, file, NULL, mask, 0);
}

int fuse_reverse_inval_entry(struct fuse_conn *fc, u64 parent_nodeid,
@@ -1506,7 +1510,7 @@ static int fuse_perm_getattr(struct inode *inode, int mask)
return -ECHILD;

forget_all_cached_acls(inode);
- return fuse_do_getattr(inode, NULL, NULL);
+ return fuse_do_getattr(&nop_mnt_idmap, inode, NULL, NULL);
}

/*
@@ -2062,7 +2066,7 @@ static int fuse_setattr(struct mnt_idmap *idmap, struct dentry *entry,
* ia_mode calculation may have used stale i_mode.
* Refresh and recalculate.
*/
- ret = fuse_do_getattr(inode, NULL, file);
+ ret = fuse_do_getattr(&nop_mnt_idmap, inode, NULL, file);
if (ret)
return ret;

@@ -2119,7 +2123,7 @@ static int fuse_getattr(struct mnt_idmap *idmap,
return -EACCES;
}

- return fuse_update_get_attr(inode, NULL, stat, request_mask, flags);
+ return fuse_update_get_attr(idmap, inode, NULL, stat, request_mask, flags);
}

static const struct inode_operations fuse_dir_inode_operations = {
--
2.34.1


2024-01-08 12:12:05

by Aleksandr Mikhalitsyn

[permalink] [raw]
Subject: [PATCH v1 6/9] fs/fuse: support idmapped ->setattr op

Cc: Christian Brauner <[email protected]>
Cc: Seth Forshee <[email protected]>
Cc: Miklos Szeredi <[email protected]>
Cc: Amir Goldstein <[email protected]>
Cc: Bernd Schubert <[email protected]>
Cc: <[email protected]>
Signed-off-by: Alexander Mikhalitsyn <[email protected]>
---
fs/fuse/dir.c | 32 +++++++++++++++++++++-----------
fs/fuse/file.c | 2 +-
fs/fuse/fuse_i.h | 4 ++--
3 files changed, 24 insertions(+), 14 deletions(-)

diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c
index f7c2c54f7122..5fbb7100ad1c 100644
--- a/fs/fuse/dir.c
+++ b/fs/fuse/dir.c
@@ -1739,17 +1739,27 @@ static bool update_mtime(unsigned ivalid, bool trust_local_mtime)
return true;
}

-static void iattr_to_fattr(struct fuse_conn *fc, struct iattr *iattr,
- struct fuse_setattr_in *arg, bool trust_local_cmtime)
+static void iattr_to_fattr(struct mnt_idmap *idmap, struct fuse_conn *fc,
+ struct iattr *iattr, struct fuse_setattr_in *arg,
+ bool trust_local_cmtime)
{
unsigned ivalid = iattr->ia_valid;

if (ivalid & ATTR_MODE)
arg->valid |= FATTR_MODE, arg->mode = iattr->ia_mode;
- if (ivalid & ATTR_UID)
- arg->valid |= FATTR_UID, arg->uid = from_kuid(fc->user_ns, iattr->ia_uid);
- if (ivalid & ATTR_GID)
- arg->valid |= FATTR_GID, arg->gid = from_kgid(fc->user_ns, iattr->ia_gid);
+
+ if (ivalid & ATTR_UID) {
+ kuid_t fsuid = from_vfsuid(idmap, fc->user_ns, iattr->ia_vfsuid);
+ arg->valid |= FATTR_UID;
+ arg->uid = from_kuid(fc->user_ns, fsuid);
+ }
+
+ if (ivalid & ATTR_GID) {
+ kgid_t fsgid = from_vfsgid(idmap, fc->user_ns, iattr->ia_vfsgid);
+ arg->valid |= FATTR_GID;
+ arg->gid = from_kgid(fc->user_ns, fsgid);
+ }
+
if (ivalid & ATTR_SIZE)
arg->valid |= FATTR_SIZE, arg->size = iattr->ia_size;
if (ivalid & ATTR_ATIME) {
@@ -1869,8 +1879,8 @@ int fuse_flush_times(struct inode *inode, struct fuse_file *ff)
* vmtruncate() doesn't allow for this case, so do the rlimit checking
* and the actual truncation by hand.
*/
-int fuse_do_setattr(struct dentry *dentry, struct iattr *attr,
- struct file *file)
+int fuse_do_setattr(struct mnt_idmap *idmap, struct dentry *dentry,
+ struct iattr *attr, struct file *file)
{
struct inode *inode = d_inode(dentry);
struct fuse_mount *fm = get_fuse_mount(inode);
@@ -1890,7 +1900,7 @@ int fuse_do_setattr(struct dentry *dentry, struct iattr *attr,
if (!fc->default_permissions)
attr->ia_valid |= ATTR_FORCE;

- err = setattr_prepare(&nop_mnt_idmap, dentry, attr);
+ err = setattr_prepare(idmap, dentry, attr);
if (err)
return err;

@@ -1949,7 +1959,7 @@ int fuse_do_setattr(struct dentry *dentry, struct iattr *attr,

memset(&inarg, 0, sizeof(inarg));
memset(&outarg, 0, sizeof(outarg));
- iattr_to_fattr(fc, attr, &inarg, trust_local_cmtime);
+ iattr_to_fattr(idmap, fc, attr, &inarg, trust_local_cmtime);
if (file) {
struct fuse_file *ff = file->private_data;
inarg.valid |= FATTR_FH;
@@ -2084,7 +2094,7 @@ static int fuse_setattr(struct mnt_idmap *idmap, struct dentry *entry,
if (!attr->ia_valid)
return 0;

- ret = fuse_do_setattr(entry, attr, file);
+ ret = fuse_do_setattr(idmap, entry, attr, file);
if (!ret) {
/*
* If filesystem supports acls it may have updated acl xattrs in
diff --git a/fs/fuse/file.c b/fs/fuse/file.c
index a660f1f21540..e0fe5497a548 100644
--- a/fs/fuse/file.c
+++ b/fs/fuse/file.c
@@ -2870,7 +2870,7 @@ static void fuse_do_truncate(struct file *file)
attr.ia_file = file;
attr.ia_valid |= ATTR_FILE;

- fuse_do_setattr(file_dentry(file), &attr, file);
+ fuse_do_setattr(&nop_mnt_idmap, file_dentry(file), &attr, file);
}

static inline loff_t fuse_round_up(struct fuse_conn *fc, loff_t off)
diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h
index 15ec95dea276..94b25ea5344a 100644
--- a/fs/fuse/fuse_i.h
+++ b/fs/fuse/fuse_i.h
@@ -1288,8 +1288,8 @@ bool fuse_write_update_attr(struct inode *inode, loff_t pos, ssize_t written);
int fuse_flush_times(struct inode *inode, struct fuse_file *ff);
int fuse_write_inode(struct inode *inode, struct writeback_control *wbc);

-int fuse_do_setattr(struct dentry *dentry, struct iattr *attr,
- struct file *file);
+int fuse_do_setattr(struct mnt_idmap *idmap, struct dentry *dentry,
+ struct iattr *attr, struct file *file);

void fuse_set_initialized(struct fuse_conn *fc);

--
2.34.1


2024-01-08 12:12:23

by Aleksandr Mikhalitsyn

[permalink] [raw]
Subject: [PATCH v1 5/9] fs/fuse: support idmapped ->permission inode op

We only cover the case when "default_permissions" flag
is used. A reason for that is that otherwise all the permission
checks are done in the userspace and we have to deal with
VFS idmapping in the userspace (which is bad), alternatively
we have to provide the userspace with idmapped req->in.h.uid/req->in.h.gid
which is also not align with VFS idmaps philosophy.

Cc: Christian Brauner <[email protected]>
Cc: Seth Forshee <[email protected]>
Cc: Miklos Szeredi <[email protected]>
Cc: Amir Goldstein <[email protected]>
Cc: Bernd Schubert <[email protected]>
Cc: <[email protected]>
Signed-off-by: Alexander Mikhalitsyn <[email protected]>
---
fs/fuse/dir.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c
index 5efcf06622f0..f7c2c54f7122 100644
--- a/fs/fuse/dir.c
+++ b/fs/fuse/dir.c
@@ -1558,7 +1558,7 @@ static int fuse_permission(struct mnt_idmap *idmap,
}

if (fc->default_permissions) {
- err = generic_permission(&nop_mnt_idmap, inode, mask);
+ err = generic_permission(idmap, inode, mask);

/* If permission is denied, try to refresh file
attributes. This is also needed, because the root
@@ -1566,7 +1566,7 @@ static int fuse_permission(struct mnt_idmap *idmap,
if (err == -EACCES && !refreshed) {
err = fuse_perm_getattr(inode, mask);
if (!err)
- err = generic_permission(&nop_mnt_idmap,
+ err = generic_permission(idmap,
inode, mask);
}

--
2.34.1


2024-01-08 12:12:29

by Aleksandr Mikhalitsyn

[permalink] [raw]
Subject: [PATCH v1 7/9] fs/fuse: drop idmap argument from __fuse_get_acl

We don't need to have idmap in the __fuse_get_acl as we don't
have any use for it.

In the current POSIX ACL implementation, idmapped mounts are
taken into account on the userspace/kernel border
(see vfs_set_acl_idmapped_mnt() and vfs_posix_acl_to_xattr()).

Cc: Christian Brauner <[email protected]>
Cc: Seth Forshee <[email protected]>
Cc: Miklos Szeredi <[email protected]>
Cc: Amir Goldstein <[email protected]>
Cc: Bernd Schubert <[email protected]>
Cc: <[email protected]>
Signed-off-by: Alexander Mikhalitsyn <[email protected]>
---
fs/fuse/acl.c | 6 ++----
1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/fs/fuse/acl.c b/fs/fuse/acl.c
index 3d192b80a561..3a3cd88bd3d7 100644
--- a/fs/fuse/acl.c
+++ b/fs/fuse/acl.c
@@ -12,7 +12,6 @@
#include <linux/posix_acl_xattr.h>

static struct posix_acl *__fuse_get_acl(struct fuse_conn *fc,
- struct mnt_idmap *idmap,
struct inode *inode, int type, bool rcu)
{
int size;
@@ -74,7 +73,7 @@ struct posix_acl *fuse_get_acl(struct mnt_idmap *idmap,
if (fuse_no_acl(fc, inode))
return ERR_PTR(-EOPNOTSUPP);

- return __fuse_get_acl(fc, idmap, inode, type, false);
+ return __fuse_get_acl(fc, inode, type, false);
}

struct posix_acl *fuse_get_inode_acl(struct inode *inode, int type, bool rcu)
@@ -90,8 +89,7 @@ struct posix_acl *fuse_get_inode_acl(struct inode *inode, int type, bool rcu)
*/
if (!fc->posix_acl)
return NULL;
-
- return __fuse_get_acl(fc, &nop_mnt_idmap, inode, type, rcu);
+ return __fuse_get_acl(fc, inode, type, rcu);
}

int fuse_set_acl(struct mnt_idmap *idmap, struct dentry *dentry,
--
2.34.1


2024-01-08 12:12:48

by Aleksandr Mikhalitsyn

[permalink] [raw]
Subject: [PATCH v1 8/9] fs/fuse: support idmapped ->set_acl

It's just a matter of adjusting a permission check condition
for S_ISGID flag. All the rest is already handled in the generic
VFS code.

Notice that this permission check is the analog of what
we have in posix_acl_update_mode() generic helper, but
fuse doesn't use this helper as on the kernel side we don't
care about ensuring that POSIX ACL and CHMOD permissions are in sync
as it is a responsibility of a userspace daemon to handle that.
For the same reason we don't have a calls to posix_acl_chmod(),
while most of other filesystem do.

Cc: Christian Brauner <[email protected]>
Cc: Seth Forshee <[email protected]>
Cc: Miklos Szeredi <[email protected]>
Cc: Amir Goldstein <[email protected]>
Cc: Bernd Schubert <[email protected]>
Cc: <[email protected]>
Signed-off-by: Alexander Mikhalitsyn <[email protected]>
---
fs/fuse/acl.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/fuse/acl.c b/fs/fuse/acl.c
index 3a3cd88bd3d7..727fe50e255e 100644
--- a/fs/fuse/acl.c
+++ b/fs/fuse/acl.c
@@ -144,8 +144,8 @@ int fuse_set_acl(struct mnt_idmap *idmap, struct dentry *dentry,
* be stripped.
*/
if (fc->posix_acl &&
- !vfsgid_in_group_p(i_gid_into_vfsgid(&nop_mnt_idmap, inode)) &&
- !capable_wrt_inode_uidgid(&nop_mnt_idmap, inode, CAP_FSETID))
+ !vfsgid_in_group_p(i_gid_into_vfsgid(idmap, inode)) &&
+ !capable_wrt_inode_uidgid(idmap, inode, CAP_FSETID))
extra_flags |= FUSE_SETXATTR_ACL_KILL_SGID;

ret = fuse_setxattr(inode, name, value, size, 0, extra_flags);
--
2.34.1


2024-01-08 12:13:08

by Aleksandr Mikhalitsyn

[permalink] [raw]
Subject: [PATCH v1 9/9] fs/fuse: allow idmapped mounts

Now we have everything in place and we can allow idmapped mounts
by setting the FS_ALLOW_IDMAP flag. Notice that real availability
of idmapped mounts will depend on the fuse daemon. Fuse daemon
have to set FUSE_ALLOW_IDMAP flag in the FUSE_INIT reply.

To discuss:
- we enable idmapped mounts support only if "default_permissions" mode is enabled,
because otherwise we would need to deal with UID/GID mappings in the userspace side OR
provide the userspace with idmapped req->in.h.uid/req->in.h.gid values which is not
something that we probably want to. Idmapped mounts phylosophy is not about faking
caller uid/gid.

- We have a small offlist discussion with Christian around adding fs_type->allow_idmap
hook. Christian pointed that it would be nice to have a superblock flag instead like
SB_I_NOIDMAP and we can set this flag during mount time if we see that filesystem does not
support idmappings. But, unfortunately I didn't succeed here because the kernel will
know if the filesystem supports idmapping or not after FUSE_INIT request, but FUSE_INIT request
is being sent at the end of mounting process, so mount and superblock will exist and
visible by the userspace in that time. It seems like setting SB_I_NOIDMAP flag in this
case is too late as user may do the trick with creating a idmapped mount while it wasn't
restricted by SB_I_NOIDMAP. Alternatively, we can introduce a "positive" version SB_I_ALLOWIDMAP
and "weak" version of FS_ALLOW_IDMAP like FS_MAY_ALLOW_IDMAP. So if FS_MAY_ALLOW_IDMAP is set,
then SB_I_ALLOWIDMAP has to be set on the superblock to allow creation of an idmapped mount.
But that's a matter of our discussion.

Some extra links and examples:

- libfuse support
https://github.com/mihalicyn/libfuse/commits/idmap_support

- fuse-overlayfs support:
https://github.com/mihalicyn/fuse-overlayfs/commits/idmap_support

- cephfs-fuse conversion example
https://github.com/mihalicyn/ceph/commits/fuse_idmap

- glusterfs conversion example
https://github.com/mihalicyn/glusterfs/commits/fuse_idmap

Cc: Christian Brauner <[email protected]>
Cc: Seth Forshee <[email protected]>
Cc: Miklos Szeredi <[email protected]>
Cc: Amir Goldstein <[email protected]>
Cc: Bernd Schubert <[email protected]>
Cc: <[email protected]>
Signed-off-by: Alexander Mikhalitsyn <[email protected]>
---
fs/fuse/fuse_i.h | 3 +++
fs/fuse/inode.c | 22 +++++++++++++++++++---
include/uapi/linux/fuse.h | 5 ++++-
3 files changed, 26 insertions(+), 4 deletions(-)

diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h
index 94b25ea5344a..9317b8c35191 100644
--- a/fs/fuse/fuse_i.h
+++ b/fs/fuse/fuse_i.h
@@ -809,6 +809,9 @@ struct fuse_conn {
/* Add owner_{u,g}id info when creating a new inode */
unsigned int owner_uid_gid_ext:1;

+ /* Allow creation of idmapped mounts */
+ unsigned int allow_idmap:1;
+
/* Does the filesystem support per inode DAX? */
unsigned int inode_dax:1;

diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c
index 08cd3714b32d..47e32a8baed3 100644
--- a/fs/fuse/inode.c
+++ b/fs/fuse/inode.c
@@ -1286,6 +1286,12 @@ static void process_init_reply(struct fuse_mount *fm, struct fuse_args *args,
fc->direct_io_allow_mmap = 1;
if (flags & FUSE_OWNER_UID_GID_EXT)
fc->owner_uid_gid_ext = 1;
+ if (flags & FUSE_ALLOW_IDMAP) {
+ if (fc->owner_uid_gid_ext && fc->default_permissions)
+ fc->allow_idmap = 1;
+ else
+ ok = false;
+ }
} else {
ra_pages = fc->max_read / PAGE_SIZE;
fc->no_lock = 1;
@@ -1332,7 +1338,8 @@ void fuse_send_init(struct fuse_mount *fm)
FUSE_NO_OPENDIR_SUPPORT | FUSE_EXPLICIT_INVAL_DATA |
FUSE_HANDLE_KILLPRIV_V2 | FUSE_SETXATTR_EXT | FUSE_INIT_EXT |
FUSE_SECURITY_CTX | FUSE_CREATE_SUPP_GROUP |
- FUSE_HAS_EXPIRE_ONLY | FUSE_DIRECT_IO_ALLOW_MMAP | FUSE_OWNER_UID_GID_EXT;
+ FUSE_HAS_EXPIRE_ONLY | FUSE_DIRECT_IO_ALLOW_MMAP |
+ FUSE_OWNER_UID_GID_EXT | FUSE_ALLOW_IDMAP;
#ifdef CONFIG_FUSE_DAX
if (fm->fc->dax)
flags |= FUSE_MAP_ALIGNMENT;
@@ -1915,12 +1922,20 @@ static void fuse_kill_sb_anon(struct super_block *sb)
fuse_mount_destroy(get_fuse_mount_super(sb));
}

+static bool fuse_allow_idmap(struct super_block *sb)
+{
+ struct fuse_conn *fc = get_fuse_conn_super(sb);
+
+ return fc->allow_idmap;
+}
+
static struct file_system_type fuse_fs_type = {
.owner = THIS_MODULE,
.name = "fuse",
- .fs_flags = FS_HAS_SUBTYPE | FS_USERNS_MOUNT,
+ .fs_flags = FS_HAS_SUBTYPE | FS_USERNS_MOUNT | FS_ALLOW_IDMAP,
.init_fs_context = fuse_init_fs_context,
.parameters = fuse_fs_parameters,
+ .allow_idmap = fuse_allow_idmap,
.kill_sb = fuse_kill_sb_anon,
};
MODULE_ALIAS_FS("fuse");
@@ -1938,8 +1953,9 @@ static struct file_system_type fuseblk_fs_type = {
.name = "fuseblk",
.init_fs_context = fuse_init_fs_context,
.parameters = fuse_fs_parameters,
+ .allow_idmap = fuse_allow_idmap,
.kill_sb = fuse_kill_sb_blk,
- .fs_flags = FS_REQUIRES_DEV | FS_HAS_SUBTYPE,
+ .fs_flags = FS_REQUIRES_DEV | FS_HAS_SUBTYPE | FS_ALLOW_IDMAP,
};
MODULE_ALIAS_FS("fuseblk");

diff --git a/include/uapi/linux/fuse.h b/include/uapi/linux/fuse.h
index ebe82104b172..d8e1235d9796 100644
--- a/include/uapi/linux/fuse.h
+++ b/include/uapi/linux/fuse.h
@@ -215,6 +215,7 @@
* 7.40
* - add FUSE_EXT_OWNER_UID_GID
* - add FUSE_OWNER_UID_GID_EXT
+ * - add FUSE_ALLOW_IDMAP
*/

#ifndef _LINUX_FUSE_H
@@ -250,7 +251,7 @@
#define FUSE_KERNEL_VERSION 7

/** Minor version number of this interface */
-#define FUSE_KERNEL_MINOR_VERSION 39
+#define FUSE_KERNEL_MINOR_VERSION 40

/** The node ID of the root inode */
#define FUSE_ROOT_ID 1
@@ -416,6 +417,7 @@ struct fuse_file_lock {
* FUSE_DIRECT_IO_ALLOW_MMAP: allow shared mmap in FOPEN_DIRECT_IO mode.
* FUSE_OWNER_UID_GID_EXT: add inode owner UID/GID info to create, mkdir,
* symlink and mknod
+ * FUSE_ALLOW_IDMAP: allow creation of idmapped mounts
*/
#define FUSE_ASYNC_READ (1 << 0)
#define FUSE_POSIX_LOCKS (1 << 1)
@@ -459,6 +461,7 @@ struct fuse_file_lock {
/* Obsolete alias for FUSE_DIRECT_IO_ALLOW_MMAP */
#define FUSE_DIRECT_IO_RELAX FUSE_DIRECT_IO_ALLOW_MMAP
#define FUSE_OWNER_UID_GID_EXT (1ULL << 37)
+#define FUSE_ALLOW_IDMAP (1ULL << 38)

/**
* CUSE INIT request/reply flags
--
2.34.1


2024-01-20 15:21:26

by Christian Brauner

[permalink] [raw]
Subject: Re: [PATCH v1 4/9] fs/fuse: support idmapped getattr inode op

> int fuse_update_attributes(struct inode *inode, struct file *file, u32 mask)
> {
> - return fuse_update_get_attr(inode, file, NULL, mask, 0);
> + return fuse_update_get_attr(&nop_mnt_idmap, inode, file, NULL, mask, 0);
> }
>
> int fuse_reverse_inval_entry(struct fuse_conn *fc, u64 parent_nodeid,
> @@ -1506,7 +1510,7 @@ static int fuse_perm_getattr(struct inode *inode, int mask)
> return -ECHILD;
>
> forget_all_cached_acls(inode);
> - return fuse_do_getattr(inode, NULL, NULL);
> + return fuse_do_getattr(&nop_mnt_idmap, inode, NULL, NULL);
> }
>
> /*
> @@ -2062,7 +2066,7 @@ static int fuse_setattr(struct mnt_idmap *idmap, struct dentry *entry,
> * ia_mode calculation may have used stale i_mode.
> * Refresh and recalculate.
> */
> - ret = fuse_do_getattr(inode, NULL, file);
> + ret = fuse_do_getattr(&nop_mnt_idmap, inode, NULL, file);
> if (ret)
> return ret;

These are internal getattr requests that don't originate from a specific mount?
Can you please add a comment about this in the commit message so it's
clear why it's ok to not pass the idmapping?

2024-01-20 15:23:55

by Christian Brauner

[permalink] [raw]
Subject: Re: [PATCH v1 6/9] fs/fuse: support idmapped ->setattr op

On Mon, Jan 08, 2024 at 01:08:21PM +0100, Alexander Mikhalitsyn wrote:
> Cc: Christian Brauner <[email protected]>
> Cc: Seth Forshee <[email protected]>
> Cc: Miklos Szeredi <[email protected]>
> Cc: Amir Goldstein <[email protected]>
> Cc: Bernd Schubert <[email protected]>
> Cc: <[email protected]>
> Signed-off-by: Alexander Mikhalitsyn <[email protected]>
> ---
> fs/fuse/dir.c | 32 +++++++++++++++++++++-----------
> fs/fuse/file.c | 2 +-
> fs/fuse/fuse_i.h | 4 ++--
> 3 files changed, 24 insertions(+), 14 deletions(-)
>
> diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c
> index f7c2c54f7122..5fbb7100ad1c 100644
> --- a/fs/fuse/dir.c
> +++ b/fs/fuse/dir.c
> @@ -1739,17 +1739,27 @@ static bool update_mtime(unsigned ivalid, bool trust_local_mtime)
> return true;
> }
>
> -static void iattr_to_fattr(struct fuse_conn *fc, struct iattr *iattr,
> - struct fuse_setattr_in *arg, bool trust_local_cmtime)
> +static void iattr_to_fattr(struct mnt_idmap *idmap, struct fuse_conn *fc,
> + struct iattr *iattr, struct fuse_setattr_in *arg,
> + bool trust_local_cmtime)
> {
> unsigned ivalid = iattr->ia_valid;
>
> if (ivalid & ATTR_MODE)
> arg->valid |= FATTR_MODE, arg->mode = iattr->ia_mode;
> - if (ivalid & ATTR_UID)
> - arg->valid |= FATTR_UID, arg->uid = from_kuid(fc->user_ns, iattr->ia_uid);
> - if (ivalid & ATTR_GID)
> - arg->valid |= FATTR_GID, arg->gid = from_kgid(fc->user_ns, iattr->ia_gid);
> +
> + if (ivalid & ATTR_UID) {
> + kuid_t fsuid = from_vfsuid(idmap, fc->user_ns, iattr->ia_vfsuid);
> + arg->valid |= FATTR_UID;
> + arg->uid = from_kuid(fc->user_ns, fsuid);
> + }
> +
> + if (ivalid & ATTR_GID) {
> + kgid_t fsgid = from_vfsgid(idmap, fc->user_ns, iattr->ia_vfsgid);
> + arg->valid |= FATTR_GID;
> + arg->gid = from_kgid(fc->user_ns, fsgid);
> + }
> +
> if (ivalid & ATTR_SIZE)
> arg->valid |= FATTR_SIZE, arg->size = iattr->ia_size;
> if (ivalid & ATTR_ATIME) {
> @@ -1869,8 +1879,8 @@ int fuse_flush_times(struct inode *inode, struct fuse_file *ff)
> * vmtruncate() doesn't allow for this case, so do the rlimit checking
> * and the actual truncation by hand.
> */
> -int fuse_do_setattr(struct dentry *dentry, struct iattr *attr,
> - struct file *file)
> +int fuse_do_setattr(struct mnt_idmap *idmap, struct dentry *dentry,
> + struct iattr *attr, struct file *file)
> {
> struct inode *inode = d_inode(dentry);
> struct fuse_mount *fm = get_fuse_mount(inode);
> @@ -1890,7 +1900,7 @@ int fuse_do_setattr(struct dentry *dentry, struct iattr *attr,
> if (!fc->default_permissions)
> attr->ia_valid |= ATTR_FORCE;
>
> - err = setattr_prepare(&nop_mnt_idmap, dentry, attr);
> + err = setattr_prepare(idmap, dentry, attr);
> if (err)
> return err;
>
> @@ -1949,7 +1959,7 @@ int fuse_do_setattr(struct dentry *dentry, struct iattr *attr,
>
> memset(&inarg, 0, sizeof(inarg));
> memset(&outarg, 0, sizeof(outarg));
> - iattr_to_fattr(fc, attr, &inarg, trust_local_cmtime);
> + iattr_to_fattr(idmap, fc, attr, &inarg, trust_local_cmtime);
> if (file) {
> struct fuse_file *ff = file->private_data;
> inarg.valid |= FATTR_FH;
> @@ -2084,7 +2094,7 @@ static int fuse_setattr(struct mnt_idmap *idmap, struct dentry *entry,
> if (!attr->ia_valid)
> return 0;
>
> - ret = fuse_do_setattr(entry, attr, file);
> + ret = fuse_do_setattr(idmap, entry, attr, file);
> if (!ret) {
> /*
> * If filesystem supports acls it may have updated acl xattrs in
> diff --git a/fs/fuse/file.c b/fs/fuse/file.c
> index a660f1f21540..e0fe5497a548 100644
> --- a/fs/fuse/file.c
> +++ b/fs/fuse/file.c
> @@ -2870,7 +2870,7 @@ static void fuse_do_truncate(struct file *file)
> attr.ia_file = file;
> attr.ia_valid |= ATTR_FILE;
>
> - fuse_do_setattr(file_dentry(file), &attr, file);
> + fuse_do_setattr(&nop_mnt_idmap, file_dentry(file), &attr, file);

Same as for the other patch. Please leave a comment in the commit
message that briefly explains why it's ok to pass &nop_mnt_idmap here.
It'll help us later. :)

2024-01-20 15:25:10

by Christian Brauner

[permalink] [raw]
Subject: Re: [PATCH v1 7/9] fs/fuse: drop idmap argument from __fuse_get_acl

On Mon, Jan 08, 2024 at 01:08:22PM +0100, Alexander Mikhalitsyn wrote:
> We don't need to have idmap in the __fuse_get_acl as we don't
> have any use for it.
>
> In the current POSIX ACL implementation, idmapped mounts are
> taken into account on the userspace/kernel border
> (see vfs_set_acl_idmapped_mnt() and vfs_posix_acl_to_xattr()).
>
> Cc: Christian Brauner <[email protected]>
> Cc: Seth Forshee <[email protected]>
> Cc: Miklos Szeredi <[email protected]>
> Cc: Amir Goldstein <[email protected]>
> Cc: Bernd Schubert <[email protected]>
> Cc: <[email protected]>
> Signed-off-by: Alexander Mikhalitsyn <[email protected]>
> ---

Ah, that probably became obsolete when I did the VFS POSIX ACL api.
Thanks,
Reviewed-by: Christian Brauner <[email protected]>

2024-01-21 17:51:13

by Christian Brauner

[permalink] [raw]
Subject: Re: [PATCH v1 0/9] fuse: basic support for idmapped mounts

On Mon, Jan 08, 2024 at 01:08:15PM +0100, Alexander Mikhalitsyn wrote:
> Dear friends,
>
> This patch series aimed to provide support for idmapped mounts
> for fuse. We already have idmapped mounts support for almost all
> widely-used filesystems:
> * local (ext4, btrfs, xfs, fat, vfat, ntfs3, squashfs, f2fs, erofs, ZFS (out-of-tree))
> * network (ceph)
>
> Git tree (based on https://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse.git/log/?h=for-next):
> v1: https://github.com/mihalicyn/linux/commits/fuse_idmapped_mounts.v1
> current: https://github.com/mihalicyn/linux/commits/fuse_idmapped_mounts

Great work!

> Things to discuss:
> - we enable idmapped mounts support only if "default_permissions" mode is enabled,
> because otherwise, we would need to deal with UID/GID mappings on the userspace side OR
> provide the userspace with idmapped req->in.h.uid/req->in.h.gid values which is not
> something that we probably want to do. Idmapped mounts philosophy is not about faking
> caller uid/gid.

Having VFS idmaps but then outsourcing permission checking to userspace
is conceptually strange so requiring default_permissions is the correct
thing to do.

> - We have a small offlist discussion with Christian about adding fs_type->allow_idmap
> hook. Christian pointed out that it would be nice to have a superblock flag instead like
> SB_I_NOIDMAP and we can set this flag during mount time if we see that the filesystem does not
> support idmappings. But, unfortunately, I didn't succeed here because the kernel will
> know if the filesystem supports idmapping or not after FUSE_INIT request, but FUSE_INIT request
> is being sent at the end of the mounting process, so the mount and superblock will exist and
> visible by the userspace in that time. It seems like setting SB_I_NOIDMAP flag, in this
> case, is too late as a user may do the trick by creating an idmapped mount while it wasn't
> restricted by SB_I_NOIDMAP. Alternatively, we can introduce a "positive" version SB_I_ALLOWIDMAP

I see.

> and a "weak" version of FS_ALLOW_IDMAP like FS_MAY_ALLOW_IDMAP. So if FS_MAY_ALLOW_IDMAP is set,
> then SB_I_ALLOWIDMAP has to be set on the superblock to allow the creation of an idmapped mount.
> But that's a matter of our discussion.

I dislike making adding a struct super_block method. Because it means that we
call into the filesystem from generic mount code and specifically with the
namespace semaphore held. If there's ever any network filesystem that e.g.,
calls to a hung server it will lockup the whole system. So I'm opposed to
calling into the filesystem here at all. It's also ugly because this is really
a vfs level change. The only involvement should be whether the filesystem type
can actually support this ideally.

I think we should handle this within FUSE. So we allow the creation of idmapped
mounts just based on FS_ALLOW_IDMAP. And if the server doesn't support the
FUSE_OWNER_UID_GID_EXT then we simply refuse all creation requests originating
from an idmapped mount. Either we return EOPNOSUPP or we return EOVERFLOW to
indicate that we can't represent the owner correctly because the server is
missing the required extension.

diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c
index 3f37ba6a7a10..0726da21150a 100644
--- a/fs/fuse/dir.c
+++ b/fs/fuse/dir.c
@@ -606,8 +606,16 @@ static int get_create_ext(struct mnt_idmap *idmap,
err = get_security_context(dentry, mode, &ext);
if (!err && fc->create_supp_group)
err = get_create_supp_group(dir, &ext);
- if (!err && fc->owner_uid_gid_ext)
- err = get_owner_uid_gid(idmap, fc, &ext);
+ if (!err) {
+ /*
+ * If the server doesn't support FUSE_OWNER_UID_GID_EXT and
+ * this is a creation request from an idmapped mount refuse it.
+ */
+ if (fc->owner_uid_gid_ext)
+ err = get_owner_uid_gid(idmap, fc, &ext);
+ else if (idmap != &nop_mnt_idmap)
+ err = -EOPNOTSUPP;
+ }

if (!err && ext.size) {
WARN_ON(args->in_numargs >= ARRAY_SIZE(args->in_args));

2024-01-22 21:13:26

by Seth Forshee

[permalink] [raw]
Subject: Re: [PATCH v1 0/9] fuse: basic support for idmapped mounts

On Sun, Jan 21, 2024 at 06:50:57PM +0100, Christian Brauner wrote:
> > - We have a small offlist discussion with Christian about adding fs_type->allow_idmap
> > hook. Christian pointed out that it would be nice to have a superblock flag instead like
> > SB_I_NOIDMAP and we can set this flag during mount time if we see that the filesystem does not
> > support idmappings. But, unfortunately, I didn't succeed here because the kernel will
> > know if the filesystem supports idmapping or not after FUSE_INIT request, but FUSE_INIT request
> > is being sent at the end of the mounting process, so the mount and superblock will exist and
> > visible by the userspace in that time. It seems like setting SB_I_NOIDMAP flag, in this
> > case, is too late as a user may do the trick by creating an idmapped mount while it wasn't
> > restricted by SB_I_NOIDMAP. Alternatively, we can introduce a "positive" version SB_I_ALLOWIDMAP
>
> I see.
>
> > and a "weak" version of FS_ALLOW_IDMAP like FS_MAY_ALLOW_IDMAP. So if FS_MAY_ALLOW_IDMAP is set,
> > then SB_I_ALLOWIDMAP has to be set on the superblock to allow the creation of an idmapped mount.
> > But that's a matter of our discussion.
>
> I dislike making adding a struct super_block method. Because it means that we
> call into the filesystem from generic mount code and specifically with the
> namespace semaphore held. If there's ever any network filesystem that e.g.,
> calls to a hung server it will lockup the whole system. So I'm opposed to
> calling into the filesystem here at all. It's also ugly because this is really
> a vfs level change. The only involvement should be whether the filesystem type
> can actually support this ideally.
>
> I think we should handle this within FUSE. So we allow the creation of idmapped
> mounts just based on FS_ALLOW_IDMAP. And if the server doesn't support the
> FUSE_OWNER_UID_GID_EXT then we simply refuse all creation requests originating
> from an idmapped mount. Either we return EOPNOSUPP or we return EOVERFLOW to
> indicate that we can't represent the owner correctly because the server is
> missing the required extension.

Could fuse just set SB_I_NOIDMAP initially then clear it if the init
reply indicates idmap support? This is like the "weak" FS_ALLOW_IDMAP
option without requiring another file_system_type flag.

2024-01-23 15:57:05

by Christian Brauner

[permalink] [raw]
Subject: Re: [PATCH v1 0/9] fuse: basic support for idmapped mounts

On Mon, Jan 22, 2024 at 03:13:12PM -0600, Seth Forshee wrote:
> On Sun, Jan 21, 2024 at 06:50:57PM +0100, Christian Brauner wrote:
> > > - We have a small offlist discussion with Christian about adding fs_type->allow_idmap
> > > hook. Christian pointed out that it would be nice to have a superblock flag instead like
> > > SB_I_NOIDMAP and we can set this flag during mount time if we see that the filesystem does not
> > > support idmappings. But, unfortunately, I didn't succeed here because the kernel will
> > > know if the filesystem supports idmapping or not after FUSE_INIT request, but FUSE_INIT request
> > > is being sent at the end of the mounting process, so the mount and superblock will exist and
> > > visible by the userspace in that time. It seems like setting SB_I_NOIDMAP flag, in this
> > > case, is too late as a user may do the trick by creating an idmapped mount while it wasn't
> > > restricted by SB_I_NOIDMAP. Alternatively, we can introduce a "positive" version SB_I_ALLOWIDMAP
> >
> > I see.
> >
> > > and a "weak" version of FS_ALLOW_IDMAP like FS_MAY_ALLOW_IDMAP. So if FS_MAY_ALLOW_IDMAP is set,
> > > then SB_I_ALLOWIDMAP has to be set on the superblock to allow the creation of an idmapped mount.
> > > But that's a matter of our discussion.
> >
> > I dislike making adding a struct super_block method. Because it means that we
> > call into the filesystem from generic mount code and specifically with the
> > namespace semaphore held. If there's ever any network filesystem that e.g.,
> > calls to a hung server it will lockup the whole system. So I'm opposed to
> > calling into the filesystem here at all. It's also ugly because this is really
> > a vfs level change. The only involvement should be whether the filesystem type
> > can actually support this ideally.
> >
> > I think we should handle this within FUSE. So we allow the creation of idmapped
> > mounts just based on FS_ALLOW_IDMAP. And if the server doesn't support the
> > FUSE_OWNER_UID_GID_EXT then we simply refuse all creation requests originating
> > from an idmapped mount. Either we return EOPNOSUPP or we return EOVERFLOW to
> > indicate that we can't represent the owner correctly because the server is
> > missing the required extension.
>
> Could fuse just set SB_I_NOIDMAP initially then clear it if the init
> reply indicates idmap support? This is like the "weak" FS_ALLOW_IDMAP
> option without requiring another file_system_type flag.

Would probably works too, yes. I'm just trying to avoid any additional
flag usage. Ideally we'd just do it in FUSE but this way is possibly
also effective. Although it kinda leaks internal FUSE details as in
"SB_I_NOIDMAP" really means "SB_I_INITIALIZED" which I kind of dislike.

2024-01-29 15:43:56

by Aleksandr Mikhalitsyn

[permalink] [raw]
Subject: Re: [PATCH v1 4/9] fs/fuse: support idmapped getattr inode op

On Sat, 20 Jan 2024 16:21:08 +0100
Christian Brauner <[email protected]> wrote:

> > int fuse_update_attributes(struct inode *inode, struct file *file, u32 mask)
> > {
> > - return fuse_update_get_attr(inode, file, NULL, mask, 0);
> > + return fuse_update_get_attr(&nop_mnt_idmap, inode, file, NULL, mask, 0);
> > }
> >
> > int fuse_reverse_inval_entry(struct fuse_conn *fc, u64 parent_nodeid,
> > @@ -1506,7 +1510,7 @@ static int fuse_perm_getattr(struct inode *inode, int mask)
> > return -ECHILD;
> >
> > forget_all_cached_acls(inode);
> > - return fuse_do_getattr(inode, NULL, NULL);
> > + return fuse_do_getattr(&nop_mnt_idmap, inode, NULL, NULL);
> > }
> >
> > /*
> > @@ -2062,7 +2066,7 @@ static int fuse_setattr(struct mnt_idmap *idmap, struct dentry *entry,
> > * ia_mode calculation may have used stale i_mode.
> > * Refresh and recalculate.
> > */
> > - ret = fuse_do_getattr(inode, NULL, file);
> > + ret = fuse_do_getattr(&nop_mnt_idmap, inode, NULL, file);
> > if (ret)
> > return ret;
>

Hi, Christian!

> These are internal getattr requests that don't originate from a specific mount?

These requests do originate from a specific mount, but we don't need an idmapping in there
because 3rd argument of fuse_do_getattr(struct mnt_idmap *idmap, struct inode *inode, struct kstat *stat, struct file *file) is NULL,
which means that this request will not fill an stat structure => we don't need to take an idmapping into account.

> Can you please add a comment about this in the commit message so it's
> clear why it's ok to not pass the idmapping?

Sure, that's my bad that I haven't added this explanation! It's not obvious at all.

Will be fixed in -v2.

Thanks,
Alex


2024-01-29 15:49:02

by Aleksandr Mikhalitsyn

[permalink] [raw]
Subject: Re: [PATCH v1 6/9] fs/fuse: support idmapped ->setattr op

On Sat, 20 Jan 2024 16:23:38 +0100
Christian Brauner <[email protected]> wrote:

> On Mon, Jan 08, 2024 at 01:08:21PM +0100, Alexander Mikhalitsyn wrote:
> > Cc: Christian Brauner <[email protected]>
> > Cc: Seth Forshee <[email protected]>
> > Cc: Miklos Szeredi <[email protected]>
> > Cc: Amir Goldstein <[email protected]>
> > Cc: Bernd Schubert <[email protected]>
> > Cc: <[email protected]>
> > Signed-off-by: Alexander Mikhalitsyn <[email protected]>
> > ---
> > fs/fuse/dir.c | 32 +++++++++++++++++++++-----------
> > fs/fuse/file.c | 2 +-
> > fs/fuse/fuse_i.h | 4 ++--
> > 3 files changed, 24 insertions(+), 14 deletions(-)
> >
> > diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c
> > index f7c2c54f7122..5fbb7100ad1c 100644
> > --- a/fs/fuse/dir.c
> > +++ b/fs/fuse/dir.c
> > @@ -1739,17 +1739,27 @@ static bool update_mtime(unsigned ivalid, bool trust_local_mtime)
> > return true;
> > }
> >
> > -static void iattr_to_fattr(struct fuse_conn *fc, struct iattr *iattr,
> > - struct fuse_setattr_in *arg, bool trust_local_cmtime)
> > +static void iattr_to_fattr(struct mnt_idmap *idmap, struct fuse_conn *fc,
> > + struct iattr *iattr, struct fuse_setattr_in *arg,
> > + bool trust_local_cmtime)
> > {
> > unsigned ivalid = iattr->ia_valid;
> >
> > if (ivalid & ATTR_MODE)
> > arg->valid |= FATTR_MODE, arg->mode = iattr->ia_mode;
> > - if (ivalid & ATTR_UID)
> > - arg->valid |= FATTR_UID, arg->uid = from_kuid(fc->user_ns, iattr->ia_uid);
> > - if (ivalid & ATTR_GID)
> > - arg->valid |= FATTR_GID, arg->gid = from_kgid(fc->user_ns, iattr->ia_gid);
> > +
> > + if (ivalid & ATTR_UID) {
> > + kuid_t fsuid = from_vfsuid(idmap, fc->user_ns, iattr->ia_vfsuid);
> > + arg->valid |= FATTR_UID;
> > + arg->uid = from_kuid(fc->user_ns, fsuid);
> > + }
> > +
> > + if (ivalid & ATTR_GID) {
> > + kgid_t fsgid = from_vfsgid(idmap, fc->user_ns, iattr->ia_vfsgid);
> > + arg->valid |= FATTR_GID;
> > + arg->gid = from_kgid(fc->user_ns, fsgid);
> > + }
> > +
> > if (ivalid & ATTR_SIZE)
> > arg->valid |= FATTR_SIZE, arg->size = iattr->ia_size;
> > if (ivalid & ATTR_ATIME) {
> > @@ -1869,8 +1879,8 @@ int fuse_flush_times(struct inode *inode, struct fuse_file *ff)
> > * vmtruncate() doesn't allow for this case, so do the rlimit checking
> > * and the actual truncation by hand.
> > */
> > -int fuse_do_setattr(struct dentry *dentry, struct iattr *attr,
> > - struct file *file)
> > +int fuse_do_setattr(struct mnt_idmap *idmap, struct dentry *dentry,
> > + struct iattr *attr, struct file *file)
> > {
> > struct inode *inode = d_inode(dentry);
> > struct fuse_mount *fm = get_fuse_mount(inode);
> > @@ -1890,7 +1900,7 @@ int fuse_do_setattr(struct dentry *dentry, struct iattr *attr,
> > if (!fc->default_permissions)
> > attr->ia_valid |= ATTR_FORCE;
> >
> > - err = setattr_prepare(&nop_mnt_idmap, dentry, attr);
> > + err = setattr_prepare(idmap, dentry, attr);
> > if (err)
> > return err;
> >
> > @@ -1949,7 +1959,7 @@ int fuse_do_setattr(struct dentry *dentry, struct iattr *attr,
> >
> > memset(&inarg, 0, sizeof(inarg));
> > memset(&outarg, 0, sizeof(outarg));
> > - iattr_to_fattr(fc, attr, &inarg, trust_local_cmtime);
> > + iattr_to_fattr(idmap, fc, attr, &inarg, trust_local_cmtime);
> > if (file) {
> > struct fuse_file *ff = file->private_data;
> > inarg.valid |= FATTR_FH;
> > @@ -2084,7 +2094,7 @@ static int fuse_setattr(struct mnt_idmap *idmap, struct dentry *entry,
> > if (!attr->ia_valid)
> > return 0;
> >
> > - ret = fuse_do_setattr(entry, attr, file);
> > + ret = fuse_do_setattr(idmap, entry, attr, file);
> > if (!ret) {
> > /*
> > * If filesystem supports acls it may have updated acl xattrs in
> > diff --git a/fs/fuse/file.c b/fs/fuse/file.c
> > index a660f1f21540..e0fe5497a548 100644
> > --- a/fs/fuse/file.c
> > +++ b/fs/fuse/file.c
> > @@ -2870,7 +2870,7 @@ static void fuse_do_truncate(struct file *file)
> > attr.ia_file = file;
> > attr.ia_valid |= ATTR_FILE;
> >
> > - fuse_do_setattr(file_dentry(file), &attr, file);
> > + fuse_do_setattr(&nop_mnt_idmap, file_dentry(file), &attr, file);
>
> Same as for the other patch. Please leave a comment in the commit
> message that briefly explains why it's ok to pass &nop_mnt_idmap here.
> It'll help us later. :)

Sure, will be fixed in -v2 ;-)

Explanation here is that in this specific case attr.ia_valid = ATTR_SIZE | ATTR_FILE,
which but we only need an idmapping for ATTR_UID | ATTR_GID.

From the other side, having struct file pointer means that getting an idmapping as easy as file_mnt_idmap(file),
and probably it's easier to pass an idmapping in this specific case rather than skipping it for a valid reasons.
What do you think about this?

Kind regards,
Alex

2024-01-29 15:55:32

by Aleksandr Mikhalitsyn

[permalink] [raw]
Subject: Re: [PATCH v1 7/9] fs/fuse: drop idmap argument from __fuse_get_acl

On Sat, 20 Jan 2024 16:24:55 +0100
Christian Brauner <[email protected]> wrote:

> On Mon, Jan 08, 2024 at 01:08:22PM +0100, Alexander Mikhalitsyn wrote:
> > We don't need to have idmap in the __fuse_get_acl as we don't
> > have any use for it.
> >
> > In the current POSIX ACL implementation, idmapped mounts are
> > taken into account on the userspace/kernel border
> > (see vfs_set_acl_idmapped_mnt() and vfs_posix_acl_to_xattr()).
> >
> > Cc: Christian Brauner <[email protected]>
> > Cc: Seth Forshee <[email protected]>
> > Cc: Miklos Szeredi <[email protected]>
> > Cc: Amir Goldstein <[email protected]>
> > Cc: Bernd Schubert <[email protected]>
> > Cc: <[email protected]>
> > Signed-off-by: Alexander Mikhalitsyn <[email protected]>
> > ---
>
> Ah, that probably became obsolete when I did the VFS POSIX ACL api.

Precisely ;-)

> Thanks,
> Reviewed-by: Christian Brauner <[email protected]>



2024-01-29 16:53:51

by Aleksandr Mikhalitsyn

[permalink] [raw]
Subject: Re: [PATCH v1 0/9] fuse: basic support for idmapped mounts

On Sun, 21 Jan 2024 18:50:57 +0100
Christian Brauner <[email protected]> wrote:

> On Mon, Jan 08, 2024 at 01:08:15PM +0100, Alexander Mikhalitsyn wrote:
> > Dear friends,
> >
> > This patch series aimed to provide support for idmapped mounts
> > for fuse. We already have idmapped mounts support for almost all
> > widely-used filesystems:
> > * local (ext4, btrfs, xfs, fat, vfat, ntfs3, squashfs, f2fs, erofs, ZFS (out-of-tree))
> > * network (ceph)
> >
> > Git tree (based on https://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse.git/log/?h=for-next):
> > v1: https://github.com/mihalicyn/linux/commits/fuse_idmapped_mounts.v1
> > current: https://github.com/mihalicyn/linux/commits/fuse_idmapped_mounts
>
> Great work!
>
> > Things to discuss:
> > - we enable idmapped mounts support only if "default_permissions" mode is enabled,
> > because otherwise, we would need to deal with UID/GID mappings on the userspace side OR
> > provide the userspace with idmapped req->in.h.uid/req->in.h.gid values which is not
> > something that we probably want to do. Idmapped mounts philosophy is not about faking
> > caller uid/gid.
>
> Having VFS idmaps but then outsourcing permission checking to userspace
> is conceptually strange so requiring default_permissions is the correct
> thing to do.
>
> > - We have a small offlist discussion with Christian about adding fs_type->allow_idmap
> > hook. Christian pointed out that it would be nice to have a superblock flag instead like
> > SB_I_NOIDMAP and we can set this flag during mount time if we see that the filesystem does not
> > support idmappings. But, unfortunately, I didn't succeed here because the kernel will
> > know if the filesystem supports idmapping or not after FUSE_INIT request, but FUSE_INIT request
> > is being sent at the end of the mounting process, so the mount and superblock will exist and
> > visible by the userspace in that time. It seems like setting SB_I_NOIDMAP flag, in this
> > case, is too late as a user may do the trick by creating an idmapped mount while it wasn't
> > restricted by SB_I_NOIDMAP. Alternatively, we can introduce a "positive" version SB_I_ALLOWIDMAP
>
> I see.
>
> > and a "weak" version of FS_ALLOW_IDMAP like FS_MAY_ALLOW_IDMAP. So if FS_MAY_ALLOW_IDMAP is set,
> > then SB_I_ALLOWIDMAP has to be set on the superblock to allow the creation of an idmapped mount.
> > But that's a matter of our discussion.
>
> I dislike making adding a struct super_block method. Because it means that we
> call into the filesystem from generic mount code and specifically with the
> namespace semaphore held. If there's ever any network filesystem that e.g.,
> calls to a hung server it will lockup the whole system.So I'm opposed to
> calling into the filesystem here at all. It's also ugly because this is really
> a vfs level change. The only involvement should be whether the filesystem type
> can actually support this ideally.

That's a very interesting point about holding a semaphore. Thanks!

>
> I think we should handle this within FUSE. So we allow the creation of idmapped
> mounts just based on FS_ALLOW_IDMAP. And if the server doesn't support the
> FUSE_OWNER_UID_GID_EXT then we simply refuse all creation requests originating
> from an idmapped mount. Either we return EOPNOSUPP or we return EOVERFLOW to
> indicate that we can't represent the owner correctly because the server is
> missing the required extension.

Ok, that's effectively the same approach that we already have in cephfs:
https://github.com/torvalds/linux/blob/41bccc98fb7931d63d03f326a746ac4d429c1dd3/fs/ceph/mds_client.c#L3059

I personally comfortable with this too.

It's interesting to hear what Miklos thinks about it.

Kind regards,
Alex

>
> diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c
> index 3f37ba6a7a10..0726da21150a 100644
> --- a/fs/fuse/dir.c
> +++ b/fs/fuse/dir.c
> @@ -606,8 +606,16 @@ static int get_create_ext(struct mnt_idmap *idmap,
> err = get_security_context(dentry, mode, &ext);
> if (!err && fc->create_supp_group)
> err = get_create_supp_group(dir, &ext);
> - if (!err && fc->owner_uid_gid_ext)
> - err = get_owner_uid_gid(idmap, fc, &ext);
> + if (!err) {
> + /*
> + * If the server doesn't support FUSE_OWNER_UID_GID_EXT and
> + * this is a creation request from an idmapped mount refuse it.
> + */
> + if (fc->owner_uid_gid_ext)
> + err = get_owner_uid_gid(idmap, fc, &ext);
> + else if (idmap != &nop_mnt_idmap)
> + err = -EOPNOTSUPP;
> + }
>
> if (!err && ext.size) {
> WARN_ON(args->in_numargs >= ARRAY_SIZE(args->in_args));


--
Alexander Mikhalitsyn <[email protected]>

2024-01-30 10:24:55

by Christian Brauner

[permalink] [raw]
Subject: Re: [PATCH v1 6/9] fs/fuse: support idmapped ->setattr op

On Mon, Jan 29, 2024 at 04:48:49PM +0100, Alexander Mikhalitsyn wrote:
> On Sat, 20 Jan 2024 16:23:38 +0100
> Christian Brauner <[email protected]> wrote:
>
> > On Mon, Jan 08, 2024 at 01:08:21PM +0100, Alexander Mikhalitsyn wrote:
> > > Cc: Christian Brauner <[email protected]>
> > > Cc: Seth Forshee <[email protected]>
> > > Cc: Miklos Szeredi <[email protected]>
> > > Cc: Amir Goldstein <[email protected]>
> > > Cc: Bernd Schubert <[email protected]>
> > > Cc: <[email protected]>
> > > Signed-off-by: Alexander Mikhalitsyn <[email protected]>
> > > ---
> > > fs/fuse/dir.c | 32 +++++++++++++++++++++-----------
> > > fs/fuse/file.c | 2 +-
> > > fs/fuse/fuse_i.h | 4 ++--
> > > 3 files changed, 24 insertions(+), 14 deletions(-)
> > >
> > > diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c
> > > index f7c2c54f7122..5fbb7100ad1c 100644
> > > --- a/fs/fuse/dir.c
> > > +++ b/fs/fuse/dir.c
> > > @@ -1739,17 +1739,27 @@ static bool update_mtime(unsigned ivalid, bool trust_local_mtime)
> > > return true;
> > > }
> > >
> > > -static void iattr_to_fattr(struct fuse_conn *fc, struct iattr *iattr,
> > > - struct fuse_setattr_in *arg, bool trust_local_cmtime)
> > > +static void iattr_to_fattr(struct mnt_idmap *idmap, struct fuse_conn *fc,
> > > + struct iattr *iattr, struct fuse_setattr_in *arg,
> > > + bool trust_local_cmtime)
> > > {
> > > unsigned ivalid = iattr->ia_valid;
> > >
> > > if (ivalid & ATTR_MODE)
> > > arg->valid |= FATTR_MODE, arg->mode = iattr->ia_mode;
> > > - if (ivalid & ATTR_UID)
> > > - arg->valid |= FATTR_UID, arg->uid = from_kuid(fc->user_ns, iattr->ia_uid);
> > > - if (ivalid & ATTR_GID)
> > > - arg->valid |= FATTR_GID, arg->gid = from_kgid(fc->user_ns, iattr->ia_gid);
> > > +
> > > + if (ivalid & ATTR_UID) {
> > > + kuid_t fsuid = from_vfsuid(idmap, fc->user_ns, iattr->ia_vfsuid);
> > > + arg->valid |= FATTR_UID;
> > > + arg->uid = from_kuid(fc->user_ns, fsuid);
> > > + }
> > > +
> > > + if (ivalid & ATTR_GID) {
> > > + kgid_t fsgid = from_vfsgid(idmap, fc->user_ns, iattr->ia_vfsgid);
> > > + arg->valid |= FATTR_GID;
> > > + arg->gid = from_kgid(fc->user_ns, fsgid);
> > > + }
> > > +
> > > if (ivalid & ATTR_SIZE)
> > > arg->valid |= FATTR_SIZE, arg->size = iattr->ia_size;
> > > if (ivalid & ATTR_ATIME) {
> > > @@ -1869,8 +1879,8 @@ int fuse_flush_times(struct inode *inode, struct fuse_file *ff)
> > > * vmtruncate() doesn't allow for this case, so do the rlimit checking
> > > * and the actual truncation by hand.
> > > */
> > > -int fuse_do_setattr(struct dentry *dentry, struct iattr *attr,
> > > - struct file *file)
> > > +int fuse_do_setattr(struct mnt_idmap *idmap, struct dentry *dentry,
> > > + struct iattr *attr, struct file *file)
> > > {
> > > struct inode *inode = d_inode(dentry);
> > > struct fuse_mount *fm = get_fuse_mount(inode);
> > > @@ -1890,7 +1900,7 @@ int fuse_do_setattr(struct dentry *dentry, struct iattr *attr,
> > > if (!fc->default_permissions)
> > > attr->ia_valid |= ATTR_FORCE;
> > >
> > > - err = setattr_prepare(&nop_mnt_idmap, dentry, attr);
> > > + err = setattr_prepare(idmap, dentry, attr);
> > > if (err)
> > > return err;
> > >
> > > @@ -1949,7 +1959,7 @@ int fuse_do_setattr(struct dentry *dentry, struct iattr *attr,
> > >
> > > memset(&inarg, 0, sizeof(inarg));
> > > memset(&outarg, 0, sizeof(outarg));
> > > - iattr_to_fattr(fc, attr, &inarg, trust_local_cmtime);
> > > + iattr_to_fattr(idmap, fc, attr, &inarg, trust_local_cmtime);
> > > if (file) {
> > > struct fuse_file *ff = file->private_data;
> > > inarg.valid |= FATTR_FH;
> > > @@ -2084,7 +2094,7 @@ static int fuse_setattr(struct mnt_idmap *idmap, struct dentry *entry,
> > > if (!attr->ia_valid)
> > > return 0;
> > >
> > > - ret = fuse_do_setattr(entry, attr, file);
> > > + ret = fuse_do_setattr(idmap, entry, attr, file);
> > > if (!ret) {
> > > /*
> > > * If filesystem supports acls it may have updated acl xattrs in
> > > diff --git a/fs/fuse/file.c b/fs/fuse/file.c
> > > index a660f1f21540..e0fe5497a548 100644
> > > --- a/fs/fuse/file.c
> > > +++ b/fs/fuse/file.c
> > > @@ -2870,7 +2870,7 @@ static void fuse_do_truncate(struct file *file)
> > > attr.ia_file = file;
> > > attr.ia_valid |= ATTR_FILE;
> > >
> > > - fuse_do_setattr(file_dentry(file), &attr, file);
> > > + fuse_do_setattr(&nop_mnt_idmap, file_dentry(file), &attr, file);
> >
> > Same as for the other patch. Please leave a comment in the commit
> > message that briefly explains why it's ok to pass &nop_mnt_idmap here.
> > It'll help us later. :)
>
> Sure, will be fixed in -v2 ;-)
>
> Explanation here is that in this specific case attr.ia_valid = ATTR_SIZE | ATTR_FILE,
> which but we only need an idmapping for ATTR_UID | ATTR_GID.
>
> From the other side, having struct file pointer means that getting an idmapping as easy as file_mnt_idmap(file),
> and probably it's easier to pass an idmapping in this specific case rather than skipping it for a valid reasons.
> What do you think about this?

Yeah, I'd just pass it through because then we don't have to think about
why we're not passing it through here.

2024-03-05 14:49:04

by Miklos Szeredi

[permalink] [raw]
Subject: Re: [PATCH v1 2/9] fs/fuse: add FUSE_OWNER_UID_GID_EXT extension

On Mon, 8 Jan 2024 at 13:10, Alexander Mikhalitsyn
<[email protected]> wrote:
>
> To properly support vfs idmappings we need to provide
> a fuse daemon with the correct owner uid/gid for
> inode creation requests like mkdir, mknod, atomic_open,
> symlink.
>
> Right now, fuse daemons use req->in.h.uid/req->in.h.gid
> to set inode owner. These fields contain fsuid/fsgid of the
> syscall's caller. And that's perfectly fine, because inode
> owner have to be set to these values. But, for idmapped mounts
> it's not the case and caller fsuid/fsgid != inode owner, because
> idmapped mounts do nothing with the caller fsuid/fsgid, but
> affect inode owner uid/gid. It means that we can't apply vfsid
> mapping to caller fsuid/fsgid, but instead we have to introduce
> a new fields to store inode owner uid/gid which will be appropriately
> transformed.

Does fsuid/fsgid have any meaning to the server?

Shouldn't this just set in.h.uid/in.h.gid to the mapped ids?

Thanks,
Miklos