2021-02-03 12:45:28

by Miklos Szeredi

[permalink] [raw]
Subject: [PATCH 01/18] vfs: add miscattr ops

There's a substantial amount of boilerplate in filesystems handling
FS_IOC_[GS]ETFLAGS/ FS_IOC_FS[GS]ETXATTR ioctls.

Also due to userspace buffers being involved in the ioctl API this is
difficult to stack, as shown by overlayfs issues related to these ioctls.

Introduce a new internal API named "miscattr" (fsxattr can be confused with
xattr, xflags is inappropriate, since this is more than just flags).

There's significant overlap between flags and xflags and this API handles
the conversions automatically, so filesystems may choose which one to use.

In ->miscattr_get() a hint is provided to the filesystem whether flags or
xattr are being requested by userspace, but in this series this hint is
ignored by all filesystems, since generating all the attributes is cheap.

If a filesystem doesn't implemement the miscattr API, just fall back to
f_op->ioctl(). When all filesystems are converted, the fallback can be
removed.

32bit compat ioctls are now handled by the generic code as well.

Signed-off-by: Miklos Szeredi <[email protected]>
---
Documentation/filesystems/locking.rst | 4 +
Documentation/filesystems/vfs.rst | 14 ++
fs/ioctl.c | 329 ++++++++++++++++++++++++++
include/linux/fs.h | 3 +
include/linux/miscattr.h | 52 ++++
5 files changed, 402 insertions(+)
create mode 100644 include/linux/miscattr.h

diff --git a/Documentation/filesystems/locking.rst b/Documentation/filesystems/locking.rst
index c0f2c7586531..5c0a30dd8336 100644
--- a/Documentation/filesystems/locking.rst
+++ b/Documentation/filesystems/locking.rst
@@ -80,6 +80,8 @@ prototypes::
struct file *, unsigned open_flag,
umode_t create_mode);
int (*tmpfile) (struct inode *, struct dentry *, umode_t);
+ int (*miscattr_set)(struct dentry *dentry, struct miscattr *ma);
+ int (*miscattr_get)(struct dentry *dentry, struct miscattr *ma);

locking rules:
all may block
@@ -107,6 +109,8 @@ fiemap: no
update_time: no
atomic_open: shared (exclusive if O_CREAT is set in open flags)
tmpfile: no
+miscattr_get: no or exclusive
+miscattr_set: exclusive
============ =============================================


diff --git a/Documentation/filesystems/vfs.rst b/Documentation/filesystems/vfs.rst
index ca52c82e5bb5..a0ee8fc85678 100644
--- a/Documentation/filesystems/vfs.rst
+++ b/Documentation/filesystems/vfs.rst
@@ -437,6 +437,8 @@ As of kernel 2.6.22, the following members are defined:
int (*atomic_open)(struct inode *, struct dentry *, struct file *,
unsigned open_flag, umode_t create_mode);
int (*tmpfile) (struct inode *, struct dentry *, umode_t);
+ int (*miscattr_set)(struct dentry *dentry, struct miscattr *ma);
+ int (*miscattr_get)(struct dentry *dentry, struct miscattr *ma);
};

Again, all methods are called without any locks being held, unless
@@ -584,6 +586,18 @@ otherwise noted.
atomically creating, opening and unlinking a file in given
directory.

+``miscattr_get``
+ called on ioctl(FS_IOC_GETFLAGS) and ioctl(FS_IOC_FSGETXATTR) to
+ retrieve miscellaneous filesystem flags and attributes. Also
+ called before the relevant SET operation to check what is being
+ changed (in this case with i_rwsem locked exclusive). If unset,
+ then fall back to f_op->ioctl().
+
+``miscattr_set``
+ called on ioctl(FS_IOC_SETFLAGS) and ioctl(FS_IOC_FSSETXATTR) to
+ change miscellaneous filesystem flags and attributes. Callers hold
+ i_rwsem exclusive. If unset, then fall back to f_op->ioctl().
+

The Address Space Object
========================
diff --git a/fs/ioctl.c b/fs/ioctl.c
index 4e6cc0a7d69c..f38f75c82e76 100644
--- a/fs/ioctl.c
+++ b/fs/ioctl.c
@@ -19,6 +19,9 @@
#include <linux/falloc.h>
#include <linux/sched/signal.h>
#include <linux/fiemap.h>
+#include <linux/mount.h>
+#include <linux/fscrypt.h>
+#include <linux/miscattr.h>

#include "internal.h"

@@ -657,6 +660,311 @@ static int ioctl_file_dedupe_range(struct file *file,
return ret;
}

+/**
+ * miscattr_fill_xflags - initialize miscattr with xflags
+ * @ma: miscattr pointer
+ * @xflags: FS_XFLAG_* flags
+ *
+ * Set ->fsx_xflags, ->xattr_valid and ->flags (translated xflags). All
+ * other fields are zeroed.
+ */
+void miscattr_fill_xflags(struct miscattr *ma, u32 xflags)
+{
+ memset(ma, 0, sizeof(*ma));
+ ma->xattr_valid = true;
+ ma->fsx_xflags = xflags;
+ if (ma->fsx_xflags & FS_XFLAG_IMMUTABLE)
+ ma->flags |= FS_IMMUTABLE_FL;
+ if (ma->fsx_xflags & FS_XFLAG_APPEND)
+ ma->flags |= FS_APPEND_FL;
+ if (ma->fsx_xflags & FS_XFLAG_SYNC)
+ ma->flags |= FS_SYNC_FL;
+ if (ma->fsx_xflags & FS_XFLAG_NOATIME)
+ ma->flags |= FS_NOATIME_FL;
+ if (ma->fsx_xflags & FS_XFLAG_NODUMP)
+ ma->flags |= FS_NODUMP_FL;
+ if (ma->fsx_xflags & FS_XFLAG_DAX)
+ ma->flags |= FS_DAX_FL;
+ if (ma->fsx_xflags & FS_XFLAG_PROJINHERIT)
+ ma->flags |= FS_PROJINHERIT_FL;
+}
+EXPORT_SYMBOL(miscattr_fill_xflags);
+
+/**
+ * miscattr_fill_flags - initialize miscattr with flags
+ * @ma: miscattr pointer
+ * @flags: FS_*_FL flags
+ *
+ * Set ->flags, ->flags_valid and ->fsx_xflags (translated flags).
+ * All other fields are zeroed.
+ */
+void miscattr_fill_flags(struct miscattr *ma, u32 flags)
+{
+ memset(ma, 0, sizeof(*ma));
+ ma->flags_valid = true;
+ ma->flags = flags;
+ if (ma->flags & FS_SYNC_FL)
+ ma->fsx_xflags |= FS_XFLAG_SYNC;
+ if (ma->flags & FS_IMMUTABLE_FL)
+ ma->fsx_xflags |= FS_XFLAG_IMMUTABLE;
+ if (ma->flags & FS_APPEND_FL)
+ ma->fsx_xflags |= FS_XFLAG_APPEND;
+ if (ma->flags & FS_NODUMP_FL)
+ ma->fsx_xflags |= FS_XFLAG_NODUMP;
+ if (ma->flags & FS_NOATIME_FL)
+ ma->fsx_xflags |= FS_XFLAG_NOATIME;
+ if (ma->flags & FS_DAX_FL)
+ ma->fsx_xflags |= FS_XFLAG_DAX;
+ if (ma->flags & FS_PROJINHERIT_FL)
+ ma->fsx_xflags |= FS_XFLAG_PROJINHERIT;
+}
+EXPORT_SYMBOL(miscattr_fill_flags);
+
+/**
+ * vfs_miscattr_get - retrieve miscellaneous inode attributes
+ * @dentry: the object to retrieve from
+ * @ma: miscattr pointer
+ *
+ * Call i_op->miscattr_get() callback, if exists.
+ *
+ * Returns 0 on success, or a negative error on failure.
+ */
+int vfs_miscattr_get(struct dentry *dentry, struct miscattr *ma)
+{
+ struct inode *inode = d_inode(dentry);
+
+ if (d_is_special(dentry))
+ return -ENOTTY;
+
+ if (!inode->i_op->miscattr_get)
+ return -ENOIOCTLCMD;
+
+ memset(ma, 0, sizeof(*ma));
+ return inode->i_op->miscattr_get(dentry, ma);
+}
+EXPORT_SYMBOL(vfs_miscattr_get);
+
+/**
+ * fsxattr_copy_to_user - copy fsxattr to userspace.
+ * @ma: miscattr pointer
+ * @ufa: fsxattr user pointer
+ *
+ * Returns 0 on success, or -EFAULT on failure.
+ */
+int fsxattr_copy_to_user(const struct miscattr *ma, struct fsxattr __user *ufa)
+{
+ struct fsxattr fa = {
+ .fsx_xflags = ma->fsx_xflags,
+ .fsx_extsize = ma->fsx_extsize,
+ .fsx_nextents = ma->fsx_nextents,
+ .fsx_projid = ma->fsx_projid,
+ .fsx_cowextsize = ma->fsx_cowextsize,
+ };
+
+ if (copy_to_user(ufa, &fa, sizeof(fa)))
+ return -EFAULT;
+
+ return 0;
+}
+EXPORT_SYMBOL(fsxattr_copy_to_user);
+
+static int fsxattr_copy_from_user(struct miscattr *ma,
+ struct fsxattr __user *ufa)
+{
+ struct fsxattr fa;
+
+ if (copy_from_user(&fa, ufa, sizeof(fa)))
+ return -EFAULT;
+
+ miscattr_fill_xflags(ma, fa.fsx_xflags);
+ ma->fsx_extsize = fa.fsx_extsize;
+ ma->fsx_nextents = fa.fsx_nextents;
+ ma->fsx_projid = fa.fsx_projid;
+ ma->fsx_cowextsize = fa.fsx_cowextsize;
+
+ return 0;
+}
+
+/*
+ * Generic function to check FS_IOC_FSSETXATTR/FS_IOC_SETFLAGS values and reject
+ * any invalid configurations.
+ *
+ * Note: must be called with inode lock held.
+ */
+static int miscattr_set_prepare(struct inode *inode,
+ const struct miscattr *old_ma,
+ struct miscattr *ma)
+{
+ int err;
+
+ /*
+ * The IMMUTABLE and APPEND_ONLY flags can only be changed by
+ * the relevant capability.
+ */
+ if ((ma->flags ^ old_ma->flags) & (FS_APPEND_FL | FS_IMMUTABLE_FL) &&
+ !capable(CAP_LINUX_IMMUTABLE))
+ return -EPERM;
+
+ err = fscrypt_prepare_setflags(inode, old_ma->flags, ma->flags);
+ if (err)
+ return err;
+
+ /*
+ * Project Quota ID state is only allowed to change from within the init
+ * namespace. Enforce that restriction only if we are trying to change
+ * the quota ID state. Everything else is allowed in user namespaces.
+ */
+ if (current_user_ns() != &init_user_ns) {
+ if (old_ma->fsx_projid != ma->fsx_projid)
+ return -EINVAL;
+ if ((old_ma->fsx_xflags ^ ma->fsx_xflags) &
+ FS_XFLAG_PROJINHERIT)
+ return -EINVAL;
+ }
+
+ /* Check extent size hints. */
+ if ((ma->fsx_xflags & FS_XFLAG_EXTSIZE) && !S_ISREG(inode->i_mode))
+ return -EINVAL;
+
+ if ((ma->fsx_xflags & FS_XFLAG_EXTSZINHERIT) &&
+ !S_ISDIR(inode->i_mode))
+ return -EINVAL;
+
+ if ((ma->fsx_xflags & FS_XFLAG_COWEXTSIZE) &&
+ !S_ISREG(inode->i_mode) && !S_ISDIR(inode->i_mode))
+ return -EINVAL;
+
+ /*
+ * It is only valid to set the DAX flag on regular files and
+ * directories on filesystems.
+ */
+ if ((ma->fsx_xflags & FS_XFLAG_DAX) &&
+ !(S_ISREG(inode->i_mode) || S_ISDIR(inode->i_mode)))
+ return -EINVAL;
+
+ /* Extent size hints of zero turn off the flags. */
+ if (ma->fsx_extsize == 0)
+ ma->fsx_xflags &= ~(FS_XFLAG_EXTSIZE | FS_XFLAG_EXTSZINHERIT);
+ if (ma->fsx_cowextsize == 0)
+ ma->fsx_xflags &= ~FS_XFLAG_COWEXTSIZE;
+
+ return 0;
+}
+
+/**
+ * vfs_miscattr_set - change miscellaneous inode attributes
+ * @dentry: the object to change
+ * @ma: miscattr pointer
+ *
+ * After verifying permissions, call i_op->miscattr_set() callback, if
+ * exists.
+ *
+ * Verifying attributes involves retrieving current attributes with
+ * i_op->miscattr_get(), this also allows initilaizing attributes that have
+ * not been set by the caller to current values. Inode lock is held
+ * thoughout to prevent racing with another instance.
+ *
+ * Returns 0 on success, or a negative error on failure.
+ */
+int vfs_miscattr_set(struct dentry *dentry, struct miscattr *ma)
+{
+ struct inode *inode = d_inode(dentry);
+ struct miscattr old_ma = {};
+ int err;
+
+ if (d_is_special(dentry))
+ return -ENOTTY;
+
+ if (!inode->i_op->miscattr_set)
+ return -ENOIOCTLCMD;
+
+ if (!inode_owner_or_capable(inode))
+ return -EPERM;
+
+ inode_lock(inode);
+ err = vfs_miscattr_get(dentry, &old_ma);
+ if (!err) {
+ /* initialize missing bits from old_ma */
+ if (ma->flags_valid) {
+ ma->fsx_xflags |= old_ma.fsx_xflags & ~FS_XFLAG_COMMON;
+ ma->fsx_extsize = old_ma.fsx_extsize;
+ ma->fsx_nextents = old_ma.fsx_nextents;
+ ma->fsx_projid = old_ma.fsx_projid;
+ ma->fsx_cowextsize = old_ma.fsx_cowextsize;
+ } else {
+ ma->flags |= old_ma.flags & ~FS_COMMON_FL;
+ }
+ err = miscattr_set_prepare(inode, &old_ma, ma);
+ if (!err)
+ err = inode->i_op->miscattr_set(dentry, ma);
+ }
+ inode_unlock(inode);
+
+ return err;
+}
+EXPORT_SYMBOL(vfs_miscattr_set);
+
+static int ioctl_getflags(struct file *file, void __user *argp)
+{
+ struct miscattr ma = { .flags_valid = true }; /* hint only */
+ unsigned int flags;
+ int err;
+
+ err = vfs_miscattr_get(file_dentry(file), &ma);
+ if (!err) {
+ flags = ma.flags;
+ if (copy_to_user(argp, &flags, sizeof(flags)))
+ err = -EFAULT;
+ }
+ return err;
+}
+
+static int ioctl_setflags(struct file *file, void __user *argp)
+{
+ struct miscattr ma;
+ unsigned int flags;
+ int err;
+
+ if (copy_from_user(&flags, argp, sizeof(flags)))
+ return -EFAULT;
+
+ err = mnt_want_write_file(file);
+ if (!err) {
+ miscattr_fill_flags(&ma, flags);
+ err = vfs_miscattr_set(file_dentry(file), &ma);
+ mnt_drop_write_file(file);
+ }
+ return err;
+}
+
+static int ioctl_fsgetxattr(struct file *file, void __user *argp)
+{
+ struct miscattr ma = { .xattr_valid = true }; /* hint only */
+ int err;
+
+ err = vfs_miscattr_get(file_dentry(file), &ma);
+ if (!err)
+ err = fsxattr_copy_to_user(&ma, argp);
+
+ return err;
+}
+
+static int ioctl_fssetxattr(struct file *file, void __user *argp)
+{
+ struct miscattr ma;
+ int err;
+
+ err = fsxattr_copy_from_user(&ma, argp);
+ if (!err) {
+ err = mnt_want_write_file(file);
+ if (!err) {
+ err = vfs_miscattr_set(file_dentry(file), &ma);
+ mnt_drop_write_file(file);
+ }
+ }
+ return err;
+}
+
/*
* do_vfs_ioctl() is not for drivers and not intended to be EXPORT_SYMBOL()'d.
* It's just a simple helper for sys_ioctl and compat_sys_ioctl.
@@ -727,6 +1035,18 @@ static int do_vfs_ioctl(struct file *filp, unsigned int fd,
return put_user(i_size_read(inode) - filp->f_pos,
(int __user *)argp);

+ case FS_IOC_GETFLAGS:
+ return ioctl_getflags(filp, argp);
+
+ case FS_IOC_SETFLAGS:
+ return ioctl_setflags(filp, argp);
+
+ case FS_IOC_FSGETXATTR:
+ return ioctl_fsgetxattr(filp, argp);
+
+ case FS_IOC_FSSETXATTR:
+ return ioctl_fssetxattr(filp, argp);
+
default:
if (S_ISREG(inode->i_mode))
return file_ioctl(filp, cmd, argp);
@@ -827,6 +1147,15 @@ COMPAT_SYSCALL_DEFINE3(ioctl, unsigned int, fd, unsigned int, cmd,
break;
#endif

+ /*
+ * These access 32-bit values anyway so no further handling is
+ * necessary.
+ */
+ case FS_IOC32_GETFLAGS:
+ case FS_IOC32_SETFLAGS:
+ cmd = (cmd == FS_IOC32_GETFLAGS) ?
+ FS_IOC_GETFLAGS : FS_IOC_SETFLAGS;
+ fallthrough;
/*
* everything else in do_vfs_ioctl() takes either a compatible
* pointer argument or no argument -- call it with a modified
diff --git a/include/linux/fs.h b/include/linux/fs.h
index fd47deea7c17..ca3bab7d5f6e 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -68,6 +68,7 @@ struct fsverity_info;
struct fsverity_operations;
struct fs_context;
struct fs_parameter_spec;
+struct miscattr;

extern void __init inode_init(void);
extern void __init inode_init_early(void);
@@ -1887,6 +1888,8 @@ struct inode_operations {
umode_t create_mode);
int (*tmpfile) (struct inode *, struct dentry *, umode_t);
int (*set_acl)(struct inode *, struct posix_acl *, int);
+ int (*miscattr_set)(struct dentry *dentry, struct miscattr *ma);
+ int (*miscattr_get)(struct dentry *dentry, struct miscattr *ma);
} ____cacheline_aligned;

static inline ssize_t call_read_iter(struct file *file, struct kiocb *kio,
diff --git a/include/linux/miscattr.h b/include/linux/miscattr.h
new file mode 100644
index 000000000000..1c531779dc9d
--- /dev/null
+++ b/include/linux/miscattr.h
@@ -0,0 +1,52 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#ifndef _LINUX_MISCATTR_H
+#define _LINUX_MISCATTR_H
+
+/* Flags shared betwen flags/xflags */
+#define FS_COMMON_FL \
+ (FS_SYNC_FL | FS_IMMUTABLE_FL | FS_APPEND_FL | \
+ FS_NODUMP_FL | FS_NOATIME_FL | FS_DAX_FL | \
+ FS_PROJINHERIT_FL)
+
+#define FS_XFLAG_COMMON \
+ (FS_XFLAG_SYNC | FS_XFLAG_IMMUTABLE | FS_XFLAG_APPEND | \
+ FS_XFLAG_NODUMP | FS_XFLAG_NOATIME | FS_XFLAG_DAX | \
+ FS_XFLAG_PROJINHERIT)
+
+struct miscattr {
+ u32 flags; /* flags (FS_IOC_GETFLAGS/FS_IOC_SETFLAGS) */
+ /* struct fsxattr: */
+ u32 fsx_xflags; /* xflags field value (get/set) */
+ u32 fsx_extsize; /* extsize field value (get/set)*/
+ u32 fsx_nextents; /* nextents field value (get) */
+ u32 fsx_projid; /* project identifier (get/set) */
+ u32 fsx_cowextsize; /* CoW extsize field value (get/set)*/
+ /* selectors: */
+ bool flags_valid:1;
+ bool xattr_valid:1;
+};
+
+int fsxattr_copy_to_user(const struct miscattr *ma, struct fsxattr __user *ufa);
+
+void miscattr_fill_xflags(struct miscattr *ma, u32 xflags);
+void miscattr_fill_flags(struct miscattr *ma, u32 flags);
+
+/**
+ * miscattr_has_xattr - check for extentended flags/attributes
+ * @ma: miscattr pointer
+ *
+ * Returns true if any attributes are present that are not represented in
+ * ->flags.
+ */
+static inline bool miscattr_has_xattr(const struct miscattr *ma)
+{
+ return ma->xattr_valid &&
+ ((ma->fsx_xflags & ~FS_XFLAG_COMMON) || ma->fsx_extsize != 0 ||
+ ma->fsx_projid != 0 || ma->fsx_cowextsize != 0);
+}
+
+int vfs_miscattr_get(struct dentry *dentry, struct miscattr *ma);
+int vfs_miscattr_set(struct dentry *dentry, struct miscattr *ma);
+
+#endif /* _LINUX_MISCATTR_H */
--
2.26.2


2021-02-03 15:16:24

by Jan Kara

[permalink] [raw]
Subject: Re: [PATCH 01/18] vfs: add miscattr ops

Hi Miklos!

On Wed 03-02-21 13:40:55, Miklos Szeredi wrote:
> There's a substantial amount of boilerplate in filesystems handling
> FS_IOC_[GS]ETFLAGS/ FS_IOC_FS[GS]ETXATTR ioctls.
>
> Also due to userspace buffers being involved in the ioctl API this is
> difficult to stack, as shown by overlayfs issues related to these ioctls.
>
> Introduce a new internal API named "miscattr" (fsxattr can be confused with
> xattr, xflags is inappropriate, since this is more than just flags).
>
> There's significant overlap between flags and xflags and this API handles
> the conversions automatically, so filesystems may choose which one to use.
>
> In ->miscattr_get() a hint is provided to the filesystem whether flags or
> xattr are being requested by userspace, but in this series this hint is
> ignored by all filesystems, since generating all the attributes is cheap.
>
> If a filesystem doesn't implemement the miscattr API, just fall back to
> f_op->ioctl(). When all filesystems are converted, the fallback can be
> removed.
>
> 32bit compat ioctls are now handled by the generic code as well.
>
> Signed-off-by: Miklos Szeredi <[email protected]>

Getting rid of the boilerplate code and improving stacking of these calls
look like a nice goal to me :) Some technical comments below.

> +/**
> + * miscattr_fill_xflags - initialize miscattr with xflags
> + * @ma: miscattr pointer
> + * @xflags: FS_XFLAG_* flags
> + *
> + * Set ->fsx_xflags, ->xattr_valid and ->flags (translated xflags). All
> + * other fields are zeroed.
> + */
> +void miscattr_fill_xflags(struct miscattr *ma, u32 xflags)

Maybe call this miscattr_fill_from_xflags() and the next function
miscattr_fill_from_flags()? At least to me it would be clearer when I want
to use which function just by looking at the name...

> +{
> + memset(ma, 0, sizeof(*ma));
> + ma->xattr_valid = true;
> + ma->fsx_xflags = xflags;
> + if (ma->fsx_xflags & FS_XFLAG_IMMUTABLE)
> + ma->flags |= FS_IMMUTABLE_FL;
> + if (ma->fsx_xflags & FS_XFLAG_APPEND)
> + ma->flags |= FS_APPEND_FL;
> + if (ma->fsx_xflags & FS_XFLAG_SYNC)
> + ma->flags |= FS_SYNC_FL;
> + if (ma->fsx_xflags & FS_XFLAG_NOATIME)
> + ma->flags |= FS_NOATIME_FL;
> + if (ma->fsx_xflags & FS_XFLAG_NODUMP)
> + ma->flags |= FS_NODUMP_FL;
> + if (ma->fsx_xflags & FS_XFLAG_DAX)
> + ma->flags |= FS_DAX_FL;
> + if (ma->fsx_xflags & FS_XFLAG_PROJINHERIT)
> + ma->flags |= FS_PROJINHERIT_FL;
> +}
> +EXPORT_SYMBOL(miscattr_fill_xflags);
> +
> +/**
> + * miscattr_fill_flags - initialize miscattr with flags
> + * @ma: miscattr pointer
> + * @flags: FS_*_FL flags
> + *
> + * Set ->flags, ->flags_valid and ->fsx_xflags (translated flags).
> + * All other fields are zeroed.
> + */
> +void miscattr_fill_flags(struct miscattr *ma, u32 flags)
> +{
> + memset(ma, 0, sizeof(*ma));
> + ma->flags_valid = true;
> + ma->flags = flags;
> + if (ma->flags & FS_SYNC_FL)
> + ma->fsx_xflags |= FS_XFLAG_SYNC;
> + if (ma->flags & FS_IMMUTABLE_FL)
> + ma->fsx_xflags |= FS_XFLAG_IMMUTABLE;
> + if (ma->flags & FS_APPEND_FL)
> + ma->fsx_xflags |= FS_XFLAG_APPEND;
> + if (ma->flags & FS_NODUMP_FL)
> + ma->fsx_xflags |= FS_XFLAG_NODUMP;
> + if (ma->flags & FS_NOATIME_FL)
> + ma->fsx_xflags |= FS_XFLAG_NOATIME;
> + if (ma->flags & FS_DAX_FL)
> + ma->fsx_xflags |= FS_XFLAG_DAX;
> + if (ma->flags & FS_PROJINHERIT_FL)
> + ma->fsx_xflags |= FS_XFLAG_PROJINHERIT;
> +}
> +EXPORT_SYMBOL(miscattr_fill_flags);
> +
> +/**
> + * vfs_miscattr_get - retrieve miscellaneous inode attributes
> + * @dentry: the object to retrieve from
> + * @ma: miscattr pointer
> + *
> + * Call i_op->miscattr_get() callback, if exists.
> + *
> + * Returns 0 on success, or a negative error on failure.
> + */
> +int vfs_miscattr_get(struct dentry *dentry, struct miscattr *ma)
> +{
> + struct inode *inode = d_inode(dentry);
> +
> + if (d_is_special(dentry))
> + return -ENOTTY;
> +
> + if (!inode->i_op->miscattr_get)
> + return -ENOIOCTLCMD;
> +
> + memset(ma, 0, sizeof(*ma));

So here we clear whole 'ma' but callers already set e.g. xattr_valid field
and cleared the 'ma' as well which just looks silly...

> + return inode->i_op->miscattr_get(dentry, ma);
> +}
> +EXPORT_SYMBOL(vfs_miscattr_get);

...

> +/*
> + * Generic function to check FS_IOC_FSSETXATTR/FS_IOC_SETFLAGS values and reject
> + * any invalid configurations.
> + *
> + * Note: must be called with inode lock held.
> + */
> +static int miscattr_set_prepare(struct inode *inode,
> + const struct miscattr *old_ma,
> + struct miscattr *ma)
> +{
> + int err;
> +
> + /*
> + * The IMMUTABLE and APPEND_ONLY flags can only be changed by
> + * the relevant capability.
> + */
> + if ((ma->flags ^ old_ma->flags) & (FS_APPEND_FL | FS_IMMUTABLE_FL) &&
> + !capable(CAP_LINUX_IMMUTABLE))
> + return -EPERM;
> +
> + err = fscrypt_prepare_setflags(inode, old_ma->flags, ma->flags);
> + if (err)
> + return err;
> +
> + /*
> + * Project Quota ID state is only allowed to change from within the init
> + * namespace. Enforce that restriction only if we are trying to change
> + * the quota ID state. Everything else is allowed in user namespaces.
> + */
> + if (current_user_ns() != &init_user_ns) {
> + if (old_ma->fsx_projid != ma->fsx_projid)
> + return -EINVAL;
> + if ((old_ma->fsx_xflags ^ ma->fsx_xflags) &
> + FS_XFLAG_PROJINHERIT)
> + return -EINVAL;
> + }
> +
> + /* Check extent size hints. */
> + if ((ma->fsx_xflags & FS_XFLAG_EXTSIZE) && !S_ISREG(inode->i_mode))
> + return -EINVAL;
> +
> + if ((ma->fsx_xflags & FS_XFLAG_EXTSZINHERIT) &&
> + !S_ISDIR(inode->i_mode))
> + return -EINVAL;
> +
> + if ((ma->fsx_xflags & FS_XFLAG_COWEXTSIZE) &&
> + !S_ISREG(inode->i_mode) && !S_ISDIR(inode->i_mode))
> + return -EINVAL;
> +
> + /*
> + * It is only valid to set the DAX flag on regular files and
> + * directories on filesystems.
> + */
> + if ((ma->fsx_xflags & FS_XFLAG_DAX) &&
> + !(S_ISREG(inode->i_mode) || S_ISDIR(inode->i_mode)))
> + return -EINVAL;
> +
> + /* Extent size hints of zero turn off the flags. */
> + if (ma->fsx_extsize == 0)
> + ma->fsx_xflags &= ~(FS_XFLAG_EXTSIZE | FS_XFLAG_EXTSZINHERIT);
> + if (ma->fsx_cowextsize == 0)
> + ma->fsx_xflags &= ~FS_XFLAG_COWEXTSIZE;
> +
> + return 0;
> +}
> +
> +/**
> + * vfs_miscattr_set - change miscellaneous inode attributes
> + * @dentry: the object to change
> + * @ma: miscattr pointer
> + *
> + * After verifying permissions, call i_op->miscattr_set() callback, if
> + * exists.
> + *
> + * Verifying attributes involves retrieving current attributes with
> + * i_op->miscattr_get(), this also allows initilaizing attributes that have
> + * not been set by the caller to current values. Inode lock is held
> + * thoughout to prevent racing with another instance.
> + *
> + * Returns 0 on success, or a negative error on failure.
> + */
> +int vfs_miscattr_set(struct dentry *dentry, struct miscattr *ma)
> +{
> + struct inode *inode = d_inode(dentry);
> + struct miscattr old_ma = {};
> + int err;
> +
> + if (d_is_special(dentry))
> + return -ENOTTY;
> +
> + if (!inode->i_op->miscattr_set)
> + return -ENOIOCTLCMD;
> +
> + if (!inode_owner_or_capable(inode))
> + return -EPERM;
> +
> + inode_lock(inode);
> + err = vfs_miscattr_get(dentry, &old_ma);
> + if (!err) {
> + /* initialize missing bits from old_ma */
> + if (ma->flags_valid) {
> + ma->fsx_xflags |= old_ma.fsx_xflags & ~FS_XFLAG_COMMON;
> + ma->fsx_extsize = old_ma.fsx_extsize;
> + ma->fsx_nextents = old_ma.fsx_nextents;
> + ma->fsx_projid = old_ma.fsx_projid;
> + ma->fsx_cowextsize = old_ma.fsx_cowextsize;
> + } else {
> + ma->flags |= old_ma.flags & ~FS_COMMON_FL;
> + }
> + err = miscattr_set_prepare(inode, &old_ma, ma);
> + if (!err)
> + err = inode->i_op->miscattr_set(dentry, ma);

So I somewhat wonder here - not all filesystems support all the xflags or
other extended attributes. Currently these would be just silently ignored
AFAICT. Which seems a bit dangerous to me - most notably because it makes
future extensions of these filesystems difficult. So how are we going to go
about this? Is every filesystem supposed to check what it supports and
refuse other stuff (but currently e.g. your ext2 conversion patch doesn't do
that AFAICT)? Shouldn't we make things easier for filesystems to provide a
bitmask of changing fields (instead of flags / xflags bools) so that they
can refuse unsupported stuff with a single mask check?

To make things more complex, ext2/4 has traditionally silently cleared
unknown flags for setflags but not for setxflags. Unlike e.g. XFS which
refuses unknown flags.

Honza
--
Jan Kara <[email protected]>
SUSE Labs, CR

2021-02-03 15:24:59

by Miklos Szeredi

[permalink] [raw]
Subject: Re: [PATCH 01/18] vfs: add miscattr ops

On Wed, Feb 3, 2021 at 4:05 PM Jan Kara <[email protected]> wrote:

[...]
> > +/**
> > + * miscattr_fill_xflags - initialize miscattr with xflags
> > + * @ma: miscattr pointer
> > + * @xflags: FS_XFLAG_* flags
> > + *
> > + * Set ->fsx_xflags, ->xattr_valid and ->flags (translated xflags). All
> > + * other fields are zeroed.
> > + */
> > +void miscattr_fill_xflags(struct miscattr *ma, u32 xflags)
>
> Maybe call this miscattr_fill_from_xflags() and the next function
> miscattr_fill_from_flags()? At least to me it would be clearer when I want
> to use which function just by looking at the name...

Yes, more clarity for the cost of a longer name. I'm not sure...

[...]
> > +/**
> > + * vfs_miscattr_get - retrieve miscellaneous inode attributes
> > + * @dentry: the object to retrieve from
> > + * @ma: miscattr pointer
> > + *
> > + * Call i_op->miscattr_get() callback, if exists.
> > + *
> > + * Returns 0 on success, or a negative error on failure.
> > + */
> > +int vfs_miscattr_get(struct dentry *dentry, struct miscattr *ma)
> > +{
> > + struct inode *inode = d_inode(dentry);
> > +
> > + if (d_is_special(dentry))
> > + return -ENOTTY;
> > +
> > + if (!inode->i_op->miscattr_get)
> > + return -ENOIOCTLCMD;
> > +
> > + memset(ma, 0, sizeof(*ma));
>
> So here we clear whole 'ma' but callers already set e.g. xattr_valid field
> and cleared the 'ma' as well which just looks silly...

Well spotted. Fixed.

[...]
> > +/**
> > + * vfs_miscattr_set - change miscellaneous inode attributes
> > + * @dentry: the object to change
> > + * @ma: miscattr pointer
> > + *
> > + * After verifying permissions, call i_op->miscattr_set() callback, if
> > + * exists.
> > + *
> > + * Verifying attributes involves retrieving current attributes with
> > + * i_op->miscattr_get(), this also allows initilaizing attributes that have
> > + * not been set by the caller to current values. Inode lock is held
> > + * thoughout to prevent racing with another instance.
> > + *
> > + * Returns 0 on success, or a negative error on failure.
> > + */
> > +int vfs_miscattr_set(struct dentry *dentry, struct miscattr *ma)
> > +{
> > + struct inode *inode = d_inode(dentry);
> > + struct miscattr old_ma = {};
> > + int err;
> > +
> > + if (d_is_special(dentry))
> > + return -ENOTTY;
> > +
> > + if (!inode->i_op->miscattr_set)
> > + return -ENOIOCTLCMD;
> > +
> > + if (!inode_owner_or_capable(inode))
> > + return -EPERM;
> > +
> > + inode_lock(inode);
> > + err = vfs_miscattr_get(dentry, &old_ma);
> > + if (!err) {
> > + /* initialize missing bits from old_ma */
> > + if (ma->flags_valid) {
> > + ma->fsx_xflags |= old_ma.fsx_xflags & ~FS_XFLAG_COMMON;
> > + ma->fsx_extsize = old_ma.fsx_extsize;
> > + ma->fsx_nextents = old_ma.fsx_nextents;
> > + ma->fsx_projid = old_ma.fsx_projid;
> > + ma->fsx_cowextsize = old_ma.fsx_cowextsize;
> > + } else {
> > + ma->flags |= old_ma.flags & ~FS_COMMON_FL;
> > + }
> > + err = miscattr_set_prepare(inode, &old_ma, ma);
> > + if (!err)
> > + err = inode->i_op->miscattr_set(dentry, ma);
>
> So I somewhat wonder here - not all filesystems support all the xflags or
> other extended attributes. Currently these would be just silently ignored
> AFAICT. Which seems a bit dangerous to me - most notably because it makes
> future extensions of these filesystems difficult. So how are we going to go
> about this? Is every filesystem supposed to check what it supports and
> refuse other stuff (but currently e.g. your ext2 conversion patch doesn't do
> that AFAICT)? Shouldn't we make things easier for filesystems to provide a
> bitmask of changing fields (instead of flags / xflags bools) so that they
> can refuse unsupported stuff with a single mask check?

Ah, ext2 one is missing miscattr_has_xattr() check and doesn't use the
miscattr_fill_flags() helper. It was one of the earlier fs I
converted, and the API wasn't so refined then. Fixed.

Will review all conversions too for this type of omission.

Creating a mask instead of bool makes sense, I'll look into this.

> To make things more complex, ext2/4 has traditionally silently cleared
> unknown flags for setflags but not for setxflags. Unlike e.g. XFS which
> refuses unknown flags.

Right. Not sure if this can be fixed. Documenting rules and
exceptions should be a first step.

Thanks,
Miklos

2021-03-23 00:25:11

by Eric Biggers

[permalink] [raw]
Subject: Re: [PATCH 01/18] vfs: add miscattr ops

On Wed, Feb 03, 2021 at 01:40:55PM +0100, Miklos Szeredi wrote:
> + * Verifying attributes involves retrieving current attributes with
> + * i_op->miscattr_get(), this also allows initilaizing attributes that have

initilaizing => initializing

> +int vfs_miscattr_set(struct dentry *dentry, struct miscattr *ma)
> +{
> + struct inode *inode = d_inode(dentry);
> + struct miscattr old_ma = {};
> + int err;
> +
> + if (d_is_special(dentry))
> + return -ENOTTY;
> +
> + if (!inode->i_op->miscattr_set)
> + return -ENOIOCTLCMD;
> +
> + if (!inode_owner_or_capable(inode))
> + return -EPERM;

Shouldn't this be EACCES, not EPERM?

> +/**
> + * miscattr_has_xattr - check for extentended flags/attributes

extentended => extended

- Eric

2021-03-25 14:55:37

by Miklos Szeredi

[permalink] [raw]
Subject: Re: [PATCH 01/18] vfs: add miscattr ops

On Tue, Mar 23, 2021 at 1:24 AM Eric Biggers <[email protected]> wrote:

> > +int vfs_miscattr_set(struct dentry *dentry, struct miscattr *ma)
> > +{
> > + struct inode *inode = d_inode(dentry);
> > + struct miscattr old_ma = {};
> > + int err;
> > +
> > + if (d_is_special(dentry))
> > + return -ENOTTY;
> > +
> > + if (!inode->i_op->miscattr_set)
> > + return -ENOIOCTLCMD;
> > +
> > + if (!inode_owner_or_capable(inode))
> > + return -EPERM;
>
> Shouldn't this be EACCES, not EPERM?

$ git diff master.. | grep -C1 inode_owner_or_capable | grep
"^-.*\(EPERM\|EACCES\)" | cut -d- -f3 | sort | uniq -c
12 EACCES;
4 EPERM;

So EACCES would win if this was a democracy. However:

"[EACCES]
Permission denied. An attempt was made to access a file in a way
forbidden by its file access permissions."

"[EPERM]
Operation not permitted. An attempt was made to perform an operation
limited to processes with appropriate privileges or to the owner of a
file or other resource."

The EPERM description matches the semantics of
inode_owner_or_capable() exactly. It's a pretty clear choice.

Thanks,
Miklos

2021-03-25 15:22:12

by Al Viro

[permalink] [raw]
Subject: Re: [PATCH 01/18] vfs: add miscattr ops

On Thu, Mar 25, 2021 at 03:52:28PM +0100, Miklos Szeredi wrote:
> On Tue, Mar 23, 2021 at 1:24 AM Eric Biggers <[email protected]> wrote:
>
> > > +int vfs_miscattr_set(struct dentry *dentry, struct miscattr *ma)
> > > +{
> > > + struct inode *inode = d_inode(dentry);
> > > + struct miscattr old_ma = {};
> > > + int err;
> > > +
> > > + if (d_is_special(dentry))
> > > + return -ENOTTY;
> > > +
> > > + if (!inode->i_op->miscattr_set)
> > > + return -ENOIOCTLCMD;
> > > +
> > > + if (!inode_owner_or_capable(inode))
> > > + return -EPERM;
> >
> > Shouldn't this be EACCES, not EPERM?
>
> $ git diff master.. | grep -C1 inode_owner_or_capable | grep
> "^-.*\(EPERM\|EACCES\)" | cut -d- -f3 | sort | uniq -c
> 12 EACCES;
> 4 EPERM;
>
> So EACCES would win if this was a democracy. However:
>
> "[EACCES]
> Permission denied. An attempt was made to access a file in a way
> forbidden by its file access permissions."
>
> "[EPERM]
> Operation not permitted. An attempt was made to perform an operation
> limited to processes with appropriate privileges or to the owner of a
> file or other resource."
>
> The EPERM description matches the semantics of
> inode_owner_or_capable() exactly. It's a pretty clear choice.

Except that existing implementation (e.g. for ext2) gives -EACCES here...
OTOH, EPERM matches the behaviour of chown(2), as well as that of
*BSD chflags(2), which is the best match to functionality (setting and
clearing immutable/append-only/etc.)

So I'd probably go with EPERM, and watched for userland breakage;
if something *does* rely upon the historical EACCES here, we might
have to restore that.