2020-07-08 03:11:12

by Daniel Rosenberg

[permalink] [raw]
Subject: [PATCH v11 2/4] fs: Add standard casefolding support

This adds general supporting functions for filesystems that use
utf8 casefolding. It provides standard dentry_operations and adds the
necessary structures in struct super_block to allow this standardization.

The new dentry operations are functionally equivalent to the existing
operations in ext4 and f2fs, apart from the use of utf8_casefold_hash to
avoid an allocation.

By providing a common implementation, all users can benefit from any
optimizations without needing to port over improvements.

Signed-off-by: Daniel Rosenberg <[email protected]>
---
fs/libfs.c | 94 ++++++++++++++++++++++++++++++++++++++++++++++
include/linux/fs.h | 16 ++++++++
2 files changed, 110 insertions(+)

diff --git a/fs/libfs.c b/fs/libfs.c
index 4d08edf19c78..fe22e2be6f7a 100644
--- a/fs/libfs.c
+++ b/fs/libfs.c
@@ -20,6 +20,8 @@
#include <linux/fs_context.h>
#include <linux/pseudo_fs.h>
#include <linux/fsnotify.h>
+#include <linux/unicode.h>
+#include <linux/fscrypt.h>

#include <linux/uaccess.h>

@@ -1363,3 +1365,95 @@ bool is_empty_dir_inode(struct inode *inode)
return (inode->i_fop == &empty_dir_operations) &&
(inode->i_op == &empty_dir_inode_operations);
}
+
+#ifdef CONFIG_UNICODE
+/*
+ * Determine if the name of a dentry should be casefolded.
+ *
+ * Return: if names will need casefolding
+ */
+static bool needs_casefold(const struct inode *dir)
+{
+ return IS_CASEFOLDED(dir) && dir->i_sb->s_encoding;
+}
+
+/**
+ * generic_ci_d_compare - generic d_compare implementation for casefolding filesystems
+ * @dentry: dentry whose name we are checking against
+ * @len: len of name of dentry
+ * @str: str pointer to name of dentry
+ * @name: Name to compare against
+ *
+ * Return: 0 if names match, 1 if mismatch, or -ERRNO
+ */
+int generic_ci_d_compare(const struct dentry *dentry, unsigned int len,
+ const char *str, const struct qstr *name)
+{
+ const struct dentry *parent = READ_ONCE(dentry->d_parent);
+ const struct inode *inode = READ_ONCE(parent->d_inode);
+ const struct super_block *sb = dentry->d_sb;
+ const struct unicode_map *um = sb->s_encoding;
+ struct qstr qstr = QSTR_INIT(str, len);
+ char strbuf[DNAME_INLINE_LEN];
+ int ret;
+
+ if (!inode || !needs_casefold(inode))
+ goto fallback;
+ /*
+ * If the dentry name is stored in-line, then it may be concurrently
+ * modified by a rename. If this happens, the VFS will eventually retry
+ * the lookup, so it doesn't matter what ->d_compare() returns.
+ * However, it's unsafe to call utf8_strncasecmp() with an unstable
+ * string. Therefore, we have to copy the name into a temporary buffer.
+ */
+ if (len <= DNAME_INLINE_LEN - 1) {
+ memcpy(strbuf, str, len);
+ strbuf[len] = 0;
+ qstr.name = strbuf;
+ /* prevent compiler from optimizing out the temporary buffer */
+ barrier();
+ }
+ ret = utf8_strncasecmp(um, name, &qstr);
+ if (ret >= 0)
+ return ret;
+
+ if (sb_has_strict_encoding(sb))
+ return -EINVAL;
+fallback:
+ if (len != name->len)
+ return 1;
+ return !!memcmp(str, name->name, len);
+}
+EXPORT_SYMBOL(generic_ci_d_compare);
+
+/**
+ * generic_ci_d_hash - generic d_hash implementation for casefolding filesystems
+ * @dentry: dentry whose name we are hashing
+ * @str: qstr of name whose hash we should fill in
+ *
+ * Return: 0 if hash was successful, or -ERRNO
+ */
+int generic_ci_d_hash(const struct dentry *dentry, struct qstr *str)
+{
+ const struct inode *inode = READ_ONCE(dentry->d_inode);
+ struct super_block *sb = dentry->d_sb;
+ const struct unicode_map *um = sb->s_encoding;
+ int ret = 0;
+
+ if (!inode || !needs_casefold(inode))
+ return 0;
+
+ ret = utf8_casefold_hash(um, dentry, str);
+ if (ret < 0)
+ goto err;
+
+ return 0;
+err:
+ if (sb_has_strict_encoding(sb))
+ ret = -EINVAL;
+ else
+ ret = 0;
+ return ret;
+}
+EXPORT_SYMBOL(generic_ci_d_hash);
+#endif
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 3f881a892ea7..af8f2ecec8ff 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1392,6 +1392,12 @@ extern int send_sigurg(struct fown_struct *fown);
#define SB_ACTIVE (1<<30)
#define SB_NOUSER (1<<31)

+/* These flags relate to encoding and casefolding */
+#define SB_ENC_STRICT_MODE_FL (1 << 0)
+
+#define sb_has_strict_encoding(sb) \
+ (sb->s_encoding_flags & SB_ENC_STRICT_MODE_FL)
+
/*
* Umount options
*/
@@ -1461,6 +1467,10 @@ struct super_block {
#endif
#ifdef CONFIG_FS_VERITY
const struct fsverity_operations *s_vop;
+#endif
+#ifdef CONFIG_UNICODE
+ struct unicode_map *s_encoding;
+ __u16 s_encoding_flags;
#endif
struct hlist_bl_head s_roots; /* alternate root dentries for NFS */
struct list_head s_mounts; /* list of mounts; _not_ for fs use */
@@ -3385,6 +3395,12 @@ extern int generic_file_fsync(struct file *, loff_t, loff_t, int);

extern int generic_check_addressable(unsigned, u64);

+#ifdef CONFIG_UNICODE
+extern int generic_ci_d_hash(const struct dentry *dentry, struct qstr *str);
+extern int generic_ci_d_compare(const struct dentry *dentry, unsigned int len,
+ const char *str, const struct qstr *name);
+#endif
+
#ifdef CONFIG_MIGRATION
extern int buffer_migrate_page(struct address_space *,
struct page *, struct page *,
--
2.27.0.383.g050319c2ae-goog


2020-07-08 04:19:26

by Eric Biggers

[permalink] [raw]
Subject: Re: [PATCH v11 2/4] fs: Add standard casefolding support

On Tue, Jul 07, 2020 at 08:05:50PM -0700, Daniel Rosenberg wrote:
> +/**
> + * generic_ci_d_compare - generic d_compare implementation for casefolding filesystems
> + * @dentry: dentry whose name we are checking against
> + * @len: len of name of dentry
> + * @str: str pointer to name of dentry
> + * @name: Name to compare against
> + *
> + * Return: 0 if names match, 1 if mismatch, or -ERRNO
> + */
> +int generic_ci_d_compare(const struct dentry *dentry, unsigned int len,
> + const char *str, const struct qstr *name)
> +{
> + const struct dentry *parent = READ_ONCE(dentry->d_parent);
> + const struct inode *inode = READ_ONCE(parent->d_inode);

How about calling the 'inode' variable 'dir' instead?

That would help avoid confusion about what is the directory and what is a file
in the directory.

Likewise in generic_ci_d_hash().

> +/**
> + * generic_ci_d_hash - generic d_hash implementation for casefolding filesystems
> + * @dentry: dentry whose name we are hashing

This comment for @dentry needs to be updated.

It's the parent dentry, not the dentry whose name we are hashing.

> + * @str: qstr of name whose hash we should fill in
> + *
> + * Return: 0 if hash was successful, or -ERRNO

As I mentioned on v9, this can also return 0 if the hashing was not done because
it wants to fallback to the standard hashing. Can you please fix the comment?

> +int generic_ci_d_hash(const struct dentry *dentry, struct qstr *str)
> +{
> + const struct inode *inode = READ_ONCE(dentry->d_inode);
> + struct super_block *sb = dentry->d_sb;
> + const struct unicode_map *um = sb->s_encoding;
> + int ret = 0;
> +
> + if (!inode || !needs_casefold(inode))
> + return 0;
> +
> + ret = utf8_casefold_hash(um, dentry, str);
> + if (ret < 0)
> + goto err;
> +
> + return 0;
> +err:
> + if (sb_has_strict_encoding(sb))
> + ret = -EINVAL;
> + else
> + ret = 0;
> + return ret;
> +}

On v9, Gabriel suggested simplifying this to:

ret = utf8_casefold_hash(um, dentry, str);
if (ret < 0 && sb_has_enc_strict_mode(sb))
return -EINVAL;
return 0;

Any reason not to do that?

- Eric

2020-07-08 08:37:58

by Daniel Rosenberg

[permalink] [raw]
Subject: Re: [PATCH v11 2/4] fs: Add standard casefolding support

On Tue, Jul 7, 2020 at 9:12 PM Eric Biggers <[email protected]> wrote:
>
> On Tue, Jul 07, 2020 at 08:05:50PM -0700, Daniel Rosenberg wrote:
> > +/**
> > + * generic_ci_d_compare - generic d_compare implementation for casefolding filesystems
> > + * @dentry: dentry whose name we are checking against
> > + * @len: len of name of dentry
> > + * @str: str pointer to name of dentry
> > + * @name: Name to compare against
> > + *
> > + * Return: 0 if names match, 1 if mismatch, or -ERRNO
> > + */
> > +int generic_ci_d_compare(const struct dentry *dentry, unsigned int len,
> > + const char *str, const struct qstr *name)
> > +{
> > + const struct dentry *parent = READ_ONCE(dentry->d_parent);
> > + const struct inode *inode = READ_ONCE(parent->d_inode);
>
> How about calling the 'inode' variable 'dir' instead?
>
> That would help avoid confusion about what is the directory and what is a file
> in the directory.
>
> Likewise in generic_ci_d_hash().
>
> > +/**
> > + * generic_ci_d_hash - generic d_hash implementation for casefolding filesystems
> > + * @dentry: dentry whose name we are hashing
>
> This comment for @dentry needs to be updated.
>
> It's the parent dentry, not the dentry whose name we are hashing.
>
> > + * @str: qstr of name whose hash we should fill in
> > + *
> > + * Return: 0 if hash was successful, or -ERRNO
>
> As I mentioned on v9, this can also return 0 if the hashing was not done because
> it wants to fallback to the standard hashing. Can you please fix the comment?
>
> > +int generic_ci_d_hash(const struct dentry *dentry, struct qstr *str)
> > +{
> > + const struct inode *inode = READ_ONCE(dentry->d_inode);
> > + struct super_block *sb = dentry->d_sb;
> > + const struct unicode_map *um = sb->s_encoding;
> > + int ret = 0;
> > +
> > + if (!inode || !needs_casefold(inode))
> > + return 0;
> > +
> > + ret = utf8_casefold_hash(um, dentry, str);
> > + if (ret < 0)
> > + goto err;
> > +
> > + return 0;
> > +err:
> > + if (sb_has_strict_encoding(sb))
> > + ret = -EINVAL;
> > + else
> > + ret = 0;
> > + return ret;
> > +}
>
> On v9, Gabriel suggested simplifying this to:
>
> ret = utf8_casefold_hash(um, dentry, str);
> if (ret < 0 && sb_has_enc_strict_mode(sb))
> return -EINVAL;
> return 0;
>
> Any reason not to do that?
>
> - Eric

Guh, I remember making those changes, must've lost them in a rebase :(
I'll resend shortly.
-Daniel