2021-03-23 20:02:10

by André Almeida

[permalink] [raw]
Subject: [RFC PATCH 2/4] mm: shmem: Support case-insensitive file name lookups

This patch implements the support for case-insensitive file name lookups
in tmpfs, based on the encoding passed in the mount options.

A filesystem that has the casefold feature set is able to configure
directories with the +F (TMPFS_CASEFOLD_FL) attribute, enabling lookups
to succeed in that directory in a case-insensitive fashion, i.e: match
a directory entry even if the name used by userspace is not a byte per
byte match with the disk name, but is an equivalent case-insensitive
version of the Unicode string. This operation is called a
case-insensitive file name lookup.

The feature is configured as an inode attribute applied to directories
and inherited by its children. This attribute can only be enabled on
empty directories for filesystems that support the encoding feature,
thus preventing collision of file names that only differ by case.

* dcache handling:

For a +F directory, tmpfs only stores the first equivalent name dentry
used in the dcache. This is done to prevent unintentional duplication of
dentries in the dcache, while also allowing the VFS code to quickly find
the right entry in the cache despite which equivalent string was used in
a previous lookup, without having to resort to ->lookup().

d_hash() of casefolded directories is implemented as the hash of the
casefolded string, such that we always have a well-known bucket for all
the equivalencies of the same string. d_compare() uses the
utf8_strncasecmp() infrastructure, which handles the comparison of
equivalent, same case, names as well.

For now, negative lookups are not inserted in the dcache, since they
would need to be invalidated anyway, because we can't trust missing file
dentries. This is bad for performance but requires some leveraging of
the VFS layer to fix. We can live without that for now, and so does
everyone else.

The lookup() path at tmpfs creates negatives dentries, that are later
instantiated if the file is created. In that way, all files in tmpfs
have a dentry given that the filesystem exists exclusively in memory.
As explained above, we don't have negative dentries for casefold files,
so dentries are created at lookup() iff files aren't casefolded. Else,
the dentry is created just before being instantiated at create path.
At the remove path, dentries are invalidated for casefolded files.

* Dealing with invalid sequences:

By default, when an invalid UTF-8 sequence is identified, tmpfs will treat
it as an opaque byte sequence, ignoring the encoding and reverting to
the old behavior for that unique file. This means that case-insensitive
file name lookup will not work only for that file. An optional flag
(cf_strict) can be set in the mount arguments telling the filesystem
code and userspace tools to enforce the encoding. When that optional
flag is set, any attempt to create a file name using an invalid UTF-8
sequence will fail and return an error to userspace.

Signed-off-by: André Almeida <[email protected]>
---
include/linux/shmem_fs.h | 1 +
mm/shmem.c | 91 +++++++++++++++++++++++++++++++++++++++-
2 files changed, 90 insertions(+), 2 deletions(-)

diff --git a/include/linux/shmem_fs.h b/include/linux/shmem_fs.h
index d82b6f396588..29ee64352807 100644
--- a/include/linux/shmem_fs.h
+++ b/include/linux/shmem_fs.h
@@ -43,6 +43,7 @@ struct shmem_sb_info {
spinlock_t shrinklist_lock; /* Protects shrinklist */
struct list_head shrinklist; /* List of shinkable inodes */
unsigned long shrinklist_len; /* Length of shrinklist */
+ bool casefold; /* If this mount point supports casefolding */
};

static inline struct shmem_inode_info *SHMEM_I(struct inode *inode)
diff --git a/mm/shmem.c b/mm/shmem.c
index b2db4ed0fbc7..20df81763995 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -38,6 +38,7 @@
#include <linux/hugetlb.h>
#include <linux/frontswap.h>
#include <linux/fs_parser.h>
+#include <linux/unicode.h>

#include <asm/tlbflush.h> /* for arch/microblaze update_mmu_cache() */

@@ -117,6 +118,8 @@ struct shmem_options {
bool full_inums;
int huge;
int seen;
+ struct unicode_map *encoding;
+ bool cf_strict;
#define SHMEM_SEEN_BLOCKS 1
#define SHMEM_SEEN_INODES 2
#define SHMEM_SEEN_HUGE 4
@@ -161,6 +164,13 @@ static inline struct shmem_sb_info *SHMEM_SB(struct super_block *sb)
return sb->s_fs_info;
}

+#ifdef CONFIG_UNICODE
+static const struct dentry_operations casefold_dentry_ops = {
+ .d_hash = generic_ci_d_hash,
+ .d_compare = generic_ci_d_compare,
+};
+#endif
+
/*
* shmem_file_setup pre-accounts the whole fixed size of a VM object,
* for shared memory and for shared anonymous (/dev/zero) mappings
@@ -2859,8 +2869,18 @@ shmem_mknod(struct user_namespace *mnt_userns, struct inode *dir,
struct inode *inode;
int error = -ENOSPC;

+#ifdef CONFIG_UNICODE
+ struct super_block *sb = dir->i_sb;
+
+ if (sb_has_strict_encoding(sb) && IS_CASEFOLDED(dir) &&
+ sb->s_encoding && utf8_validate(sb->s_encoding, &dentry->d_name))
+ return -EINVAL;
+#endif
+
inode = shmem_get_inode(dir->i_sb, dir, mode, dev, VM_NORESERVE);
if (inode) {
+ inode->i_flags |= dir->i_flags;
+
error = simple_acl_create(dir, inode);
if (error)
goto out_iput;
@@ -2870,6 +2890,9 @@ shmem_mknod(struct user_namespace *mnt_userns, struct inode *dir,
if (error && error != -EOPNOTSUPP)
goto out_iput;

+ if (IS_CASEFOLDED(dir))
+ d_add(dentry, NULL);
+
error = 0;
dir->i_size += BOGO_DIRENT_SIZE;
dir->i_ctime = dir->i_mtime = current_time(dir);
@@ -2925,6 +2948,19 @@ static int shmem_create(struct user_namespace *mnt_userns, struct inode *dir,
return shmem_mknod(&init_user_ns, dir, dentry, mode | S_IFREG, 0);
}

+static struct dentry *shmem_lookup(struct inode *dir, struct dentry *dentry, unsigned int flags)
+{
+ if (dentry->d_name.len > NAME_MAX)
+ return ERR_PTR(-ENAMETOOLONG);
+
+ if (IS_CASEFOLDED(dir))
+ return NULL;
+
+ d_add(dentry, NULL);
+
+ return NULL;
+}
+
/*
* Link a file..
*/
@@ -2946,6 +2982,9 @@ static int shmem_link(struct dentry *old_dentry, struct inode *dir, struct dentr
goto out;
}

+ if (IS_CASEFOLDED(dir))
+ d_add(dentry, NULL);
+
dir->i_size += BOGO_DIRENT_SIZE;
inode->i_ctime = dir->i_ctime = dir->i_mtime = current_time(inode);
inc_nlink(inode);
@@ -2967,6 +3006,10 @@ static int shmem_unlink(struct inode *dir, struct dentry *dentry)
inode->i_ctime = dir->i_ctime = dir->i_mtime = current_time(inode);
drop_nlink(inode);
dput(dentry); /* Undo the count from "create" - this does all the work */
+
+ if (IS_CASEFOLDED(dir))
+ d_invalidate(dentry);
+
return 0;
}

@@ -3128,6 +3171,8 @@ static int shmem_symlink(struct user_namespace *mnt_userns, struct inode *dir,
}
dir->i_size += BOGO_DIRENT_SIZE;
dir->i_ctime = dir->i_mtime = current_time(dir);
+ if (IS_CASEFOLDED(dir))
+ d_add(dentry, NULL);
d_instantiate(dentry, inode);
dget(dentry);
return 0;
@@ -3364,6 +3409,8 @@ enum shmem_param {
Opt_uid,
Opt_inode32,
Opt_inode64,
+ Opt_casefold,
+ Opt_cf_strict,
};

static const struct constant_table shmem_param_enums_huge[] = {
@@ -3385,6 +3432,8 @@ const struct fs_parameter_spec shmem_fs_parameters[] = {
fsparam_u32 ("uid", Opt_uid),
fsparam_flag ("inode32", Opt_inode32),
fsparam_flag ("inode64", Opt_inode64),
+ fsparam_string("casefold", Opt_casefold),
+ fsparam_flag ("cf_strict", Opt_cf_strict),
{}
};

@@ -3392,9 +3441,11 @@ static int shmem_parse_one(struct fs_context *fc, struct fs_parameter *param)
{
struct shmem_options *ctx = fc->fs_private;
struct fs_parse_result result;
+ struct unicode_map *encoding;
unsigned long long size;
+ char version[10];
char *rest;
- int opt;
+ int opt, ret;

opt = fs_parse(fc, shmem_fs_parameters, param, &result);
if (opt < 0)
@@ -3468,6 +3519,23 @@ static int shmem_parse_one(struct fs_context *fc, struct fs_parameter *param)
ctx->full_inums = true;
ctx->seen |= SHMEM_SEEN_INUMS;
break;
+ case Opt_casefold:
+ if (strncmp(param->string, "utf8-", 5))
+ return invalfc(fc, "Only utf8 encondings are supported");
+ ret = strscpy(version, param->string + 5, sizeof(version));
+ if (ret < 0)
+ return invalfc(fc, "Invalid enconding argument: %s", param->string);
+
+ encoding = utf8_load(version);
+ if (IS_ERR(encoding))
+ return invalfc(fc, "Invalid utf8 version: %s", version);
+ pr_info("tmpfs: Using encoding defined by mount options: %s\n",
+ param->string);
+ ctx->encoding = encoding;
+ break;
+ case Opt_cf_strict:
+ ctx->cf_strict = true;
+ break;
}
return 0;

@@ -3646,6 +3714,11 @@ static void shmem_put_super(struct super_block *sb)
{
struct shmem_sb_info *sbinfo = SHMEM_SB(sb);

+#ifdef CONFIG_UNICODE
+ if (sbinfo->casefold)
+ utf8_unload(sb->s_encoding);
+#endif
+
free_percpu(sbinfo->ino_batch);
percpu_counter_destroy(&sbinfo->used_blocks);
mpol_put(sbinfo->mpol);
@@ -3686,6 +3759,18 @@ static int shmem_fill_super(struct super_block *sb, struct fs_context *fc)
}
sb->s_export_op = &shmem_export_ops;
sb->s_flags |= SB_NOSEC;
+
+#ifdef CONFIG_UNICODE
+ if (ctx->encoding) {
+ sb->s_d_op = &casefold_dentry_ops;
+ sb->s_encoding = ctx->encoding;
+ if (ctx->cf_strict)
+ sb->s_encoding_flags = SB_ENC_STRICT_MODE_FL;
+ sbinfo->casefold = true;
+ } else if (ctx->cf_strict) {
+ pr_warn("tmpfs: casefold strict mode enabled without encoding, ignoring\n");
+ }
+#endif /* CONFIG_UNICODE */
#else
sb->s_flags |= SB_NOUSER;
#endif
@@ -3846,7 +3931,7 @@ static const struct inode_operations shmem_inode_operations = {
static const struct inode_operations shmem_dir_inode_operations = {
#ifdef CONFIG_TMPFS
.create = shmem_create,
- .lookup = simple_lookup,
+ .lookup = shmem_lookup,
.link = shmem_link,
.unlink = shmem_unlink,
.symlink = shmem_symlink,
@@ -3912,6 +3997,8 @@ int shmem_init_fs_context(struct fs_context *fc)
ctx->mode = 0777 | S_ISVTX;
ctx->uid = current_fsuid();
ctx->gid = current_fsgid();
+ ctx->encoding = NULL;
+ ctx->cf_strict = false;

fc->fs_private = ctx;
fc->ops = &shmem_fs_context_ops;
--
2.31.0


2021-03-23 20:20:54

by Gabriel Krisman Bertazi

[permalink] [raw]
Subject: Re: [RFC PATCH 2/4] mm: shmem: Support case-insensitive file name lookups

André Almeida <[email protected]> writes:

> This patch implements the support for case-insensitive file name lookups
> in tmpfs, based on the encoding passed in the mount options.

Thanks for doing this.

>
> +#ifdef CONFIG_UNICODE
> +static const struct dentry_operations casefold_dentry_ops = {
> + .d_hash = generic_ci_d_hash,
> + .d_compare = generic_ci_d_compare,
> +};
> +#endif

Why not reuse struct generic_ci_dentry_ops ?

> +
> /*
> * shmem_file_setup pre-accounts the whole fixed size of a VM object,
> * for shared memory and for shared anonymous (/dev/zero) mappings
> @@ -2859,8 +2869,18 @@ shmem_mknod(struct user_namespace *mnt_userns, struct inode *dir,
> struct inode *inode;
> int error = -ENOSPC;
>
> +#ifdef CONFIG_UNICODE
> + struct super_block *sb = dir->i_sb;
> +
> + if (sb_has_strict_encoding(sb) && IS_CASEFOLDED(dir) &&
> + sb->s_encoding && utf8_validate(sb->s_encoding, &dentry->d_name))
> + return -EINVAL;
> +#endif
> +
> inode = shmem_get_inode(dir->i_sb, dir, mode, dev, VM_NORESERVE);
> if (inode) {
> + inode->i_flags |= dir->i_flags;
> +
> error = simple_acl_create(dir, inode);
> if (error)
> goto out_iput;
> @@ -2870,6 +2890,9 @@ shmem_mknod(struct user_namespace *mnt_userns, struct inode *dir,
> if (error && error != -EOPNOTSUPP)
> goto out_iput;
>
> + if (IS_CASEFOLDED(dir))
> + d_add(dentry, NULL);
> +
> error = 0;
> dir->i_size += BOGO_DIRENT_SIZE;
> dir->i_ctime = dir->i_mtime = current_time(dir);
> @@ -2925,6 +2948,19 @@ static int shmem_create(struct user_namespace *mnt_userns, struct inode *dir,
> return shmem_mknod(&init_user_ns, dir, dentry, mode | S_IFREG, 0);
> }
>
> +static struct dentry *shmem_lookup(struct inode *dir, struct dentry *dentry, unsigned int flags)
> +{
> + if (dentry->d_name.len > NAME_MAX)
> + return ERR_PTR(-ENAMETOOLONG);
> +
> + if (IS_CASEFOLDED(dir))
> + return NULL;

I think this deserves a comment explaining why it is necessary.

> +
> + d_add(dentry, NULL);
> +
> + return NULL;
> +}
> +
> /*
> * Link a file..
> */
> @@ -2946,6 +2982,9 @@ static int shmem_link(struct dentry *old_dentry, struct inode *dir, struct dentr
> goto out;
> }
>
> + if (IS_CASEFOLDED(dir))
> + d_add(dentry, NULL);
> +
> dir->i_size += BOGO_DIRENT_SIZE;
> inode->i_ctime = dir->i_ctime = dir->i_mtime = current_time(inode);
> inc_nlink(inode);
> @@ -2967,6 +3006,10 @@ static int shmem_unlink(struct inode *dir, struct dentry *dentry)
> inode->i_ctime = dir->i_ctime = dir->i_mtime = current_time(inode);
> drop_nlink(inode);
> dput(dentry); /* Undo the count from "create" - this does all the work */
> +
> + if (IS_CASEFOLDED(dir))
> + d_invalidate(dentry);
> +
> return 0;
> }
>
> @@ -3128,6 +3171,8 @@ static int shmem_symlink(struct user_namespace *mnt_userns, struct inode *dir,
> }
> dir->i_size += BOGO_DIRENT_SIZE;
> dir->i_ctime = dir->i_mtime = current_time(dir);
> + if (IS_CASEFOLDED(dir))
> + d_add(dentry, NULL);
> d_instantiate(dentry, inode);
> dget(dentry);
> return 0;
> @@ -3364,6 +3409,8 @@ enum shmem_param {
> Opt_uid,
> Opt_inode32,
> Opt_inode64,
> + Opt_casefold,
> + Opt_cf_strict,
> };
>
> static const struct constant_table shmem_param_enums_huge[] = {
> @@ -3385,6 +3432,8 @@ const struct fs_parameter_spec shmem_fs_parameters[] = {
> fsparam_u32 ("uid", Opt_uid),
> fsparam_flag ("inode32", Opt_inode32),
> fsparam_flag ("inode64", Opt_inode64),
> + fsparam_string("casefold", Opt_casefold),
> + fsparam_flag ("cf_strict", Opt_cf_strict),
> {}
> };
>
> @@ -3392,9 +3441,11 @@ static int shmem_parse_one(struct fs_context *fc, struct fs_parameter *param)
> {
> struct shmem_options *ctx = fc->fs_private;
> struct fs_parse_result result;
> + struct unicode_map *encoding;
> unsigned long long size;
> + char version[10];
> char *rest;
> - int opt;
> + int opt, ret;
>
> opt = fs_parse(fc, shmem_fs_parameters, param, &result);
> if (opt < 0)
> @@ -3468,6 +3519,23 @@ static int shmem_parse_one(struct fs_context *fc, struct fs_parameter *param)
> ctx->full_inums = true;
> ctx->seen |= SHMEM_SEEN_INUMS;
> break;
> + case Opt_casefold:
> + if (strncmp(param->string, "utf8-", 5))
> + return invalfc(fc, "Only utf8 encondings are supported");
> + ret = strscpy(version, param->string + 5, sizeof(version));

Ugh. Now we are doing two strscpy for the parse api (in unicode_load).
Can change the unicode_load api to reuse it?

thanks,

--
Gabriel Krisman Bertazi

2021-03-24 20:22:06

by André Almeida

[permalink] [raw]
Subject: Re: [RFC PATCH 2/4] mm: shmem: Support case-insensitive file name lookups

Às 17:18 de 23/03/21, Gabriel Krisman Bertazi escreveu:
> André Almeida <[email protected]> writes:
>> opt = fs_parse(fc, shmem_fs_parameters, param, &result);
>> if (opt < 0)
>> @@ -3468,6 +3519,23 @@ static int shmem_parse_one(struct fs_context *fc, struct fs_parameter *param)
>> ctx->full_inums = true;
>> ctx->seen |= SHMEM_SEEN_INUMS;
>> break;
>> + case Opt_casefold:
>> + if (strncmp(param->string, "utf8-", 5))
>> + return invalfc(fc, "Only utf8 encondings are supported");
>> + ret = strscpy(version, param->string + 5, sizeof(version));
>
> Ugh. Now we are doing two strscpy for the parse api (in unicode_load).
> Can change the unicode_load api to reuse it?
>

So instead of getting just the version number (e.g. "12.1.0") as
parameter, utf8_load/unicode_load would get the full encoding string
(e.g. "utf8-12.1.0") right?

2021-03-24 21:36:20

by Al Viro

[permalink] [raw]
Subject: Re: [RFC PATCH 2/4] mm: shmem: Support case-insensitive file name lookups

On Tue, Mar 23, 2021 at 04:59:39PM -0300, Andr? Almeida wrote:

> * dcache handling:
>
> For a +F directory, tmpfs only stores the first equivalent name dentry
> used in the dcache. This is done to prevent unintentional duplication of
> dentries in the dcache, while also allowing the VFS code to quickly find
> the right entry in the cache despite which equivalent string was used in
> a previous lookup, without having to resort to ->lookup().
>
> d_hash() of casefolded directories is implemented as the hash of the
> casefolded string, such that we always have a well-known bucket for all
> the equivalencies of the same string. d_compare() uses the
> utf8_strncasecmp() infrastructure, which handles the comparison of
> equivalent, same case, names as well.
>
> For now, negative lookups are not inserted in the dcache, since they
> would need to be invalidated anyway, because we can't trust missing file
> dentries. This is bad for performance but requires some leveraging of
> the VFS layer to fix. We can live without that for now, and so does
> everyone else.

"For now"? Not a single practical suggestion has ever materialized.
Pardon me, but by now I'm very sceptical about the odds of that
ever changing. And no, I don't have any suggestions either.

> The lookup() path at tmpfs creates negatives dentries, that are later
> instantiated if the file is created. In that way, all files in tmpfs
> have a dentry given that the filesystem exists exclusively in memory.
> As explained above, we don't have negative dentries for casefold files,
> so dentries are created at lookup() iff files aren't casefolded. Else,
> the dentry is created just before being instantiated at create path.
> At the remove path, dentries are invalidated for casefolded files.

Umm... What happens to those assertions if previously sane directory
gets case-buggered? You've got an ioctl for doing just that...
Incidentally, that ioctl is obviously racy - result of that simple_empty()
might have nothing to do with reality before it is returned to caller.
And while we are at it, simple_empty() doesn't check a damn thing about
negative dentries in there...

2021-03-25 03:25:17

by André Almeida

[permalink] [raw]
Subject: Re: [RFC PATCH 2/4] mm: shmem: Support case-insensitive file name lookups

Hi Al Viro,

Às 20:19 de 23/03/21, Al Viro escreveu:
> On Tue, Mar 23, 2021 at 04:59:39PM -0300, André Almeida wrote:
>
>> * dcache handling:
>>
>> For now, negative lookups are not inserted in the dcache, since they
>> would need to be invalidated anyway, because we can't trust missing file
>> dentries. This is bad for performance but requires some leveraging of
>> the VFS layer to fix. We can live without that for now, and so does
>> everyone else.
>
> "For now"? Not a single practical suggestion has ever materialized.
> Pardon me, but by now I'm very sceptical about the odds of that
> ever changing. And no, I don't have any suggestions either.

Right, I'll reword this to reflect that there's no expectation that this
will be done, while keeping documented this performance issue.

>
>> The lookup() path at tmpfs creates negatives dentries, that are later
>> instantiated if the file is created. In that way, all files in tmpfs
>> have a dentry given that the filesystem exists exclusively in memory.
>> As explained above, we don't have negative dentries for casefold files,
>> so dentries are created at lookup() iff files aren't casefolded. Else,
>> the dentry is created just before being instantiated at create path.
>> At the remove path, dentries are invalidated for casefolded files.
>
> Umm... What happens to those assertions if previously sane directory
> gets case-buggered? You've got an ioctl for doing just that...
> Incidentally, that ioctl is obviously racy - result of that simple_empty()
> might have nothing to do with reality before it is returned to caller.
> And while we are at it, simple_empty() doesn't check a damn thing about
> negative dentries in there...
>

Thanks for pointing those issues. I'll move my lock at IOCTL to make
impossible to change directory attributes and add a file there at the
same time. About the negative dentries that existed before at that
directory, I believe the way to solve this is by invalidating them all.
How that sound to you?

Thanks,
André