2022-03-22 03:30:40

by Gabriel Krisman Bertazi

[permalink] [raw]
Subject: [PATCH 0/5] Clean up the case-insenstive lookup path

The case-insensitive implementations in f2fs and ext4 have quite a bit
of duplicated code. This series simplifies the ext4 version, with the
goal (not completed) of extracting ext4_ci_compare into a helper library
that can be used by both filesystems.

While there, I noticed we can leverage the utf8 functions to detect
encoded names that are corrupted in the filesystem. The final patch
adds an ext4 error on that scenario, to mark the filesystem as
corrupted.

This series survived passes of xfstests -g quick.

Gabriel Krisman Bertazi (5):
ext4: Match the f2fs ci_compare implementation
ext4: Simplify the handling of chached insensitive names
ext4: Implement ci comparison using fscrypt_name
ext4: Simplify hash check on ext4_match
ext4: Log error when lookup of encoded dentry fails

fs/ext4/ext4.h | 2 +-
fs/ext4/namei.c | 110 +++++++++++++++++++++++-----------------
include/linux/fscrypt.h | 4 ++
3 files changed, 68 insertions(+), 48 deletions(-)

--
2.35.1


2022-03-22 03:30:50

by Gabriel Krisman Bertazi

[permalink] [raw]
Subject: [PATCH 1/5] ext4: Match the f2fs ci_compare implementation

ext4_ci_compare originally follows utf8_*_strcmp, which means return
zero on match. This means that every usage of that in ext4 negates
the return.

Turn it into a predicate function, let it follow the kernel convention
and return true on match, which means it's now the same as its f2fs
counterpart and can be extracted into generic code.

This change also makes it more obvious that we are ignoring error
handling in ext4_match, which can occur since casefolding support (bad
utf8 name due to disk corruption on strict mode causes -EINVAL) and
casefold+encryption (-ENOMEM). For now, keep the behavior. It is
handled by the following patches.

While we are there, change the comment to the kernel-doc style.

Signed-off-by: Gabriel Krisman Bertazi <[email protected]>
---
fs/ext4/namei.c | 62 +++++++++++++++++++++++++++++++++----------------
1 file changed, 42 insertions(+), 20 deletions(-)

diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c
index 8cf0a924a49b..24ea3bb446d0 100644
--- a/fs/ext4/namei.c
+++ b/fs/ext4/namei.c
@@ -1318,13 +1318,20 @@ static void dx_insert_block(struct dx_frame *frame, u32 hash, ext4_lblk_t block)
}

#if IS_ENABLED(CONFIG_UNICODE)
-/*
+/**
+ * ext4_ci_compare() - Match (case-insensitive) a name with a dirent.
+ * @parent: Inode of the parent of the dentry.
+ * @name: name under lookup.
+ * @de_name: Dirent name.
+ * @de_name_len: dirent name length.
+ * @quick: whether @name is already casefolded.
+ *
* Test whether a case-insensitive directory entry matches the filename
- * being searched for. If quick is set, assume the name being looked up
- * is already in the casefolded form.
+ * being searched. If quick is set, the @name being looked up is
+ * already in the casefolded form.
*
- * Returns: 0 if the directory entry matches, more than 0 if it
- * doesn't match or less than zero on error.
+ * Return: > 0 if the directory entry matches, 0 if it doesn't match, or
+ * < 0 on error.
*/
static int ext4_ci_compare(const struct inode *parent, const struct qstr *name,
u8 *de_name, size_t de_name_len, bool quick)
@@ -1333,7 +1340,7 @@ static int ext4_ci_compare(const struct inode *parent, const struct qstr *name,
const struct unicode_map *um = sb->s_encoding;
struct fscrypt_str decrypted_name = FSTR_INIT(NULL, de_name_len);
struct qstr entry = QSTR_INIT(de_name, de_name_len);
- int ret;
+ int ret, match = false;

if (IS_ENCRYPTED(parent)) {
const struct fscrypt_str encrypted_name =
@@ -1354,20 +1361,22 @@ static int ext4_ci_compare(const struct inode *parent, const struct qstr *name,
ret = utf8_strncasecmp_folded(um, name, &entry);
else
ret = utf8_strncasecmp(um, name, &entry);
- if (ret < 0) {
- /* Handle invalid character sequence as either an error
- * or as an opaque byte sequence.
+
+ if (!ret)
+ match = true;
+ else if (ret < 0 && !sb_has_strict_encoding(sb)) {
+ /*
+ * In non-strict mode, fallback to a byte comparison if
+ * the names have invalid characters.
*/
- if (sb_has_strict_encoding(sb))
- ret = -EINVAL;
- else if (name->len != entry.len)
- ret = 1;
- else
- ret = !!memcmp(name->name, entry.name, entry.len);
+ ret = 0;
+ match = ((name->len == entry.len) &&
+ !memcmp(name->name, entry.name, entry.len));
}
+
out:
kfree(decrypted_name.name);
- return ret;
+ return (ret >= 0) ? match : ret;
}

int ext4_fname_setup_ci_filename(struct inode *dir, const struct qstr *iname,
@@ -1418,6 +1427,7 @@ static bool ext4_match(struct inode *parent,
struct ext4_dir_entry_2 *de)
{
struct fscrypt_name f;
+ int ret;

if (!de->inode)
return false;
@@ -1442,11 +1452,23 @@ static bool ext4_match(struct inode *parent,
return false;
}
}
- return !ext4_ci_compare(parent, &cf, de->name,
- de->name_len, true);
+ ret = ext4_ci_compare(parent, &cf, de->name,
+ de->name_len, true);
+ } else {
+ ret = ext4_ci_compare(parent, fname->usr_fname,
+ de->name, de->name_len, false);
}
- return !ext4_ci_compare(parent, fname->usr_fname, de->name,
- de->name_len, false);
+
+ if (ret < 0) {
+ /*
+ * Treat comparison errors as not a match. The
+ * only case where it happens is on a disk
+ * corruption or ENOMEM.
+ */
+ return false;
+ }
+ return ret;
+
}
#endif

--
2.35.1

2022-03-22 03:30:54

by Gabriel Krisman Bertazi

[permalink] [raw]
Subject: [PATCH 2/5] ext4: Simplify the handling of chached insensitive names

Keeping it as qstr avoids the unnecessary conversion in ext4_match

Signed-off-by: Gabriel Krisman Bertazi <[email protected]>
---
fs/ext4/ext4.h | 2 +-
fs/ext4/namei.c | 23 +++++++++++------------
2 files changed, 12 insertions(+), 13 deletions(-)

diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index bcd3b9bf8069..46e729ce7b35 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -2484,7 +2484,7 @@ struct ext4_filename {
struct fscrypt_str crypto_buf;
#endif
#if IS_ENABLED(CONFIG_UNICODE)
- struct fscrypt_str cf_name;
+ struct qstr cf_name;
#endif
};

diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c
index 24ea3bb446d0..8976e5a28c73 100644
--- a/fs/ext4/namei.c
+++ b/fs/ext4/namei.c
@@ -1382,28 +1382,29 @@ static int ext4_ci_compare(const struct inode *parent, const struct qstr *name,
int ext4_fname_setup_ci_filename(struct inode *dir, const struct qstr *iname,
struct ext4_filename *name)
{
- struct fscrypt_str *cf_name = &name->cf_name;
+ struct qstr *cf_name = &name->cf_name;
+ unsigned char *buf;
struct dx_hash_info *hinfo = &name->hinfo;
int len;

if (!IS_CASEFOLDED(dir) || !dir->i_sb->s_encoding ||
(IS_ENCRYPTED(dir) && !fscrypt_has_encryption_key(dir))) {
- cf_name->name = NULL;
+ name->cf_name.name = NULL;
return 0;
}

- cf_name->name = kmalloc(EXT4_NAME_LEN, GFP_NOFS);
- if (!cf_name->name)
+ buf = kmalloc(EXT4_NAME_LEN, GFP_NOFS);
+ if (!buf)
return -ENOMEM;

- len = utf8_casefold(dir->i_sb->s_encoding,
- iname, cf_name->name,
- EXT4_NAME_LEN);
+ len = utf8_casefold(dir->i_sb->s_encoding, iname, buf, EXT4_NAME_LEN);
if (len <= 0) {
- kfree(cf_name->name);
- cf_name->name = NULL;
+ kfree(buf);
+ buf = NULL;
}
+ cf_name->name = buf;
cf_name->len = (unsigned) len;
+
if (!IS_ENCRYPTED(dir))
return 0;

@@ -1442,8 +1443,6 @@ static bool ext4_match(struct inode *parent,
if (parent->i_sb->s_encoding && IS_CASEFOLDED(parent) &&
(!IS_ENCRYPTED(parent) || fscrypt_has_encryption_key(parent))) {
if (fname->cf_name.name) {
- struct qstr cf = {.name = fname->cf_name.name,
- .len = fname->cf_name.len};
if (IS_ENCRYPTED(parent)) {
if (fname->hinfo.hash != EXT4_DIRENT_HASH(de) ||
fname->hinfo.minor_hash !=
@@ -1452,7 +1451,7 @@ static bool ext4_match(struct inode *parent,
return false;
}
}
- ret = ext4_ci_compare(parent, &cf, de->name,
+ ret = ext4_ci_compare(parent, &fname->cf_name, de->name,
de->name_len, true);
} else {
ret = ext4_ci_compare(parent, fname->usr_fname,
--
2.35.1

2022-03-22 03:31:13

by Gabriel Krisman Bertazi

[permalink] [raw]
Subject: [PATCH 3/5] ext4: Implement ci comparison using fscrypt_name

By using fscrypt_name here, we can hide most of the caching casefold
logic from ext4. The condition in ext4_match is now quite redundant,
but this is addressed in the next patch.

Signed-off-by: Gabriel Krisman Bertazi <[email protected]>
---
fs/ext4/namei.c | 26 ++++++++++++--------------
include/linux/fscrypt.h | 4 ++++
2 files changed, 16 insertions(+), 14 deletions(-)

diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c
index 8976e5a28c73..71b4b05fae89 100644
--- a/fs/ext4/namei.c
+++ b/fs/ext4/namei.c
@@ -1321,10 +1321,9 @@ static void dx_insert_block(struct dx_frame *frame, u32 hash, ext4_lblk_t block)
/**
* ext4_ci_compare() - Match (case-insensitive) a name with a dirent.
* @parent: Inode of the parent of the dentry.
- * @name: name under lookup.
+ * @fname: name under lookup.
* @de_name: Dirent name.
* @de_name_len: dirent name length.
- * @quick: whether @name is already casefolded.
*
* Test whether a case-insensitive directory entry matches the filename
* being searched. If quick is set, the @name being looked up is
@@ -1333,8 +1332,9 @@ static void dx_insert_block(struct dx_frame *frame, u32 hash, ext4_lblk_t block)
* Return: > 0 if the directory entry matches, 0 if it doesn't match, or
* < 0 on error.
*/
-static int ext4_ci_compare(const struct inode *parent, const struct qstr *name,
- u8 *de_name, size_t de_name_len, bool quick)
+static int ext4_ci_compare(const struct inode *parent,
+ const struct fscrypt_name *fname,
+ u8 *de_name, size_t de_name_len)
{
const struct super_block *sb = parent->i_sb;
const struct unicode_map *um = sb->s_encoding;
@@ -1357,10 +1357,10 @@ static int ext4_ci_compare(const struct inode *parent, const struct qstr *name,
entry.len = decrypted_name.len;
}

- if (quick)
- ret = utf8_strncasecmp_folded(um, name, &entry);
+ if (fname->cf_name.name)
+ ret = utf8_strncasecmp_folded(um, &fname->cf_name, &entry);
else
- ret = utf8_strncasecmp(um, name, &entry);
+ ret = utf8_strncasecmp(um, fname->usr_fname, &entry);

if (!ret)
match = true;
@@ -1370,8 +1370,8 @@ static int ext4_ci_compare(const struct inode *parent, const struct qstr *name,
* the names have invalid characters.
*/
ret = 0;
- match = ((name->len == entry.len) &&
- !memcmp(name->name, entry.name, entry.len));
+ match = ((fname->usr_fname->len == entry.len) &&
+ !memcmp(fname->usr_fname->name, entry.name, entry.len));
}

out:
@@ -1440,6 +1440,8 @@ static bool ext4_match(struct inode *parent,
#endif

#if IS_ENABLED(CONFIG_UNICODE)
+ f.cf_name = fname->cf_name;
+
if (parent->i_sb->s_encoding && IS_CASEFOLDED(parent) &&
(!IS_ENCRYPTED(parent) || fscrypt_has_encryption_key(parent))) {
if (fname->cf_name.name) {
@@ -1451,13 +1453,9 @@ static bool ext4_match(struct inode *parent,
return false;
}
}
- ret = ext4_ci_compare(parent, &fname->cf_name, de->name,
- de->name_len, true);
- } else {
- ret = ext4_ci_compare(parent, fname->usr_fname,
- de->name, de->name_len, false);
}

+ ret = ext4_ci_compare(parent, &f, de->name, de->name_len);
if (ret < 0) {
/*
* Treat comparison errors as not a match. The
diff --git a/include/linux/fscrypt.h b/include/linux/fscrypt.h
index 91ea9477e9bd..5dc4b3c805e4 100644
--- a/include/linux/fscrypt.h
+++ b/include/linux/fscrypt.h
@@ -36,6 +36,10 @@ struct fscrypt_name {
u32 minor_hash;
struct fscrypt_str crypto_buf;
bool is_nokey_name;
+
+#ifdef CONFIG_UNICODE
+ struct qstr cf_name;
+#endif
};

#define FSTR_INIT(n, l) { .name = n, .len = l }
--
2.35.1

2022-03-22 03:31:17

by Gabriel Krisman Bertazi

[permalink] [raw]
Subject: [PATCH 4/5] ext4: Simplify hash check on ext4_match

The existence of fname->cf_name.name requires s_encoding & IS_CASEFOLDED,
therefore this can be simplified.

Signed-off-by: Gabriel Krisman Bertazi <[email protected]>
---
fs/ext4/namei.c | 18 ++++++------------
1 file changed, 6 insertions(+), 12 deletions(-)

diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c
index 71b4b05fae89..8520115cd5c2 100644
--- a/fs/ext4/namei.c
+++ b/fs/ext4/namei.c
@@ -1442,19 +1442,13 @@ static bool ext4_match(struct inode *parent,
#if IS_ENABLED(CONFIG_UNICODE)
f.cf_name = fname->cf_name;

- if (parent->i_sb->s_encoding && IS_CASEFOLDED(parent) &&
- (!IS_ENCRYPTED(parent) || fscrypt_has_encryption_key(parent))) {
- if (fname->cf_name.name) {
- if (IS_ENCRYPTED(parent)) {
- if (fname->hinfo.hash != EXT4_DIRENT_HASH(de) ||
- fname->hinfo.minor_hash !=
- EXT4_DIRENT_MINOR_HASH(de)) {
-
- return false;
- }
- }
- }
+ if (IS_ENCRYPTED(parent) && fname->cf_name.name) {
+ if (fname->hinfo.hash != EXT4_DIRENT_HASH(de) ||
+ fname->hinfo.minor_hash != EXT4_DIRENT_MINOR_HASH(de))
+ return false;
+ }

+ if (parent->i_sb->s_encoding && IS_CASEFOLDED(parent)) {
ret = ext4_ci_compare(parent, &f, de->name, de->name_len);
if (ret < 0) {
/*
--
2.35.1

2022-03-22 03:31:21

by Gabriel Krisman Bertazi

[permalink] [raw]
Subject: [PATCH 5/5] ext4: Log error when lookup of encoded dentry fails

If the volume is in strict mode, ext4_ci_compare can report a broken
encoding name. This will not trigger on a bad lookup, which is caught
earlier, only if the actual disk name is bad.

Signed-off-by: Gabriel Krisman Bertazi <[email protected]>
---
fs/ext4/namei.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c
index 8520115cd5c2..c321c6fdb4ae 100644
--- a/fs/ext4/namei.c
+++ b/fs/ext4/namei.c
@@ -1456,6 +1456,9 @@ static bool ext4_match(struct inode *parent,
* only case where it happens is on a disk
* corruption or ENOMEM.
*/
+ if (ret == -EINVAL)
+ EXT4_ERROR_INODE(parent,
+ "Bad encoded file in directory");
return false;
}
return ret;
--
2.35.1

2022-03-29 03:10:46

by Eric Biggers

[permalink] [raw]
Subject: Re: [PATCH 3/5] ext4: Implement ci comparison using fscrypt_name

On Mon, Mar 21, 2022 at 11:00:02PM -0400, Gabriel Krisman Bertazi wrote:
> By using fscrypt_name here, we can hide most of the caching casefold
> logic from ext4. The condition in ext4_match is now quite redundant,
> but this is addressed in the next patch.
>
> Signed-off-by: Gabriel Krisman Bertazi <[email protected]>
> ---
> fs/ext4/namei.c | 26 ++++++++++++--------------
> include/linux/fscrypt.h | 4 ++++
> 2 files changed, 16 insertions(+), 14 deletions(-)
>
> diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c
> index 8976e5a28c73..71b4b05fae89 100644
> --- a/fs/ext4/namei.c
> +++ b/fs/ext4/namei.c
> @@ -1321,10 +1321,9 @@ static void dx_insert_block(struct dx_frame *frame, u32 hash, ext4_lblk_t block)
> /**
> * ext4_ci_compare() - Match (case-insensitive) a name with a dirent.
> * @parent: Inode of the parent of the dentry.
> - * @name: name under lookup.
> + * @fname: name under lookup.
> * @de_name: Dirent name.
> * @de_name_len: dirent name length.
> - * @quick: whether @name is already casefolded.
> *
> * Test whether a case-insensitive directory entry matches the filename
> * being searched. If quick is set, the @name being looked up is
> @@ -1333,8 +1332,9 @@ static void dx_insert_block(struct dx_frame *frame, u32 hash, ext4_lblk_t block)
> * Return: > 0 if the directory entry matches, 0 if it doesn't match, or
> * < 0 on error.
> */
> -static int ext4_ci_compare(const struct inode *parent, const struct qstr *name,
> - u8 *de_name, size_t de_name_len, bool quick)
> +static int ext4_ci_compare(const struct inode *parent,
> + const struct fscrypt_name *fname,
> + u8 *de_name, size_t de_name_len)
> {
> const struct super_block *sb = parent->i_sb;
> const struct unicode_map *um = sb->s_encoding;
> @@ -1357,10 +1357,10 @@ static int ext4_ci_compare(const struct inode *parent, const struct qstr *name,
> entry.len = decrypted_name.len;
> }
>
> - if (quick)
> - ret = utf8_strncasecmp_folded(um, name, &entry);
> + if (fname->cf_name.name)
> + ret = utf8_strncasecmp_folded(um, &fname->cf_name, &entry);
> else
> - ret = utf8_strncasecmp(um, name, &entry);
> + ret = utf8_strncasecmp(um, fname->usr_fname, &entry);
>
> if (!ret)
> match = true;
> @@ -1370,8 +1370,8 @@ static int ext4_ci_compare(const struct inode *parent, const struct qstr *name,
> * the names have invalid characters.
> */
> ret = 0;
> - match = ((name->len == entry.len) &&
> - !memcmp(name->name, entry.name, entry.len));
> + match = ((fname->usr_fname->len == entry.len) &&
> + !memcmp(fname->usr_fname->name, entry.name, entry.len));
> }
>
> out:
> @@ -1440,6 +1440,8 @@ static bool ext4_match(struct inode *parent,
> #endif
>
> #if IS_ENABLED(CONFIG_UNICODE)
> + f.cf_name = fname->cf_name;
> +
> if (parent->i_sb->s_encoding && IS_CASEFOLDED(parent) &&
> (!IS_ENCRYPTED(parent) || fscrypt_has_encryption_key(parent))) {
> if (fname->cf_name.name) {
> @@ -1451,13 +1453,9 @@ static bool ext4_match(struct inode *parent,
> return false;
> }
> }
> - ret = ext4_ci_compare(parent, &fname->cf_name, de->name,
> - de->name_len, true);
> - } else {
> - ret = ext4_ci_compare(parent, fname->usr_fname,
> - de->name, de->name_len, false);
> }
>
> + ret = ext4_ci_compare(parent, &f, de->name, de->name_len);
> if (ret < 0) {
> /*
> * Treat comparison errors as not a match. The
> diff --git a/include/linux/fscrypt.h b/include/linux/fscrypt.h
> index 91ea9477e9bd..5dc4b3c805e4 100644
> --- a/include/linux/fscrypt.h
> +++ b/include/linux/fscrypt.h
> @@ -36,6 +36,10 @@ struct fscrypt_name {
> u32 minor_hash;
> struct fscrypt_str crypto_buf;
> bool is_nokey_name;
> +
> +#ifdef CONFIG_UNICODE
> + struct qstr cf_name;
> +#endif
> };
>

This seems like the wrong approach. struct fscrypt_name shouldn't have fields
that aren't used by the fs/crypto/ layer.

Did you check what f2fs does? It has a struct f2fs_filename to represent
everything f2fs needs to know about a filename, and it only uses
struct fscrypt_name when communicating with the fs/crypto/ layer.

struct ext4_filename already exists. Couldn't you use that here?

- Eric

2022-03-29 03:18:24

by Eric Biggers

[permalink] [raw]
Subject: Re: [PATCH 1/5] ext4: Match the f2fs ci_compare implementation

On Mon, Mar 21, 2022 at 11:00:00PM -0400, Gabriel Krisman Bertazi wrote:
> ext4_ci_compare originally follows utf8_*_strcmp, which means return
> zero on match. This means that every usage of that in ext4 negates
> the return.
>
> Turn it into a predicate function, let it follow the kernel convention
> and return true on match, which means it's now the same as its f2fs
> counterpart and can be extracted into generic code.
>
> This change also makes it more obvious that we are ignoring error
> handling in ext4_match, which can occur since casefolding support (bad
> utf8 name due to disk corruption on strict mode causes -EINVAL) and
> casefold+encryption (-ENOMEM). For now, keep the behavior. It is
> handled by the following patches.
>
> While we are there, change the comment to the kernel-doc style.
>
> Signed-off-by: Gabriel Krisman Bertazi <[email protected]>
> ---
> fs/ext4/namei.c | 62 +++++++++++++++++++++++++++++++++----------------
> 1 file changed, 42 insertions(+), 20 deletions(-)
>
> diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c
> index 8cf0a924a49b..24ea3bb446d0 100644
> --- a/fs/ext4/namei.c
> +++ b/fs/ext4/namei.c
> @@ -1318,13 +1318,20 @@ static void dx_insert_block(struct dx_frame *frame, u32 hash, ext4_lblk_t block)
> }
>
> #if IS_ENABLED(CONFIG_UNICODE)
> -/*
> +/**
> + * ext4_ci_compare() - Match (case-insensitive) a name with a dirent.
> + * @parent: Inode of the parent of the dentry.
> + * @name: name under lookup.
> + * @de_name: Dirent name.
> + * @de_name_len: dirent name length.
> + * @quick: whether @name is already casefolded.
> + *
> * Test whether a case-insensitive directory entry matches the filename
> - * being searched for. If quick is set, assume the name being looked up
> - * is already in the casefolded form.
> + * being searched. If quick is set, the @name being looked up is
> + * already in the casefolded form.
> *
> - * Returns: 0 if the directory entry matches, more than 0 if it
> - * doesn't match or less than zero on error.
> + * Return: > 0 if the directory entry matches, 0 if it doesn't match, or
> + * < 0 on error.
> */
> static int ext4_ci_compare(const struct inode *parent, const struct qstr *name,
> u8 *de_name, size_t de_name_len, bool quick)

Shouldn't this be renamed to ext4_match_ci() as well? The f2fs equivalent is
called f2fs_match_ci_name(), and this is called from ext4_match().
ext4_match_ci() would better fit the "return 1 on match" behavior, I think.

- Eric

2022-03-29 03:24:57

by Eric Biggers

[permalink] [raw]
Subject: Re: [PATCH 5/5] ext4: Log error when lookup of encoded dentry fails

On Mon, Mar 21, 2022 at 11:00:04PM -0400, Gabriel Krisman Bertazi wrote:
> If the volume is in strict mode, ext4_ci_compare can report a broken
> encoding name. This will not trigger on a bad lookup, which is caught
> earlier, only if the actual disk name is bad.
>
> Signed-off-by: Gabriel Krisman Bertazi <[email protected]>
> ---
> fs/ext4/namei.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c
> index 8520115cd5c2..c321c6fdb4ae 100644
> --- a/fs/ext4/namei.c
> +++ b/fs/ext4/namei.c
> @@ -1456,6 +1456,9 @@ static bool ext4_match(struct inode *parent,
> * only case where it happens is on a disk
> * corruption or ENOMEM.
> */
> + if (ret == -EINVAL)
> + EXT4_ERROR_INODE(parent,
> + "Bad encoded file in directory");

Maybe have this say "Bad encoded filename" instead of "Bad encoded file"?

- Eric

2022-03-29 06:32:55

by Eric Biggers

[permalink] [raw]
Subject: Re: [PATCH 2/5] ext4: Simplify the handling of chached insensitive names

On Mon, Mar 21, 2022 at 11:00:01PM -0400, Gabriel Krisman Bertazi wrote:
> Keeping it as qstr avoids the unnecessary conversion in ext4_match
>
> Signed-off-by: Gabriel Krisman Bertazi <[email protected]>
> ---
> fs/ext4/ext4.h | 2 +-
> fs/ext4/namei.c | 23 +++++++++++------------
> 2 files changed, 12 insertions(+), 13 deletions(-)
>
> diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
> index bcd3b9bf8069..46e729ce7b35 100644
> --- a/fs/ext4/ext4.h
> +++ b/fs/ext4/ext4.h
> @@ -2484,7 +2484,7 @@ struct ext4_filename {
> struct fscrypt_str crypto_buf;
> #endif
> #if IS_ENABLED(CONFIG_UNICODE)
> - struct fscrypt_str cf_name;
> + struct qstr cf_name;
> #endif
> };
>
> diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c
> index 24ea3bb446d0..8976e5a28c73 100644
> --- a/fs/ext4/namei.c
> +++ b/fs/ext4/namei.c
> @@ -1382,28 +1382,29 @@ static int ext4_ci_compare(const struct inode *parent, const struct qstr *name,
> int ext4_fname_setup_ci_filename(struct inode *dir, const struct qstr *iname,
> struct ext4_filename *name)
> {
> - struct fscrypt_str *cf_name = &name->cf_name;
> + struct qstr *cf_name = &name->cf_name;
> + unsigned char *buf;
> struct dx_hash_info *hinfo = &name->hinfo;
> int len;
>
> if (!IS_CASEFOLDED(dir) || !dir->i_sb->s_encoding ||
> (IS_ENCRYPTED(dir) && !fscrypt_has_encryption_key(dir))) {
> - cf_name->name = NULL;
> + name->cf_name.name = NULL;
> return 0;
> }

Why not keep "cf_name->name = NULL;" above?

- Eric

2022-03-29 17:22:48

by Gabriel Krisman Bertazi

[permalink] [raw]
Subject: Re: [PATCH 3/5] ext4: Implement ci comparison using fscrypt_name

Eric Biggers <[email protected]> writes:

> On Mon, Mar 21, 2022 at 11:00:02PM -0400, Gabriel Krisman Bertazi wrote:
>> By using fscrypt_name here, we can hide most of the caching casefold
>> logic from ext4. The condition in ext4_match is now quite redundant,
>> but this is addressed in the next patch.
>>
>> Signed-off-by: Gabriel Krisman Bertazi <[email protected]>
>> ---
>> fs/ext4/namei.c | 26 ++++++++++++--------------
>> include/linux/fscrypt.h | 4 ++++
>> 2 files changed, 16 insertions(+), 14 deletions(-)
>>
>> diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c
>> index 8976e5a28c73..71b4b05fae89 100644
>> --- a/fs/ext4/namei.c
>> +++ b/fs/ext4/namei.c
>> @@ -1321,10 +1321,9 @@ static void dx_insert_block(struct dx_frame *frame, u32 hash, ext4_lblk_t block)
>> /**
>> * ext4_ci_compare() - Match (case-insensitive) a name with a dirent.
>> * @parent: Inode of the parent of the dentry.
>> - * @name: name under lookup.
>> + * @fname: name under lookup.
>> * @de_name: Dirent name.
>> * @de_name_len: dirent name length.
>> - * @quick: whether @name is already casefolded.
>> *
>> * Test whether a case-insensitive directory entry matches the filename
>> * being searched. If quick is set, the @name being looked up is
>> @@ -1333,8 +1332,9 @@ static void dx_insert_block(struct dx_frame *frame, u32 hash, ext4_lblk_t block)
>> * Return: > 0 if the directory entry matches, 0 if it doesn't match, or
>> * < 0 on error.
>> */
>> -static int ext4_ci_compare(const struct inode *parent, const struct qstr *name,
>> - u8 *de_name, size_t de_name_len, bool quick)
>> +static int ext4_ci_compare(const struct inode *parent,
>> + const struct fscrypt_name *fname,
>> + u8 *de_name, size_t de_name_len)
>> {
>> const struct super_block *sb = parent->i_sb;
>> const struct unicode_map *um = sb->s_encoding;
>> @@ -1357,10 +1357,10 @@ static int ext4_ci_compare(const struct inode *parent, const struct qstr *name,
>> entry.len = decrypted_name.len;
>> }
>>
>> - if (quick)
>> - ret = utf8_strncasecmp_folded(um, name, &entry);
>> + if (fname->cf_name.name)
>> + ret = utf8_strncasecmp_folded(um, &fname->cf_name, &entry);
>> else
>> - ret = utf8_strncasecmp(um, name, &entry);
>> + ret = utf8_strncasecmp(um, fname->usr_fname, &entry);
>>
>> if (!ret)
>> match = true;
>> @@ -1370,8 +1370,8 @@ static int ext4_ci_compare(const struct inode *parent, const struct qstr *name,
>> * the names have invalid characters.
>> */
>> ret = 0;
>> - match = ((name->len == entry.len) &&
>> - !memcmp(name->name, entry.name, entry.len));
>> + match = ((fname->usr_fname->len == entry.len) &&
>> + !memcmp(fname->usr_fname->name, entry.name, entry.len));
>> }
>>
>> out:
>> @@ -1440,6 +1440,8 @@ static bool ext4_match(struct inode *parent,
>> #endif
>>
>> #if IS_ENABLED(CONFIG_UNICODE)
>> + f.cf_name = fname->cf_name;
>> +
>> if (parent->i_sb->s_encoding && IS_CASEFOLDED(parent) &&
>> (!IS_ENCRYPTED(parent) || fscrypt_has_encryption_key(parent))) {
>> if (fname->cf_name.name) {
>> @@ -1451,13 +1453,9 @@ static bool ext4_match(struct inode *parent,
>> return false;
>> }
>> }
>> - ret = ext4_ci_compare(parent, &fname->cf_name, de->name,
>> - de->name_len, true);
>> - } else {
>> - ret = ext4_ci_compare(parent, fname->usr_fname,
>> - de->name, de->name_len, false);
>> }
>>
>> + ret = ext4_ci_compare(parent, &f, de->name, de->name_len);
>> if (ret < 0) {
>> /*
>> * Treat comparison errors as not a match. The
>> diff --git a/include/linux/fscrypt.h b/include/linux/fscrypt.h
>> index 91ea9477e9bd..5dc4b3c805e4 100644
>> --- a/include/linux/fscrypt.h
>> +++ b/include/linux/fscrypt.h
>> @@ -36,6 +36,10 @@ struct fscrypt_name {
>> u32 minor_hash;
>> struct fscrypt_str crypto_buf;
>> bool is_nokey_name;
>> +
>> +#ifdef CONFIG_UNICODE
>> + struct qstr cf_name;
>> +#endif
>> };
>>
>
> This seems like the wrong approach. struct fscrypt_name shouldn't have fields
> that aren't used by the fs/crypto/ layer.
>
> Did you check what f2fs does? It has a struct f2fs_filename to represent
> everything f2fs needs to know about a filename, and it only uses
> struct fscrypt_name when communicating with the fs/crypto/ layer.
>
> struct ext4_filename already exists. Couldn't you use that here?

Hi Eric,

The reason I'm not using struct ext4_filename here is because I'm trying
to make this generic, so this function can be shared across filesystems
implementing casefold. Since the fscrypt_name abstraction is used for
case-sensitive comparison, I was trying to reuse that type for
case-insensitive as well. It seemed unnecessary to define a generic
casefold_name type just for passing the cf_name and disk_name to this
function, considering that fscrypt_name is already initialized by
ext4_match.

--
Gabriel Krisman Bertazi

2022-03-31 22:05:06

by Eric Biggers

[permalink] [raw]
Subject: Re: [PATCH 3/5] ext4: Implement ci comparison using fscrypt_name

On Tue, Mar 29, 2022 at 12:11:04PM -0400, Gabriel Krisman Bertazi wrote:
> Eric Biggers <[email protected]> writes:
>
> > On Mon, Mar 21, 2022 at 11:00:02PM -0400, Gabriel Krisman Bertazi wrote:
> >> By using fscrypt_name here, we can hide most of the caching casefold
> >> logic from ext4. The condition in ext4_match is now quite redundant,
> >> but this is addressed in the next patch.
> >>
> >> Signed-off-by: Gabriel Krisman Bertazi <[email protected]>
> >> ---
> >> fs/ext4/namei.c | 26 ++++++++++++--------------
> >> include/linux/fscrypt.h | 4 ++++
> >> 2 files changed, 16 insertions(+), 14 deletions(-)
> >>
> >> diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c
> >> index 8976e5a28c73..71b4b05fae89 100644
> >> --- a/fs/ext4/namei.c
> >> +++ b/fs/ext4/namei.c
> >> @@ -1321,10 +1321,9 @@ static void dx_insert_block(struct dx_frame *frame, u32 hash, ext4_lblk_t block)
> >> /**
> >> * ext4_ci_compare() - Match (case-insensitive) a name with a dirent.
> >> * @parent: Inode of the parent of the dentry.
> >> - * @name: name under lookup.
> >> + * @fname: name under lookup.
> >> * @de_name: Dirent name.
> >> * @de_name_len: dirent name length.
> >> - * @quick: whether @name is already casefolded.
> >> *
> >> * Test whether a case-insensitive directory entry matches the filename
> >> * being searched. If quick is set, the @name being looked up is
> >> @@ -1333,8 +1332,9 @@ static void dx_insert_block(struct dx_frame *frame, u32 hash, ext4_lblk_t block)
> >> * Return: > 0 if the directory entry matches, 0 if it doesn't match, or
> >> * < 0 on error.
> >> */
> >> -static int ext4_ci_compare(const struct inode *parent, const struct qstr *name,
> >> - u8 *de_name, size_t de_name_len, bool quick)
> >> +static int ext4_ci_compare(const struct inode *parent,
> >> + const struct fscrypt_name *fname,
> >> + u8 *de_name, size_t de_name_len)
> >> {
> >> const struct super_block *sb = parent->i_sb;
> >> const struct unicode_map *um = sb->s_encoding;
> >> @@ -1357,10 +1357,10 @@ static int ext4_ci_compare(const struct inode *parent, const struct qstr *name,
> >> entry.len = decrypted_name.len;
> >> }
> >>
> >> - if (quick)
> >> - ret = utf8_strncasecmp_folded(um, name, &entry);
> >> + if (fname->cf_name.name)
> >> + ret = utf8_strncasecmp_folded(um, &fname->cf_name, &entry);
> >> else
> >> - ret = utf8_strncasecmp(um, name, &entry);
> >> + ret = utf8_strncasecmp(um, fname->usr_fname, &entry);
> >>
> >> if (!ret)
> >> match = true;
> >> @@ -1370,8 +1370,8 @@ static int ext4_ci_compare(const struct inode *parent, const struct qstr *name,
> >> * the names have invalid characters.
> >> */
> >> ret = 0;
> >> - match = ((name->len == entry.len) &&
> >> - !memcmp(name->name, entry.name, entry.len));
> >> + match = ((fname->usr_fname->len == entry.len) &&
> >> + !memcmp(fname->usr_fname->name, entry.name, entry.len));
> >> }
> >>
> >> out:
> >> @@ -1440,6 +1440,8 @@ static bool ext4_match(struct inode *parent,
> >> #endif
> >>
> >> #if IS_ENABLED(CONFIG_UNICODE)
> >> + f.cf_name = fname->cf_name;
> >> +
> >> if (parent->i_sb->s_encoding && IS_CASEFOLDED(parent) &&
> >> (!IS_ENCRYPTED(parent) || fscrypt_has_encryption_key(parent))) {
> >> if (fname->cf_name.name) {
> >> @@ -1451,13 +1453,9 @@ static bool ext4_match(struct inode *parent,
> >> return false;
> >> }
> >> }
> >> - ret = ext4_ci_compare(parent, &fname->cf_name, de->name,
> >> - de->name_len, true);
> >> - } else {
> >> - ret = ext4_ci_compare(parent, fname->usr_fname,
> >> - de->name, de->name_len, false);
> >> }
> >>
> >> + ret = ext4_ci_compare(parent, &f, de->name, de->name_len);
> >> if (ret < 0) {
> >> /*
> >> * Treat comparison errors as not a match. The
> >> diff --git a/include/linux/fscrypt.h b/include/linux/fscrypt.h
> >> index 91ea9477e9bd..5dc4b3c805e4 100644
> >> --- a/include/linux/fscrypt.h
> >> +++ b/include/linux/fscrypt.h
> >> @@ -36,6 +36,10 @@ struct fscrypt_name {
> >> u32 minor_hash;
> >> struct fscrypt_str crypto_buf;
> >> bool is_nokey_name;
> >> +
> >> +#ifdef CONFIG_UNICODE
> >> + struct qstr cf_name;
> >> +#endif
> >> };
> >>
> >
> > This seems like the wrong approach. struct fscrypt_name shouldn't have fields
> > that aren't used by the fs/crypto/ layer.
> >
> > Did you check what f2fs does? It has a struct f2fs_filename to represent
> > everything f2fs needs to know about a filename, and it only uses
> > struct fscrypt_name when communicating with the fs/crypto/ layer.
> >
> > struct ext4_filename already exists. Couldn't you use that here?
>
> Hi Eric,
>
> The reason I'm not using struct ext4_filename here is because I'm trying
> to make this generic, so this function can be shared across filesystems
> implementing casefold. Since the fscrypt_name abstraction is used for
> case-sensitive comparison, I was trying to reuse that type for
> case-insensitive as well. It seemed unnecessary to define a generic
> casefold_name type just for passing the cf_name and disk_name to this
> function, considering that fscrypt_name is already initialized by
> ext4_match.
>

Which function, specifically, are you trying to share across filesystems?
Do you have patches that show what your end goal is?

- Eric

2022-04-05 01:56:15

by Gabriel Krisman Bertazi

[permalink] [raw]
Subject: Re: [PATCH 3/5] ext4: Implement ci comparison using fscrypt_name

Eric Biggers <[email protected]> writes:

> On Tue, Mar 29, 2022 at 12:11:04PM -0400, Gabriel Krisman Bertazi wrote:
>> Eric Biggers <[email protected]> writes:
>>
>> > On Mon, Mar 21, 2022 at 11:00:02PM -0400, Gabriel Krisman Bertazi wrote:
>> >> By using fscrypt_name here, we can hide most of the caching casefold
>> >> logic from ext4. The condition in ext4_match is now quite redundant,
>> >> but this is addressed in the next patch.
>> >>
>> >> Signed-off-by: Gabriel Krisman Bertazi <[email protected]>
>> >> ---
>> >> fs/ext4/namei.c | 26 ++++++++++++--------------
>> >> include/linux/fscrypt.h | 4 ++++
>> >> 2 files changed, 16 insertions(+), 14 deletions(-)
>> >>
>> >> diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c
>> >> index 8976e5a28c73..71b4b05fae89 100644
>> >> --- a/fs/ext4/namei.c
>> >> +++ b/fs/ext4/namei.c
>> >> @@ -1321,10 +1321,9 @@ static void dx_insert_block(struct dx_frame *frame, u32 hash, ext4_lblk_t block)
>> >> /**
>> >> * ext4_ci_compare() - Match (case-insensitive) a name with a dirent.
>> >> * @parent: Inode of the parent of the dentry.
>> >> - * @name: name under lookup.
>> >> + * @fname: name under lookup.
>> >> * @de_name: Dirent name.
>> >> * @de_name_len: dirent name length.
>> >> - * @quick: whether @name is already casefolded.
>> >> *
>> >> * Test whether a case-insensitive directory entry matches the filename
>> >> * being searched. If quick is set, the @name being looked up is
>> >> @@ -1333,8 +1332,9 @@ static void dx_insert_block(struct dx_frame *frame, u32 hash, ext4_lblk_t block)
>> >> * Return: > 0 if the directory entry matches, 0 if it doesn't match, or
>> >> * < 0 on error.
>> >> */
>> >> -static int ext4_ci_compare(const struct inode *parent, const struct qstr *name,
>> >> - u8 *de_name, size_t de_name_len, bool quick)
>> >> +static int ext4_ci_compare(const struct inode *parent,
>> >> + const struct fscrypt_name *fname,
>> >> + u8 *de_name, size_t de_name_len)
>> >> {
>> >> const struct super_block *sb = parent->i_sb;
>> >> const struct unicode_map *um = sb->s_encoding;
>> >> @@ -1357,10 +1357,10 @@ static int ext4_ci_compare(const struct inode *parent, const struct qstr *name,
>> >> entry.len = decrypted_name.len;
>> >> }
>> >>
>> >> - if (quick)
>> >> - ret = utf8_strncasecmp_folded(um, name, &entry);
>> >> + if (fname->cf_name.name)
>> >> + ret = utf8_strncasecmp_folded(um, &fname->cf_name, &entry);
>> >> else
>> >> - ret = utf8_strncasecmp(um, name, &entry);
>> >> + ret = utf8_strncasecmp(um, fname->usr_fname, &entry);
>> >>
>> >> if (!ret)
>> >> match = true;
>> >> @@ -1370,8 +1370,8 @@ static int ext4_ci_compare(const struct inode *parent, const struct qstr *name,
>> >> * the names have invalid characters.
>> >> */
>> >> ret = 0;
>> >> - match = ((name->len == entry.len) &&
>> >> - !memcmp(name->name, entry.name, entry.len));
>> >> + match = ((fname->usr_fname->len == entry.len) &&
>> >> + !memcmp(fname->usr_fname->name, entry.name, entry.len));
>> >> }
>> >>
>> >> out:
>> >> @@ -1440,6 +1440,8 @@ static bool ext4_match(struct inode *parent,
>> >> #endif
>> >>
>> >> #if IS_ENABLED(CONFIG_UNICODE)
>> >> + f.cf_name = fname->cf_name;
>> >> +
>> >> if (parent->i_sb->s_encoding && IS_CASEFOLDED(parent) &&
>> >> (!IS_ENCRYPTED(parent) || fscrypt_has_encryption_key(parent))) {
>> >> if (fname->cf_name.name) {
>> >> @@ -1451,13 +1453,9 @@ static bool ext4_match(struct inode *parent,
>> >> return false;
>> >> }
>> >> }
>> >> - ret = ext4_ci_compare(parent, &fname->cf_name, de->name,
>> >> - de->name_len, true);
>> >> - } else {
>> >> - ret = ext4_ci_compare(parent, fname->usr_fname,
>> >> - de->name, de->name_len, false);
>> >> }
>> >>
>> >> + ret = ext4_ci_compare(parent, &f, de->name, de->name_len);
>> >> if (ret < 0) {
>> >> /*
>> >> * Treat comparison errors as not a match. The
>> >> diff --git a/include/linux/fscrypt.h b/include/linux/fscrypt.h
>> >> index 91ea9477e9bd..5dc4b3c805e4 100644
>> >> --- a/include/linux/fscrypt.h
>> >> +++ b/include/linux/fscrypt.h
>> >> @@ -36,6 +36,10 @@ struct fscrypt_name {
>> >> u32 minor_hash;
>> >> struct fscrypt_str crypto_buf;
>> >> bool is_nokey_name;
>> >> +
>> >> +#ifdef CONFIG_UNICODE
>> >> + struct qstr cf_name;
>> >> +#endif
>> >> };
>> >>
>> >
>> > This seems like the wrong approach. struct fscrypt_name shouldn't have fields
>> > that aren't used by the fs/crypto/ layer.
>> >
>> > Did you check what f2fs does? It has a struct f2fs_filename to represent
>> > everything f2fs needs to know about a filename, and it only uses
>> > struct fscrypt_name when communicating with the fs/crypto/ layer.
>> >
>> > struct ext4_filename already exists. Couldn't you use that here?
>>
>> Hi Eric,
>>
>> The reason I'm not using struct ext4_filename here is because I'm trying
>> to make this generic, so this function can be shared across filesystems
>> implementing casefold. Since the fscrypt_name abstraction is used for
>> case-sensitive comparison, I was trying to reuse that type for
>> case-insensitive as well. It seemed unnecessary to define a generic
>> casefold_name type just for passing the cf_name and disk_name to this
>> function, considering that fscrypt_name is already initialized by
>> ext4_match.
>>
>
> Which function, specifically, are you trying to share across filesystems?
> Do you have patches that show what your end goal is?

ext4_ci_compare/f2fs_match_ci_name :)

Let me follow up with a v2 that merges them so it makes more sense.

--
Gabriel Krisman Bertazi