2022-04-29 11:36:57

by Gabriel Krisman Bertazi

[permalink] [raw]
Subject: [PATCH v2 0/7] Clean up the case-insenstive lookup path

This is the v2 of this series. it applies Eric's comments and extend
the series to complete the merge of generic_ci_match for ext4 and f2fs.

The case-insensitive implementations in f2fs and ext4 have quite a bit
of duplicated code. This series simplifies the ext4 version, with the
goal of extracting ext4_ci_compare into a helper library that can be
used by both filesystems.

While there, I noticed we can leverage the utf8 functions to detect
encoded names that are corrupted in the filesystem. The final patch
adds an ext4 error on that scenario, to mark the filesystem as
corrupted.

This series survived passes of xfstests -g quick.

Gabriel Krisman Bertazi (7):
ext4: Match the f2fs ci_compare implementation
ext4: Simplify the handling of cached insensitive names
ext4: Implement ci comparison using unicode_name
ext4: Simplify hash check on ext4_match
ext4: Log error when lookup of encoded dentry fails
ext4: Move ext4_match_ci into libfs
f2fs: Reuse generic_ci_match for ci comparisons

fs/ext4/ext4.h | 2 +-
fs/ext4/namei.c | 110 ++++++++++++++-------------------------------
fs/f2fs/dir.c | 58 ++----------------------
fs/libfs.c | 60 +++++++++++++++++++++++++
include/linux/fs.h | 8 ++++
5 files changed, 106 insertions(+), 132 deletions(-)

--
2.35.1


2022-04-29 12:39:31

by Gabriel Krisman Bertazi

[permalink] [raw]
Subject: [PATCH v2 2/7] ext4: Simplify the handling of cached insensitive names

Keeping it as qstr avoids the unnecessary conversion in ext4_match

Signed-off-by: Gabriel Krisman Bertazi <[email protected]>

--
Changes since v1:
- Simplify hunk (eric)
---
fs/ext4/ext4.h | 2 +-
fs/ext4/namei.c | 22 +++++++++++-----------
2 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index a743b1e3b89e..93a28fcb2e22 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -2490,7 +2490,7 @@ struct ext4_filename {
struct fscrypt_str crypto_buf;
#endif
#if IS_ENABLED(CONFIG_UNICODE)
- struct fscrypt_str cf_name;
+ struct qstr cf_name;
#endif
};

diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c
index c363f637057d..c1a8a09369d1 100644
--- a/fs/ext4/namei.c
+++ b/fs/ext4/namei.c
@@ -1382,7 +1382,8 @@ static int ext4_match_ci(const struct inode *parent, const struct qstr *name,
int ext4_fname_setup_ci_filename(struct inode *dir, const struct qstr *iname,
struct ext4_filename *name)
{
- struct fscrypt_str *cf_name = &name->cf_name;
+ struct qstr *cf_name = &name->cf_name;
+ unsigned char *buf;
struct dx_hash_info *hinfo = &name->hinfo;
int len;

@@ -1392,18 +1393,18 @@ int ext4_fname_setup_ci_filename(struct inode *dir, const struct qstr *iname,
return 0;
}

- cf_name->name = kmalloc(EXT4_NAME_LEN, GFP_NOFS);
- if (!cf_name->name)
+ buf = kmalloc(EXT4_NAME_LEN, GFP_NOFS);
+ if (!buf)
return -ENOMEM;

- len = utf8_casefold(dir->i_sb->s_encoding,
- iname, cf_name->name,
- EXT4_NAME_LEN);
+ len = utf8_casefold(dir->i_sb->s_encoding, iname, buf, EXT4_NAME_LEN);
if (len <= 0) {
- kfree(cf_name->name);
- cf_name->name = NULL;
+ kfree(buf);
+ buf = NULL;
}
+ cf_name->name = buf;
cf_name->len = (unsigned) len;
+
if (!IS_ENCRYPTED(dir))
return 0;

@@ -1442,8 +1443,6 @@ static bool ext4_match(struct inode *parent,
if (parent->i_sb->s_encoding && IS_CASEFOLDED(parent) &&
(!IS_ENCRYPTED(parent) || fscrypt_has_encryption_key(parent))) {
if (fname->cf_name.name) {
- struct qstr cf = {.name = fname->cf_name.name,
- .len = fname->cf_name.len};
if (IS_ENCRYPTED(parent)) {
if (fname->hinfo.hash != EXT4_DIRENT_HASH(de) ||
fname->hinfo.minor_hash !=
@@ -1452,7 +1451,8 @@ static bool ext4_match(struct inode *parent,
return false;
}
}
- ret = ext4_match_ci(parent, &cf, de->name,
+
+ ret = ext4_match_ci(parent, &fname->cf_name, de->name,
de->name_len, true);
} else {
ret = ext4_match_ci(parent, fname->usr_fname,
--
2.35.1

2022-04-29 14:35:51

by Gabriel Krisman Bertazi

[permalink] [raw]
Subject: [PATCH v2 1/7] ext4: Match the f2fs ci_compare implementation

ext4_ci_compare originally follows utf8_*_strcmp, which means return
zero on match. This means that every usage of that in ext4 negates
the return.

Turn it into a predicate function, let it follow the kernel convention
and return true on match, which means it's now the same as its f2fs
counterpart and can be extracted into generic code.

This change also makes it more obvious that we are ignoring error
handling in ext4_match, which can occur since casefolding support (bad
utf8 name due to disk corruption on strict mode causes -EINVAL) and
casefold+encryption (-ENOMEM). For now, keep the behavior. It is
handled by the following patches.

While we are there, change the comment to the kernel-doc style.

Signed-off-by: Gabriel Krisman Bertazi <[email protected]>
---
changes since v1:
- rename to match f2fs naming (Eric)
---
fs/ext4/namei.c | 65 ++++++++++++++++++++++++++++++++-----------------
1 file changed, 43 insertions(+), 22 deletions(-)

diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c
index 767b4bfe39c3..c363f637057d 100644
--- a/fs/ext4/namei.c
+++ b/fs/ext4/namei.c
@@ -1318,22 +1318,29 @@ static void dx_insert_block(struct dx_frame *frame, u32 hash, ext4_lblk_t block)
}

#if IS_ENABLED(CONFIG_UNICODE)
-/*
+/**
+ * ext4_match_ci() - Match (case-insensitive) a name with a dirent.
+ * @parent: Inode of the parent of the dentry.
+ * @name: name under lookup.
+ * @de_name: Dirent name.
+ * @de_name_len: dirent name length.
+ * @quick: whether @name is already casefolded.
+ *
* Test whether a case-insensitive directory entry matches the filename
- * being searched for. If quick is set, assume the name being looked up
- * is already in the casefolded form.
+ * being searched. If quick is set, the @name being looked up is
+ * already in the casefolded form.
*
- * Returns: 0 if the directory entry matches, more than 0 if it
- * doesn't match or less than zero on error.
+ * Return: > 0 if the directory entry matches, 0 if it doesn't match, or
+ * < 0 on error.
*/
-static int ext4_ci_compare(const struct inode *parent, const struct qstr *name,
- u8 *de_name, size_t de_name_len, bool quick)
+static int ext4_match_ci(const struct inode *parent, const struct qstr *name,
+ u8 *de_name, size_t de_name_len, bool quick)
{
const struct super_block *sb = parent->i_sb;
const struct unicode_map *um = sb->s_encoding;
struct fscrypt_str decrypted_name = FSTR_INIT(NULL, de_name_len);
struct qstr entry = QSTR_INIT(de_name, de_name_len);
- int ret;
+ int ret, match = false;

if (IS_ENCRYPTED(parent)) {
const struct fscrypt_str encrypted_name =
@@ -1354,20 +1361,22 @@ static int ext4_ci_compare(const struct inode *parent, const struct qstr *name,
ret = utf8_strncasecmp_folded(um, name, &entry);
else
ret = utf8_strncasecmp(um, name, &entry);
- if (ret < 0) {
- /* Handle invalid character sequence as either an error
- * or as an opaque byte sequence.
+
+ if (!ret)
+ match = true;
+ else if (ret < 0 && !sb_has_strict_encoding(sb)) {
+ /*
+ * In non-strict mode, fallback to a byte comparison if
+ * the names have invalid characters.
*/
- if (sb_has_strict_encoding(sb))
- ret = -EINVAL;
- else if (name->len != entry.len)
- ret = 1;
- else
- ret = !!memcmp(name->name, entry.name, entry.len);
+ ret = 0;
+ match = ((name->len == entry.len) &&
+ !memcmp(name->name, entry.name, entry.len));
}
+
out:
kfree(decrypted_name.name);
- return ret;
+ return (ret >= 0) ? match : ret;
}

int ext4_fname_setup_ci_filename(struct inode *dir, const struct qstr *iname,
@@ -1418,6 +1427,7 @@ static bool ext4_match(struct inode *parent,
struct ext4_dir_entry_2 *de)
{
struct fscrypt_name f;
+ int ret;

if (!de->inode)
return false;
@@ -1442,11 +1452,22 @@ static bool ext4_match(struct inode *parent,
return false;
}
}
- return !ext4_ci_compare(parent, &cf, de->name,
- de->name_len, true);
+ ret = ext4_match_ci(parent, &cf, de->name,
+ de->name_len, true);
+ } else {
+ ret = ext4_match_ci(parent, fname->usr_fname,
+ de->name, de->name_len, false);
+ }
+
+ if (ret < 0) {
+ /*
+ * Treat comparison errors as not a match. The
+ * only case where it happens is on a disk
+ * corruption or ENOMEM.
+ */
+ return false;
}
- return !ext4_ci_compare(parent, fname->usr_fname, de->name,
- de->name_len, false);
+ return ret;
}
#endif

--
2.35.1

2022-05-03 00:33:25

by Gabriel Krisman Bertazi

[permalink] [raw]
Subject: [PATCH v2 4/7] ext4: Simplify hash check on ext4_match

The existence of fname->cf_name.name requires s_encoding & IS_CASEFOLDED,
therefore this can be simplified.

Signed-off-by: Gabriel Krisman Bertazi <[email protected]>
---
fs/ext4/namei.c | 18 ++++++------------
1 file changed, 6 insertions(+), 12 deletions(-)

diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c
index 5102652b5af4..e450e52eef48 100644
--- a/fs/ext4/namei.c
+++ b/fs/ext4/namei.c
@@ -1440,19 +1440,13 @@ static bool ext4_match(struct inode *parent,
#endif

#if IS_ENABLED(CONFIG_UNICODE)
- if (parent->i_sb->s_encoding && IS_CASEFOLDED(parent) &&
- (!IS_ENCRYPTED(parent) || fscrypt_has_encryption_key(parent))) {
- if (fname->cf_name.name) {
- if (IS_ENCRYPTED(parent)) {
- if (fname->hinfo.hash != EXT4_DIRENT_HASH(de) ||
- fname->hinfo.minor_hash !=
- EXT4_DIRENT_MINOR_HASH(de)) {
-
- return false;
- }
- }
- }
+ if (IS_ENCRYPTED(parent) && fname->cf_name.name) {
+ if (fname->hinfo.hash != EXT4_DIRENT_HASH(de) ||
+ fname->hinfo.minor_hash != EXT4_DIRENT_MINOR_HASH(de))
+ return false;
+ }

+ if (parent->i_sb->s_encoding && IS_CASEFOLDED(parent)) {
u.folded_name = &fname->cf_name;
u.usr_name = fname->usr_fname;

--
2.35.1