Hi,
Thank you, Eric and Al, for your feedback on the previous version.
This version has some big changes: instead of only configuring on the
case-insensitive case, we do it for case-sensitive fscrypt as well, and
disable d_revalidate as needed. This pretty much reverses the way
fscrypt operated (only enable d_revalidate for dentries that require
it), but has the advantage we can be consistent among variations of
case-insensitive/sensitive, encrypted/unencrypted configurations.
You'll find the code is simpler than v1 and v2. I dropped the dcache
patch because now we always try to disable DCACHE_OP_REVALIDATE while
holding the d_lock already, so I do it inline; I also changed the way we
drop d_revalidate when the key is made available, because we couldn't
really do it the way I originally proposed on the RCU case, which would
require falling back to non-RCU lookup just to disable d_revalidate; I
also included a patch fixing the overlayfs issue that I mentioned on the
previous thread. While unrelated to the rest of the patchset, it is a
quick fix that I might merge earlier if you are happy with it.
More details can be found in the per-patch changelog.
This survived fstests on ext4 and f2fs. I also verified that fscrypt
continues to work when combined to overlayfs as Eric requested.
..
original cover letter:
When case-insensitive and fscrypt were adapted to work together, we moved the
code that sets the dentry operations for case-insensitive dentries(d_hash and
d_compare) to happen from a helper inside ->lookup. This is because fscrypt
wants to set d_revalidate only on some dentries, so it does it only for them in
d_revalidate.
But, case-insensitive hooks are actually set on all dentries in the filesystem,
so the natural place to do it is through s_d_op and let d_alloc handle it [1].
In addition, doing it inside the ->lookup is a problem for case-insensitive
dentries that are not created through ->lookup, like those coming
open-by-fhandle[2], which will not see the required d_ops.
This patchset therefore reverts to using sb->s_d_op to set the dentry operations
for case-insensitive filesystems. In order to set case-insensitive hooks early
and not require every dentry to have d_revalidate in case-insensitive
filesystems, it introduces a patch suggested by Al Viro to disable d_revalidate
on some dentries on the fly.
It survives fstests encrypt and quick groups without regressions. Based on
v6.7-rc1.
[1] https://lore.kernel.org/linux-fsdevel/20231123195327.GP38156@ZenIV/
[2] https://lore.kernel.org/linux-fsdevel/20231123171255.GN38156@ZenIV/
Gabriel Krisman Bertazi (10):
ovl: Reject mounting case-insensitive filesystems
fscrypt: Share code between functions that prepare lookup
fscrypt: Drop d_revalidate for valid dentries during lookup
fscrypt: Drop d_revalidate once the key is added
libfs: Merge encrypted_ci_dentry_ops and ci_dentry_ops
libfs: Add helper to choose dentry operations at mount
ext4: Configure dentry operations at dentry-creation time
f2fs: Configure dentry operations at dentry-creation time
ubifs: Configure dentry operations at dentry-creation time
libfs: Drop generic_set_encrypted_ci_d_ops
fs/ceph/dir.c | 2 +-
fs/ceph/file.c | 2 +-
fs/crypto/hooks.c | 62 +++++++++++++++++++++--------------------
fs/ext4/namei.c | 1 -
fs/ext4/super.c | 1 +
fs/f2fs/namei.c | 1 -
fs/f2fs/super.c | 1 +
fs/libfs.c | 61 +++++++++++-----------------------------
fs/overlayfs/params.c | 13 +++++++--
fs/ubifs/dir.c | 1 -
fs/ubifs/super.c | 1 +
include/linux/fs.h | 11 +++++++-
include/linux/fscrypt.h | 51 ++++++++++++++++++++-------------
13 files changed, 106 insertions(+), 102 deletions(-)
--
2.43.0
Unencrypted and encrypted-dentries where the key is available don't need
to be revalidated with regards to fscrypt, since they don't go stale
from under VFS and the key cannot be removed for the encrypted case
without evicting the dentry. Mark them with d_set_always_valid, to
avoid unnecessary revalidation, in preparation to always configuring
d_op through sb->s_d_op.
Since the filesystem might have other features that require
revalidation, only apply this optimization if the d_revalidate handler
is fscrypt_d_revalidate itself.
Signed-off-by: Gabriel Krisman Bertazi <[email protected]>
---
fs/crypto/hooks.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/fs/crypto/hooks.c b/fs/crypto/hooks.c
index 41df986d1230..53381acc83e7 100644
--- a/fs/crypto/hooks.c
+++ b/fs/crypto/hooks.c
@@ -127,6 +127,15 @@ int fscrypt_prepare_lookup_dentry(struct inode *dir,
spin_lock(&dentry->d_lock);
if (nokey_name) {
dentry->d_flags |= DCACHE_NOKEY_NAME;
+ } else if (dentry->d_flags & DCACHE_OP_REVALIDATE &&
+ dentry->d_op->d_revalidate == fscrypt_d_revalidate) {
+ /*
+ * Unencrypted dentries and encrypted dentries where the
+ * key is available are always valid from fscrypt
+ * perspective. Avoid the cost of calling
+ * fscrypt_d_revalidate unnecessarily.
+ */
+ dentry->d_flags &= ~DCACHE_OP_REVALIDATE;
}
spin_unlock(&dentry->d_lock);
--
2.43.0
Right now, we have two functions that can be called by the filesystem
during lookup to set up fscrypt internal state for the dentry and
inode under lookup:
1) fscrypt_prepare_lookup_dentry_partial: used only by ceph. It sets
encryption information in the inode and sets the dentry flag if the
key is not available.
2) fscrypt_prepare_lookup: used by everything else. Does all the
above, plus also sets struct fname.
This patch refactors the code to implement the later using the former.
This way, we'll have a single place where we set DCACHE_NOKEY_NAME, in
preparation to add more churn to that condition in the following patch.
To make the patch simpler, we now call fscrypt_get_encryption_info twice
for fscrypt_prepare_lookup, once inside fscrypt_setup_filename and once
inside fscrypt_prepare_lookup_dentry. It seems safe to do, and
considering it will bail early in the second lookup and most lookups
should go to the dcache anyway, it doesn't seem problematic for
performance. In addition, we add a function call for the unencrypted
case, also during lookup.
Apart from the above, it should have no behavior change.
I tested ext4/f2fs manually and with fstests, which yielded no regressions.
Signed-off-by: Gabriel Krisman Bertazi <[email protected]>
---
fs/ceph/dir.c | 2 +-
fs/ceph/file.c | 2 +-
fs/crypto/hooks.c | 53 ++++++++++++++++++-----------------------
include/linux/fscrypt.h | 40 +++++++++++++++++--------------
4 files changed, 47 insertions(+), 50 deletions(-)
diff --git a/fs/ceph/dir.c b/fs/ceph/dir.c
index 91709934c8b1..835421e2845d 100644
--- a/fs/ceph/dir.c
+++ b/fs/ceph/dir.c
@@ -813,7 +813,7 @@ static struct dentry *ceph_lookup(struct inode *dir, struct dentry *dentry,
if (IS_ENCRYPTED(dir)) {
bool had_key = fscrypt_has_encryption_key(dir);
- err = fscrypt_prepare_lookup_partial(dir, dentry);
+ err = fscrypt_prepare_lookup_dentry(dir, dentry);
if (err < 0)
return ERR_PTR(err);
diff --git a/fs/ceph/file.c b/fs/ceph/file.c
index 3b5aae29e944..5c9206670533 100644
--- a/fs/ceph/file.c
+++ b/fs/ceph/file.c
@@ -812,7 +812,7 @@ int ceph_atomic_open(struct inode *dir, struct dentry *dentry,
ihold(dir);
if (IS_ENCRYPTED(dir)) {
set_bit(CEPH_MDS_R_FSCRYPT_FILE, &req->r_req_flags);
- err = fscrypt_prepare_lookup_partial(dir, dentry);
+ err = fscrypt_prepare_lookup_dentry(dir, dentry);
if (err < 0)
goto out_req;
}
diff --git a/fs/crypto/hooks.c b/fs/crypto/hooks.c
index 52504dd478d3..41df986d1230 100644
--- a/fs/crypto/hooks.c
+++ b/fs/crypto/hooks.c
@@ -94,52 +94,45 @@ int __fscrypt_prepare_rename(struct inode *old_dir, struct dentry *old_dentry,
}
EXPORT_SYMBOL_GPL(__fscrypt_prepare_rename);
-int __fscrypt_prepare_lookup(struct inode *dir, struct dentry *dentry,
- struct fscrypt_name *fname)
-{
- int err = fscrypt_setup_filename(dir, &dentry->d_name, 1, fname);
-
- if (err && err != -ENOENT)
- return err;
-
- if (fname->is_nokey_name) {
- spin_lock(&dentry->d_lock);
- dentry->d_flags |= DCACHE_NOKEY_NAME;
- spin_unlock(&dentry->d_lock);
- }
- return err;
-}
-EXPORT_SYMBOL_GPL(__fscrypt_prepare_lookup);
-
/**
- * fscrypt_prepare_lookup_partial() - prepare lookup without filename setup
+ * fscrypt_prepare_lookup_dentry() - prepare lookup without filename setup
* @dir: the encrypted directory being searched
* @dentry: the dentry being looked up in @dir
*
- * This function should be used by the ->lookup and ->atomic_open methods of
- * filesystems that handle filename encryption and no-key name encoding
- * themselves and thus can't use fscrypt_prepare_lookup(). Like
- * fscrypt_prepare_lookup(), this will try to set up the directory's encryption
- * key and will set DCACHE_NOKEY_NAME on the dentry if the key is unavailable.
- * However, this function doesn't set up a struct fscrypt_name for the filename.
+ * This function should be used by the ->lookup and ->atomic_open
+ * methods of filesystems that handle filename encryption and no-key
+ * name encoding themselves and thus can't use fscrypt_prepare_lookup()
+ * directly. This will try to set up the directory's encryption key and
+ * will set DCACHE_NOKEY_NAME on the dentry if the key is unavailable.
+ * However, this function doesn't set up a struct fscrypt_name for the
+ * filename.
*
* Return: 0 on success; -errno on error. Note that the encryption key being
* unavailable is not considered an error. It is also not an error if
* the encryption policy is unsupported by this kernel; that is treated
* like the key being unavailable, so that files can still be deleted.
*/
-int fscrypt_prepare_lookup_partial(struct inode *dir, struct dentry *dentry)
+int fscrypt_prepare_lookup_dentry(struct inode *dir,
+ struct dentry *dentry)
{
- int err = fscrypt_get_encryption_info(dir, true);
+ bool nokey_name = false;
+ int err = 0;
- if (!err && !fscrypt_has_encryption_key(dir)) {
- spin_lock(&dentry->d_lock);
+ if (IS_ENCRYPTED(dir)) {
+ err = fscrypt_get_encryption_info(dir, true);
+ if (!err && !fscrypt_has_encryption_key(dir))
+ nokey_name = true;
+ }
+
+ spin_lock(&dentry->d_lock);
+ if (nokey_name) {
dentry->d_flags |= DCACHE_NOKEY_NAME;
- spin_unlock(&dentry->d_lock);
}
+ spin_unlock(&dentry->d_lock);
+
return err;
}
-EXPORT_SYMBOL_GPL(fscrypt_prepare_lookup_partial);
+EXPORT_SYMBOL_GPL(fscrypt_prepare_lookup_dentry);
int __fscrypt_prepare_readdir(struct inode *dir)
{
diff --git a/include/linux/fscrypt.h b/include/linux/fscrypt.h
index 12f9e455d569..3801c5c94fb6 100644
--- a/include/linux/fscrypt.h
+++ b/include/linux/fscrypt.h
@@ -382,9 +382,7 @@ int __fscrypt_prepare_link(struct inode *inode, struct inode *dir,
int __fscrypt_prepare_rename(struct inode *old_dir, struct dentry *old_dentry,
struct inode *new_dir, struct dentry *new_dentry,
unsigned int flags);
-int __fscrypt_prepare_lookup(struct inode *dir, struct dentry *dentry,
- struct fscrypt_name *fname);
-int fscrypt_prepare_lookup_partial(struct inode *dir, struct dentry *dentry);
+int fscrypt_prepare_lookup_dentry(struct inode *dir, struct dentry *dentry);
int __fscrypt_prepare_readdir(struct inode *dir);
int __fscrypt_prepare_setattr(struct dentry *dentry, struct iattr *attr);
int fscrypt_prepare_setflags(struct inode *inode,
@@ -704,14 +702,7 @@ static inline int __fscrypt_prepare_rename(struct inode *old_dir,
return -EOPNOTSUPP;
}
-static inline int __fscrypt_prepare_lookup(struct inode *dir,
- struct dentry *dentry,
- struct fscrypt_name *fname)
-{
- return -EOPNOTSUPP;
-}
-
-static inline int fscrypt_prepare_lookup_partial(struct inode *dir,
+static inline int fscrypt_prepare_lookup_dentry(struct inode *dir,
struct dentry *dentry)
{
return -EOPNOTSUPP;
@@ -975,14 +966,27 @@ static inline int fscrypt_prepare_lookup(struct inode *dir,
struct dentry *dentry,
struct fscrypt_name *fname)
{
- if (IS_ENCRYPTED(dir))
- return __fscrypt_prepare_lookup(dir, dentry, fname);
+ int ret, err = 0;
+
+ if (IS_ENCRYPTED(dir)) {
+ err = fscrypt_setup_filename(dir, &dentry->d_name, 1, fname);
+ if (err && err != -ENOENT)
+ return err;
+ } else {
+ memset(fname, 0, sizeof(*fname));
+ fname->usr_fname = &dentry->d_name;
+ fname->disk_name.name = (unsigned char *)dentry->d_name.name;
+ fname->disk_name.len = dentry->d_name.len;
+ }
- memset(fname, 0, sizeof(*fname));
- fname->usr_fname = &dentry->d_name;
- fname->disk_name.name = (unsigned char *)dentry->d_name.name;
- fname->disk_name.len = dentry->d_name.len;
- return 0;
+ /*
+ * fscrypt_prepare_lookup_dentry will succeed even if the
+ * directory key is unavailable but the filename isn't a valid
+ * no-key name (-ENOENT). Thus, propagate the previous
+ * error, if any.
+ */
+ ret = fscrypt_prepare_lookup_dentry(dir, dentry);
+ return err ? err : ret;
}
/**
--
2.43.0
In preparation to drop the similar helper that sets d_op at lookup time,
add a version to set the right d_op filesystem-wide, through sb->s_d_op.
The operations structures are shared across filesystems supporting
fscrypt and/or casefolding, therefore we can keep it in common libfs
code.
Signed-off-by: Gabriel Krisman Bertazi <[email protected]>
---
fs/libfs.c | 29 +++++++++++++++++++++++++++++
include/linux/fs.h | 1 +
2 files changed, 30 insertions(+)
diff --git a/fs/libfs.c b/fs/libfs.c
index c4be0961faf0..9cd4df6969d2 100644
--- a/fs/libfs.c
+++ b/fs/libfs.c
@@ -1822,6 +1822,35 @@ void generic_set_encrypted_ci_d_ops(struct dentry *dentry)
}
EXPORT_SYMBOL(generic_set_encrypted_ci_d_ops);
+/**
+ * generic_set_sb_d_ops - helper for choosing the set of
+ * filesystem-wide dentry operations for the enabled features
+ * @sb: superblock to be configured
+ *
+ * Filesystems supporting casefolding and/or fscrypt can call this
+ * helper at mount-time to configure sb->s_d_root to best set of dentry
+ * operations required for the enabled features. The helper must be
+ * called after these have been configured, but before the root
+ * dentry is created.
+ *
+ */
+void generic_set_sb_d_ops(struct super_block *sb)
+{
+#if IS_ENABLED(CONFIG_UNICODE)
+ if (sb->s_encoding) {
+ sb->s_d_op = &generic_ci_dentry_ops;
+ return;
+ }
+#endif
+#ifdef CONFIG_FS_ENCRYPTION
+ if (sb->s_cop) {
+ sb->s_d_op = &generic_encrypted_dentry_ops;
+ return;
+ }
+#endif
+}
+EXPORT_SYMBOL(generic_set_sb_d_ops);
+
/**
* inode_maybe_inc_iversion - increments i_version
* @inode: inode with the i_version that should be updated
diff --git a/include/linux/fs.h b/include/linux/fs.h
index e6667ece5e64..c985d9392b61 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -3202,6 +3202,7 @@ extern int generic_file_fsync(struct file *, loff_t, loff_t, int);
extern int generic_check_addressable(unsigned, u64);
extern void generic_set_encrypted_ci_d_ops(struct dentry *dentry);
+extern void generic_set_sb_d_ops(struct super_block *sb);
static inline bool sb_has_encoding(const struct super_block *sb)
{
--
2.43.0
This was already the case for case-insensitive before commit
bb9cd9106b22 ("fscrypt: Have filesystems handle their d_ops"), but it
was changed to set at lookup-time to facilitate the integration with
fscrypt. But it's a problem because dentries that don't get created
through ->lookup() won't have any visibility of the operations.
Since fscrypt now also supports configuring dentry operations at
creation-time, do it for any encrypted and/or casefold volume,
simplifying the implementation across these features.
Signed-off-by: Gabriel Krisman Bertazi <[email protected]>
---
fs/ext4/namei.c | 1 -
fs/ext4/super.c | 1 +
2 files changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c
index d252935f9c8a..3f0b853a371e 100644
--- a/fs/ext4/namei.c
+++ b/fs/ext4/namei.c
@@ -1762,7 +1762,6 @@ static struct buffer_head *ext4_lookup_entry(struct inode *dir,
struct buffer_head *bh;
err = ext4_fname_prepare_lookup(dir, dentry, &fname);
- generic_set_encrypted_ci_d_ops(dentry);
if (err == -ENOENT)
return NULL;
if (err)
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index c5fcf377ab1f..de80a9cc699a 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -5493,6 +5493,7 @@ static int __ext4_fill_super(struct fs_context *fc, struct super_block *sb)
goto failed_mount4;
}
+ generic_set_sb_d_ops(sb);
sb->s_root = d_make_root(root);
if (!sb->s_root) {
ext4_msg(sb, KERN_ERR, "get root dentry failed");
--
2.43.0
This was already the case for case-insensitive before commit
bb9cd9106b22 ("fscrypt: Have filesystems handle their d_ops"), but it
was changed to set at lookup-time to facilitate the integration with
fscrypt. But it's a problem because dentries that don't get created
through ->lookup() won't have any visibility of the operations.
Since fscrypt now also supports configuring dentry operations at
creation-time, do it for any encrypted and/or casefold volume,
simplifying the implementation across these features.
Signed-off-by: Gabriel Krisman Bertazi <[email protected]>
---
fs/f2fs/namei.c | 1 -
fs/f2fs/super.c | 1 +
2 files changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/f2fs/namei.c b/fs/f2fs/namei.c
index d0053b0284d8..b40c6c393bd6 100644
--- a/fs/f2fs/namei.c
+++ b/fs/f2fs/namei.c
@@ -532,7 +532,6 @@ static struct dentry *f2fs_lookup(struct inode *dir, struct dentry *dentry,
}
err = f2fs_prepare_lookup(dir, dentry, &fname);
- generic_set_encrypted_ci_d_ops(dentry);
if (err == -ENOENT)
goto out_splice;
if (err)
diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
index 033af907c3b1..abfdb6e25b1c 100644
--- a/fs/f2fs/super.c
+++ b/fs/f2fs/super.c
@@ -4663,6 +4663,7 @@ static int f2fs_fill_super(struct super_block *sb, void *data, int silent)
goto free_node_inode;
}
+ generic_set_sb_d_ops(sb);
sb->s_root = d_make_root(root); /* allocate root dentry */
if (!sb->s_root) {
err = -ENOMEM;
--
2.43.0
From fscrypt perspective, once the key is available, the dentry will
remain valid until evicted for other reasons, since keyed dentries don't
require revalidation and, if the key is removed, the dentry is
forcefully evicted. Therefore, we don't need to keep revalidating them
repeatedly.
Obviously, we can only do this if fscrypt is the only thing requiring
revalidation for a dentry. For this reason, we only disable
d_revalidate if the .d_revalidate hook is fscrypt_d_revalidate itself.
It is safe to do it here because when moving the dentry to the
plain-text version, we are holding the d_lock. We might race with a
concurrent RCU lookup but this is harmless because, at worst, we will
get an extra d_revalidate on the keyed dentry, which is will find the
dentry is valid.
Signed-off-by: Gabriel Krisman Bertazi <[email protected]>
---
Changes since v2:
- Do it when moving instead of when revalidating the dentry. (me)
Changes since v1:
- Improve commit message (Eric)
- Drop & in function comparison (Eric)
---
include/linux/fscrypt.h | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)
diff --git a/include/linux/fscrypt.h b/include/linux/fscrypt.h
index 3801c5c94fb6..379b423802fa 100644
--- a/include/linux/fscrypt.h
+++ b/include/linux/fscrypt.h
@@ -192,6 +192,8 @@ struct fscrypt_operations {
unsigned int *num_devs);
};
+int fscrypt_d_revalidate(struct dentry *dentry, unsigned int flags);
+
static inline struct fscrypt_inode_info *
fscrypt_get_inode_info(const struct inode *inode)
{
@@ -230,6 +232,14 @@ static inline bool fscrypt_needs_contents_encryption(const struct inode *inode)
static inline void fscrypt_handle_d_move(struct dentry *dentry)
{
dentry->d_flags &= ~DCACHE_NOKEY_NAME;
+
+ /*
+ * Save the d_revalidate call cost during VFS operations. We
+ * can do it because, when the key is available, the dentry
+ * can't go stale and the key won't go away without eviction.
+ */
+ if (dentry->d_op->d_revalidate == fscrypt_d_revalidate)
+ dentry->d_flags &= ~DCACHE_OP_REVALIDATE;
}
/**
@@ -368,7 +378,6 @@ int fscrypt_fname_disk_to_usr(const struct inode *inode,
bool fscrypt_match_name(const struct fscrypt_name *fname,
const u8 *de_name, u32 de_name_len);
u64 fscrypt_fname_siphash(const struct inode *dir, const struct qstr *name);
-int fscrypt_d_revalidate(struct dentry *dentry, unsigned int flags);
/* bio.c */
bool fscrypt_decrypt_bio(struct bio *bio);
--
2.43.0
fscrypt now supports configuring dentry operations at dentry-creation
time through the preset sb->s_d_op, instead of at lookup time.
Enable this in ubifs, since the lookup-time mechanism is going away.
Signed-off-by: Gabriel Krisman Bertazi <[email protected]>
---
fs/ubifs/dir.c | 1 -
fs/ubifs/super.c | 1 +
2 files changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/ubifs/dir.c b/fs/ubifs/dir.c
index 3b13c648d490..51b9a10a9851 100644
--- a/fs/ubifs/dir.c
+++ b/fs/ubifs/dir.c
@@ -205,7 +205,6 @@ static struct dentry *ubifs_lookup(struct inode *dir, struct dentry *dentry,
dbg_gen("'%pd' in dir ino %lu", dentry, dir->i_ino);
err = fscrypt_prepare_lookup(dir, dentry, &nm);
- generic_set_encrypted_ci_d_ops(dentry);
if (err == -ENOENT)
return d_splice_alias(NULL, dentry);
if (err)
diff --git a/fs/ubifs/super.c b/fs/ubifs/super.c
index 09e270d6ed02..304646b03e99 100644
--- a/fs/ubifs/super.c
+++ b/fs/ubifs/super.c
@@ -2239,6 +2239,7 @@ static int ubifs_fill_super(struct super_block *sb, void *data, int silent)
goto out_umount;
}
+ generic_set_sb_d_ops(sb);
sb->s_root = d_make_root(root);
if (!sb->s_root) {
err = -ENOMEM;
--
2.43.0
No filesystems depend on it anymore, and it is generally a bad idea.
Since all dentries should have the same set of dentry operations in
case-insensitive filesystems, it should be propagated through ->s_d_op.
Signed-off-by: Gabriel Krisman Bertazi <[email protected]>
---
fs/libfs.c | 34 ----------------------------------
include/linux/fs.h | 1 -
2 files changed, 35 deletions(-)
diff --git a/fs/libfs.c b/fs/libfs.c
index 9cd4df6969d2..c5c92ac76ba7 100644
--- a/fs/libfs.c
+++ b/fs/libfs.c
@@ -1788,40 +1788,6 @@ static const struct dentry_operations generic_encrypted_dentry_ops = {
};
#endif
-/**
- * generic_set_encrypted_ci_d_ops - helper for setting d_ops for given dentry
- * @dentry: dentry to set ops on
- *
- * Casefolded directories need d_hash and d_compare set, so that the dentries
- * contained in them are handled case-insensitively. Note that these operations
- * are needed on the parent directory rather than on the dentries in it, and
- * while the casefolding flag can be toggled on and off on an empty directory,
- * dentry_operations can't be changed later. As a result, if the filesystem has
- * casefolding support enabled at all, we have to give all dentries the
- * casefolding operations even if their inode doesn't have the casefolding flag
- * currently (and thus the casefolding ops would be no-ops for now).
- *
- * Encryption works differently in that the only dentry operation it needs is
- * d_revalidate, which it only needs on dentries that have the no-key name flag.
- * The no-key flag can't be set "later", so we don't have to worry about that.
- */
-void generic_set_encrypted_ci_d_ops(struct dentry *dentry)
-{
-#if IS_ENABLED(CONFIG_UNICODE)
- if (dentry->d_sb->s_encoding) {
- d_set_d_op(dentry, &generic_ci_dentry_ops);
- return;
- }
-#endif
-#ifdef CONFIG_FS_ENCRYPTION
- if (dentry->d_flags & DCACHE_NOKEY_NAME) {
- d_set_d_op(dentry, &generic_encrypted_dentry_ops);
- return;
- }
-#endif
-}
-EXPORT_SYMBOL(generic_set_encrypted_ci_d_ops);
-
/**
* generic_set_sb_d_ops - helper for choosing the set of
* filesystem-wide dentry operations for the enabled features
diff --git a/include/linux/fs.h b/include/linux/fs.h
index c985d9392b61..c0cfc53f95bb 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -3201,7 +3201,6 @@ extern int generic_file_fsync(struct file *, loff_t, loff_t, int);
extern int generic_check_addressable(unsigned, u64);
-extern void generic_set_encrypted_ci_d_ops(struct dentry *dentry);
extern void generic_set_sb_d_ops(struct super_block *sb);
static inline bool sb_has_encoding(const struct super_block *sb)
--
2.43.0
In preparation to get case-insensitive dentry operations from
sb->s_d_op again, use the same structure for case-insensitive
filesystems with and without fscrypt.
This means that on a casefolded filesystem without fscrypt, we end up
having to call fscrypt_d_revalidate once per dentry, which does the
function call extra cost. We could avoid it by calling
d_set_always_valid in generic_set_encrypted_ci_d_ops, but this entire
function will go away in the following patches, and it is not worth the
extra complexity. Also, since the first fscrypt_d_revalidate will call
d_set_always_valid for us, we'll only have only pay the cost once, and
not per-lookup.
Signed-off-by: Gabriel Krisman Bertazi <[email protected]>
---
Changes since v1:
- fix header guard (eric)
---
fs/libfs.c | 34 ++++++----------------------------
1 file changed, 6 insertions(+), 28 deletions(-)
diff --git a/fs/libfs.c b/fs/libfs.c
index c2aa6fd4795c..c4be0961faf0 100644
--- a/fs/libfs.c
+++ b/fs/libfs.c
@@ -1776,19 +1776,14 @@ static int generic_ci_d_hash(const struct dentry *dentry, struct qstr *str)
static const struct dentry_operations generic_ci_dentry_ops = {
.d_hash = generic_ci_d_hash,
.d_compare = generic_ci_d_compare,
-};
-#endif
-
#ifdef CONFIG_FS_ENCRYPTION
-static const struct dentry_operations generic_encrypted_dentry_ops = {
.d_revalidate = fscrypt_d_revalidate,
+#endif
};
#endif
-#if defined(CONFIG_FS_ENCRYPTION) && IS_ENABLED(CONFIG_UNICODE)
-static const struct dentry_operations generic_encrypted_ci_dentry_ops = {
- .d_hash = generic_ci_d_hash,
- .d_compare = generic_ci_d_compare,
+#ifdef CONFIG_FS_ENCRYPTION
+static const struct dentry_operations generic_encrypted_dentry_ops = {
.d_revalidate = fscrypt_d_revalidate,
};
#endif
@@ -1809,38 +1804,21 @@ static const struct dentry_operations generic_encrypted_ci_dentry_ops = {
* Encryption works differently in that the only dentry operation it needs is
* d_revalidate, which it only needs on dentries that have the no-key name flag.
* The no-key flag can't be set "later", so we don't have to worry about that.
- *
- * Finally, to maximize compatibility with overlayfs (which isn't compatible
- * with certain dentry operations) and to avoid taking an unnecessary
- * performance hit, we use custom dentry_operations for each possible
- * combination rather than always installing all operations.
*/
void generic_set_encrypted_ci_d_ops(struct dentry *dentry)
{
-#ifdef CONFIG_FS_ENCRYPTION
- bool needs_encrypt_ops = dentry->d_flags & DCACHE_NOKEY_NAME;
-#endif
#if IS_ENABLED(CONFIG_UNICODE)
- bool needs_ci_ops = dentry->d_sb->s_encoding;
-#endif
-#if defined(CONFIG_FS_ENCRYPTION) && IS_ENABLED(CONFIG_UNICODE)
- if (needs_encrypt_ops && needs_ci_ops) {
- d_set_d_op(dentry, &generic_encrypted_ci_dentry_ops);
+ if (dentry->d_sb->s_encoding) {
+ d_set_d_op(dentry, &generic_ci_dentry_ops);
return;
}
#endif
#ifdef CONFIG_FS_ENCRYPTION
- if (needs_encrypt_ops) {
+ if (dentry->d_flags & DCACHE_NOKEY_NAME) {
d_set_d_op(dentry, &generic_encrypted_dentry_ops);
return;
}
#endif
-#if IS_ENABLED(CONFIG_UNICODE)
- if (needs_ci_ops) {
- d_set_d_op(dentry, &generic_ci_dentry_ops);
- return;
- }
-#endif
}
EXPORT_SYMBOL(generic_set_encrypted_ci_d_ops);
--
2.43.0
Hello,
kernel test robot noticed "BUG:kernel_NULL_pointer_dereference,address" on:
commit: 1cfe4ba685d9eb6123648a0d9bef2d3d57b078ef ("[PATCH v3 04/10] fscrypt: Drop d_revalidate once the key is added")
url: https://github.com/intel-lab-lkp/linux/commits/Gabriel-Krisman-Bertazi/ovl-Reject-mounting-case-insensitive-filesystems/20240112-070113
base: https://git.kernel.org/cgit/linux/kernel/git/jaegeuk/f2fs.git dev-test
patch link: https://lore.kernel.org/all/[email protected]/
patch subject: [PATCH v3 04/10] fscrypt: Drop d_revalidate once the key is added
in testcase: fxmark
version: fxmark-x86_64-0ce9491-1_20220601
with following parameters:
disk: 1SSD
media: ssd
test: MWRL
fstype: xfs
directio: bufferedio
cpufreq_governor: performance
compiler: gcc-12
test machine: 128 threads 2 sockets Intel(R) Xeon(R) Platinum 8358 CPU @ 2.60GHz (Ice Lake) with 128G memory
(please refer to attached dmesg/kmsg for entire log/backtrace)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <[email protected]>
| Closes: https://lore.kernel.org/oe-lkp/[email protected]
[ 73.173380][ T6828] BUG: kernel NULL pointer dereference, address: 0000000000000000
[ 73.181338][ T6828] #PF: supervisor read access in kernel mode
[ 73.187453][ T6828] #PF: error_code(0x0000) - not-present page
[ 73.193566][ T6828] PGD 11cc47067 P4D 0
[ 73.197762][ T6828] Oops: 0000 [#1] SMP NOPTI
[ 73.202383][ T6828] CPU: 16 PID: 6828 Comm: fxmark Tainted: G S 6.7.0-rc1-00176-g1cfe4ba685d9 #1
[ 73.212818][ T6828] Hardware name: Intel Corporation M50CYP2SB1U/M50CYP2SB1U, BIOS SE5C620.86B.01.01.0003.2104260124 04/26/2021
[ 73.224837][ T6828] RIP: 0010:__d_move (include/linux/fscrypt.h:241 fs/dcache.c:3003)
[ 73.229912][ T6828] Code: c1 00 00 00 08 0f 84 ed 00 00 00 81 e1 3f 10 07 00 0f 84 e1 00 00 00 80 cc 40 89 c1 81 e1 ff ff ff fd 41 89 4d 00 49 8b 4d 60 <48> 81 39 10 21 4e 81 0f 84 66 01 00 00 83 43 04 01 41 83 45 04 01
All code
========
0: c1 00 00 roll $0x0,(%rax)
3: 00 08 add %cl,(%rax)
5: 0f 84 ed 00 00 00 je 0xf8
b: 81 e1 3f 10 07 00 and $0x7103f,%ecx
11: 0f 84 e1 00 00 00 je 0xf8
17: 80 cc 40 or $0x40,%ah
1a: 89 c1 mov %eax,%ecx
1c: 81 e1 ff ff ff fd and $0xfdffffff,%ecx
22: 41 89 4d 00 mov %ecx,0x0(%r13)
26: 49 8b 4d 60 mov 0x60(%r13),%rcx
2a:* 48 81 39 10 21 4e 81 cmpq $0xffffffff814e2110,(%rcx) <-- trapping instruction
31: 0f 84 66 01 00 00 je 0x19d
37: 83 43 04 01 addl $0x1,0x4(%rbx)
3b: 41 83 45 04 01 addl $0x1,0x4(%r13)
Code starting with the faulting instruction
===========================================
0: 48 81 39 10 21 4e 81 cmpq $0xffffffff814e2110,(%rcx)
7: 0f 84 66 01 00 00 je 0x173
d: 83 43 04 01 addl $0x1,0x4(%rbx)
11: 41 83 45 04 01 addl $0x1,0x4(%r13)
[ 73.249920][ T6828] RSP: 0018:ffa000000a99bce8 EFLAGS: 00010206
[ 73.256134][ T6828] RAX: 0000000000480000 RBX: ff1100012cab5380 RCX: 0000000000000000
[ 73.264248][ T6828] RDX: ff1100012cab4609 RSI: 0000000000000000 RDI: ff1100012cab4600
[ 73.272366][ T6828] RBP: 0000000000000000 R08: 0000000000000000 R09: 000000000000020c
[ 73.280473][ T6828] R10: ff110001622ddde0 R11: 0000000000010000 R12: 0000000000000000
[ 73.288584][ T6828] R13: ff1100012cab4600 R14: 0000000000000000 R15: ff1100012cab5200
[ 73.296699][ T6828] FS: 00007f1073011600(0000) GS:ff1100103f600000(0000) knlGS:0000000000000000
[ 73.305766][ T6828] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 73.312488][ T6828] CR2: 0000000000000000 CR3: 000000012af2a006 CR4: 0000000000771ef0
[ 73.320596][ T6828] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 73.328699][ T6828] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 73.336803][ T6828] PKRU: 55555554
[ 73.340485][ T6828] Call Trace:
[ 73.343900][ T6828] <TASK>
[ 73.346960][ T6828] ? __die (arch/x86/kernel/dumpstack.c:421 arch/x86/kernel/dumpstack.c:434)
[ 73.350983][ T6828] ? page_fault_oops (arch/x86/mm/fault.c:707)
[ 73.355957][ T6828] ? exc_page_fault (arch/x86/include/asm/irqflags.h:37 arch/x86/include/asm/irqflags.h:72 arch/x86/mm/fault.c:1513 arch/x86/mm/fault.c:1561)
[ 73.360837][ T6828] ? asm_exc_page_fault (arch/x86/include/asm/idtentry.h:570)
[ 73.365974][ T6828] ? __d_move (include/linux/fscrypt.h:241 fs/dcache.c:3003)
[ 73.370410][ T6828] ? __d_move (arch/x86/include/asm/atomic.h:23 include/linux/atomic/atomic-arch-fallback.h:457 include/linux/atomic/atomic-instrumented.h:33 include/asm-generic/qspinlock.h:57 include/linux/fsnotify_backend.h:580 fs/dcache.c:3002)
[ 73.374846][ T6828] d_move (include/linux/seqlock.h:500 include/linux/seqlock.h:572 include/linux/seqlock.h:910 fs/dcache.c:3032)
[ 73.378757][ T6828] vfs_rename (include/linux/fs.h:807 fs/namei.c:4864)
[ 73.383189][ T6828] ? do_renameat2 (fs/namei.c:4996)
[ 73.387963][ T6828] do_renameat2 (fs/namei.c:4996)
[ 73.392568][ T6828] __x64_sys_rename (fs/namei.c:5040)
[ 73.397336][ T6828] do_syscall_64 (arch/x86/entry/common.c:51 arch/x86/entry/common.c:82)
[ 73.401835][ T6828] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:129)
[ 73.407817][ T6828] RIP: 0033:0x7f1072e83ed7
[ 73.412325][ T6828] Code: e8 6e 82 09 00 85 c0 0f 95 c0 0f b6 c0 f7 d8 5d c3 66 90 b8 ff ff ff ff 5d c3 66 0f 1f 84 00 00 00 00 00 b8 52 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 01 c3 48 8b 15 89 8f 17 00 f7 d8 64 89 02 b8
All code
========
0: e8 6e 82 09 00 callq 0x98273
5: 85 c0 test %eax,%eax
7: 0f 95 c0 setne %al
a: 0f b6 c0 movzbl %al,%eax
d: f7 d8 neg %eax
f: 5d pop %rbp
10: c3 retq
11: 66 90 xchg %ax,%ax
13: b8 ff ff ff ff mov $0xffffffff,%eax
18: 5d pop %rbp
19: c3 retq
1a: 66 0f 1f 84 00 00 00 nopw 0x0(%rax,%rax,1)
21: 00 00
23: b8 52 00 00 00 mov $0x52,%eax
28: 0f 05 syscall
2a:* 48 3d 00 f0 ff ff cmp $0xfffffffffffff000,%rax <-- trapping instruction
30: 77 01 ja 0x33
32: c3 retq
33: 48 8b 15 89 8f 17 00 mov 0x178f89(%rip),%rdx # 0x178fc3
3a: f7 d8 neg %eax
3c: 64 89 02 mov %eax,%fs:(%rdx)
3f: b8 .byte 0xb8
Code starting with the faulting instruction
===========================================
0: 48 3d 00 f0 ff ff cmp $0xfffffffffffff000,%rax
6: 77 01 ja 0x9
8: c3 retq
9: 48 8b 15 89 8f 17 00 mov 0x178f89(%rip),%rdx # 0x178f99
10: f7 d8 neg %eax
12: 64 89 02 mov %eax,%fs:(%rdx)
15: b8 .byte 0xb8
[ 73.432265][ T6828] RSP: 002b:00007ffe3fb6db98 EFLAGS: 00000202 ORIG_RAX: 0000000000000052
[ 73.440777][ T6828] RAX: ffffffffffffffda RBX: 00007f107300d840 RCX: 00007f1072e83ed7
[ 73.448853][ T6828] RDX: 0000000000000000 RSI: 00007ffe3fb6eba0 RDI: 00007ffe3fb6dba0
[ 73.456921][ T6828] RBP: 0000000000000000 R08: 0000000000000000 R09: 00007ffe3fb6d937
[ 73.464983][ T6828] R10: fffffffffffffa2e R11: 0000000000000202 R12: 000055ef6a56d0a2
[ 73.473046][ T6828] R13: 00007ffe3fb6eba0 R14: 00007ffe3fb6dba0 R15: 00007f1073004000
[ 73.481105][ T6828] </TASK>
[ 73.484219][ T6828] Modules linked in: xfs loop btrfs blake2b_generic xor raid6_pq libcrc32c sd_mod t10_pi crc64_rocksoft_generic crc64_rocksoft crc64 sg intel_rapl_msr intel_rapl_common x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel sha512_ssse3 rapl intel_cstate ast ahci libahci ipmi_ssif mei_me acpi_ipmi ioatdma drm_shmem_helper intel_uncore i2c_i801 dax_hmem ipmi_si libata drm_kms_helper mei joydev i2c_smbus dca intel_pch_thermal wmi ipmi_devintf ipmi_msghandler acpi_pad acpi_power_meter drm fuse ip_tables
[ 73.538153][ T6828] CR2: 0000000000000000
[ 73.542418][ T6828] ---[ end trace 0000000000000000 ]---
[ 73.560053][ T6828] pstore: backend (erst) writing error (-28)
[ 73.566136][ T6828] RIP: 0010:__d_move (include/linux/fscrypt.h:241 fs/dcache.c:3003)
[ 73.571178][ T6828] Code: c1 00 00 00 08 0f 84 ed 00 00 00 81 e1 3f 10 07 00 0f 84 e1 00 00 00 80 cc 40 89 c1 81 e1 ff ff ff fd 41 89 4d 00 49 8b 4d 60 <48> 81 39 10 21 4e 81 0f 84 66 01 00 00 83 43 04 01 41 83 45 04 01
All code
========
0: c1 00 00 roll $0x0,(%rax)
3: 00 08 add %cl,(%rax)
5: 0f 84 ed 00 00 00 je 0xf8
b: 81 e1 3f 10 07 00 and $0x7103f,%ecx
11: 0f 84 e1 00 00 00 je 0xf8
17: 80 cc 40 or $0x40,%ah
1a: 89 c1 mov %eax,%ecx
1c: 81 e1 ff ff ff fd and $0xfdffffff,%ecx
22: 41 89 4d 00 mov %ecx,0x0(%r13)
26: 49 8b 4d 60 mov 0x60(%r13),%rcx
2a:* 48 81 39 10 21 4e 81 cmpq $0xffffffff814e2110,(%rcx) <-- trapping instruction
31: 0f 84 66 01 00 00 je 0x19d
37: 83 43 04 01 addl $0x1,0x4(%rbx)
3b: 41 83 45 04 01 addl $0x1,0x4(%r13)
Code starting with the faulting instruction
===========================================
0: 48 81 39 10 21 4e 81 cmpq $0xffffffff814e2110,(%rcx)
7: 0f 84 66 01 00 00 je 0x173
d: 83 43 04 01 addl $0x1,0x4(%rbx)
11: 41 83 45 04 01 addl $0x1,0x4(%r13)
The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20240118/[email protected]
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
kernel test robot <[email protected]> writes:
> Hello,
>
> kernel test robot noticed "BUG:kernel_NULL_pointer_dereference,address" on:
>
> commit: 1cfe4ba685d9eb6123648a0d9bef2d3d57b078ef ("[PATCH v3 04/10] fscrypt: Drop d_revalidate once the key is added")
> url: https://github.com/intel-lab-lkp/linux/commits/Gabriel-Krisman-Bertazi/ovl-Reject-mounting-case-insensitive-filesystems/20240112-070113
> base: https://git.kernel.org/cgit/linux/kernel/git/jaegeuk/f2fs.git dev-test
> patch link: https://lore.kernel.org/all/[email protected]/
> patch subject: [PATCH v3 04/10] fscrypt: Drop d_revalidate once the key is added
>
> in testcase: fxmark
> version: fxmark-x86_64-0ce9491-1_20220601
> with following parameters:
>
> disk: 1SSD
> media: ssd
> test: MWRL
> fstype: xfs
> directio: bufferedio
> cpufreq_governor: performance
>
>
>
> compiler: gcc-12
> test machine: 128 threads 2 sockets Intel(R) Xeon(R) Platinum 8358 CPU @ 2.60GHz (Ice Lake) with 128G memory
>
> (please refer to attached dmesg/kmsg for entire log/backtrace)
>
>
>
> If you fix the issue in a separate patch/commit (i.e. not just a new version of
> the same patch/commit), kindly add following tags
> | Reported-by: kernel test robot <[email protected]>
> | Closes: https://lore.kernel.org/oe-lkp/[email protected]
>
>
> [ 73.173380][ T6828] BUG: kernel NULL pointer dereference, address: 0000000000000000
> [ 73.181338][ T6828] #PF: supervisor read access in kernel mode
> [ 73.187453][ T6828] #PF: error_code(0x0000) - not-present page
> [ 73.193566][ T6828] PGD 11cc47067 P4D 0
> [ 73.197762][ T6828] Oops: 0000 [#1] SMP NOPTI
> [ 73.202383][ T6828] CPU: 16 PID: 6828 Comm: fxmark Tainted: G S 6.7.0-rc1-00176-g1cfe4ba685d9 #1
> [ 73.212818][ T6828] Hardware name: Intel Corporation M50CYP2SB1U/M50CYP2SB1U, BIOS SE5C620.86B.01.01.0003.2104260124 04/26/2021
> [ 73.224837][ T6828] RIP: 0010:__d_move (include/linux/fscrypt.h:241 fs/dcache.c:3003)
> [ 73.229912][ T6828] Code: c1 00 00 00 08 0f 84 ed 00 00 00 81 e1 3f 10 07 00 0f 84 e1 00 00 00 80 cc 40 89 c1 81 e1 ff ff ff fd 41 89 4d 00 49 8b 4d 60 <48> 81 39 10 21 4e 81 0f 84 66 01 00 00 83 43 04 01 41 83 45 04 01
> All code
> ========
> 0: c1 00 00 roll $0x0,(%rax)
> 3: 00 08 add %cl,(%rax)
> 5: 0f 84 ed 00 00 00 je 0xf8
> b: 81 e1 3f 10 07 00 and $0x7103f,%ecx
> 11: 0f 84 e1 00 00 00 je 0xf8
> 17: 80 cc 40 or $0x40,%ah
> 1a: 89 c1 mov %eax,%ecx
> 1c: 81 e1 ff ff ff fd and $0xfdffffff,%ecx
> 22: 41 89 4d 00 mov %ecx,0x0(%r13)
> 26: 49 8b 4d 60 mov 0x60(%r13),%rcx
> 2a:* 48 81 39 10 21 4e 81 cmpq $0xffffffff814e2110,(%rcx) <-- trapping instruction
> 31: 0f 84 66 01 00 00 je 0x19d
> 37: 83 43 04 01 addl $0x1,0x4(%rbx)
> 3b: 41 83 45 04 01 addl $0x1,0x4(%r13)
>
> Code starting with the faulting instruction
> ===========================================
> 0: 48 81 39 10 21 4e 81 cmpq $0xffffffff814e2110,(%rcx)
> 7: 0f 84 66 01 00 00 je 0x173
> d: 83 43 04 01 addl $0x1,0x4(%rbx)
> 11: 41 83 45 04 01 addl $0x1,0x4(%r13)
> [ 73.249920][ T6828] RSP: 0018:ffa000000a99bce8 EFLAGS: 00010206
> [ 73.256134][ T6828] RAX: 0000000000480000 RBX: ff1100012cab5380 RCX: 0000000000000000
> [ 73.264248][ T6828] RDX: ff1100012cab4609 RSI: 0000000000000000 RDI: ff1100012cab4600
> [ 73.272366][ T6828] RBP: 0000000000000000 R08: 0000000000000000 R09: 000000000000020c
> [ 73.280473][ T6828] R10: ff110001622ddde0 R11: 0000000000010000 R12: 0000000000000000
> [ 73.288584][ T6828] R13: ff1100012cab4600 R14: 0000000000000000 R15: ff1100012cab5200
> [ 73.296699][ T6828] FS: 00007f1073011600(0000) GS:ff1100103f600000(0000) knlGS:0000000000000000
> [ 73.305766][ T6828] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 73.312488][ T6828] CR2: 0000000000000000 CR3: 000000012af2a006 CR4: 0000000000771ef0
> [ 73.320596][ T6828] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ 73.328699][ T6828] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [ 73.336803][ T6828] PKRU: 55555554
> [ 73.340485][ T6828] Call Trace:
> [ 73.343900][ T6828] <TASK>
> [ 73.346960][ T6828] ? __die (arch/x86/kernel/dumpstack.c:421 arch/x86/kernel/dumpstack.c:434)
> [ 73.350983][ T6828] ? page_fault_oops (arch/x86/mm/fault.c:707)
> [ 73.355957][ T6828] ? exc_page_fault (arch/x86/include/asm/irqflags.h:37 arch/x86/include/asm/irqflags.h:72 arch/x86/mm/fault.c:1513 arch/x86/mm/fault.c:1561)
> [ 73.360837][ T6828] ? asm_exc_page_fault (arch/x86/include/asm/idtentry.h:570)
> [ 73.365974][ T6828] ? __d_move (include/linux/fscrypt.h:241 fs/dcache.c:3003)
> [ 73.370410][ T6828] ? __d_move (arch/x86/include/asm/atomic.h:23 include/linux/atomic/atomic-arch-fallback.h:457 include/linux/atomic/atomic-instrumented.h:33 include/asm-generic/qspinlock.h:57 include/linux/fsnotify_backend.h:580 fs/dcache.c:3002)
> [ 73.374846][ T6828] d_move (include/linux/seqlock.h:500 include/linux/seqlock.h:572 include/linux/seqlock.h:910 fs/dcache.c:3032)
> [ 73.378757][ T6828] vfs_rename (include/linux/fs.h:807 fs/namei.c:4864)
> [ 73.383189][ T6828] ? do_renameat2 (fs/namei.c:4996)
> [ 73.387963][ T6828] do_renameat2 (fs/namei.c:4996)
> [ 73.392568][ T6828] __x64_sys_rename (fs/namei.c:5040)
> [ 73.397336][ T6828] do_syscall_64 (arch/x86/entry/common.c:51 arch/x86/entry/common.c:82)
> [ 73.401835][ T6828] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:129)
> [ 73.407817][ T6828] RIP: 0033:0x7f1072e83ed7
> [ 73.412325][ T6828] Code: e8 6e 82 09 00 85 c0 0f 95 c0 0f b6 c0 f7 d8 5d c3 66 90 b8 ff ff ff ff 5d c3 66 0f 1f 84 00 00 00 00 00 b8 52 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 01 c3 48 8b 15 89 8f 17 00 f7 d8 64 89 02 b8
> All code
> ========
> 0: e8 6e 82 09 00 callq 0x98273
> 5: 85 c0 test %eax,%eax
> 7: 0f 95 c0 setne %al
> a: 0f b6 c0 movzbl %al,%eax
> d: f7 d8 neg %eax
> f: 5d pop %rbp
> 10: c3 retq
> 11: 66 90 xchg %ax,%ax
> 13: b8 ff ff ff ff mov $0xffffffff,%eax
> 18: 5d pop %rbp
> 19: c3 retq
> 1a: 66 0f 1f 84 00 00 00 nopw 0x0(%rax,%rax,1)
> 21: 00 00
> 23: b8 52 00 00 00 mov $0x52,%eax
> 28: 0f 05 syscall
> 2a:* 48 3d 00 f0 ff ff cmp $0xfffffffffffff000,%rax <-- trapping instruction
> 30: 77 01 ja 0x33
> 32: c3 retq
> 33: 48 8b 15 89 8f 17 00 mov 0x178f89(%rip),%rdx # 0x178fc3
> 3a: f7 d8 neg %eax
> 3c: 64 89 02 mov %eax,%fs:(%rdx)
> 3f: b8 .byte 0xb8
>
Hm. So, the thing I missed here is that fscrypt_handle_d_move will be
called even by filesystems that don't support fscrypt. While we know
fscrypt filesystem dentries must have ->d_op, others might not have
it, and the dereferencing of dentry->d_op causes the oops at:
if (dentry->d_op->d_revalidate == fscrypt_d_revalidate)
causes the Oops.
One fix would be to prevent non-fscrypt filesystems from calling this
function. But since __d_move only touches the dentries, I think I'll
leave it as-is and just do:
if (dentry->d_op && dentry->d_op->d_revalidate)
sorry for the noise.
--
Gabriel Krisman Bertazi