2022-06-01 21:39:20

by Gabriel Krisman Bertazi

[permalink] [raw]
Subject: [PATCH RFC 0/7] Support negative dentries on case-insensitive directories

Hi,

I'm sending this as an RFC given the amount of corner cases that appear
when doing case-insensitive lookups. I'm confident I've covered every
one of them, but I'd appreciate the review to ensure I'm not missing a
case where negative dentries cannot be trusted on case-insensitive
directories.

This survives every corner case I could think of, including those
already tested by fstests. I also ran sanity checks to show it
uses the created negative dentries, and I observed the expected
performance increase of the negative dentry cache hit.

* Background

Negative dentries have always been disabled in case-insensitive
directories because they don't provide enough assurance that every case
variation of a filename doesn't exist in a directory and because there
are known corner cases in file creation where negative dentries can't be
instantiated.

In the trivial case the upstream implementation already works for
negative dentries, even though it is disabled. That is: if the lookup
that caused the dentry to be created was done in a case-insensitive way,
the negative dentry can already be trusted, since it means that no valid
dcache entry exists, *and* that no variation of the file exists on
disk (since the lookup failed). A following lookup will then be executed
with case-insensitive-aware d_hash and d_lookup, it will find the right
negative dentry and can trust it. It has a creation problem, though,
discussed below.

The first problem appears when a case-insensitive directory has negative
dentries that were created when the directory was case-sensitive. A
further lookup would incorrectly trust it:

This sequence demonstrates the problem:

mkdir d
touch d/$1
touch d/$2
unlink d/$1 <- leaves negative dentry.
unlink d/$2 <- leaves negative dentry.
chattr +F d
touch d/$1 <- finds one of the negative dentries, makes it positive
< if d/$1 is d_drop somehow >
access d/$2 <- Might find the other negative dentry, get -ENOENT

There are actually a few problems here. The first is that a preexisting
negative dentry created during a case-sensitive lookup doesn't guarantee
that no other variation of the name exists. This is not a big problem
in the common case, since the directory has to be empty to be converted,
and the d_hash and d_compare are case-insensitive; which means they will
find the same dentry and reuse it most of the time (except for
invalidations and hash collisions). But it means that we are leaving
behind a stalled dentry that shouldn't be there.

The real problem happens if $1 and $2 are two strings where:
(i) casefold($1) == casefold($2)
(ii) hash($1) == hash($2) == hash(casefold($1))

This condition is the worst case. Both negative dentries can
potentially be found during a case-insensitive lookup if the wrong
dentry is invalidated.

In fact, this is a problem even on the current implementation. There
was a bug reported by Al in 2020 [1], where a directory might end up
with dangling negative dentries created during a case-sensitive lookup,
because when the +F attribute is set; even though that code requires an
empty directory, it doesn't check for negative dentries.

Condition (ii) is hard to test, but not impossible. But, even if it
is not present, we still leave negative dentries behind, which shouldn't
currently exist for a case-insensitive directory.

A completely different problem with negative dentries on
case-insensitive directories exist when turning a negative dentry to
positive. If the negative dentry has a different case than what is
currently being looked up, the dentry cannot be reused without changing
its name, because we guarantee filename-preserving semantics. We need
to either change the name or invalidate the dentry. This is currently
done in the upstream kernel by completely stopping negative dentries
from being created in the first place.

* Proposal

The proposed solution is to differentiate negative dentries created from
a case-insensitive context from those created during a case-sensitive
one via a new dentry flag, D_CASEFOLD_LOOKUP, set by the filesystem
during d_lookup. Since a negative dentry created during a
case-insensitive lookup can be trusted (except for the name-preserving
issue), we can check that flag during d_revalidate to quickly accept or
reject the negative dentry.

Another solution for that problem would be to guarantee that no negative
dentry exists during the Case-sensitive to case-insensitive directory
conversion (the other direction is safe). This has the following
problems:

1) It is not trivial to implement a race-free mechanism to ensure
negative dentries won't be recreated immediately after invalidation
while converting the directory.

2) The knowledge whether the negative dentry can be is
valid (i.e. comes from a case-insensitive lookup) is implicit on the
fact that we are correctly invalidating dentries when converting the
directory.

Having a D_CASEFOLD_LOOKUP avoids both issues, and seems to be a cheap
solution to explicitly decide whether to validate a negative dentry.

As explained, in order to maintain the filename preserving semantics, it
is not sufficient to reuse the dentry.

One solution would be to invalidate the negative dentry when it is
decided to turn it positive, instead of reusing it. I implemented that
in the past (2018) but my understanding is that we don't want to incur
costs on the VFS critical path for other filesystems who don't care
about case-insensitiveness. I think there is also a challenge in making
this invalidation race-free, but it might be simpler than I thought.

Instead, I'm suggesting that we only validate negative dentries in
casefold directories during lookups that will instantiate the dentry
when the lookup name is exactly what is cached.

* caveats

1) Encryption

Currently, negative dentries on encrypted directories are also disabled.
No semantic change on encrypted directories is intended in this
patchset; we just bypass the revalidation directly to fscrypt, for
positive dentries. I'm working on this case as future work.

2) revalidate the cached dentry using the name under lookup

This is strange for a cache. the new semantic is implemented on
d_revalidate() to stay out of the critical path of filesystems that
don't care about case-insensitive. But this requires the revalidation
hook to validate based on what name is under lookup, which is odd for a
cache.

* Tests

There are a few tests for the corner cases discussed above in
generic/556. They mainly verify the name-preserving semantics.
The invalidation when converting the directory is harder to test,
because it is hard to force the invalidation of specific cached dentries
that occlude a dangling invalid dentry. I tested it with forcing the
positive dentries to be removed, but I'm not sure how to write an
upstreamable test for it.

This also survives smoke test on ext4 and f2fs.

* patchset

Patch 1 introduces a new version of d_revalidate to provide the
filesystem with the name under lookup; Patch 2 introduces a new dentry
flag to mark dentries as created during a case-insensitive lookup; Patch
3 introduces a libfs helper to validate negative dentries on
case-insensitive directories; Patch 4 deals with encryption; Patch 5
cleans up the now redundant dentry operations for case-insensitive with
and without encryption; Finally, Patch 6 and 7 enable negative dentries
on case-insensitive directories for ext4 and f2fs, respectively.

Gabriel Krisman Bertazi (7):
fs: Expose name under lookup to d_revalidate hook
fs: Add DCACHE_CASEFOLD_LOOKUP flag
libfs: Validate negative dentries in case-insensitive directories
libfs: Support revalidation of encrypted case-insensitive dentries
libfs: Merge encrypted_ci_dentry_ops and ci_dentry_ops
ext4: Enable negative dentries on case-insensitive lookup
f2fs: Enable negative dentries on case-insensitive lookup

fs/dcache.c | 9 +++++-
fs/ext4/namei.c | 35 +++-----------------
fs/f2fs/namei.c | 23 ++------------
fs/libfs.c | 72 ++++++++++++++++++++++++------------------
fs/namei.c | 23 ++++++++------
include/linux/dcache.h | 9 ++++++
6 files changed, 78 insertions(+), 93 deletions(-)

--
2.36.1



2022-06-01 21:39:24

by Gabriel Krisman Bertazi

[permalink] [raw]
Subject: [PATCH RFC 4/7] libfs: Support revalidation of encrypted case-insensitive dentries

Preserve the existing behavior for encrypted directories, by rejecting
negative dentries of encrypted+casefolded directories. This allows
generic_ci_d_revalidate to be used by filesystems with both features
enabled, as long as the directory is either casefolded or encrypted, but
not both at the same time.

Signed-off-by: Gabriel Krisman Bertazi <[email protected]>
---
fs/libfs.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/fs/libfs.c b/fs/libfs.c
index 618a85c08aa7..fe22738291e4 100644
--- a/fs/libfs.c
+++ b/fs/libfs.c
@@ -1461,6 +1461,9 @@ static inline int generic_ci_d_revalidate(struct dentry *dentry,
const struct inode *dir = READ_ONCE(parent->d_inode);

if (dir && needs_casefold(dir)) {
+ if (IS_ENCRYPTED(dir))
+ return 0;
+
if (!d_is_casefold_lookup(dentry))
return 0;

@@ -1470,7 +1473,8 @@ static inline int generic_ci_d_revalidate(struct dentry *dentry,
return 0;
}
}
- return 1;
+
+ return fscrypt_d_revalidate(dentry, flags);
}

static const struct dentry_operations generic_ci_dentry_ops = {
@@ -1490,7 +1494,7 @@ static const struct dentry_operations generic_encrypted_dentry_ops = {
static const struct dentry_operations generic_encrypted_ci_dentry_ops = {
.d_hash = generic_ci_d_hash,
.d_compare = generic_ci_d_compare,
- .d_revalidate = fscrypt_d_revalidate,
+ .d_revalidate_name = generic_ci_d_revalidate,
};
#endif

--
2.36.1


2022-06-01 21:39:25

by Gabriel Krisman Bertazi

[permalink] [raw]
Subject: [PATCH RFC 7/7] f2fs: Enable negative dentries on case-insensitive lookup

Instead of invalidating negative dentries during case-insensitive
lookups, mark them as such and let them be added to the dcache.
d_ci_revalidate is able to properly filter them out if necessary based
on the dentry casefold flag.

Signed-off-by: Gabriel Krisman Bertazi <[email protected]>
---
fs/f2fs/namei.c | 23 ++---------------------
1 file changed, 2 insertions(+), 21 deletions(-)

diff --git a/fs/f2fs/namei.c b/fs/f2fs/namei.c
index 5ed79b29999f..6e970a6c3623 100644
--- a/fs/f2fs/namei.c
+++ b/fs/f2fs/namei.c
@@ -562,17 +562,8 @@ static struct dentry *f2fs_lookup(struct inode *dir, struct dentry *dentry,
goto out_iput;
}
out_splice:
-#if IS_ENABLED(CONFIG_UNICODE)
- if (!inode && IS_CASEFOLDED(dir)) {
- /* Eventually we want to call d_add_ci(dentry, NULL)
- * for negative dentries in the encoding case as
- * well. For now, prevent the negative dentry
- * from being cached.
- */
- trace_f2fs_lookup_end(dir, dentry, ino, err);
- return NULL;
- }
-#endif
+ if (IS_ENABLED(CONFIG_UNICODE) && IS_CASEFOLDED(dir))
+ d_set_casefold_lookup(dentry);
new = d_splice_alias(inode, dentry);
err = PTR_ERR_OR_ZERO(new);
trace_f2fs_lookup_end(dir, dentry, ino, !new ? -ENOENT : err);
@@ -623,16 +614,6 @@ static int f2fs_unlink(struct inode *dir, struct dentry *dentry)
goto fail;
}
f2fs_delete_entry(de, page, dir, inode);
-#if IS_ENABLED(CONFIG_UNICODE)
- /* VFS negative dentries are incompatible with Encoding and
- * Case-insensitiveness. Eventually we'll want avoid
- * invalidating the dentries here, alongside with returning the
- * negative dentries at f2fs_lookup(), when it is better
- * supported by the VFS for the CI case.
- */
- if (IS_CASEFOLDED(dir))
- d_invalidate(dentry);
-#endif
f2fs_unlock_op(sbi);

if (IS_DIRSYNC(dir))
--
2.36.1


2022-06-01 21:39:27

by Gabriel Krisman Bertazi

[permalink] [raw]
Subject: [PATCH RFC 3/7] libfs: Validate negative dentries in case-insensitive directories

Introduce a dentry revalidation helper to be used by case-insensitive
filesystems to check if it is safe to reuse a negative dentry.

A negative dentry is safe to be reused on a case-insensitive lookup if
it was created during a case-insensitive lookup and this is not a lookup
that will instantiate a dentry. If this is a creation lookup, we also
need to make sure the name matches sensitively the name under lookup in
order to assure the name preserving semantics.

Signed-off-by: Gabriel Krisman Bertazi <[email protected]>
---
fs/libfs.c | 24 ++++++++++++++++++++++++
1 file changed, 24 insertions(+)

diff --git a/fs/libfs.c b/fs/libfs.c
index e64bdedef168..618a85c08aa7 100644
--- a/fs/libfs.c
+++ b/fs/libfs.c
@@ -1450,9 +1450,33 @@ static int generic_ci_d_hash(const struct dentry *dentry, struct qstr *str)
return 0;
}

+static inline int generic_ci_d_revalidate(struct dentry *dentry,
+ const struct qstr *name,
+ unsigned int flags)
+{
+ int is_creation = flags & (LOOKUP_CREATE | LOOKUP_RENAME_TARGET);
+
+ if (d_is_negative(dentry)) {
+ const struct dentry *parent = READ_ONCE(dentry->d_parent);
+ const struct inode *dir = READ_ONCE(parent->d_inode);
+
+ if (dir && needs_casefold(dir)) {
+ if (!d_is_casefold_lookup(dentry))
+ return 0;
+
+ if (is_creation &&
+ (dentry->d_name.len != name->len ||
+ memcmp(dentry->d_name.name, name->name, name->len)))
+ return 0;
+ }
+ }
+ return 1;
+}
+
static const struct dentry_operations generic_ci_dentry_ops = {
.d_hash = generic_ci_d_hash,
.d_compare = generic_ci_d_compare,
+ .d_revalidate_name = generic_ci_d_revalidate,
};
#endif

--
2.36.1