2023-07-11 02:34:53

by Krister Johansen

[permalink] [raw]
Subject: [RFC PATCH 0/2] virtiofs submounts forgotten after client memory pressure

Hi,
I recently ran into a situation where a virtiofs client began
encountering EBADF after the client system had an OOM. After
reproducing the issue and debugging, it appears that the problem is
caused by a virtiofsd submount being forgotten once the dentry
referencing that submount is killed by the shrinker. In this particular
case, the submount had been bind mounted into a container's mount
namespace. The reference count on the original dentry was 0, making it
eligible for eviction. However, because this dentry was also the last
reference the client knew it had, it sent a forget message to the
server. This caused all future references to the FUSE node-id from
virtiofsd perspective to become invalid. Subsequent attempts to used
the node-id received an EBADF from the server.

This pair of patches modifies the virtiofs submount code to perform a
lookup on the nodeid that forms the root of the submount. The patch
before this pulls the revalidate lookup code into a helper function that
can be used both in revalidate and submount superblock fill.

I'm not enamored with this approach, but was hard pressed to think of a
more clever idea.

In the meantime, it's been tested via:

- fstests for virtiofs
- fstests for fuse (against passthrough_ll)
- manual testing to watch how refcounts change between client and server
in response to filesytem access, umount, and eviction by the shrinker.

Thanks,

-K

Krister Johansen (2):
fuse: revalidate: move lookup into a separate function
fuse: ensure that submounts lookup their root

fs/fuse/dir.c | 87 +++++++++++++++++++++++++++++++++---------------
fs/fuse/fuse_i.h | 6 ++++
fs/fuse/inode.c | 32 +++++++++++++++---
3 files changed, 94 insertions(+), 31 deletions(-)

--
2.25.1



2023-07-11 02:35:33

by Krister Johansen

[permalink] [raw]
Subject: [RFC PATCH 1/2] fuse: revalidate: move lookup into a separate function

If this refactoring seems cumbersome, it's because the goal is to move
the lookup parts of fuse_dentry_revalidate into a common function. This
function will be used elsewhere in a separate commit. In the meantime,
the new function fuse_dentry_revalidate_lookup is responsible for just
the lookup and validation portions of the revalidate dance. The
fuse_dentry_revalidate function retains the responsibility for
invalidating and mutating any state associated with the origial
fuse_inode and dentry.

Signed-off-by: Krister Johansen <[email protected]>
---
fs/fuse/dir.c | 87 +++++++++++++++++++++++++++++++++++----------------
1 file changed, 60 insertions(+), 27 deletions(-)

diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c
index f67bef9d83c4..bdf5526a0733 100644
--- a/fs/fuse/dir.c
+++ b/fs/fuse/dir.c
@@ -193,6 +193,59 @@ static void fuse_lookup_init(struct fuse_conn *fc, struct fuse_args *args,
args->out_args[0].value = outarg;
}

+static int fuse_dentry_revalidate_lookup(struct fuse_mount *fm,
+ struct dentry *entry,
+ struct inode *inode,
+ struct fuse_entry_out *outarg,
+ bool *lookedup)
+{
+ struct dentry *parent;
+ struct fuse_forget_link *forget;
+ struct fuse_inode *fi;
+ FUSE_ARGS(args);
+ int ret;
+
+ forget = fuse_alloc_forget();
+ ret = -ENOMEM;
+ if (!forget)
+ goto out;
+
+ parent = dget_parent(entry);
+ fuse_lookup_init(fm->fc, &args, get_node_id(d_inode(parent)),
+ &entry->d_name, outarg);
+ ret = fuse_simple_request(fm, &args);
+ dput(parent);
+
+ /* Zero nodeid is same as -ENOENT */
+ if (!ret && !outarg->nodeid)
+ ret = -ENOENT;
+ if (!ret) {
+ fi = get_fuse_inode(inode);
+ if (outarg->nodeid != get_node_id(inode) ||
+ (bool) IS_AUTOMOUNT(inode) != (bool) (outarg->attr.flags & FUSE_ATTR_SUBMOUNT)) {
+ fuse_queue_forget(fm->fc, forget,
+ outarg->nodeid, 1);
+ goto invalid;
+ }
+ *lookedup = true;
+ }
+ kfree(forget);
+ if (ret == -ENOMEM || ret == -EINTR)
+ goto out;
+ if (ret || fuse_invalid_attr(&outarg->attr) ||
+ fuse_stale_inode(inode, outarg->generation, &outarg->attr)) {
+ goto invalid;
+ }
+
+ ret = 1;
+out:
+ return ret;
+
+invalid:
+ ret = 0;
+ goto out;
+}
+
/*
* Check whether the dentry is still valid
*
@@ -216,9 +269,8 @@ static int fuse_dentry_revalidate(struct dentry *entry, unsigned int flags)
else if (time_before64(fuse_dentry_time(entry), get_jiffies_64()) ||
(flags & (LOOKUP_EXCL | LOOKUP_REVAL | LOOKUP_RENAME_TARGET))) {
struct fuse_entry_out outarg;
- FUSE_ARGS(args);
- struct fuse_forget_link *forget;
u64 attr_version;
+ bool lookedup = false;

/* For negative dentries, always do a fresh lookup */
if (!inode)
@@ -230,38 +282,19 @@ static int fuse_dentry_revalidate(struct dentry *entry, unsigned int flags)

fm = get_fuse_mount(inode);

- forget = fuse_alloc_forget();
- ret = -ENOMEM;
- if (!forget)
- goto out;
-
attr_version = fuse_get_attr_version(fm->fc);

- parent = dget_parent(entry);
- fuse_lookup_init(fm->fc, &args, get_node_id(d_inode(parent)),
- &entry->d_name, &outarg);
- ret = fuse_simple_request(fm, &args);
- dput(parent);
- /* Zero nodeid is same as -ENOENT */
- if (!ret && !outarg.nodeid)
- ret = -ENOENT;
- if (!ret) {
+ ret = fuse_dentry_revalidate_lookup(fm, entry, inode, &outarg,
+ &lookedup);
+ if (ret == -ENOMEM || ret == -EINTR)
+ goto out;
+ if (lookedup) {
fi = get_fuse_inode(inode);
- if (outarg.nodeid != get_node_id(inode) ||
- (bool) IS_AUTOMOUNT(inode) != (bool) (outarg.attr.flags & FUSE_ATTR_SUBMOUNT)) {
- fuse_queue_forget(fm->fc, forget,
- outarg.nodeid, 1);
- goto invalid;
- }
spin_lock(&fi->lock);
fi->nlookup++;
spin_unlock(&fi->lock);
}
- kfree(forget);
- if (ret == -ENOMEM || ret == -EINTR)
- goto out;
- if (ret || fuse_invalid_attr(&outarg.attr) ||
- fuse_stale_inode(inode, outarg.generation, &outarg.attr))
+ if (ret <= 0)
goto invalid;

forget_all_cached_acls(inode);
--
2.25.1