v2:
- Simplified the first patch, by removing code cleanups, to reduce churn
and minimize regression potential in stable trees
+ The resulting changes are *identical* to the v1 submission when
patches 1, 2, and 3 are applied
- Added patches 4 and 5 which are additional, minor code cleanups
v1: https://lore.kernel.org/lkml/[email protected]/
The primary motivation for this series is patch #1 which fixes a
refcounting issue in the path walking code of
v9fs_fid_lookup_with_uid(). Userspace can cause fids, which are created
for use during lookup, to not be clunked and make the mount unusable.
The remaining patches are code cleanups to improve readability. They're
not critical.
I tested v1 and v2 by running fstests against 9p inside of QEMU to
ensure that the results were the same as without my changes applied. I
also backported patch #1 of v2 to linux-5.15.y (v5.15.32) and ensured
that the 'fid not clunked' issue did not occur when running against the
Windows Subsystem for Linux (WSL) 9p server.
Tyler
Tyler Hicks (5):
9p: Fix refcounting during full path walks for fid lookups
9p: Track the root fid with its own variable during lookups
9p: Make the path walk logic more clear about when cloning is required
9p: Remove unnecessary variable for old fids while walking from
d_parent
9p: Fix minor typo in code comment
fs/9p/fid.c | 50 ++++++++++++++++++++++----------------------------
1 file changed, 22 insertions(+), 28 deletions(-)
--
2.25.1
Fix s/patch/path/ typo and make it clear that we're talking about
multiple path components.
Signed-off-by: Tyler Hicks <[email protected]>
---
fs/9p/fid.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/9p/fid.c b/fs/9p/fid.c
index d8c70c4cd3c2..60fc62081598 100644
--- a/fs/9p/fid.c
+++ b/fs/9p/fid.c
@@ -219,7 +219,7 @@ static struct p9_fid *v9fs_fid_lookup_with_uid(struct dentry *dentry,
l = min(n - i, P9_MAXWELEM);
/*
* We need to hold rename lock when doing a multipath
- * walk to ensure none of the patch component change
+ * walk to ensure none of the path components change
*/
fid = p9_client_walk(old_fid, l, &wnames[i],
old_fid == root_fid);
--
2.25.1
Remove the ofid variable that's local to the conditional block in favor
of the old_fid variable that's local to the entire function.
Signed-off-by: Tyler Hicks <[email protected]>
---
fs/9p/fid.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/fs/9p/fid.c b/fs/9p/fid.c
index dfe98a308612..d8c70c4cd3c2 100644
--- a/fs/9p/fid.c
+++ b/fs/9p/fid.c
@@ -169,10 +169,10 @@ static struct p9_fid *v9fs_fid_lookup_with_uid(struct dentry *dentry,
fid = v9fs_fid_find(ds, uid, any);
if (fid) {
/* Found the parent fid do a lookup with that */
- struct p9_fid *ofid = fid;
+ old_fid = fid;
- fid = p9_client_walk(ofid, 1, &dentry->d_name.name, 1);
- p9_client_clunk(ofid);
+ fid = p9_client_walk(old_fid, 1, &dentry->d_name.name, 1);
+ p9_client_clunk(old_fid);
goto fid_out;
}
up_read(&v9ses->rename_sem);
--
2.25.1
Cloning during a path component walk is only needed when the old fid
used for the walk operation is the root fid. Make that clear by
comparing the two fids rather than using an additional variable.
Signed-off-by: Tyler Hicks <[email protected]>
---
fs/9p/fid.c | 7 +++----
1 file changed, 3 insertions(+), 4 deletions(-)
diff --git a/fs/9p/fid.c b/fs/9p/fid.c
index dae276ca7f7a..dfe98a308612 100644
--- a/fs/9p/fid.c
+++ b/fs/9p/fid.c
@@ -150,7 +150,7 @@ static struct p9_fid *v9fs_fid_lookup_with_uid(struct dentry *dentry,
{
struct dentry *ds;
const unsigned char **wnames, *uname;
- int i, n, l, clone, access;
+ int i, n, l, access;
struct v9fs_session_info *v9ses;
struct p9_fid *fid, *root_fid, *old_fid;
@@ -214,7 +214,6 @@ static struct p9_fid *v9fs_fid_lookup_with_uid(struct dentry *dentry,
}
fid = root_fid;
old_fid = root_fid;
- clone = 1;
i = 0;
while (i < n) {
l = min(n - i, P9_MAXWELEM);
@@ -222,7 +221,8 @@ static struct p9_fid *v9fs_fid_lookup_with_uid(struct dentry *dentry,
* We need to hold rename lock when doing a multipath
* walk to ensure none of the patch component change
*/
- fid = p9_client_walk(old_fid, l, &wnames[i], clone);
+ fid = p9_client_walk(old_fid, l, &wnames[i],
+ old_fid == root_fid);
p9_client_clunk(old_fid);
if (IS_ERR(fid)) {
kfree(wnames);
@@ -230,7 +230,6 @@ static struct p9_fid *v9fs_fid_lookup_with_uid(struct dentry *dentry,
}
old_fid = fid;
i += l;
- clone = 0;
}
kfree(wnames);
fid_out:
--
2.25.1
Decrement the refcount of the parent dentry's fid after walking
each path component during a full path walk for a lookup. Failure to do
so can lead to fids that are not clunked until the filesystem is
unmounted, as indicated by this warning:
9pnet: found fid 3 not clunked
The improper refcounting after walking resulted in open(2) returning
-EIO on any directories underneath the mount point when using the virtio
transport. When using the fd transport, there's no apparent issue until
the filesytem is unmounted and the warning above is emitted to the logs.
In some cases, the user may not yet be attached to the filesystem and a
new root fid, associated with the user, is created and attached to the
root dentry before the full path walk is performed. Increment the new
root fid's refcount to two in that situation so that it can be safely
decremented to one after it is used for the walk operation. The new fid
will still be attached to the root dentry when
v9fs_fid_lookup_with_uid() returns so a final refcount of one is
correct/expected.
Fixes: 6636b6dcc3db ("9p: add refcount to p9_fid struct")
Cc: [email protected]
Signed-off-by: Tyler Hicks <[email protected]>
---
fs/9p/fid.c | 17 +++++------------
1 file changed, 5 insertions(+), 12 deletions(-)
diff --git a/fs/9p/fid.c b/fs/9p/fid.c
index 79df61fe0e59..5a469b79c1ee 100644
--- a/fs/9p/fid.c
+++ b/fs/9p/fid.c
@@ -152,7 +152,7 @@ static struct p9_fid *v9fs_fid_lookup_with_uid(struct dentry *dentry,
const unsigned char **wnames, *uname;
int i, n, l, clone, access;
struct v9fs_session_info *v9ses;
- struct p9_fid *fid, *old_fid = NULL;
+ struct p9_fid *fid, *old_fid;
v9ses = v9fs_dentry2v9ses(dentry);
access = v9ses->flags & V9FS_ACCESS_MASK;
@@ -194,13 +194,12 @@ static struct p9_fid *v9fs_fid_lookup_with_uid(struct dentry *dentry,
if (IS_ERR(fid))
return fid;
+ refcount_inc(&fid->count);
v9fs_fid_add(dentry->d_sb->s_root, fid);
}
/* If we are root ourself just return that */
- if (dentry->d_sb->s_root == dentry) {
- refcount_inc(&fid->count);
+ if (dentry->d_sb->s_root == dentry)
return fid;
- }
/*
* Do a multipath walk with attached root.
* When walking parent we need to make sure we
@@ -212,6 +211,7 @@ static struct p9_fid *v9fs_fid_lookup_with_uid(struct dentry *dentry,
fid = ERR_PTR(n);
goto err_out;
}
+ old_fid = fid;
clone = 1;
i = 0;
while (i < n) {
@@ -221,15 +221,8 @@ static struct p9_fid *v9fs_fid_lookup_with_uid(struct dentry *dentry,
* walk to ensure none of the patch component change
*/
fid = p9_client_walk(fid, l, &wnames[i], clone);
+ p9_client_clunk(old_fid);
if (IS_ERR(fid)) {
- if (old_fid) {
- /*
- * If we fail, clunk fid which are mapping
- * to path component and not the last component
- * of the path.
- */
- p9_client_clunk(old_fid);
- }
kfree(wnames);
goto err_out;
}
--
2.25.1
Improve readability by using a new variable when dealing with the root
fid. The root fid requires special handling in comparison to other fids
used during fid lookup.
Signed-off-by: Tyler Hicks <[email protected]>
---
fs/9p/fid.c | 26 ++++++++++++++------------
1 file changed, 14 insertions(+), 12 deletions(-)
diff --git a/fs/9p/fid.c b/fs/9p/fid.c
index 5a469b79c1ee..dae276ca7f7a 100644
--- a/fs/9p/fid.c
+++ b/fs/9p/fid.c
@@ -152,7 +152,7 @@ static struct p9_fid *v9fs_fid_lookup_with_uid(struct dentry *dentry,
const unsigned char **wnames, *uname;
int i, n, l, clone, access;
struct v9fs_session_info *v9ses;
- struct p9_fid *fid, *old_fid;
+ struct p9_fid *fid, *root_fid, *old_fid;
v9ses = v9fs_dentry2v9ses(dentry);
access = v9ses->flags & V9FS_ACCESS_MASK;
@@ -178,8 +178,8 @@ static struct p9_fid *v9fs_fid_lookup_with_uid(struct dentry *dentry,
up_read(&v9ses->rename_sem);
/* start from the root and try to do a lookup */
- fid = v9fs_fid_find(dentry->d_sb->s_root, uid, any);
- if (!fid) {
+ root_fid = v9fs_fid_find(dentry->d_sb->s_root, uid, any);
+ if (!root_fid) {
/* the user is not attached to the fs yet */
if (access == V9FS_ACCESS_SINGLE)
return ERR_PTR(-EPERM);
@@ -189,17 +189,18 @@ static struct p9_fid *v9fs_fid_lookup_with_uid(struct dentry *dentry,
else
uname = v9ses->uname;
- fid = p9_client_attach(v9ses->clnt, NULL, uname, uid,
- v9ses->aname);
- if (IS_ERR(fid))
- return fid;
+ root_fid = p9_client_attach(v9ses->clnt, NULL, uname, uid,
+ v9ses->aname);
+ if (IS_ERR(root_fid))
+ return root_fid;
- refcount_inc(&fid->count);
- v9fs_fid_add(dentry->d_sb->s_root, fid);
+ refcount_inc(&root_fid->count);
+ v9fs_fid_add(dentry->d_sb->s_root, root_fid);
}
/* If we are root ourself just return that */
if (dentry->d_sb->s_root == dentry)
- return fid;
+ return root_fid;
+
/*
* Do a multipath walk with attached root.
* When walking parent we need to make sure we
@@ -211,7 +212,8 @@ static struct p9_fid *v9fs_fid_lookup_with_uid(struct dentry *dentry,
fid = ERR_PTR(n);
goto err_out;
}
- old_fid = fid;
+ fid = root_fid;
+ old_fid = root_fid;
clone = 1;
i = 0;
while (i < n) {
@@ -220,7 +222,7 @@ static struct p9_fid *v9fs_fid_lookup_with_uid(struct dentry *dentry,
* We need to hold rename lock when doing a multipath
* walk to ensure none of the patch component change
*/
- fid = p9_client_walk(fid, l, &wnames[i], clone);
+ fid = p9_client_walk(old_fid, l, &wnames[i], clone);
p9_client_clunk(old_fid);
if (IS_ERR(fid)) {
kfree(wnames);
--
2.25.1
On Freitag, 27. Mai 2022 01:59:59 CEST Tyler Hicks wrote:
> Decrement the refcount of the parent dentry's fid after walking
> each path component during a full path walk for a lookup. Failure to do
> so can lead to fids that are not clunked until the filesystem is
> unmounted, as indicated by this warning:
>
> 9pnet: found fid 3 not clunked
That explains why I saw so many fids not being clunked with recent Linux
kernel versions while doing some 9p protocol debugging with QEMU recently.
> The improper refcounting after walking resulted in open(2) returning
> -EIO on any directories underneath the mount point when using the virtio
> transport. When using the fd transport, there's no apparent issue until
> the filesytem is unmounted and the warning above is emitted to the logs.
Actually I never saw that open() = -EIO error. Do you have a reproducer?
> In some cases, the user may not yet be attached to the filesystem and a
> new root fid, associated with the user, is created and attached to the
> root dentry before the full path walk is performed. Increment the new
> root fid's refcount to two in that situation so that it can be safely
> decremented to one after it is used for the walk operation. The new fid
> will still be attached to the root dentry when
> v9fs_fid_lookup_with_uid() returns so a final refcount of one is
> correct/expected.
>
> Fixes: 6636b6dcc3db ("9p: add refcount to p9_fid struct")
> Cc: [email protected]
> Signed-off-by: Tyler Hicks <[email protected]>
> ---
> fs/9p/fid.c | 17 +++++------------
> 1 file changed, 5 insertions(+), 12 deletions(-)
>
> diff --git a/fs/9p/fid.c b/fs/9p/fid.c
> index 79df61fe0e59..5a469b79c1ee 100644
> --- a/fs/9p/fid.c
> +++ b/fs/9p/fid.c
> @@ -152,7 +152,7 @@ static struct p9_fid *v9fs_fid_lookup_with_uid(struct
> dentry *dentry, const unsigned char **wnames, *uname;
> int i, n, l, clone, access;
> struct v9fs_session_info *v9ses;
> - struct p9_fid *fid, *old_fid = NULL;
> + struct p9_fid *fid, *old_fid;
>
> v9ses = v9fs_dentry2v9ses(dentry);
> access = v9ses->flags & V9FS_ACCESS_MASK;
> @@ -194,13 +194,12 @@ static struct p9_fid *v9fs_fid_lookup_with_uid(struct
> dentry *dentry, if (IS_ERR(fid))
> return fid;
>
> + refcount_inc(&fid->count);
> v9fs_fid_add(dentry->d_sb->s_root, fid);
> }
> /* If we are root ourself just return that */
> - if (dentry->d_sb->s_root == dentry) {
> - refcount_inc(&fid->count);
> + if (dentry->d_sb->s_root == dentry)
> return fid;
> - }
Hmm, wouldn't it then be possible that the root fid is returned with refcount
being 2 here?
> /*
> * Do a multipath walk with attached root.
> * When walking parent we need to make sure we
> @@ -212,6 +211,7 @@ static struct p9_fid *v9fs_fid_lookup_with_uid(struct
> dentry *dentry, fid = ERR_PTR(n);
> goto err_out;
> }
> + old_fid = fid;
> clone = 1;
> i = 0;
> while (i < n) {
> @@ -221,15 +221,8 @@ static struct p9_fid *v9fs_fid_lookup_with_uid(struct
> dentry *dentry, * walk to ensure none of the patch component change
> */
> fid = p9_client_walk(fid, l, &wnames[i], clone);
> + p9_client_clunk(old_fid);
> if (IS_ERR(fid)) {
> - if (old_fid) {
> - /*
> - * If we fail, clunk fid which are
mapping
> - * to path component and not the last
component
> - * of the path.
> - */
> - p9_client_clunk(old_fid);
> - }
> kfree(wnames);
> goto err_out;
> }
So this is the actual fix mentioned in the commit log. Makes sense.
Nitpicking: Wouldn't it be a bit cleaner to set old_fid solely within the
while loop and just before overwriting fid? And as we now have bumped to
-std=C11, probably making old_fid a local variable within loop scope only?
Best regards,
Christian Schoenebeck
On 2022-05-30 19:14:43, Christian Schoenebeck wrote:
> On Freitag, 27. Mai 2022 01:59:59 CEST Tyler Hicks wrote:
> > Decrement the refcount of the parent dentry's fid after walking
> > each path component during a full path walk for a lookup. Failure to do
> > so can lead to fids that are not clunked until the filesystem is
> > unmounted, as indicated by this warning:
> >
> > 9pnet: found fid 3 not clunked
>
> That explains why I saw so many fids not being clunked with recent Linux
> kernel versions while doing some 9p protocol debugging with QEMU recently.
In addition to this refcounting bug, there's another one that I noticed
while running fstests. My series does not fix it and I haven't had a
chance to look into it more. The generic/531 test triggers it.
https://git.kernel.org/pub/scm/fs/xfs/xfstests-dev.git/tree/tests/generic/531
>
> > The improper refcounting after walking resulted in open(2) returning
> > -EIO on any directories underneath the mount point when using the virtio
> > transport. When using the fd transport, there's no apparent issue until
> > the filesytem is unmounted and the warning above is emitted to the logs.
>
> Actually I never saw that open() = -EIO error. Do you have a reproducer?
The reproducer that I have is binary only (fairly large and runs a bunch
of different tests) and is used to regression test the Windows Subsystem
for Linux 2 (WSL2) host <-> guest filesystem sharing. Now that I think
about it, I'm not sure if the open() = -EIO error happens with other 9p
servers.
I can try to tease out the exact sequence of filesystem operations from
this test binary but it might take me a bit. It looks like it has to do
with switching UIDs, which could make sense because different users may
not be connected to the filesystem yet (the conditional block that does
p9_client_attach() and v9fs_fid_add()).
>
> > In some cases, the user may not yet be attached to the filesystem and a
> > new root fid, associated with the user, is created and attached to the
> > root dentry before the full path walk is performed. Increment the new
> > root fid's refcount to two in that situation so that it can be safely
> > decremented to one after it is used for the walk operation. The new fid
> > will still be attached to the root dentry when
> > v9fs_fid_lookup_with_uid() returns so a final refcount of one is
> > correct/expected.
> >
> > Fixes: 6636b6dcc3db ("9p: add refcount to p9_fid struct")
> > Cc: [email protected]
> > Signed-off-by: Tyler Hicks <[email protected]>
> > ---
> > fs/9p/fid.c | 17 +++++------------
> > 1 file changed, 5 insertions(+), 12 deletions(-)
> >
> > diff --git a/fs/9p/fid.c b/fs/9p/fid.c
> > index 79df61fe0e59..5a469b79c1ee 100644
> > --- a/fs/9p/fid.c
> > +++ b/fs/9p/fid.c
> > @@ -152,7 +152,7 @@ static struct p9_fid *v9fs_fid_lookup_with_uid(struct
> > dentry *dentry, const unsigned char **wnames, *uname;
> > int i, n, l, clone, access;
> > struct v9fs_session_info *v9ses;
> > - struct p9_fid *fid, *old_fid = NULL;
> > + struct p9_fid *fid, *old_fid;
> >
> > v9ses = v9fs_dentry2v9ses(dentry);
> > access = v9ses->flags & V9FS_ACCESS_MASK;
> > @@ -194,13 +194,12 @@ static struct p9_fid *v9fs_fid_lookup_with_uid(struct
> > dentry *dentry, if (IS_ERR(fid))
> > return fid;
> >
> > + refcount_inc(&fid->count);
> > v9fs_fid_add(dentry->d_sb->s_root, fid);
> > }
> > /* If we are root ourself just return that */
> > - if (dentry->d_sb->s_root == dentry) {
> > - refcount_inc(&fid->count);
> > + if (dentry->d_sb->s_root == dentry)
> > return fid;
> > - }
>
> Hmm, wouldn't it then be possible that the root fid is returned with refcount
> being 2 here?
Yes and I think that's correct. One refcount taken for adding the root
fid to the root dentry and another refcount taken for the original
purpose of the lookup.
Reverting this portion of the change and re-testing with the reproducer
triggers a refcount underflow.
>
> > /*
> > * Do a multipath walk with attached root.
> > * When walking parent we need to make sure we
> > @@ -212,6 +211,7 @@ static struct p9_fid *v9fs_fid_lookup_with_uid(struct
> > dentry *dentry, fid = ERR_PTR(n);
> > goto err_out;
> > }
> > + old_fid = fid;
> > clone = 1;
> > i = 0;
> > while (i < n) {
> > @@ -221,15 +221,8 @@ static struct p9_fid *v9fs_fid_lookup_with_uid(struct
> > dentry *dentry, * walk to ensure none of the patch component change
> > */
> > fid = p9_client_walk(fid, l, &wnames[i], clone);
> > + p9_client_clunk(old_fid);
> > if (IS_ERR(fid)) {
> > - if (old_fid) {
> > - /*
> > - * If we fail, clunk fid which are
> mapping
> > - * to path component and not the last
> component
> > - * of the path.
> > - */
> > - p9_client_clunk(old_fid);
> > - }
> > kfree(wnames);
> > goto err_out;
> > }
>
> So this is the actual fix mentioned in the commit log. Makes sense.
I think the refcount_inc() change for the root fid is an important and
required part of the fix.
> Nitpicking: Wouldn't it be a bit cleaner to set old_fid solely within the
> while loop and just before overwriting fid? And as we now have bumped to
> -std=C11, probably making old_fid a local variable within loop scope only?
You're right that it would be cleaner for the purposes of this single
patch. In a followup patch in this series, I start tracking the root fid
with a root_fid variable and that requires "old_fid = root_fid" before
we enter the loop and then "old_fid = fid" inside of the loop.
Tyler
>
> Best regards,
> Christian Schoenebeck
>
>
>
On Dienstag, 31. Mai 2022 16:28:29 CEST Tyler Hicks wrote:
> On 2022-05-30 19:14:43, Christian Schoenebeck wrote:
> > On Freitag, 27. Mai 2022 01:59:59 CEST Tyler Hicks wrote:
> > > Decrement the refcount of the parent dentry's fid after walking
> > > each path component during a full path walk for a lookup. Failure to do
> > > so can lead to fids that are not clunked until the filesystem is
> > >
> > > unmounted, as indicated by this warning:
> > > 9pnet: found fid 3 not clunked
> >
> > That explains why I saw so many fids not being clunked with recent Linux
> > kernel versions while doing some 9p protocol debugging with QEMU recently.
>
> In addition to this refcounting bug, there's another one that I noticed
> while running fstests. My series does not fix it and I haven't had a
> chance to look into it more. The generic/531 test triggers it.
>
> https://git.kernel.org/pub/scm/fs/xfs/xfstests-dev.git/tree/tests/generic/5
> 31
> > > The improper refcounting after walking resulted in open(2) returning
> > > -EIO on any directories underneath the mount point when using the virtio
> > > transport. When using the fd transport, there's no apparent issue until
> > > the filesytem is unmounted and the warning above is emitted to the logs.
> >
> > Actually I never saw that open() = -EIO error. Do you have a reproducer?
>
> The reproducer that I have is binary only (fairly large and runs a bunch
> of different tests) and is used to regression test the Windows Subsystem
> for Linux 2 (WSL2) host <-> guest filesystem sharing. Now that I think
> about it, I'm not sure if the open() = -EIO error happens with other 9p
> servers.
>
> I can try to tease out the exact sequence of filesystem operations from
> this test binary but it might take me a bit. It looks like it has to do
> with switching UIDs, which could make sense because different users may
> not be connected to the filesystem yet (the conditional block that does
> p9_client_attach() and v9fs_fid_add()).
>
> > > In some cases, the user may not yet be attached to the filesystem and a
> > > new root fid, associated with the user, is created and attached to the
> > > root dentry before the full path walk is performed. Increment the new
> > > root fid's refcount to two in that situation so that it can be safely
> > > decremented to one after it is used for the walk operation. The new fid
> > > will still be attached to the root dentry when
> > > v9fs_fid_lookup_with_uid() returns so a final refcount of one is
> > > correct/expected.
> > >
> > > Fixes: 6636b6dcc3db ("9p: add refcount to p9_fid struct")
> > > Cc: [email protected]
> > > Signed-off-by: Tyler Hicks <[email protected]>
> > > ---
> > >
> > > fs/9p/fid.c | 17 +++++------------
> > > 1 file changed, 5 insertions(+), 12 deletions(-)
> > >
> > > diff --git a/fs/9p/fid.c b/fs/9p/fid.c
> > > index 79df61fe0e59..5a469b79c1ee 100644
> > > --- a/fs/9p/fid.c
> > > +++ b/fs/9p/fid.c
> > > @@ -152,7 +152,7 @@ static struct p9_fid
> > > *v9fs_fid_lookup_with_uid(struct
> > > dentry *dentry, const unsigned char **wnames, *uname;
> > >
> > > int i, n, l, clone, access;
> > > struct v9fs_session_info *v9ses;
> > >
> > > - struct p9_fid *fid, *old_fid = NULL;
> > > + struct p9_fid *fid, *old_fid;
> > >
> > > v9ses = v9fs_dentry2v9ses(dentry);
> > > access = v9ses->flags & V9FS_ACCESS_MASK;
> > >
> > > @@ -194,13 +194,12 @@ static struct p9_fid
> > > *v9fs_fid_lookup_with_uid(struct
> > > dentry *dentry, if (IS_ERR(fid))
> > >
> > > return fid;
> > >
> > > + refcount_inc(&fid->count);
> > >
> > > v9fs_fid_add(dentry->d_sb->s_root, fid);
> > >
> > > }
> > > /* If we are root ourself just return that */
> > >
> > > - if (dentry->d_sb->s_root == dentry) {
> > > - refcount_inc(&fid->count);
> > > + if (dentry->d_sb->s_root == dentry)
> > >
> > > return fid;
> > >
> > > - }
> >
> > Hmm, wouldn't it then be possible that the root fid is returned with
> > refcount being 2 here?
>
> Yes and I think that's correct. One refcount taken for adding the root
> fid to the root dentry and another refcount taken for the original
> purpose of the lookup.
>
> Reverting this portion of the change and re-testing with the reproducer
> triggers a refcount underflow.
Right, I still have some knowledge gaps in the kernel's 9p code base. I was
actually rather confused about p9_client_clunk() which I just realized
actually does the refcount decrement and then conditionally sends out the
'Tclunk' message on refcount zero only.
So yes, it looks fine to me:
Reviewed-by: Christian Schoenebeck <[email protected]>
> > > /*
> > >
> > > * Do a multipath walk with attached root.
> > > * When walking parent we need to make sure we
> > >
> > > @@ -212,6 +211,7 @@ static struct p9_fid
> > > *v9fs_fid_lookup_with_uid(struct
> > > dentry *dentry, fid = ERR_PTR(n);
> > >
> > > goto err_out;
> > >
> > > }
> > >
> > > + old_fid = fid;
> > >
> > > clone = 1;
> > > i = 0;
> > > while (i < n) {
> > >
> > > @@ -221,15 +221,8 @@ static struct p9_fid
> > > *v9fs_fid_lookup_with_uid(struct
> > > dentry *dentry, * walk to ensure none of the patch component change
> > >
> > > */
> > >
> > > fid = p9_client_walk(fid, l, &wnames[i], clone);
> > >
> > > + p9_client_clunk(old_fid);
> > >
> > > if (IS_ERR(fid)) {
> > >
> > > - if (old_fid) {
> > > - /*
> > > - * If we fail, clunk fid which are
> >
> > mapping
> >
> > > - * to path component and not the last
> >
> > component
> >
> > > - * of the path.
> > > - */
> > > - p9_client_clunk(old_fid);
> > > - }
> > >
> > > kfree(wnames);
> > > goto err_out;
> > >
> > > }
> >
> > So this is the actual fix mentioned in the commit log. Makes sense.
>
> I think the refcount_inc() change for the root fid is an important and
> required part of the fix.
>
> > Nitpicking: Wouldn't it be a bit cleaner to set old_fid solely within the
> > while loop and just before overwriting fid? And as we now have bumped to
> > -std=C11, probably making old_fid a local variable within loop scope only?
>
> You're right that it would be cleaner for the purposes of this single
> patch. In a followup patch in this series, I start tracking the root fid
> with a root_fid variable and that requires "old_fid = root_fid" before
> we enter the loop and then "old_fid = fid" inside of the loop.
s/while/for/ would do the trick I guess. Not a big deal though.
Best regards,
Christian Schoenebeck
On 2022-06-01 16:28:49, Christian Schoenebeck wrote:
> On Dienstag, 31. Mai 2022 16:28:29 CEST Tyler Hicks wrote:
> > On 2022-05-30 19:14:43, Christian Schoenebeck wrote:
> > > On Freitag, 27. Mai 2022 01:59:59 CEST Tyler Hicks wrote:
> > > > Decrement the refcount of the parent dentry's fid after walking
> > > > each path component during a full path walk for a lookup. Failure to do
> > > > so can lead to fids that are not clunked until the filesystem is
> > > >
> > > > unmounted, as indicated by this warning:
> > > > 9pnet: found fid 3 not clunked
> > >
> > > That explains why I saw so many fids not being clunked with recent Linux
> > > kernel versions while doing some 9p protocol debugging with QEMU recently.
> >
> > In addition to this refcounting bug, there's another one that I noticed
> > while running fstests. My series does not fix it and I haven't had a
> > chance to look into it more. The generic/531 test triggers it.
> >
> > https://git.kernel.org/pub/scm/fs/xfs/xfstests-dev.git/tree/tests/generic/5
> > 31
> > > > The improper refcounting after walking resulted in open(2) returning
> > > > -EIO on any directories underneath the mount point when using the virtio
> > > > transport. When using the fd transport, there's no apparent issue until
> > > > the filesytem is unmounted and the warning above is emitted to the logs.
> > >
> > > Actually I never saw that open() = -EIO error. Do you have a reproducer?
> >
> > The reproducer that I have is binary only (fairly large and runs a bunch
> > of different tests) and is used to regression test the Windows Subsystem
> > for Linux 2 (WSL2) host <-> guest filesystem sharing. Now that I think
> > about it, I'm not sure if the open() = -EIO error happens with other 9p
> > servers.
This -EIO error looks to be specific to the WSL 9p server. I was unable
to reproduce it with QEMU's 9p server. I just see unclunked fids with
QEMU.
> >
> > I can try to tease out the exact sequence of filesystem operations from
> > this test binary but it might take me a bit. It looks like it has to do
> > with switching UIDs, which could make sense because different users may
> > not be connected to the filesystem yet (the conditional block that does
> > p9_client_attach() and v9fs_fid_add()).
I didn't have much luck here. This issue only reproduces after a
sequence of somewhat unrelated tests running in succession. They each
contain a lot of unnecessary filesystem operations but they each contain
some setuid() calls which makes some sense considering the refcounting
change proposed in this patch.
9p maintainers, is there anything else that I can help with to get this
bug fix reviewed/merged? Thanks!
Tyler
On Dienstag, 7. Juni 2022 05:41:10 CEST Tyler Hicks wrote:
> On 2022-06-01 16:28:49, Christian Schoenebeck wrote:
> > On Dienstag, 31. Mai 2022 16:28:29 CEST Tyler Hicks wrote:
> > > On 2022-05-30 19:14:43, Christian Schoenebeck wrote:
> > > > On Freitag, 27. Mai 2022 01:59:59 CEST Tyler Hicks wrote:
> > > > > Decrement the refcount of the parent dentry's fid after walking
> > > > > each path component during a full path walk for a lookup. Failure to
> > > > > do
> > > > > so can lead to fids that are not clunked until the filesystem is
> > > > >
> > > > > unmounted, as indicated by this warning:
> > > > > 9pnet: found fid 3 not clunked
> > > >
> > > > That explains why I saw so many fids not being clunked with recent
> > > > Linux
> > > > kernel versions while doing some 9p protocol debugging with QEMU
> > > > recently.
[...]
> 9p maintainers, is there anything else that I can help with to get this
> bug fix reviewed/merged? Thanks!
Dominique is the only active 9p maintainer for a while and barely has time for
9p anymore (hint: implied call for volunteers hidden in this sentence).
Dominique, do you have a time slice for this issue? I agree that it makes
sense to bring this issue forward.
Best regards,
Christian Schoenebeck
Christian Schoenebeck wrote on Thu, Jun 09, 2022 at 02:44:04PM +0200:
> Dominique is the only active 9p maintainer for a while and barely has time for
> 9p anymore (hint: implied call for volunteers hidden in this sentence).
>
> Dominique, do you have a time slice for this issue? I agree that it makes
> sense to bring this issue forward.
Sorry for lack of replies, I've updated my mail configuration a few
weeks ago and somehow screwed up 9p mails -- I've received everything,
just never got notified...
As Christian put it so well I don't have much time, but I can take a
couple of hours tomorrow.
Will review and/or take patches then hopefully :)
--
Dominique
Tyler Hicks wrote on Thu, May 26, 2022 at 06:59:59PM -0500:
> Decrement the refcount of the parent dentry's fid after walking
> each path component during a full path walk for a lookup. Failure to do
> so can lead to fids that are not clunked until the filesystem is
> unmounted, as indicated by this warning:
>
> 9pnet: found fid 3 not clunked
>
> The improper refcounting after walking resulted in open(2) returning
> -EIO on any directories underneath the mount point when using the virtio
> transport. When using the fd transport, there's no apparent issue until
> the filesytem is unmounted and the warning above is emitted to the logs.
>
> In some cases, the user may not yet be attached to the filesystem and a
> new root fid, associated with the user, is created and attached to the
> root dentry before the full path walk is performed. Increment the new
> root fid's refcount to two in that situation so that it can be safely
> decremented to one after it is used for the walk operation. The new fid
> will still be attached to the root dentry when
> v9fs_fid_lookup_with_uid() returns so a final refcount of one is
> correct/expected.
>
> Fixes: 6636b6dcc3db ("9p: add refcount to p9_fid struct")
> Cc: [email protected]
> Signed-off-by: Tyler Hicks <[email protected]>
> ---
> fs/9p/fid.c | 17 +++++------------
> 1 file changed, 5 insertions(+), 12 deletions(-)
>
> diff --git a/fs/9p/fid.c b/fs/9p/fid.c
> index 79df61fe0e59..5a469b79c1ee 100644
> --- a/fs/9p/fid.c
> +++ b/fs/9p/fid.c
> @@ -152,7 +152,7 @@ static struct p9_fid *v9fs_fid_lookup_with_uid(struct dentry *dentry,
> const unsigned char **wnames, *uname;
> int i, n, l, clone, access;
> struct v9fs_session_info *v9ses;
> - struct p9_fid *fid, *old_fid = NULL;
> + struct p9_fid *fid, *old_fid;
>
> v9ses = v9fs_dentry2v9ses(dentry);
> access = v9ses->flags & V9FS_ACCESS_MASK;
> @@ -194,13 +194,12 @@ static struct p9_fid *v9fs_fid_lookup_with_uid(struct dentry *dentry,
> if (IS_ERR(fid))
> return fid;
>
> + refcount_inc(&fid->count);
> v9fs_fid_add(dentry->d_sb->s_root, fid);
> }
> /* If we are root ourself just return that */
> - if (dentry->d_sb->s_root == dentry) {
> - refcount_inc(&fid->count);
> + if (dentry->d_sb->s_root == dentry)
> return fid;
> - }
> /*
> * Do a multipath walk with attached root.
> * When walking parent we need to make sure we
> @@ -212,6 +211,7 @@ static struct p9_fid *v9fs_fid_lookup_with_uid(struct dentry *dentry,
> fid = ERR_PTR(n);
> goto err_out;
> }
> + old_fid = fid;
> clone = 1;
> i = 0;
> while (i < n) {
> @@ -221,15 +221,8 @@ static struct p9_fid *v9fs_fid_lookup_with_uid(struct dentry *dentry,
> * walk to ensure none of the patch component change
> */
> fid = p9_client_walk(fid, l, &wnames[i], clone);
> + p9_client_clunk(old_fid);
hmm, if we're not cloning then fid == old_fid and the refcount is not
increased? (I think... I didn't even realize/remember that walk had a
no-clone mode, sorry.)
So we'd only need to clunk if old fid here if we're cloning (old fid is
the initial root fid), but I'm not sure how to test this path as I
couldn't think of any pattern that'd trigger a multi-level lookup,
so I'm not 100% sure; I'll try a bit more.
--
Dominique
On 2022-06-12 08:17:16, Dominique Martinet wrote:
> Tyler Hicks wrote on Thu, May 26, 2022 at 06:59:59PM -0500:
> > Decrement the refcount of the parent dentry's fid after walking
> > each path component during a full path walk for a lookup. Failure to do
> > so can lead to fids that are not clunked until the filesystem is
> > unmounted, as indicated by this warning:
> >
> > 9pnet: found fid 3 not clunked
> >
> > The improper refcounting after walking resulted in open(2) returning
> > -EIO on any directories underneath the mount point when using the virtio
> > transport. When using the fd transport, there's no apparent issue until
> > the filesytem is unmounted and the warning above is emitted to the logs.
> >
> > In some cases, the user may not yet be attached to the filesystem and a
> > new root fid, associated with the user, is created and attached to the
> > root dentry before the full path walk is performed. Increment the new
> > root fid's refcount to two in that situation so that it can be safely
> > decremented to one after it is used for the walk operation. The new fid
> > will still be attached to the root dentry when
> > v9fs_fid_lookup_with_uid() returns so a final refcount of one is
> > correct/expected.
> >
> > Fixes: 6636b6dcc3db ("9p: add refcount to p9_fid struct")
> > Cc: [email protected]
> > Signed-off-by: Tyler Hicks <[email protected]>
> > ---
> > fs/9p/fid.c | 17 +++++------------
> > 1 file changed, 5 insertions(+), 12 deletions(-)
> >
> > diff --git a/fs/9p/fid.c b/fs/9p/fid.c
> > index 79df61fe0e59..5a469b79c1ee 100644
> > --- a/fs/9p/fid.c
> > +++ b/fs/9p/fid.c
> > @@ -152,7 +152,7 @@ static struct p9_fid *v9fs_fid_lookup_with_uid(struct dentry *dentry,
> > const unsigned char **wnames, *uname;
> > int i, n, l, clone, access;
> > struct v9fs_session_info *v9ses;
> > - struct p9_fid *fid, *old_fid = NULL;
> > + struct p9_fid *fid, *old_fid;
> >
> > v9ses = v9fs_dentry2v9ses(dentry);
> > access = v9ses->flags & V9FS_ACCESS_MASK;
> > @@ -194,13 +194,12 @@ static struct p9_fid *v9fs_fid_lookup_with_uid(struct dentry *dentry,
> > if (IS_ERR(fid))
> > return fid;
> >
> > + refcount_inc(&fid->count);
> > v9fs_fid_add(dentry->d_sb->s_root, fid);
> > }
> > /* If we are root ourself just return that */
> > - if (dentry->d_sb->s_root == dentry) {
> > - refcount_inc(&fid->count);
> > + if (dentry->d_sb->s_root == dentry)
> > return fid;
> > - }
> > /*
> > * Do a multipath walk with attached root.
> > * When walking parent we need to make sure we
> > @@ -212,6 +211,7 @@ static struct p9_fid *v9fs_fid_lookup_with_uid(struct dentry *dentry,
> > fid = ERR_PTR(n);
> > goto err_out;
> > }
> > + old_fid = fid;
> > clone = 1;
> > i = 0;
> > while (i < n) {
> > @@ -221,15 +221,8 @@ static struct p9_fid *v9fs_fid_lookup_with_uid(struct dentry *dentry,
> > * walk to ensure none of the patch component change
> > */
> > fid = p9_client_walk(fid, l, &wnames[i], clone);
> > + p9_client_clunk(old_fid);
>
> hmm, if we're not cloning then fid == old_fid and the refcount is not
> increased? (I think... I didn't even realize/remember that walk had a
> no-clone mode, sorry.)
>
> So we'd only need to clunk if old fid here if we're cloning (old fid is
> the initial root fid), but I'm not sure how to test this path as I
> couldn't think of any pattern that'd trigger a multi-level lookup,
> so I'm not 100% sure; I'll try a bit more.
Yes, you're correct. Nice catch!
Tyler
>
> --
> Dominique
>