2024-03-13 08:15:39

by David Howells

[permalink] [raw]
Subject: [PATCH 0/2] afs: Miscellaneous fixes

Hi Marc,

Here are some fixes for afs, if you could look them over?

(1) Fix the caching of preferred address of a fileserver. By doing that, we
stick with whatever address we get a response back from first rather then
obeying any preferences set.

(2) Fix an occasional FetchStatus-after-RemoveDir. The FetchStatus then
fails with VNOVNODE (equivalent to -ENOENT) that confuses parts of the
driver that aren't expecting that.

The patches can be found here:

https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git/log/?h=afs-fixes

Thanks,
David

David Howells (2):
afs: Don't cache preferred address
afs: Fix occasional rmdir-then-VNOVNODE with generic/011

fs/afs/rotate.c | 21 ++++-----------------
fs/afs/validation.c | 16 +++++++++-------
2 files changed, 13 insertions(+), 24 deletions(-)



2024-03-13 08:16:02

by David Howells

[permalink] [raw]
Subject: [PATCH 1/2] afs: Don't cache preferred address

In the AFS fileserver rotation algorithm, don't cache the preferred address
for the server as that will override the explicit preference if a
non-preferred address responds first.

Fixes: 495f2ae9e355 ("afs: Fix fileserver rotation")
Signed-off-by: David Howells <[email protected]>
cc: Marc Dionne <[email protected]>
cc: [email protected]
---
fs/afs/rotate.c | 21 ++++-----------------
1 file changed, 4 insertions(+), 17 deletions(-)

diff --git a/fs/afs/rotate.c b/fs/afs/rotate.c
index 700a27bc8c25..ed04bd1eeae8 100644
--- a/fs/afs/rotate.c
+++ b/fs/afs/rotate.c
@@ -602,6 +602,8 @@ bool afs_select_fileserver(struct afs_operation *op)
goto wait_for_more_probe_results;

alist = op->estate->addresses;
+ best_prio = -1;
+ addr_index = 0;
for (i = 0; i < alist->nr_addrs; i++) {
if (alist->addrs[i].prio > best_prio) {
addr_index = i;
@@ -609,9 +611,7 @@ bool afs_select_fileserver(struct afs_operation *op)
}
}

- addr_index = READ_ONCE(alist->preferred);
- if (!test_bit(addr_index, &set))
- addr_index = __ffs(set);
+ alist->preferred = addr_index;

op->addr_index = addr_index;
set_bit(addr_index, &op->addr_tried);
@@ -656,12 +656,6 @@ bool afs_select_fileserver(struct afs_operation *op)
next_server:
trace_afs_rotate(op, afs_rotate_trace_next_server, 0);
_debug("next");
- ASSERT(op->estate);
- alist = op->estate->addresses;
- if (op->call_responded &&
- op->addr_index != READ_ONCE(alist->preferred) &&
- test_bit(alist->preferred, &op->addr_tried))
- WRITE_ONCE(alist->preferred, op->addr_index);
op->estate = NULL;
goto pick_server;

@@ -690,14 +684,7 @@ bool afs_select_fileserver(struct afs_operation *op)
failed:
trace_afs_rotate(op, afs_rotate_trace_failed, 0);
op->flags |= AFS_OPERATION_STOP;
- if (op->estate) {
- alist = op->estate->addresses;
- if (op->call_responded &&
- op->addr_index != READ_ONCE(alist->preferred) &&
- test_bit(alist->preferred, &op->addr_tried))
- WRITE_ONCE(alist->preferred, op->addr_index);
- op->estate = NULL;
- }
+ op->estate = NULL;
_leave(" = f [failed %d]", afs_op_error(op));
return false;
}


2024-03-13 08:16:08

by David Howells

[permalink] [raw]
Subject: [PATCH 2/2] afs: Fix occasional rmdir-then-VNOVNODE with generic/011

Sometimes generic/011 causes kafs to follow up an FS.RemoveDir RPC call by
spending around a second sending a slew of FS.FetchStatus RPC calls to the
directory just deleted that then abort with VNOVNODE, indicating deletion
of the target directory.

This seems to stem from userspace attempting to stat the directory or
something in it:

afs_select_fileserver+0x46d/0xaa2
afs_wait_for_operation+0x12/0x17e
afs_fetch_status+0x56/0x75
afs_validate+0xfb/0x240
afs_permission+0xef/0x1b0
inode_permission+0x90/0x139
link_path_walk.part.0.constprop.0+0x6f/0x2f0
path_lookupat+0x4c/0xfa
filename_lookup+0x63/0xd7
vfs_statx+0x62/0x13f
vfs_fstatat+0x72/0x8a

The issue appears to be that afs_dir_remove_subdir() marks the callback
promise as being cancelled by setting the expiry time to AFS_NO_CB_PROMISE
- which then confuses afs_validate() which sends the FetchStatus to try and
get a new one before it checks for the AFS_VNODE_DELETED flag which
indicates that we know the directory got deleted.

Fix this by:

(1) Make afs_check_validity() return true if AFS_VNODE_DELETED is set, and
then tweak the return from afs_validate() if the DELETED flag is set.

(2) Move the AFS_VNODE_DELETED check in afs_validate() up above the
expiration check to immediately after we've grabbed the validate_lock.

Fixes: 453924de6212 ("afs: Overhaul invalidation handling to better support RO volumes")
Signed-off-by: David Howells <[email protected]>
cc: Marc Dionne <[email protected]>
cc: [email protected]
---
fs/afs/validation.c | 16 +++++++++-------
1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/fs/afs/validation.c b/fs/afs/validation.c
index 46b37f2cce7d..32a53fc8dfb2 100644
--- a/fs/afs/validation.c
+++ b/fs/afs/validation.c
@@ -122,6 +122,9 @@ bool afs_check_validity(const struct afs_vnode *vnode)
const struct afs_volume *volume = vnode->volume;
time64_t deadline = ktime_get_real_seconds() + 10;

+ if (test_bit(AFS_VNODE_DELETED, &vnode->flags))
+ return true;
+
if (atomic_read(&volume->cb_v_check) != atomic_read(&volume->cb_v_break) ||
atomic64_read(&vnode->cb_expires_at) <= deadline ||
volume->cb_expires_at <= deadline ||
@@ -389,12 +392,17 @@ int afs_validate(struct afs_vnode *vnode, struct key *key)
key_serial(key));

if (afs_check_validity(vnode))
- return 0;
+ return test_bit(AFS_VNODE_DELETED, &vnode->flags) ? -ESTALE : 0;

ret = down_write_killable(&vnode->validate_lock);
if (ret < 0)
goto error;

+ if (test_bit(AFS_VNODE_DELETED, &vnode->flags)) {
+ ret = -ESTALE;
+ goto error_unlock;
+ }
+
/* Validate a volume after the v_break has changed or the volume
* callback expired. We only want to do this once per volume per
* v_break change. The actual work will be done when parsing the
@@ -448,12 +456,6 @@ int afs_validate(struct afs_vnode *vnode, struct key *key)
vnode->cb_ro_snapshot = cb_ro_snapshot;
vnode->cb_scrub = cb_scrub;

- if (test_bit(AFS_VNODE_DELETED, &vnode->flags)) {
- _debug("file already deleted");
- ret = -ESTALE;
- goto error_unlock;
- }
-
/* if the vnode's data version number changed then its contents are
* different */
zap |= test_and_clear_bit(AFS_VNODE_ZAP_DATA, &vnode->flags);


2024-03-13 13:01:38

by Marc Dionne

[permalink] [raw]
Subject: Re: [PATCH 0/2] afs: Miscellaneous fixes

On Wed, Mar 13, 2024 at 5:15 AM David Howells <[email protected]> wrote:
>
> Hi Marc,
>
> Here are some fixes for afs, if you could look them over?
>
> (1) Fix the caching of preferred address of a fileserver. By doing that, we
> stick with whatever address we get a response back from first rather then
> obeying any preferences set.
>
> (2) Fix an occasional FetchStatus-after-RemoveDir. The FetchStatus then
> fails with VNOVNODE (equivalent to -ENOENT) that confuses parts of the
> driver that aren't expecting that.
>
> The patches can be found here:
>
> https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fsgit/log/?h=afs-fixes
>
> Thanks,
> David
>
> David Howells (2):
> afs: Don't cache preferred address
> afs: Fix occasional rmdir-then-VNOVNODE with generic/011
>
> fs/afs/rotate.c | 21 ++++-----------------
> fs/afs/validation.c | 16 +++++++++-------
> 2 files changed, 13 insertions(+), 24 deletions(-)

Reviewed-by: Marc Dionne <[email protected]>

Marc

2024-03-14 11:14:00

by Christian Brauner

[permalink] [raw]
Subject: Re: [PATCH 0/2] afs: Miscellaneous fixes

On Wed, 13 Mar 2024 08:15:01 +0000, David Howells wrote:
> Here are some fixes for afs, if you could look them over?
>
> (1) Fix the caching of preferred address of a fileserver. By doing that, we
> stick with whatever address we get a response back from first rather then
> obeying any preferences set.
>
> (2) Fix an occasional FetchStatus-after-RemoveDir. The FetchStatus then
> fails with VNOVNODE (equivalent to -ENOENT) that confuses parts of the
> driver that aren't expecting that.
>
> [...]

Applied to the vfs.fixes branch of the vfs/vfs.git tree.
Patches in the vfs.fixes branch should appear in linux-next soon.

Please report any outstanding bugs that were missed during review in a
new review to the original patch series allowing us to drop it.

It's encouraged to provide Acked-bys and Reviewed-bys even though the
patch has now been applied. If possible patch trailers will be updated.

Note that commit hashes shown below are subject to change due to rebase,
trailer updates or similar. If in doubt, please check the listed branch.

tree: https://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git
branch: vfs.fixes

[1/2] afs: Don't cache preferred address
https://git.kernel.org/vfs/vfs/c/b719dcc8b02d
[2/2] afs: Fix occasional rmdir-then-VNOVNODE with generic/011
https://git.kernel.org/vfs/vfs/c/4eed8f8549f4