hi,
I have evidence from a customer of an infinite loop in
pnfs_update_layout(). This has only happened once and I suspect it is
unlikely to recur often. We don't have a lot of tracing data, but I
think we have enough...
The evidence I do have is repeated "BUG: workqueue lockup" errors
with sufficiently many samples that I can determine the code path of
the loop (see below), and a message:
NFSv4: state recovery failed for open file SVC_rapid7_dc33/.bash_history, error = -116
The loop involves the "lookup_again" label and the "goto" on line 2112.
This is the code where NFS_LAYOUT_INVALID_STID was found to be true and
nfs4_select_rw_stateid() returned non-zero.
I deduce that ctx->state is not a valid open stateid. This leads to
nfs4_select_rw_stateid() returned -EIO and
nfs4_schedule_stateid_recovery() doing nothing. This "doing nothing"
is the only explanation I can find for the
nfs4_client_recover_expired_lease() call at the top of the loop not
waiting at all (if it did wait, we wouldn't get a workqueue lockup).
The state being invalid also perfectly matches the "state recovery
failed" error.
So it seems likely that we should test
nfs4_valid_open_stateid(ctx->state)
somewhere in the loop, and return either NULL or and error. I'm not
certain what is best.
My inclination is
diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c
index 0c0fed1ecd0b..e702ac518205 100644
--- a/fs/nfs/pnfs.c
+++ b/fs/nfs/pnfs.c
@@ -2002,6 +2002,12 @@ pnfs_update_layout(struct inode *ino,
lseg = ERR_PTR(nfs4_client_recover_expired_lease(clp));
if (IS_ERR(lseg))
goto out;
+ if (!nfs4_valid_open_stateid(ctx->state)) {
+ lseq = ERR_PTR(-EIO);
+ trace_pnfs_update_layout(ino, pos, count, iomode, lo, lseg,
+ PNFS_UPDATE_LAYOUT_INVALID_OPEN);
+ goto out;
+ }
first = false;
spin_lock(&ino->i_lock);
lo = pnfs_find_alloc_layout(ino, ctx, gfp_flags);
Does that seem reasonable?
Another possibility would be to check the status from
nfs4_select_rw_stateid() and "goto out_unlock" if it is EIO.
diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c
index 0c0fed1ecd0b..7cc90ee86882 100644
--- a/fs/nfs/pnfs.c
+++ b/fs/nfs/pnfs.c
@@ -2106,6 +2106,8 @@ pnfs_update_layout(struct inode *ino,
trace_pnfs_update_layout(ino, pos, count,
iomode, lo, lseg,
PNFS_UPDATE_LAYOUT_INVALID_OPEN);
+ if (status == -EIO)
+ goto out_unlock;
nfs4_schedule_stateid_recovery(server, ctx->state);
pnfs_clear_first_layoutget(lo);
pnfs_put_layout_hdr(lo);
Thoughts?
Thanks,
NeilBrown