LinuxLists.cc - server continually returns NFS4ERR_DELAY to CREATE

2024-04-05 14:23:07

Subject: server continually returns NFS4ERR_DELAY to CREATE_SESSION

Vladimir reported a problem running a certain set of testcases. These
tests would spin up a number of QEMU/KVM guests that use nfsroot.

After running for a while, they eventually reboot, and that boot hangs
mounting the server because the server is continually returning
NFS4ERR_DELAY on a CREATE_SESSION operation. With the addition of some
new tracepoints that I'll eventually post, I think I have a handle on
what's happening:

First, server decides to issue a CB_RECALL_ANY to the client, but there
is a fault on the callback channel:

nfsd-42728 [000] ....... 1678.271093: nfsd_fh_verify: xid=0x9109236c fh_hash=0x4394a635 type=0x0 access=BYPASS_GSS
nfsd-42728 [000] ....... 1678.271093: nfsd_compound_status: op=2/3 OP_PUTFH status=0
nfsd-42728 [000] ....... 1678.271093: nfsd_fh_verify: xid=0x9109236c fh_hash=0x4394a635 type=LNK access=
nfsd-42728 [000] ....... 1678.271095: nfsd_compound_status: op=3/3 OP_READLINK status=0
kworker/u8:5-459 [002] ....... 1690.685190: nfsd_cb_recall_any: addr=192.168.50.101:833 client 660ffa56:69849ebb keep=0 bmval0=RDATA_DLG
kworker/u8:5-459 [002] ....... 1690.685191: nfsd_cb_queue: addr=192.168.50.101:0 client 660ffa56:69849ebb cb=000000006e42ee91 (first try)
kworker/u8:5-459 [002] ....... 1690.685194: nfsd_cb_start: addr=192.168.50.101:0 client 660ffa56:69849ebb state=UP
kworker/u8:0-42764 [002] ....... 1699.821526: nfsd_cb_seq_status: task:00000001@00000003 sessionid=660ffa56:69849ebb:0000001e:00000000 tk_status=-5 seq_status=1
kworker/u8:0-42764 [002] ....... 1699.821527: nfsd_cb_new_state: addr=192.168.50.101:0 client 660ffa56:69849ebb state=FAULT
kworker/u8:0-42764 [002] ....... 1699.821528: nfsd_cb_restart: addr=192.168.50.101:0 client 660ffa56:69849ebb cb=000000006e42ee91 (first try)
kworker/u8:0-42764 [002] ....... 1699.821533: nfsd_cb_queue: addr=192.168.50.101:0 client 660ffa56:69849ebb cb=000000006e42ee91 (need restart)
kworker/u8:0-42764 [002] ....... 1699.821538: nfsd_cb_start: addr=192.168.50.101:0 client 660ffa56:69849ebb state=FAULT
kworker/u8:0-42764 [003] ....... 1709.037319: nfsd_cb_seq_status: task:00000002@00000003 sessionid=660ffa56:69849ebb:0000001e:00000000 tk_status=-5 seq_status=1
kworker/u8:0-42764 [003] ....... 1709.037320: nfsd_cb_restart: addr=192.168.50.101:0 client 660ffa56:69849ebb cb=000000006e42ee91 (first try)
kworker/u8:0-42764 [003] ....... 1709.037325: nfsd_cb_queue: addr=192.168.50.101:0 client 660ffa56:69849ebb cb=000000006e42ee91 (need restart)
kworker/u8:0-42764 [003] ....... 1709.037327: nfsd_cb_start: addr=192.168.50.101:0 client 660ffa56:69849ebb state=FAULT

This repeats for a while over several seconds. Eventually the server
tries to probe the callback channel, but it's DOWN. Cue more
nfsd_cb_queue/nfsd_cb_start calls repeating over time:

kworker/u8:3-42 [003] ....... 1773.550044: nfsd_cb_restart: addr=192.168.50.101:0 client 660ffa56:69849ebb cb=000000006e42ee91 (first try)
kworker/u8:3-42 [003] ....... 1773.550049: nfsd_cb_queue: addr=192.168.50.101:0 client 660ffa56:69849ebb cb=000000006e42ee91 (need restart)
kworker/u8:3-42 [003] ....... 1773.550053: nfsd_cb_start: addr=192.168.50.101:0 client 660ffa56:69849ebb state=FAULT
nfsd-42728 [000] ....1.. 1781.401822: nfsd_cb_lost: addr=192.168.50.101:0 client 660ffa56:69849ebb state=FAULT
nfsd-42728 [000] ....2.. 1781.401823: nfsd_cb_probe: addr=192.168.50.101:0 client 660ffa56:69849ebb state=FAULT
nfsd-42728 [000] ....2.. 1781.401823: nfsd_cb_new_state: addr=192.168.50.101:0 client 660ffa56:69849ebb state=UNKNOWN
nfsd-42728 [000] ....2.. 1781.401824: nfsd_cb_queue: addr=192.168.50.101:0 client 660ffa56:69849ebb cb=00000000c4627b33 (first try)
kworker/u8:3-42 [000] ....... 1781.401832: nfsd_cb_start: addr=192.168.50.101:0 client 660ffa56:69849ebb state=UNKNOWN
kworker/u8:5-459 [003] ....... 1781.401833: nfsd_cb_seq_status: task:0000000a@00000003 sessionid=660ffa56:69849ebb:0000001e:00000000 tk_status=-107 seq_status=1
kworker/u8:3-42 [000] ....... 1781.401833: nfsd_cb_bc_update: addr=192.168.50.101:0 client 660ffa56:69849ebb cb=00000000c4627b33 (first try)
kworker/u8:3-42 [000] ....... 1781.401833: nfsd_cb_bc_shutdown: addr=192.168.50.101:0 client 660ffa56:69849ebb cb=00000000c4627b33 (first try)
kworker/u8:5-459 [003] ....... 1781.401833: nfsd_cb_restart: addr=192.168.50.101:0 client 660ffa56:69849ebb cb=000000006e42ee91 (first try)
kworker/u8:5-459 [003] ....... 1781.401836: nfsd_cb_queue: addr=192.168.50.101:0 client 660ffa56:69849ebb cb=000000006e42ee91 (need restart)
kworker/u8:3-42 [000] ....... 1781.401858: nfsd_cb_new_state: addr=192.168.50.101:0 client 660ffa56:69849ebb state=DOWN
kworker/u8:3-42 [000] ....... 1781.401858: nfsd_cb_queue: addr=192.168.50.101:0 client 660ffa56:69849ebb cb=00000000c4627b33 (first try)
kworker/u8:3-42 [000] ....... 1781.401859: nfsd_cb_start: addr=192.168.50.101:0 client 660ffa56:69849ebb state=DOWN

Eventually, the client calls back (after rebooting), but it can't get through:
kworker/u8:3-42 [000] ....... 1782.391951: nfsd_cb_start: addr=192.168.50.101:0 client 660ffa56:69849ebb state=DOWN
kworker/u8:3-42 [000] ....... 1782.391951: nfsd_cb_queue: addr=192.168.50.101:0 client 660ffa56:69849ebb cb=00000000c4627b33 (first try)
kworker/u8:3-42 [000] ....... 1782.391952: nfsd_cb_start: addr=192.168.50.101:0 client 660ffa56:69849ebb state=DOWN
kworker/u8:3-42 [000] ....... 1782.391952: nfsd_cb_queue: addr=192.168.50.101:0 client 660ffa56:69849ebb cb=000000006e42ee91 (need restart)
nfsd-42728 [000] ....... 1782.409010: nfsd_compound: xid=0xa299edc2 opcnt=1 tag=
nfsd-42728 [000] ....1.. 1782.409039: nfsd_clid_verf_mismatch: client 660ffa56:69849ebb verf=0x17c364ee8596f72a, updated=0x17c3650afe9e0a7f from addr=192.168.50.101:926
nfsd-42728 [000] ....... 1782.409041: nfsd_compound_status: op=1/1 OP_EXCHANGE_ID status=0
nfsd-42728 [000] ....... 1782.409241: nfsd_compound: xid=0xa399edc2 opcnt=1 tag=
nfsd-42728 [000] ....1.. 1782.409245: nfsd_clid_verf_mismatch: client 660ffa56:69849ebb verf=0x17c364ee8596f72a, updated=0x17c3650afe9e0a7f from addr=192.168.50.101:926
nfsd-42728 [000] ....... 1782.409245: nfsd_compound_status: op=1/1 OP_EXCHANGE_ID status=0
nfsd-42728 [000] ....... 1782.409309: nfsd_compound: xid=0xa499edc2 opcnt=1 tag=
nfsd-42728 [000] ....1.. 1782.409325: check_slot_seqid: seqid=1 slot_seqid=0 inuse=0 conf=0 ret=0
nfsd-42728 [000] ....1.. 1782.409326: mark_client_expired_locked: addr=192.168.50.101:0 client 660ffa56:69849ebb error=10008
nfsd-42728 [000] ....... 1782.409329: nfsd_compound_status: op=1/1 OP_CREATE_SESSION status=10008
kworker/u8:3-42 [000] ....... 1782.417901: nfsd_cb_start: addr=192.168.50.101:0 client 660ffa56:69849ebb state=DOWN
kworker/u8:3-42 [000] ....... 1782.417901: nfsd_cb_queue: addr=192.168.50.101:0 client 660ffa56:69849ebb cb=000000006e42ee91 (need restart)

The EXCHANGE_ID works, but CREATE_SESSION returns NFS4ERR_DELAY because
it couldn't unhash the client due the refcount (cl_rpc_users) being
high. I didn't capture the value of the refcount at this point, but the
CB_RECALL_ANY job holds a reference to the old client, so that would be
enough to prevent unhashing it.

I think we need to rework the refcounting on the CB_RECALL_ANY job, at a
minimum. It probably shouldn't hold a reference to the client. Maybe
make it store the clientid and just redo the lookup if it's still
around? I might try to spin up a patch along those lines.

Longer term, I think we need to take a hard look at why the cl_rpc_users
refcount ensures that the client remains hashed. It seems like we ought
to be able to unhash it, even if there are outstanding references, no?

Thoughts on other ways to fix this or other structural fixes?
--
Jeff Layton <[email protected]>

2024-04-05 14:57:21

by Chuck Lever III

[permalink] [raw]

Subject: Re: server continually returns NFS4ERR_DELAY to CREATE_SESSION

> On Apr 5, 2024, at 10:14 AM, Jeff Layton <[email protected]> wrote:
>
> Vladimir reported a problem running a certain set of testcases. These
> tests would spin up a number of QEMU/KVM guests that use nfsroot.
>
> After running for a while, they eventually reboot, and that boot hangs
> mounting the server because the server is continually returning
> NFS4ERR_DELAY on a CREATE_SESSION operation. With the addition of some
> new tracepoints that I'll eventually post, I think I have a handle on
> what's happening:
>
> First, server decides to issue a CB_RECALL_ANY to the client, but there
> is a fault on the callback channel:
>
> nfsd-42728 [000] ....... 1678.271093: nfsd_fh_verify: xid=0x9109236c fh_hash=0x4394a635 type=0x0 access=BYPASS_GSS
> nfsd-42728 [000] ....... 1678.271093: nfsd_compound_status: op=2/3 OP_PUTFH status=0
> nfsd-42728 [000] ....... 1678.271093: nfsd_fh_verify: xid=0x9109236c fh_hash=0x4394a635 type=LNK access=
> nfsd-42728 [000] ....... 1678.271095: nfsd_compound_status: op=3/3 OP_READLINK status=0
> kworker/u8:5-459 [002] ....... 1690.685190: nfsd_cb_recall_any: addr=192.168.50.101:833 client 660ffa56:69849ebb keep=0 bmval0=RDATA_DLG
> kworker/u8:5-459 [002] ....... 1690.685191: nfsd_cb_queue: addr=192.168.50.101:0 client 660ffa56:69849ebb cb=000000006e42ee91 (first try)
> kworker/u8:5-459 [002] ....... 1690.685194: nfsd_cb_start: addr=192.168.50.101:0 client 660ffa56:69849ebb state=UP
> kworker/u8:0-42764 [002] ....... 1699.821526: nfsd_cb_seq_status: task:00000001@00000003 sessionid=660ffa56:69849ebb:0000001e:00000000 tk_status=-5 seq_status=1
> kworker/u8:0-42764 [002] ....... 1699.821527: nfsd_cb_new_state: addr=192.168.50.101:0 client 660ffa56:69849ebb state=FAULT
> kworker/u8:0-42764 [002] ....... 1699.821528: nfsd_cb_restart: addr=192.168.50.101:0 client 660ffa56:69849ebb cb=000000006e42ee91 (first try)
> kworker/u8:0-42764 [002] ....... 1699.821533: nfsd_cb_queue: addr=192.168.50.101:0 client 660ffa56:69849ebb cb=000000006e42ee91 (need restart)
> kworker/u8:0-42764 [002] ....... 1699.821538: nfsd_cb_start: addr=192.168.50.101:0 client 660ffa56:69849ebb state=FAULT
> kworker/u8:0-42764 [003] ....... 1709.037319: nfsd_cb_seq_status: task:00000002@00000003 sessionid=660ffa56:69849ebb:0000001e:00000000 tk_status=-5 seq_status=1
> kworker/u8:0-42764 [003] ....... 1709.037320: nfsd_cb_restart: addr=192.168.50.101:0 client 660ffa56:69849ebb cb=000000006e42ee91 (first try)
> kworker/u8:0-42764 [003] ....... 1709.037325: nfsd_cb_queue: addr=192.168.50.101:0 client 660ffa56:69849ebb cb=000000006e42ee91 (need restart)
> kworker/u8:0-42764 [003] ....... 1709.037327: nfsd_cb_start: addr=192.168.50.101:0 client 660ffa56:69849ebb state=FAULT
>
> This repeats for a while over several seconds. Eventually the server
> tries to probe the callback channel, but it's DOWN. Cue more
> nfsd_cb_queue/nfsd_cb_start calls repeating over time:
>
> kworker/u8:3-42 [003] ....... 1773.550044: nfsd_cb_restart: addr=192.168.50.101:0 client 660ffa56:69849ebb cb=000000006e42ee91 (first try)
> kworker/u8:3-42 [003] ....... 1773.550049: nfsd_cb_queue: addr=192.168.50.101:0 client 660ffa56:69849ebb cb=000000006e42ee91 (need restart)
> kworker/u8:3-42 [003] ....... 1773.550053: nfsd_cb_start: addr=192.168.50.101:0 client 660ffa56:69849ebb state=FAULT
> nfsd-42728 [000] ....1.. 1781.401822: nfsd_cb_lost: addr=192.168.50.101:0 client 660ffa56:69849ebb state=FAULT
> nfsd-42728 [000] ....2.. 1781.401823: nfsd_cb_probe: addr=192.168.50.101:0 client 660ffa56:69849ebb state=FAULT
> nfsd-42728 [000] ....2.. 1781.401823: nfsd_cb_new_state: addr=192.168.50.101:0 client 660ffa56:69849ebb state=UNKNOWN
> nfsd-42728 [000] ....2.. 1781.401824: nfsd_cb_queue: addr=192.168.50.101:0 client 660ffa56:69849ebb cb=00000000c4627b33 (first try)
> kworker/u8:3-42 [000] ....... 1781.401832: nfsd_cb_start: addr=192.168.50.101:0 client 660ffa56:69849ebb state=UNKNOWN
> kworker/u8:5-459 [003] ....... 1781.401833: nfsd_cb_seq_status: task:0000000a@00000003 sessionid=660ffa56:69849ebb:0000001e:00000000 tk_status=-107 seq_status=1
> kworker/u8:3-42 [000] ....... 1781.401833: nfsd_cb_bc_update: addr=192.168.50.101:0 client 660ffa56:69849ebb cb=00000000c4627b33 (first try)
> kworker/u8:3-42 [000] ....... 1781.401833: nfsd_cb_bc_shutdown: addr=192.168.50.101:0 client 660ffa56:69849ebb cb=00000000c4627b33 (first try)
> kworker/u8:5-459 [003] ....... 1781.401833: nfsd_cb_restart: addr=192.168.50.101:0 client 660ffa56:69849ebb cb=000000006e42ee91 (first try)
> kworker/u8:5-459 [003] ....... 1781.401836: nfsd_cb_queue: addr=192.168.50.101:0 client 660ffa56:69849ebb cb=000000006e42ee91 (need restart)
> kworker/u8:3-42 [000] ....... 1781.401858: nfsd_cb_new_state: addr=192.168.50.101:0 client 660ffa56:69849ebb state=DOWN
> kworker/u8:3-42 [000] ....... 1781.401858: nfsd_cb_queue: addr=192.168.50.101:0 client 660ffa56:69849ebb cb=00000000c4627b33 (first try)
> kworker/u8:3-42 [000] ....... 1781.401859: nfsd_cb_start: addr=192.168.50.101:0 client 660ffa56:69849ebb state=DOWN
>
> Eventually, the client calls back (after rebooting), but it can't get through:
> kworker/u8:3-42 [000] ....... 1782.391951: nfsd_cb_start: addr=192.168.50.101:0 client 660ffa56:69849ebb state=DOWN
> kworker/u8:3-42 [000] ....... 1782.391951: nfsd_cb_queue: addr=192.168.50.101:0 client 660ffa56:69849ebb cb=00000000c4627b33 (first try)
> kworker/u8:3-42 [000] ....... 1782.391952: nfsd_cb_start: addr=192.168.50.101:0 client 660ffa56:69849ebb state=DOWN
> kworker/u8:3-42 [000] ....... 1782.391952: nfsd_cb_queue: addr=192.168.50.101:0 client 660ffa56:69849ebb cb=000000006e42ee91 (need restart)
> nfsd-42728 [000] ....... 1782.409010: nfsd_compound: xid=0xa299edc2 opcnt=1 tag=
> nfsd-42728 [000] ....1.. 1782.409039: nfsd_clid_verf_mismatch: client 660ffa56:69849ebb verf=0x17c364ee8596f72a, updated=0x17c3650afe9e0a7f from addr=192.168.50.101:926
> nfsd-42728 [000] ....... 1782.409041: nfsd_compound_status: op=1/1 OP_EXCHANGE_ID status=0
> nfsd-42728 [000] ....... 1782.409241: nfsd_compound: xid=0xa399edc2 opcnt=1 tag=
> nfsd-42728 [000] ....1.. 1782.409245: nfsd_clid_verf_mismatch: client 660ffa56:69849ebb verf=0x17c364ee8596f72a, updated=0x17c3650afe9e0a7f from addr=192.168.50.101:926
> nfsd-42728 [000] ....... 1782.409245: nfsd_compound_status: op=1/1 OP_EXCHANGE_ID status=0
> nfsd-42728 [000] ....... 1782.409309: nfsd_compound: xid=0xa499edc2 opcnt=1 tag=
> nfsd-42728 [000] ....1.. 1782.409325: check_slot_seqid: seqid=1 slot_seqid=0 inuse=0 conf=0 ret=0
> nfsd-42728 [000] ....1.. 1782.409326: mark_client_expired_locked: addr=192.168.50.101:0 client 660ffa56:69849ebb error=10008
> nfsd-42728 [000] ....... 1782.409329: nfsd_compound_status: op=1/1 OP_CREATE_SESSION status=10008
> kworker/u8:3-42 [000] ....... 1782.417901: nfsd_cb_start: addr=192.168.50.101:0 client 660ffa56:69849ebb state=DOWN
> kworker/u8:3-42 [000] ....... 1782.417901: nfsd_cb_queue: addr=192.168.50.101:0 client 660ffa56:69849ebb cb=000000006e42ee91 (need restart)
>
> The EXCHANGE_ID works, but CREATE_SESSION returns NFS4ERR_DELAY because
> it couldn't unhash the client due the refcount (cl_rpc_users) being
> high. I didn't capture the value of the refcount at this point, but the
> CB_RECALL_ANY job holds a reference to the old client, so that would be
> enough to prevent unhashing it.
>
> I think we need to rework the refcounting on the CB_RECALL_ANY job, at a
> minimum. It probably shouldn't hold a reference to the client. Maybe
> make it store the clientid and just redo the lookup if it's still
> around? I might try to spin up a patch along those lines.

There's no reason to retry a CB_RECALL_ANY, IMO. Can we make it
"fire and forget" along the lines you suggested? Sounds like an
appropriate fix for v6.9-rc.

> Longer term, I think we need to take a hard look at why the cl_rpc_users
> refcount ensures that the client remains hashed. It seems like we ought
> to be able to unhash it, even if there are outstanding references, no?

I've not had to look at the cl_rpc_users mechanism, but Dai
might have some opinions, like "kill it with fire!" ;-)

--
Chuck Lever