I'm observing an interesting stateid problem on a 2.6.18 based kernel to
an AIX server. I've added some printks to track stateids and a couple
other things, and see the following sequence:
01: _nfs4_do_open name=1GB57
02: decode_open , stateid: 00000001:48295ae2 00000008 004e8042
03: encode_open_confirm , stateid: 00000001:48295ae2 00000008 004e8042
04: decode_open_confirm , stateid: 00000002:48295ae2 00000008 004e8042
05: nfs4_open_confirm_done , stateid: 00000002:48295ae2 00000008 004e8042
06: update_open_stateid , stateid: 00000002:48295ae2 00000008 004e8042
07: nfs4_copy_stateid open , stateid: 00000002:48295ae2 00000008 004e8042
08: encode_setattr , stateid: 00000002:48295ae2 00000008 004e8042
09: encode_stateid , stateid: 00000002:48295ae2 00000008 004e8042
10: nfs4_renew_state now: 627063 last: 547074 lease: 120000 expires: 667074
11: nfs4_renew_done scheduled state recovery (no wait) task->tk_status=NFS4ERR_EXPIRED (-10011) now=627296
12: reclaimer at restart_loop nfs4_renew_done NFS4ERR_EXPIRED
13: encode_stateid skipped same stateid 7510 times
14: decode_open , stateid: 00000001:48295ae2 00000008 00508042
15: encode_stateid not last decoded , stateid: 00000002:48295ae2 00000008 004e8042
16: encode_open_confirm , stateid: 00000001:48295ae2 00000008 00508042
17: encode_stateid not last decoded , stateid: 00000002:48295ae2 00000008 004e8042
18: decode_open_confirm , stateid: 00000002:48295ae2 00000008 00508042
19: nfs4_open_confirm_done , stateid: 00000002:48295ae2 00000008 00508042
20: nfs4_open_recover_helper , stateid: 00000002:48295ae2 00000008 00508042
21: update_open_stateid , stateid: 00000002:48295ae2 00000008 00508042
22: reclaimer at out
23: nfs4_copy_stateid open , stateid: 00000002:48295ae2 00000008 00508042
24: encode_stateid , stateid: 00000002:48295ae2 00000008 00508042
25: encode_stateid skipped same stateid 8510 times
26: encode_close , stateid: 00000002:48295ae2 00000008 00508042
27: nfs4_put_open_state , stateid: 00000002:48295ae2 00000008 00508042
What seems to be happening is that a state requires renewal (at line
10), the renewal fails with NFS4ERR_EXPIRED (line 11) so a new open is
issued as part of recovery (line 14). During the time the open recovery
is in progress, two writes occur (line 15 and line 17) using the old
expired stateid. Finally, the open recovery completes and the new
stateid is used for the rest of the writes (line 24 and line 25).
One question is why the open state is expiring, it appears that is
expires significantly before it is expected to (expires: 667074, now:
627296). Also, should those two writes actually occur while open
recovery is in progress? If so, and if they fail, shouldn't they be
retried?
Any suggestions on additional debug? I'd like to avoid enabling all
debug messages since they would flood the log.
Thanks
Frank
-------------------------------------------------------------------------
Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services for
just about anything Open Source.
http://sourceforge.net/services/buy/index.php
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs
_______________________________________________
Please note that [email protected] is being discontinued.
Please subscribe to [email protected] instead.
http://vger.kernel.org/vger-lists.html#linux-nfs