Hi,
I'm trying to track down a strange problem with state ids going bad
(possibly linked to ntp changing the system time on the non-Linux server)
and am still learning about how the state management works.
But I've come across an error where I don't think there should be one.
For whatever reason the client gets a BAD_STATEID on a file that it has a
lock on. The open gets a write delegation so that when it runs
nfs4_reclaim_locks(), nfs4_lock_reclaim aborts early without doing anything
(it doesn't need to because there is a delegation).
But the code below then checks that NFS_LOCK_INITIALIZED is set on all lock
states. But it isn't because nfs4_clear_open_state cleared it and
nfs4_lock_reclaim didn't bother setting it.
So I think the error should only be printed if there is no delegated state,
hence this patch.
Does it look right, or have I misunderstood something?
Thanks,
NeilBrown
diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c
index 1fab140..1876ee7 100644
--- a/fs/nfs/nfs4state.c
+++ b/fs/nfs/nfs4state.c
@@ -1444,14 +1444,16 @@ restart:
if (status >= 0) {
status = nfs4_reclaim_locks(state, ops);
if (status >= 0) {
- spin_lock(&state->state_lock);
- list_for_each_entry(lock, &state->lock_states, ls_locks) {
- if (!test_bit(NFS_LOCK_INITIALIZED, &lock->ls_flags))
- pr_warn_ratelimited("NFS: "
- "%s: Lock reclaim "
- "failed!\n", __func__);
+ if (test_bit(NFS_DELEGATED_STATE, &state->flags) != 0) {
+ spin_lock(&state->state_lock);
+ list_for_each_entry(lock, &state->lock_states, ls_locks) {
+ if (!test_bit(NFS_LOCK_INITIALIZED, &lock->ls_flags))
+ pr_warn_ratelimited("NFS: "
+ "%s: Lock reclaim "
+ "failed!\n", __func__);
+ }
+ spin_unlock(&state->state_lock);
}
- spin_unlock(&state->state_lock);
nfs4_put_open_state(state);
spin_lock(&sp->so_lock);
goto restart;
T24gVGh1LCAyMDEzLTA4LTA4IGF0IDEyOjU5ICsxMDAwLCBOZWlsQnJvd24gd3JvdGU6DQo+IEhp
LA0KPiAgSSdtIHRyeWluZyB0byB0cmFjayBkb3duIGEgc3RyYW5nZSBwcm9ibGVtIHdpdGggc3Rh
dGUgaWRzIGdvaW5nIGJhZA0KPiAgKHBvc3NpYmx5IGxpbmtlZCB0byBudHAgY2hhbmdpbmcgdGhl
IHN5c3RlbSB0aW1lIG9uIHRoZSBub24tTGludXggc2VydmVyKQ0KPiAgYW5kIGFtIHN0aWxsIGxl
YXJuaW5nIGFib3V0IGhvdyB0aGUgc3RhdGUgbWFuYWdlbWVudCB3b3Jrcy4NCj4gDQo+ICBCdXQg
SSd2ZSBjb21lIGFjcm9zcyBhbiBlcnJvciB3aGVyZSBJIGRvbid0IHRoaW5rIHRoZXJlIHNob3Vs
ZCBiZSBvbmUuDQo+IA0KPiAgRm9yIHdoYXRldmVyIHJlYXNvbiB0aGUgY2xpZW50IGdldHMgYSBC
QURfU1RBVEVJRCBvbiBhIGZpbGUgdGhhdCBpdCBoYXMgYQ0KPiAgbG9jayBvbi4gIFRoZSBvcGVu
IGdldHMgYSB3cml0ZSBkZWxlZ2F0aW9uIHNvIHRoYXQgd2hlbiBpdCBydW5zDQo+ICBuZnM0X3Jl
Y2xhaW1fbG9ja3MoKSwgbmZzNF9sb2NrX3JlY2xhaW0gYWJvcnRzIGVhcmx5IHdpdGhvdXQgZG9p
bmcgYW55dGhpbmcNCj4gIChpdCBkb2Vzbid0IG5lZWQgdG8gYmVjYXVzZSB0aGVyZSBpcyBhIGRl
bGVnYXRpb24pLg0KPiAgQnV0IHRoZSBjb2RlIGJlbG93IHRoZW4gY2hlY2tzIHRoYXQgTkZTX0xP
Q0tfSU5JVElBTElaRUQgaXMgc2V0IG9uIGFsbCBsb2NrDQo+ICBzdGF0ZXMuICBCdXQgaXQgaXNu
J3QgYmVjYXVzZSBuZnM0X2NsZWFyX29wZW5fc3RhdGUgY2xlYXJlZCBpdCBhbmQNCj4gIG5mczRf
bG9ja19yZWNsYWltIGRpZG4ndCBib3RoZXIgc2V0dGluZyBpdC4NCj4gDQo+ICBTbyBJIHRoaW5r
IHRoZSBlcnJvciBzaG91bGQgb25seSBiZSBwcmludGVkIGlmIHRoZXJlIGlzIG5vIGRlbGVnYXRl
ZCBzdGF0ZSwNCj4gIGhlbmNlIHRoaXMgcGF0Y2guDQo+IA0KPiAgRG9lcyBpdCBsb29rIHJpZ2h0
LCBvciBoYXZlIEkgbWlzdW5kZXJzdG9vZCBzb21ldGhpbmc/DQo+IA0KDQpIaSBOZWlsLA0KDQpU
aGF0IGFuYWx5c2lzIGxvb2tzIGNvcnJlY3QuIENhbiB5b3UgcmVzZW5kIHRoZSBwYXRjaCB3aXRo
IGFuDQphcHByb3ByaWF0ZSBzaWduZWQtb2ZmLWJ5IGFuZCBjaGFuZ2Vsb2cgZW50cnk/DQoNClRo
YW5rcyENCiAgVHJvbmQNCi0tIA0KVHJvbmQgTXlrbGVidXN0DQpMaW51eCBORlMgY2xpZW50IG1h
aW50YWluZXINCg0KTmV0QXBwDQpUcm9uZC5NeWtsZWJ1c3RAbmV0YXBwLmNvbQ0Kd3d3Lm5ldGFw
cC5jb20NCg==
On Thu, 8 Aug 2013 15:51:30 +0000 "Myklebust, Trond"
<[email protected]> wrote:
> On Thu, 2013-08-08 at 12:59 +1000, NeilBrown wrote:
> > Hi,
> > I'm trying to track down a strange problem with state ids going bad
> > (possibly linked to ntp changing the system time on the non-Linux server)
> > and am still learning about how the state management works.
> >
> > But I've come across an error where I don't think there should be one.
> >
> > For whatever reason the client gets a BAD_STATEID on a file that it has a
> > lock on. The open gets a write delegation so that when it runs
> > nfs4_reclaim_locks(), nfs4_lock_reclaim aborts early without doing anything
> > (it doesn't need to because there is a delegation).
> > But the code below then checks that NFS_LOCK_INITIALIZED is set on all lock
> > states. But it isn't because nfs4_clear_open_state cleared it and
> > nfs4_lock_reclaim didn't bother setting it.
> >
> > So I think the error should only be printed if there is no delegated state,
> > hence this patch.
> >
> > Does it look right, or have I misunderstood something?
> >
>
> Hi Neil,
>
> That analysis looks correct. Can you resend the patch with an
> appropriate signed-off-by and changelog entry?
Thanks. I've resent separately.
NeilBrown