Hey Folks, we've been hunting an NLM bug here for a few days and I
think we've got enough information to bring this one to the community.
This exists, as far as we can tell, in the current tree, and goes back
at least as early as .22.
Two clients(one process per client), clientA holds a lock a full file
lock on a file. The clientB is waiting on the lock.
If a signal is received for the process on clientB, the lock is
cancelled, then re-requested:
[11269.808195] lockd: server 10.1.230.34 CANCEL (async) status 0 pid 0
(0) type 0x1 0 -> 9223372036854775807 fh(32) 0x003d71ae 0x107ebab1
0x00000020 0xb799de00 0x06c2e532 0x5b0098
a0 0x003d71ae 0x007ebab1
[11269.819230] lockd: server 10.1.230.34 LOCK status 3 pid 1 (0) type
0x1 0 -> 9223372036854775807 fh(32) 0x003d71ae 0x107ebab1 0x00000020
0xb799de00 0x06c2e532 0x5b0098a0 0x003d7
1ae 0x007ebab1
The pid being incremented between requests, as it should. Now, if
clientA releases its lock, clientB gets the callback properly, and
successfully gains the lock:
[11298.158595] lockd: server 10.1.230.34 GRANTED status 0 pid 1 (1)
type 0x1 0 -> 9223372036854775807 fh(32) 0x003d71ae 0x107ebab1
0x00000020 0xb799de00 0x06c2e532 0x5b0098a0 0x00
3d71ae 0x007ebab1
Now, clientB's process dies, releasing the lock:
[11306.465702] lockd: server 10.1.230.34 UNLOCK (async) status 0 pid 2
(0) type 0x2 0 -> 9223372036854775807 fh(32) 0x003d71ae 0x107ebab1
0x00000020 0xb799de00 0x06c2e532 0x5b0098a0 0x003d71ae 0x007ebab1
Note the pid here has incorrectly incremented.
We dug into this a bit and found that __nlm_find_lockowner() did not
correctly match the lock we were trying to find and unlock, so it
created a new lock "pid". This results in the server ignoring the
unlock, and the file being permenently locked up.
We're no kernel wizards here, so we're looking for a bit of help. Any
suggestions on where to look from here would be appreciated.
Thanks!
-Aaron
Hi Aaron - I've been looking into this here, it does appear to be a
client issue. I suspect the lockowner is not being properly linked
into the per-server list, in the case where the lock request is
retried. It appears to be related to the use/reuse of the file_lock
struct that comes down with the call.
In kernels prior to 2.6.9, this may have worked fine because the same
lockowner was always used for a given process. It's not yet clear to
me if the lost-unlock behavior was also introduced there however.
I can reproduce some slightly different behavior using three flock(1)
commands, ^Z and ^C. I haven't been able to leave abandoned
locks on the server with these yet, however. More later...
Tom.
At 12:33 PM 3/27/2008, Aaron Wiebe wrote:
>Hey Folks, we've been hunting an NLM bug here for a few days and I
>think we've got enough information to bring this one to the community.
>
>This exists, as far as we can tell, in the current tree, and goes back
>at least as early as .22.
>
>Two clients(one process per client), clientA holds a lock a full file
>lock on a file. The clientB is waiting on the lock.
>
>If a signal is received for the process on clientB, the lock is
>cancelled, then re-requested:
>
>[11269.808195] lockd: server 10.1.230.34 CANCEL (async) status 0 pid 0
>(0) type 0x1 0 -> 9223372036854775807 fh(32) 0x003d71ae 0x107ebab1
>0x00000020 0xb799de00 0x06c2e532 0x5b0098
>a0 0x003d71ae 0x007ebab1
>[11269.819230] lockd: server 10.1.230.34 LOCK status 3 pid 1 (0) type
>0x1 0 -> 9223372036854775807 fh(32) 0x003d71ae 0x107ebab1 0x00000020
>0xb799de00 0x06c2e532 0x5b0098a0 0x003d7
>1ae 0x007ebab1
>
>
>The pid being incremented between requests, as it should. Now, if
>clientA releases its lock, clientB gets the callback properly, and
>successfully gains the lock:
>
>[11298.158595] lockd: server 10.1.230.34 GRANTED status 0 pid 1 (1)
>type 0x1 0 -> 9223372036854775807 fh(32) 0x003d71ae 0x107ebab1
>0x00000020 0xb799de00 0x06c2e532 0x5b0098a0 0x00
>3d71ae 0x007ebab1
>
>
>Now, clientB's process dies, releasing the lock:
>
>[11306.465702] lockd: server 10.1.230.34 UNLOCK (async) status 0 pid 2
>(0) type 0x2 0 -> 9223372036854775807 fh(32) 0x003d71ae 0x107ebab1
>0x00000020 0xb799de00 0x06c2e532 0x5b0098a0 0x003d71ae 0x007ebab1
>
>Note the pid here has incorrectly incremented.
>
>We dug into this a bit and found that __nlm_find_lockowner() did not
>correctly match the lock we were trying to find and unlock, so it
>created a new lock "pid". This results in the server ignoring the
>unlock, and the file being permenently locked up.
>
>We're no kernel wizards here, so we're looking for a bit of help. Any
>suggestions on where to look from here would be appreciated.
>
>Thanks!
>
>-Aaron
We think we have a fix - however we definitely need some folks to
review. This does appear to solve the reproducible issue as we were
seeing it.
>From 627028f01dae67882061c533236ac3f1f00c70c3 Mon Sep 17 00:00:00 2001
From: Aaron Wiebe <aaronw-k7QPB+T73Rje9wHmmfpqLFaTQe2KTcn/@public.gmane.org>
Date: Thu, 27 Mar 2008 16:53:04 -0400
Subject: [PATCH] Fix an issue where an NLM CANCEL would desync VFS
In the condition where a process is waiting on an NLM
lock, and the process receives a signal, an NLM CANCEL
is sent to the nfs server. In that case, VFS would
maintain an entry in its blocked list. After the
signal handler returns, NLM re-requests the lock.
VFS is not being properly notified to release the local
lock when the NLM CANCEL is processed. When NLM
re-requests the lock, blocks, and then a GRANTED
callback is received, the process ends up blocking on
its original (cancelled) lock.
Tell VFS to clear the lock request after a CANCEL is
completed.
Signed-off-by: Aaron Wiebe <aaronw-k7QPB+T73Rje9wHmmfpqLFaTQe2KTcn/@public.gmane.org>
Signed-off-by: Brandon Seibel <brandon.seibel-k7QPB+T73Rje9wHmmfpqLFaTQe2KTcn/@public.gmane.org>
---
fs/lockd/clntproc.c | 4 ++++
1 files changed, 4 insertions(+), 0 deletions(-)
diff --git a/fs/lockd/clntproc.c b/fs/lockd/clntproc.c
index b6b74a6..6a1b80c 100644
--- a/fs/lockd/clntproc.c
+++ b/fs/lockd/clntproc.c
@@ -687,6 +687,10 @@ static int nlmclnt_cancel(struct nlm_host *host,
int block, struct file_lock *fl
return -ENOMEM;
req->a_flags = RPC_TASK_ASYNC;
+ /* set the lock type to F_UNLCK for vfs, so once the cancel has
+ * completed, the vfs blocked list will be updated accordingly
+ */
+ fl->fl_type = F_UNLCK;
nlmclnt_setlockargs(req, fl);
req->a_args.block = block;
--
1.5.3.4