LinuxLists.cc - [NLM] 2.6.27.14 breakage when grace period expires

2009-02-11 11:23:19

Subject: [NLM] 2.6.27.14 breakage when grace period expires

I'm sorry to inform you but... it seems that there is a similar problem
in the NLM subsystem as reported previously but this time it is triggered
when the grace time expires after a reboot.

Client and server run 2.6.27.14 + previous fix, NFSv3.

On the client there are three shells running:

while :; do lck -w /mnt/foo 2; done

The "lck" program is the same as posted before and it obtains an exclusive
write lock then waits 2 seconds in above invocation (there's probably an
"fcntl" command equivalent). After an orderly server reboot + grace time
expiration one of above command loops reports:

lck: fcntl: No locks available

and all three get stuck. After ^C-ing all "lck" loops the server still
shows an entry in /proc/locks which causes the file to be locked
indefinately. Maybe two loops are sufficient to reproduce the issue or
maybe you need more, I don't know.

Interestingly, during the grace time at least one of the "lck" processes
should have re-obtained the lock but it didn't show up in /proc/locks
on the server.

Interestingly (#2), after removing the file on the server (i.e. no
sillyrename) the now free inode is still locked according to /proc/locks.
Even stopping/starting /etc/init.d/nfs-kernel-server plus "echo
3 >/proc/sys/vm/drop_caches" did not remove the lock (it did re-enter
grace).

--
Frank

2009-02-11 20:35:52

by J. Bruce Fields

[permalink] [raw]

Subject: Re: [NLM] 2.6.27.14 breakage when grace period expires

On Wed, Feb 11, 2009 at 12:23:18PM +0100, Frank van Maarseveen wrote:
> I'm sorry to inform you but... it seems that there is a similar problem
> in the NLM subsystem as reported previously but this time it is triggered
> when the grace time expires after a reboot.
>
> Client and server run 2.6.27.14 + previous fix, NFSv3.
>
> On the client there are three shells running:
>
> while :; do lck -w /mnt/foo 2; done
>
> The "lck" program is the same as posted before and it obtains an exclusive
> write lock then waits 2 seconds in above invocation (there's probably an
> "fcntl" command equivalent). After an orderly server reboot + grace time

How are you rebooting the server?

--b.

> expiration one of above command loops reports:
>
> lck: fcntl: No locks available
>
> and all three get stuck. After ^C-ing all "lck" loops the server still
> shows an entry in /proc/locks which causes the file to be locked
> indefinately. Maybe two loops are sufficient to reproduce the issue or
> maybe you need more, I don't know.
>
> Interestingly, during the grace time at least one of the "lck" processes
> should have re-obtained the lock but it didn't show up in /proc/locks
> on the server.
>
> Interestingly (#2), after removing the file on the server (i.e. no
> sillyrename) the now free inode is still locked according to /proc/locks.
> Even stopping/starting /etc/init.d/nfs-kernel-server plus "echo
> 3 >/proc/sys/vm/drop_caches" did not remove the lock (it did re-enter
> grace).
>
> --
> Frank
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

2009-02-11 20:37:05

by Frank van Maarseveen

[permalink] [raw]

Subject: Re: [NLM] 2.6.27.14 breakage when grace period expires

On Wed, Feb 11, 2009 at 03:35:55PM -0500, J. Bruce Fields wrote:
> On Wed, Feb 11, 2009 at 12:23:18PM +0100, Frank van Maarseveen wrote:
> > I'm sorry to inform you but... it seems that there is a similar problem
> > in the NLM subsystem as reported previously but this time it is triggered
> > when the grace time expires after a reboot.
> >
> > Client and server run 2.6.27.14 + previous fix, NFSv3.
> >
> > On the client there are three shells running:
> >
> > while :; do lck -w /mnt/foo 2; done
> >
> > The "lck" program is the same as posted before and it obtains an exclusive
> > write lock then waits 2 seconds in above invocation (there's probably an
> > "fcntl" command equivalent). After an orderly server reboot + grace time
>
> How are you rebooting the server?

"reboot"

>
> --b.
>
> > expiration one of above command loops reports:
> >
> > lck: fcntl: No locks available
> >
> > and all three get stuck. After ^C-ing all "lck" loops the server still
> > shows an entry in /proc/locks which causes the file to be locked
> > indefinately. Maybe two loops are sufficient to reproduce the issue or
> > maybe you need more, I don't know.
> >
> > Interestingly, during the grace time at least one of the "lck" processes
> > should have re-obtained the lock but it didn't show up in /proc/locks
> > on the server.
> >
> > Interestingly (#2), after removing the file on the server (i.e. no
> > sillyrename) the now free inode is still locked according to /proc/locks.
> > Even stopping/starting /etc/init.d/nfs-kernel-server plus "echo
> > 3 >/proc/sys/vm/drop_caches" did not remove the lock (it did re-enter
> > grace).
> >
> > --
> > Frank
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> > the body of a message to [email protected]
> > More majordomo info at http://vger.kernel.org/majordomo-info.html

--
Frank

2009-02-11 20:39:40

by J. Bruce Fields

[permalink] [raw]

Subject: Re: [NLM] 2.6.27.14 breakage when grace period expires

On Wed, Feb 11, 2009 at 09:37:03PM +0100, Frank van Maarseveen wrote:
> On Wed, Feb 11, 2009 at 03:35:55PM -0500, J. Bruce Fields wrote:
> > On Wed, Feb 11, 2009 at 12:23:18PM +0100, Frank van Maarseveen wrote:
> > > I'm sorry to inform you but... it seems that there is a similar problem
> > > in the NLM subsystem as reported previously but this time it is triggered
> > > when the grace time expires after a reboot.
> > >
> > > Client and server run 2.6.27.14 + previous fix, NFSv3.
> > >
> > > On the client there are three shells running:
> > >
> > > while :; do lck -w /mnt/foo 2; done
> > >
> > > The "lck" program is the same as posted before and it obtains an exclusive
> > > write lock then waits 2 seconds in above invocation (there's probably an
> > > "fcntl" command equivalent). After an orderly server reboot + grace time
> >
> > How are you rebooting the server?
>
> "reboot"

Could you watch the nfs/nlm/nsm traffic on reboot and make sure that the
server is actually sending the reboot notification to the client, and
that the client is trying to reclaim? (Wireshark should make this all
fairly clear. But capture the traffic with tcpdump -s0 -wtmp.pcap and
send it to me if you're having trouble interpreting it.)

--b.

>
> >
> > --b.
> >
> > > expiration one of above command loops reports:
> > >
> > > lck: fcntl: No locks available
> > >
> > > and all three get stuck. After ^C-ing all "lck" loops the server still
> > > shows an entry in /proc/locks which causes the file to be locked
> > > indefinately. Maybe two loops are sufficient to reproduce the issue or
> > > maybe you need more, I don't know.
> > >
> > > Interestingly, during the grace time at least one of the "lck" processes
> > > should have re-obtained the lock but it didn't show up in /proc/locks
> > > on the server.
> > >
> > > Interestingly (#2), after removing the file on the server (i.e. no
> > > sillyrename) the now free inode is still locked according to /proc/locks.
> > > Even stopping/starting /etc/init.d/nfs-kernel-server plus "echo
> > > 3 >/proc/sys/vm/drop_caches" did not remove the lock (it did re-enter
> > > grace).
> > >
> > > --
> > > Frank
> > > --
> > > To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> > > the body of a message to [email protected]
> > > More majordomo info at http://vger.kernel.org/majordomo-info.html
>
> --
> Frank

2009-02-11 20:57:10

by Frank van Maarseveen

[permalink] [raw]

On Thu, 2009-02-12 at 19:29 +0100, Frank van Maarseveen wrote:
> On Thu, Feb 12, 2009 at 01:17:27PM -0500, Trond Myklebust wrote:
> > On Thu, 2009-02-12 at 16:36 +0100, Frank van Maarseveen wrote:
> > > A little theorizing:
> > > If the unlock of a yet unrecovered lock has failed up to that point then
> > > the client sure must remember the lock somehow. That might explain the
> > > secondary error when a conflicting lock is granted by the server.
> >
> > Sorry, but that doesn't hold water. The client will release the VFS
> > 'mirror' of the lock before it attempts to unlock. Otherwise, you could
> > have some nasty races between the unlock thread and the recovery
> > thread...
> > Besides, the granted callback handler on the client only checks the list
> > of blocked locks for a match.
>
> ok, then we have more than one NLM bug to resolve.
>
> >
> > Oh, bugger, I know what this is... It's the same thing that happened to
> > the NFSv4 callback server. If you compile with CONFIG_IPV6 or
> > CONFIG_IPV6_MODULE enabled, and also set CONFIG_SUNRPC_REGISTER_V4, then
> > the NLM server will listen on an IPv6 socket, and so the RPC request
> > come in with their IPv4 address mapped into the IPv6 namespace.
>
> Nope:
>
> $ zgrep IPV6 /proc/config.gz
> # CONFIG_IPV6 is not set
> $ zgrep SUNRPC /proc/config.gz
> CONFIG_SUNRPC=y
> CONFIG_SUNRPC_GSS=y
> # CONFIG_SUNRPC_BIND34 is not set

Sorry, yes... 2.6.27.x should be OK. The lockd v4mapped addresses bug is
specific to 2.6.29. Chuck, are you planning on fixing this before
2.6.29-final comes out?

> And remember this is not a recent regression.

It would help if you sent us the full binary tcpdump, instead of just
the summary. That should enable us to figure out which of the tests is
failing in nlmclnt_grant().

Trond

2009-02-12 19:16:11

On Thu, Feb 12, 2009 at 03:24:11PM -0500, Trond Myklebust wrote:
>
> Hmm... I wonder if the problem isn't just that we're failing to cancel
> the lock request when the process is signalled. Can you try the
> following patch?
>
> --------------------------------------------------------------------
> From: Trond Myklebust <[email protected]>
> NLM/lockd: Always cancel blocked locks when exiting early from nlmclnt_lock
>
> Signed-off-by: Trond Myklebust <[email protected]>
> ---
>
> fs/lockd/clntproc.c | 9 +++++++--
> 1 files changed, 7 insertions(+), 2 deletions(-)
>
>
> diff --git a/fs/lockd/clntproc.c b/fs/lockd/clntproc.c
> index 31668b6..f956d1e 100644
> --- a/fs/lockd/clntproc.c
> +++ b/fs/lockd/clntproc.c
> @@ -542,9 +542,14 @@ again:
> status = nlmclnt_call(cred, req, NLMPROC_LOCK);
> if (status < 0)
> break;
> - /* Did a reclaimer thread notify us of a server reboot? */
> - if (resp->status == nlm_lck_denied_grace_period)
> + /* Is the server in a grace period state?
> + * If so, we need to reset the resp->status, and
> + * retry...
> + */
> + if (resp->status == nlm_lck_denied_grace_period) {
> + resp->status = nlm_lck_blocked;
> continue;
> + }
> if (resp->status != nlm_lck_blocked)
> break;
> /* Wait on an NLM blocking lock */

Patch tried but didn't make any difference. Note that there isn't any ^C
or any other signal involved. The client runs three loops in the shell

while :; do lck -w /mnt/locktest 2; done

and every "lck" opens the file, obtains an exclusive write lock (waits
if necessary), calls sleep(2), closes the fd (releasing the lock) and
goes exit.

The "lck" which ends up unlocking during grace terminates normally but
one of the others gets a "fcntl: No locks available" when trying to
obtain the lock.

Question: shouldn't the server drop the lock after a sequence like:

201 122.033767 server: NLM V4 GRANTED_MSG Call (Reply In 202) FH:0xcafa61cc svid:116 pos:0-0
202 122.034066 client: NLM V4 GRANTED_MSG Reply (Call In 201)
205 122.034665 client: NLM V4 GRANTED_RES Call (Reply In 206) NLM_DENIED
206 122.034753 server: NLM V4 GRANTED_RES Reply (Call In 205)

?

--
Frank