2005-03-31 13:13:31

by Neil Horman

[permalink] [raw]
Subject: should NLM locks send notifications after nfsd crash

Hey all-
I can't quite get my head around something. I've ben asked a question
regarding the behavior of NLM locks, and I'm not sure of what the right behavior
is. If there are NFS clients which have issued monitor requests for various NLM
locks to a server (rpc.statd), and the NFS server (rpc.nfsd) which serves the
files for which the client is monitoring locks on, is stopped and started (some
arbitrary time later), should notifications be sent to the monitoring
clients? I've been reading the opengourp definition for NLM and NSM, and
section regarding the entrance of grace periods when the "server crashes" is
just unclear to me.

Thanks!
Neil

--
/***************************************************
*Neil Horman
*Software Engineer
*Red Hat, Inc.
*[email protected]
*gpg keyid: 1024D / 0x92A74FA1
*http://pgp.mit.edu
***************************************************/


-------------------------------------------------------
This SF.net email is sponsored by Demarc:
A global provider of Threat Management Solutions.
Download our HomeAdmin security software for free today!
http://www.demarc.com/Info/Sentarus/hamr30
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs


2005-03-31 16:19:37

by Trond Myklebust

[permalink] [raw]
Subject: Re: should NLM locks send notifications after nfsd crash

to den 31.03.2005 Klokka 08:13 (-0500) skreiv Neil Horman:
> Hey all-
> I can't quite get my head around something. I've ben asked a question
> regarding the behavior of NLM locks, and I'm not sure of what the right behavior
> is. If there are NFS clients which have issued monitor requests for various NLM
> locks to a server (rpc.statd), and the NFS server (rpc.nfsd) which serves the
> files for which the client is monitoring locks on, is stopped and started (some
> arbitrary time later), should notifications be sent to the monitoring
> clients? I've been reading the opengourp definition for NLM and NSM, and
> section regarding the entrance of grace periods when the "server crashes" is
> just unclear to me.

Unlike NFSv2/v3, NLM involves the server keeping state in sync with the
client (both client and server have to agree at all times on which locks
are held by the client).
As recovery of state is client-driven, that means that the server is
responsible for notifying the clients upon any event which causes it to
lose track of that state. The clients are then responsible for to trying
to recover the state that they believe to be holding.

So if stopping and starting the server causes it to lose the NLM lock
information that it held, then it must notify the clients of this so
that they can take whatever action may be necessary to recover their
locks.
In order to ensure that this recovery can happen in an orderly fashion,
the server defines a "grace period", during which clients may not
actually set new locks, but they may reclaim locks which they held
before the event. (In order for this to work, the server has to trust
the clients not to lie about which locks they held before the
"reboot"...)

Cheers,
Trond
--
Trond Myklebust <[email protected]>



-------------------------------------------------------
This SF.net email is sponsored by Demarc:
A global provider of Threat Management Solutions.
Download our HomeAdmin security software for free today!
http://www.demarc.com/Info/Sentarus/hamr30
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2005-04-01 01:56:47

by Neil Horman

[permalink] [raw]
Subject: Re: should NLM locks send notifications after nfsd crash

On Thu, Mar 31, 2005 at 11:19:13AM -0500, Trond Myklebust wrote:
> to den 31.03.2005 Klokka 08:13 (-0500) skreiv Neil Horman:
> > Hey all-
> > I can't quite get my head around something. I've ben asked a question
> > regarding the behavior of NLM locks, and I'm not sure of what the right behavior
> > is. If there are NFS clients which have issued monitor requests for various NLM
> > locks to a server (rpc.statd), and the NFS server (rpc.nfsd) which serves the
> > files for which the client is monitoring locks on, is stopped and started (some
> > arbitrary time later), should notifications be sent to the monitoring
> > clients? I've been reading the opengourp definition for NLM and NSM, and
> > section regarding the entrance of grace periods when the "server crashes" is
> > just unclear to me.
>
> Unlike NFSv2/v3, NLM involves the server keeping state in sync with the
> client (both client and server have to agree at all times on which locks
> are held by the client).
> As recovery of state is client-driven, that means that the server is
> responsible for notifying the clients upon any event which causes it to
> lose track of that state. The clients are then responsible for to trying
> to recover the state that they believe to be holding.
>
> So if stopping and starting the server causes it to lose the NLM lock
> information that it held, then it must notify the clients of this so
> that they can take whatever action may be necessary to recover their
> locks.
> In order to ensure that this recovery can happen in an orderly fashion,
> the server defines a "grace period", during which clients may not
> actually set new locks, but they may reclaim locks which they held
> before the event. (In order for this to work, the server has to trust
> the clients not to lie about which locks they held before the
> "reboot"...)
>
> Cheers,
> Trond
> --
> Trond Myklebust <[email protected]>
>

Thank you for the through explination Trond, its apperciated. This only leaves
1 question in my mind: Specifically under the Linux nfs server, does a stopping
and starting of nfsd (but specifically _not_ rpc.lockd or rpc.statd) constitute
a loss of NLM state?

Thanks
Neil
--
/***************************************************
*Neil Horman
*Software Engineer
*Red Hat, Inc.
*[email protected]
*gpg keyid: 1024D / 0x92A74FA1
*http://pgp.mit.edu
***************************************************/


-------------------------------------------------------
This SF.net email is sponsored by Demarc:
A global provider of Threat Management Solutions.
Download our HomeAdmin security software for free today!
http://www.demarc.com/Info/Sentarus/hamr30
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2005-04-01 02:14:16

by NeilBrown

[permalink] [raw]
Subject: Re: should NLM locks send notifications after nfsd crash

On Thursday March 31, [email protected] wrote:
>
> Thank you for the through explination Trond, its apperciated. This only leaves
> 1 question in my mind: Specifically under the Linux nfs server, does a stopping
> and starting of nfsd (but specifically _not_ rpc.lockd or rpc.statd) constitute
> a loss of NLM state?
>

Uhmm... it depends.

Stopping the nfsd server threads does not, itself, cause a loss of
locking state.
However unexporting the filesystems does, and this often accompanies
the stopping of the last thread.
To be specific:
If the last thread dies due to SIGHUP, or though an explicit setting
of the number of threads to 0:
rpc.nfsd 0
then the filesystems aren't unexported, and locking state remains.
(unless you explicitly exportfs -avu)

If the last thread dies due to any other signal (KILL, INT, QUIT),
then the filesystems are explicitly unexported which will cause
locking state to be lost.

NeilBrown


-------------------------------------------------------
This SF.net email is sponsored by Demarc:
A global provider of Threat Management Solutions.
Download our HomeAdmin security software for free today!
http://www.demarc.com/Info/Sentarus/hamr30
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2005-04-01 02:28:43

by Trond Myklebust

[permalink] [raw]
Subject: Re: should NLM locks send notifications after nfsd crash

fr den 01.04.2005 Klokka 12:14 (+1000) skreiv Neil Brown:

> Stopping the nfsd server threads does not, itself, cause a loss of
> locking state.
> However unexporting the filesystems does, and this often accompanies
> the stopping of the last thread.
> To be specific:
> If the last thread dies due to SIGHUP, or though an explicit setting
> of the number of threads to 0:
> rpc.nfsd 0
> then the filesystems aren't unexported, and locking state remains.
> (unless you explicitly exportfs -avu)
>
> If the last thread dies due to any other signal (KILL, INT, QUIT),
> then the filesystems are explicitly unexported which will cause
> locking state to be lost.


Note also that if you do want to trigger a "reboot situation" and have
lockd go into a grace period, then you can do this by signalling the
lockd thread using "kill -9". Rather than causing the thread to
terminate, it will cause it to flush out all locks, and to call the
function set_grace_period().

It goes without saying that knfsd itself will not be affected by signals
to the lockd thread.

Cheers,
Trond

--
Trond Myklebust <[email protected]>



-------------------------------------------------------
This SF.net email is sponsored by Demarc:
A global provider of Threat Management Solutions.
Download our HomeAdmin security software for free today!
http://www.demarc.com/Info/Sentarus/hamr30
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2005-04-01 12:58:29

by Neil Horman

[permalink] [raw]
Subject: Re: should NLM locks send notifications after nfsd crash

On Thu, Mar 31, 2005 at 09:28:21PM -0500, Trond Myklebust wrote:
> fr den 01.04.2005 Klokka 12:14 (+1000) skreiv Neil Brown:
>
> > Stopping the nfsd server threads does not, itself, cause a loss of
> > locking state.
> > However unexporting the filesystems does, and this often accompanies
> > the stopping of the last thread.
> > To be specific:
> > If the last thread dies due to SIGHUP, or though an explicit setting
> > of the number of threads to 0:
> > rpc.nfsd 0
> > then the filesystems aren't unexported, and locking state remains.
> > (unless you explicitly exportfs -avu)
> >
> > If the last thread dies due to any other signal (KILL, INT, QUIT),
> > then the filesystems are explicitly unexported which will cause
> > locking state to be lost.
>
>
> Note also that if you do want to trigger a "reboot situation" and have
> lockd go into a grace period, then you can do this by signalling the
> lockd thread using "kill -9". Rather than causing the thread to
> terminate, it will cause it to flush out all locks, and to call the
> function set_grace_period().
>
> It goes without saying that knfsd itself will not be affected by signals
> to the lockd thread.
>
> Cheers,
> Trond
>
> --
> Trond Myklebust <[email protected]>
>

Understood. Thank you Neil, Trond!
Neil

--
/***************************************************
*Neil Horman
*Software Engineer
*Red Hat, Inc.
*[email protected]
*gpg keyid: 1024D / 0x92A74FA1
*http://pgp.mit.edu
***************************************************/


-------------------------------------------------------
This SF.net email is sponsored by Demarc:
A global provider of Threat Management Solutions.
Download our HomeAdmin security software for free today!
http://www.demarc.com/Info/Sentarus/hamr30
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs