2004-02-02 14:08:55

by Olaf Kirch

[permalink] [raw]
Subject: statd simplified

Hi all,

I've never really liked statd for a number of reasons (the following
list is not exhaustive):

- 50% of all bug reports I get conerning "NFS file locking
broken" is due to people forgetting to start statd
(the other half is people trying to do flock over NFS :)
- lockd having to do upcalls to user space is ugly
- user space doing RPC callbacks to the kernel is even uglier
- statd isn't really used by anything but lockd, so the
only reason for making things so complex is probably because
the folks creating NFS locking _liked_ complexity.

So I sat down last week and dusted off some old ideas how to simplify
NLM/NSM interaction quite a bit.

The approach I implemented was to get rid of all the SM_MON/SM_UNMON
calls completely. If lockd wants a remote host to be "monitored", simply
add it's IP address to /var/lib/nfs/sm. This used to be statd's job,
but there's no reason why we cannot do that in the kernel.

In addition, lockd now understands a very limited subset of NSM
procedures: NULL, and SM_NOTIFY. If it receives an SM_NOTIFY message,
it initiated reclaim for all locks we hold on that server (if any)
and ditch all locks this client holds on our box (if any).

(As a special bonus, the new code actually initializes nsm_local_state
from /var/lib/nfs/state, which we previously didn't).

That is almost all there is to statd. The only piece of the puzzle
missing is a facility to send out NSM notifications when we reboot.
That can be done in user space by running "rpc.statd -N" (notify-only
mode) every time we boot.

Please find attached two kernel patches. The first is from Andreas
Gruenbacher implementing several RPC programs sharing a single server
socket (he wrote the patch for his NFS ACL implementation); the second
is my kernel-statd patch.

Feedback welcome!

Cheers,
Olaf
--
Olaf Kirch | Stop wasting entropy - start using predictable
[email protected] | tempfile names today!
---------------+


Attachments:
(No filename) (1.91 kB)
sunrpc-multiple-programs (2.49 kB)
kernel-statd (15.66 kB)
Download all attachments

2004-02-02 18:15:52

by Steve Dickson

[permalink] [raw]
Subject: Re: statd simplified

Olaf Kirch wrote:

>So I sat down last week and dusted off some old ideas how to simplify
>NLM/NSM interaction quite a bit.
>
>
Putting statd in the kernel I think a good idea... Probably the only
reason it has not been done, is the fact, managing the state file from
the kernel is a real pain...

>The approach I implemented was to get rid of all the SM_MON/SM_UNMON
>calls completely. If lockd wants a remote host to be "monitored", simply
>add it's IP address to /var/lib/nfs/sm. This used to be statd's job,
>but there's no reason why we cannot do that in the kernel.
>
>
Letting /var/lib/nfs/sm grow uncontrollably (i.e. not using SM_UNMON to
clean it up) may not be a good idea... Since users can't go in and clean
in up by hand
(since they could be removing live state), there probably should be some
way to cleaned up (and up to date)

>In addition, lockd now understands a very limited subset of NSM
>procedures: NULL, and SM_NOTIFY. If it receives an SM_NOTIFY message,
>it initiated reclaim for all locks we hold on that server (if any)
>and ditch all locks this client holds on our box (if any).
>
>(As a special bonus, the new code actually initializes nsm_local_state
>from /var/lib/nfs/state, which we previously didn't).
>
>
Currently when statd can not open the state file (in change_state()),
it dies
which in turn causes lock requests to fail because that can't be monitored.
In your kernel implementation, this failure seems to be ignored. Since
the failure would probably do to the sm directory not existing, which
also means state file will not able to be created, it might make sense
to fail the loading of lockd when this happens.

Also I notice that the new nsm_monitor() will only fail with ENOMEM,
regardless if it was able to save state... If it can not save state,
shouldn't it fail?

SteveD.




-------------------------------------------------------
The SF.Net email is sponsored by EclipseCon 2004
Premiere Conference on Open Tools Development and Integration
See the breadth of Eclipse activity. February 3-5 in Anaheim, CA.
http://www.eclipsecon.org/osdn
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2004-02-03 10:22:56

by Olaf Kirch

[permalink] [raw]
Subject: Re: statd simplified

Hi,

On Mon, Feb 02, 2004 at 12:56:59PM -0500, Steve Dickson wrote:
> Letting /var/lib/nfs/sm grow uncontrollably (i.e. not using SM_UNMON to
> clean it up) may not be a good idea... Since users can't go in and clean
> in up by hand
> (since they could be removing live state), there probably should be some
> way to cleaned up (and up to date)

Okay, I could add that. Calling unlink from kernel space is a bit of
a nuisance, but I'll try :)

> Currently when statd can not open the state file (in change_state()),
> it dies
> which in turn causes lock requests to fail because that can't be monitored.
> In your kernel implementation, this failure seems to be ignored. Since
> the failure would probably do to the sm directory not existing, which
> also means state file will not able to be created, it might make sense
> to fail the loading of lockd when this happens.

Agreed. Kernel statd just reads the file, however, it doesn't update
the state. That's done by rpc.statd -N.

I'll submit an updated patch later today or tomorrow.

Olaf
--
Olaf Kirch | Stop wasting entropy - start using predictable
[email protected] | tempfile names today!
---------------+


-------------------------------------------------------
The SF.Net email is sponsored by EclipseCon 2004
Premiere Conference on Open Tools Development and Integration
See the breadth of Eclipse activity. February 3-5 in Anaheim, CA.
http://www.eclipsecon.org/osdn
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2004-02-03 16:51:32

by Steve Dickson

[permalink] [raw]
Subject: Re: statd simplified

Hello,

Olaf Kirch wrote:

>On Mon, Feb 02, 2004 at 12:56:59PM -0500, Steve Dickson wrote:
>
>
>>Letting /var/lib/nfs/sm grow uncontrollably (i.e. not using SM_UNMON to
>>clean it up) may not be a good idea... Since users can't go in and clean
>>in up by hand
>>(since they could be removing live state), there probably should be some
>>way to cleaned up (and up to date)
>>
>>
>
>Okay, I could add that. Calling unlink from kernel space is a bit of
>a nuisance, but I'll try :)
>
>
Maybe the user level daemon could be used to do this....
When an SM_UNMON comes in, the kernel could clean up its
state and then forward the message to the daemon that would
cause the file to be removed...

SteveD.



-------------------------------------------------------
The SF.Net email is sponsored by EclipseCon 2004
Premiere Conference on Open Tools Development and Integration
See the breadth of Eclipse activity. February 3-5 in Anaheim, CA.
http://www.eclipsecon.org/osdn
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2004-02-03 21:28:27

by Peter Astrand

[permalink] [raw]
Subject: Re: statd simplified


>I've never really liked statd for a number of reasons (the following
>list is not exhaustive):
>...
>So I sat down last week and dusted off some old ideas how to simplify
>NLM/NSM interaction quite a bit.

Here is something to add to the wishlist: HA support.
http://marc.theaimsgroup.com/?l=linux-nfs&m=106880965906133&w=2 lists some
problems with using statd in a HA environment.


--
Peter ?strand http://www.thinlinc.com
Cendio http://www.cendio.se
Teknikringen 3 Phone: +46-13-21 46 00
583 30 Link?ping








-------------------------------------------------------
The SF.Net email is sponsored by EclipseCon 2004
Premiere Conference on Open Tools Development and Integration
See the breadth of Eclipse activity. February 3-5 in Anaheim, CA.
http://www.eclipsecon.org/osdn
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs