2007-03-16 08:00:31

by NeilBrown

[permalink] [raw]
Subject: HEADS-UP: nearing nfs-utils 1.1.0 and statd changes.


I'm keen on getting to nfs-utils-1.1.0 relatively soon and so have
been pushing towards it.

To this end I have picked though the patches in current RPMs from SuSE
and Redhat and the deb from Debian.
With few exceptions, everything I have not taken are either
distro-specific or - in my opinion - wrong (or at least imperfect).

A particular exception is fixes for mount.nfs w.r.t. handing of
MS_REMOUNT. I haven't figured out what that is all about yet and it
is getting late. :-)

If anyone has anything else outstanding that is not in the git, please
consider at least telling me about it.

One thing I have been putting thought into is improvements for statd.
Current SLES releases have statd in the kernel and I don't want to
continue that, but instead want to make sure that user-space statd
provides at least equally good service...

I have changed the default so that statd now compiled with
RESTRICTED_STATD, so that it only listens to locked for monitor
requests. Everyone else is ignored. There is no-one else who uses
statd anywhere, so this should be perfectly safe and is more secure.

I have also arranged that the new "mount.nfs" will try to start statd
if that seems to be appropriate (via a script so statd options can be
specified). Said script is not currently in .git, but
cat > /usr/sbin/start-statd <<END
!#/bin/sh
PATH=/sbin:/usr/sbin
statd
END
chmod +x /usr/sbin/start-statd

should do it. Comments in this idea and approach are welcome.

Statd currently does several very different things.
1/ It listens for monitor requests from lockd and creates
files in /var/lib/nfs/sm/HOST
2/ It listens for notifications from peers and tells lockd
that those peers have restarted.
3/ It moves files from sm/ to sm.bak/ and then tries to
notify every host listed in sm.bak/

The first is very similar to what NFSv4 needs for state management,
though is somewhat simpler. I would like to create a better
interface for the kernel to ask for state to be stored, and then use
if for NFSv4 and NLM, subsuming this function.

The second is very simple and could reasonably be moved into the
kernel.

The last is a totally independent operation that just needs to run at
boot time until everything is notified and then exit.

I would like to:

A/ create a separate program "statd-notify" that just performs
operation 3. It is always run at boot time, but often exits
very soon. When statd starts it possibly forks and runs
"statd-notify" depending on options.
Then all that code can be removed from statd.
Redhat currently has a patch that move drop_privs a little later,
presumable related to getting statd-notify functionality
configured properly (hard to be sure - no comments in the patch).

B/ Add functionality to the kernel to register and listen to statd
'notify' requests and act upon them immediately. This would have
to default to off and only be enabled if a new nfs-utils requested
it.

C/ Create an "record state" mechanism that suits NFSv4 and NLM,
incorporate the user-space side of this into statd and have statd
detect if the kernel is new enough and, if it is, tell the kernel
to listen for statd-notify requests, and itself listen to the
kernel for state-storage requests.

I don't know if all of this will be ready for 1.1.0, but I would like
to have 'A' done at least.

Comments or questions on this are also quite welcome.

Thanks,
NeilBrown

-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs


2007-03-19 18:40:52

by Talpey, Thomas

[permalink] [raw]
Subject: Re: HEADS-UP: nearing nfs-utils 1.1.0 and statd changes.

At 07:02 PM 3/18/2007, Neil Brown wrote:
>The part of statd that I particularly want to keep out of the kernel
>is the creating of files in /var/lib/nfs/sm/.

This sounds good. And of course, if the kernel doesn't do the notify
then it doesn't need to read (list) them either.

>SuSE's kstatd seems to cope without resolving mon_name.

I'd be interested in the details. It's been our observation that very few
statd implementations work properly in all cases.

>One option is to just ignore it and use the source IP address. That
>would work fine with the default config which uses IP address for all
>host identification.
>However this doesn't work well for multi-homes hosts and there is a

Yes, multihomed hosts (clients) are the problem. Without the mon_name
lookup, they will be required to send their notifies from each interface
that they originally took nlm locks on. Otherwise, the server won't ever
know, and the locks won't be released. So unless the world comes back
in exactly the same state (and don't forget DHCP) with the same
connectivity, the nlm recovery will fail.

>However if it turns out that name-resolution is really needed in this
>path, then I'll definitely keep that part of statd out of the kernel.

Let me finish digging up some of our testing scenarios and I think you
may be convinced. We did some very interesting research last year
that led to some important changes in the way our Ontap server
performs recovery. There are implications on the client side too,
though most of them were in nfs-utils<=1.0.6 and so fixed.

The big huge gigantic problem with SM_NOTIFY is that it is semantically
void - absolutely nothing can be inferred from its "success". Any peer
with half an idea what an RPC is, will respond whether it thinks it holds
a lock or not. In fact, a peer that doesn't reply gives you more information!

Tom.


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-03-19 23:02:17

by J. Bruce Fields

[permalink] [raw]
Subject: Re: HEADS-UP: nearing nfs-utils 1.1.0 and statd changes.

On Mon, Mar 19, 2007 at 10:49:17AM +1100, Neil Brown wrote:
> On Friday March 16, [email protected] wrote:
> > NFSv4 needs something like the third as well--knfsd needs to know on
> > startup the list of clients that will be allowed to reclaim state from a
> > previous boot instance. (This is to protect clients that *think*
> > they're still holding locks on the server, but (thanks to a network
> > partition) don't realize that the server has actually rebooted twice.)
>
> Similar... but different...

OK, actually more different than similar. We need to run something at
about the same time--nfsd startup--but the stuff it has to do is pretty
different. (In particular, it just needs to dump some information into
the kernel--it doesn't need to talk to any other hosts.)

> You would want to forget about clients who haven't reclaimed when the
> 'grace period' expires. Yes? So when the grace period starts, you
> move state from "current" to "recovering". Then when a client tries
> to recover, we check in 'recovering' and if we find something, we
> recreate the state in 'current'. Then when the grace period ends, we
> remove everything from 'recovering'. So if the server reboots twice
> without actually completing a grace period, the client would still be
> safe.

Essentially correct--but I'd like one small change to that: when we move
clients to that "current" list (an action that'll have to be recorded to
stable storage) I also want to record a timestamp showing when we did
so. That means that we no longer need to forget those clients that
haven't reclaimed at the end of grace--we *can* if we want to, but it's
not urgent because (as long as we also rememember "boot" times), we can
notice at the next boot that their last reclaim was too long ago.

This saves us having to do a bunch of synchronous work at the time the
grace period ends, which is inefficient and complicates the locking.
And it solves one or two extremely obscure corner cases.

(And it's what the rfc recommends, actually--I thought I was being
clever by doing something "simpler". What a loser.)

--b.

-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-03-20 00:30:49

by Talpey, Thomas

[permalink] [raw]
Subject: Re: HEADS-UP: nearing nfs-utils 1.1.0 and statd changes.

At 07:02 PM 3/19/2007, J. Bruce Fields wrote:
>so. That means that we no longer need to forget those clients that
>haven't reclaimed at the end of grace--we *can* if we want to, but it's
>not urgent because (as long as we also rememember "boot" times), we can
>notice at the next boot that their last reclaim was too long ago.

In fact, it's highly desirable to keep their state around, until some
conflict arises. Maybe the network is partitioed, etc. It's the Internet
Principle, in addition to being appropriately lazy.

But I have a question - what's "too long ago"? Do you propose
refusing a reclaim after some interval?

Tom.


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-03-20 01:15:09

by J. Bruce Fields

[permalink] [raw]
Subject: Re: HEADS-UP: nearing nfs-utils 1.1.0 and statd changes.

On Mon, Mar 19, 2007 at 08:30:42PM -0400, Talpey, Thomas wrote:
> At 07:02 PM 3/19/2007, J. Bruce Fields wrote:
> >so. That means that we no longer need to forget those clients that
> >haven't reclaimed at the end of grace--we *can* if we want to, but it's
> >not urgent because (as long as we also rememember "boot" times), we can
> >notice at the next boot that their last reclaim was too long ago.
>
> In fact, it's highly desirable to keep their state around, until some
> conflict arises. Maybe the network is partitioed, etc. It's the Internet
> Principle, in addition to being appropriately lazy.

You're getting a little ahead of me here. I'm not talking about trying
to allow reclaims after the grace period ends--I agree that that would
be nice, but I don't see a really simple way to do that. I'm just
talking about how we implement the simplest reboot recovery behavior.

Currently we're *not* doing what the rfc suggests--keeping a record
with timestamp of first open, etc.--instead we're basically remembering
just the one bit per client (is this client known to us or not), which
means we *must* synchronously invalidate every client as we exit the
grace period. That's awkward.

> But I have a question - what's "too long ago"? Do you propose
> refusing a reclaim after some interval?

So by "too long ago" I mean "more than one boot ago".

--b.

-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-03-20 10:47:55

by Talpey, Thomas

[permalink] [raw]
Subject: Re: HEADS-UP: nearing nfs-utils 1.1.0 and statd changes.

At 09:14 PM 3/19/2007, J. Bruce Fields wrote:
>Currently we're *not* doing what the rfc suggests--keeping a record
>with timestamp of first open, etc.--instead we're basically remembering
>just the one bit per client (is this client known to us or not), which
>means we *must* synchronously invalidate every client as we exit the
>grace period. That's awkward.

Ah, I get it. It has to be invalidated because the state can't be marked
"out of grace"? The timestamp is the right fix of course, but wouldn't
a single bit ("known to us" | "out of grace") kinda sorta do it? Then
invalidation could at least be delayed.

It's a little worse than awkward though. Isn't the server going to return
BAD_STATEID after this (instead of stale/old)? The server goes from
serving no state-granting ops, to dropping everything that didn't make
it back to reclaim in time. The v3 nlm recovery doesn't do that.

Tom.


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

Subject: Re: HEADS-UP: nearing nfs-utils 1.1.0 and statd changes.

On 3/20/07, Talpey, Thomas <[email protected]> wrote:
>
> At 09:14 PM 3/19/2007, J. Bruce Fields wrote:
> >Currently we're *not* doing what the rfc suggests--keeping a record
> >with timestamp of first open, etc.--instead we're basically remembering
> >just the one bit per client (is this client known to us or not), which
> >means we *must* synchronously invalidate every client as we exit the
> >grace period. That's awkward.
>
> Ah, I get it. It has to be invalidated because the state can't be marked
> "out of grace"? The timestamp is the right fix of course, but wouldn't
> a single bit ("known to us" | "out of grace") kinda sorta do it? Then
> invalidation could at least be delayed.
>
> It's a little worse than awkward though. Isn't the server going to return
> BAD_STATEID after this (instead of stale/old)? The server goes from
> serving no state-granting ops, to dropping everything that didn't make
> it back to reclaim in time. The v3 nlm recovery doesn't do that.


No. The stale stateid check is before the bad stateid check
(fs/nfsd/nfs4state.c:nfs4_preprocess_stateid_op) - stale stateid is
returned, which is the correct behavior past reclaim processing.

-->Andy

Tom.
>
>
> -------------------------------------------------------------------------
> Take Surveys. Earn Cash. Influence the Future of IT
> Join SourceForge.net's Techsay panel and you'll get the chance to share
> your
> opinions on IT & business topics through brief surveys-and earn cash
> http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
> _______________________________________________
> NFS maillist - [email protected]
> https://lists.sourceforge.net/lists/listinfo/nfs
>
>


Attachments:
(No filename) (1.67 kB)
(No filename) (2.48 kB)
(No filename) (345.00 B)
(No filename) (140.00 B)
Download all attachments

2007-03-20 12:31:04

by Steve Dickson

[permalink] [raw]
Subject: Re: HEADS-UP: nearing nfs-utils 1.1.0 and statd changes.

Neil Brown wrote:
> I have changed the default so that statd now compiled with
> RESTRICTED_STATD, so that it only listens to locked for monitor
> requests. Everyone else is ignored. There is no-one else who uses
> statd anywhere, so this should be perfectly safe and is more secure.
It just occurred to me that this may not be a good idea... Unless
there has been some changes in the area, turn this on will
break locking... The reason I know this is that I turned it on
at one point and people started to see the following messages
being logged:
nsm_mon_unmon: rpc failed, status=-13
lockd: cannot monitor 192.168.1.202

https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=140385
has the details...

steved.


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-03-20 14:26:30

by J. Bruce Fields

[permalink] [raw]
Subject: Re: HEADS-UP: nearing nfs-utils 1.1.0 and statd changes.

On Tue, Mar 20, 2007 at 07:24:00AM -0400, William A. (Andy) Adamson wrote:
> On 3/20/07, Talpey, Thomas <[email protected]> wrote:
> >
> >At 09:14 PM 3/19/2007, J. Bruce Fields wrote:
> >>Currently we're *not* doing what the rfc suggests--keeping a record
> >>with timestamp of first open, etc.--instead we're basically remembering
> >>just the one bit per client (is this client known to us or not), which
> >>means we *must* synchronously invalidate every client as we exit the
> >>grace period. That's awkward.
> >
> >Ah, I get it. It has to be invalidated because the state can't be marked
> >"out of grace"? The timestamp is the right fix of course, but wouldn't
> >a single bit ("known to us" | "out of grace") kinda sorta do it? Then
> >invalidation could at least be delayed.
> >
> >It's a little worse than awkward though. Isn't the server going to return
> >BAD_STATEID after this (instead of stale/old)? The server goes from
> >serving no state-granting ops, to dropping everything that didn't make
> >it back to reclaim in time. The v3 nlm recovery doesn't do that.
>
>
> No. The stale stateid check is before the bad stateid check
> (fs/nfsd/nfs4state.c:nfs4_preprocess_stateid_op) - stale stateid is
> returned, which is the correct behavior past reclaim processing.

Yeah, note that the necessary information is in the stateid itself--we
embed the current boot time in every stateid we hand out--so we don't
need to keep around any memory of the client in order to hand out the
correct error.

--b.

-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-03-20 14:49:29

by Talpey, Thomas

[permalink] [raw]
Subject: Re: HEADS-UP: nearing nfs-utils 1.1.0 and statd changes.

At 10:26 AM 3/20/2007, J. Bruce Fields wrote:
>On Tue, Mar 20, 2007 at 07:24:00AM -0400, William A. (Andy) Adamson wrote:
>> No. The stale stateid check is before the bad stateid check
>> (fs/nfsd/nfs4state.c:nfs4_preprocess_stateid_op) - stale stateid is
>> returned, which is the correct behavior past reclaim processing.
>
>Yeah, note that the necessary information is in the stateid itself--we
>embed the current boot time in every stateid we hand out--so we don't
>need to keep around any memory of the client in order to hand out the
>correct error.

Yeah, after Andy's pointer I did notice that all non-0, non-1,
non-current-server-boot-time stateids from the client will result in
a "stale" return from STALE_STATEID(). ;-)

I might argue on that "correct" adjective, but yes, it does encourage
reclaim/recovery. If the client is truly full of garbage, then the clientid
will probably fail too anyway.

Thanks.

Tom.


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-03-20 14:57:08

by J. Bruce Fields

[permalink] [raw]
Subject: Re: HEADS-UP: nearing nfs-utils 1.1.0 and statd changes.

On Tue, Mar 20, 2007 at 10:49:15AM -0400, Talpey, Thomas wrote:
> At 10:26 AM 3/20/2007, J. Bruce Fields wrote:
> >On Tue, Mar 20, 2007 at 07:24:00AM -0400, William A. (Andy) Adamson wrote:
> >> No. The stale stateid check is before the bad stateid check
> >> (fs/nfsd/nfs4state.c:nfs4_preprocess_stateid_op) - stale stateid is
> >> returned, which is the correct behavior past reclaim processing.
> >
> >Yeah, note that the necessary information is in the stateid itself--we
> >embed the current boot time in every stateid we hand out--so we don't
> >need to keep around any memory of the client in order to hand out the
> >correct error.
>
> Yeah, after Andy's pointer I did notice that all non-0, non-1,
> non-current-server-boot-time stateids from the client will result in
> a "stale" return from STALE_STATEID(). ;-)
>
> I might argue on that "correct" adjective, but yes, it does encourage
> reclaim/recovery.

Is there a particular case you're worried about?

> If the client is truly full of garbage, then the clientid
> will probably fail too anyway.

Yeah, I don't see any reason to care about a client that hands us random
stateid's that weren't given out by us.

--b.

-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-03-20 15:04:11

by Talpey, Thomas

[permalink] [raw]
Subject: Re: HEADS-UP: nearing nfs-utils 1.1.0 and statd changes.

At 10:57 AM 3/20/2007, J. Bruce Fields wrote:
>Is there a particular case you're worried about?
>
>> If the client is truly full of garbage, then the clientid
>> will probably fail too anyway.
>
>Yeah, I don't see any reason to care about a client that hands us random
>stateid's that weren't given out by us.

Well, maybe just returning them as "stale". But in the absence of
the server persisting it all, it could be the best it can do.

With v4.1 and a session, it will be a lot harder to get there anyway.

Tom.


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-03-16 13:00:26

by Talpey, Thomas

[permalink] [raw]
Subject: Re: HEADS-UP: nearing nfs-utils 1.1.0 and statd changes.

At 04:00 AM 3/16/2007, Neil Brown wrote:
>I'm keen on getting to nfs-utils-1.1.0 relatively soon and so have
>been pushing towards it.

Can you define "relatively"? Weeks?

I guess I'd be concerned about a good long testing cycle before the
first bump of the second version in, like, forever. Just a perception
thing from the user community.

>One thing I have been putting thought into is improvements for statd.
>Current SLES releases have statd in the kernel and I don't want to
>continue that, but instead want to make sure that user-space statd
>provides at least equally good service...

Does this mean SLES can stop using its own statd if these changes are
done? If so, that's a good goal.

>I have also arranged that the new "mount.nfs" will try to start statd
>if that seems to be appropriate (via a script so statd options can be
>specified).

What conditions make it appropriate? Does it do some kind of nlm ping
for instance?

Said script is not currently in .git,
>A/ create a separate program "statd-notify" that just performs
> operation 3. It is always run at boot time, but often exits
> very soon. When statd starts it possibly forks and runs
> "statd-notify" depending on options.
> Then all that code can be removed from statd.

Hurray! This is a great approach, especially since the notifies can
take a long, long time and interfere with other statd functions
while they're in progress. We did some research on this process
here as part of other work, btw. I'll look for the list of issues we
found, there were a couple of interesting ones.

>B/ Add functionality to the kernel to register and listen to statd
> 'notify' requests and act upon them immediately. This would have
> to default to off and only be enabled if a new nfs-utils requested
> it.

I assume you mean the client kernel, or both? In either case...

But but but, doesn't this move statd back into the kernel like you
weren't trying to do? Also, how does it deal with the resolving the
mon_name in the sm_notify? Won't it depend on a callout (ouch)?

Tom.


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-03-16 14:03:51

by Steve Dickson

[permalink] [raw]
Subject: Re: HEADS-UP: nearing nfs-utils 1.1.0 and statd changes.



Neil Brown wrote:
> A particular exception is fixes for mount.nfs w.r.t. handing of
> MS_REMOUNT. I haven't figured out what that is all about yet and it
> is getting late. :-)
Well the main purpose of the patch is to stop ment->mnt_opts
from being re-freed during the remount path:

+ /*
+ * Note: free(ment->mnt_opts) happens in discard_mntentchn()
+ * via update_mtab() on remounts
+ */
+ if (!remount)
+ free(ment->mnt_opts);
+}

The second part the patch is basically cleaning
up how the default nfs version is set... preparing
for the day when v4 is the default...

steved.

-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-03-16 14:30:35

by Kevin Coffman

[permalink] [raw]
Subject: Re: HEADS-UP: nearing nfs-utils 1.1.0 and statd changes.

Hi Neil,
Is it still the intention to change the default to build mount.nfs by
default for 1.1.0?

K.C.

-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-03-16 18:10:51

by J. Bruce Fields

[permalink] [raw]
Subject: Re: HEADS-UP: nearing nfs-utils 1.1.0 and statd changes.

On Fri, Mar 16, 2007 at 07:00:21PM +1100, Neil Brown wrote:
> Statd currently does several very different things.
> 1/ It listens for monitor requests from lockd and creates
> files in /var/lib/nfs/sm/HOST
> 2/ It listens for notifications from peers and tells lockd
> that those peers have restarted.
> 3/ It moves files from sm/ to sm.bak/ and then tries to
> notify every host listed in sm.bak/
>
> The first is very similar to what NFSv4 needs for state management,
> though is somewhat simpler. I would like to create a better
> interface for the kernel to ask for state to be stored, and then use
> if for NFSv4 and NLM, subsuming this function.

NFSv4 needs something like the third as well--knfsd needs to know on
startup the list of clients that will be allowed to reclaim state from a
previous boot instance. (This is to protect clients that *think*
they're still holding locks on the server, but (thanks to a network
partition) don't realize that the server has actually rebooted twice.)

--b.

-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-03-18 22:52:23

by NeilBrown

[permalink] [raw]
Subject: Re: HEADS-UP: nearing nfs-utils 1.1.0 and statd changes.

On Friday March 16, [email protected] wrote:
> Hi Neil,
> Is it still the intention to change the default to build mount.nfs by
> default for 1.1.0?

Yes. Which means the code needs some careful review as it needs to be
setuid-root. So between -rc1 and -final I'll be looking at it
carefully, and I would be great if other did too....

NeilBrown

-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-03-18 23:02:13

by NeilBrown

[permalink] [raw]
Subject: Re: HEADS-UP: nearing nfs-utils 1.1.0 and statd changes.

On Friday March 16, [email protected] wrote:
> At 04:00 AM 3/16/2007, Neil Brown wrote:
> >I'm keen on getting to nfs-utils-1.1.0 relatively soon and so have
> >been pushing towards it.
>
> Can you define "relatively"? Weeks?

I was deliberately being vague :-)
Soon 'relative' to the typical inter-release time for nfs-utils ??

I'd like to have an -rc1 this week or next week.
I'd like lots of people to test it.
I'd like to develop a test-suite for nfs-utils.
I'd like to make final release about 2 weeks after the -rc.

>
> I guess I'd be concerned about a good long testing cycle before the
> first bump of the second version in, like, forever. Just a perception
> thing from the user community.

Fair enough .... but who runs a .0 anyway? Doesn't everyone wait for
.1??
But yes, some testing time is important.

>
> >One thing I have been putting thought into is improvements for statd.
> >Current SLES releases have statd in the kernel and I don't want to
> >continue that, but instead want to make sure that user-space statd
> >provides at least equally good service...
>
> Does this mean SLES can stop using its own statd if these changes are
> done? If so, that's a good goal.

I hope so. I don't expect OpenSuSE-10.3 to have an in-kernel statd (that is
not an official statement from SuSE or Novell...)

>
> >I have also arranged that the new "mount.nfs" will try to start statd
> >if that seems to be appropriate (via a script so statd options can be
> >specified).
>
> What conditions make it appropriate? Does it do some kind of nlm ping
> for instance?

Currently it just tests /var/run/rpc.statd.pid.

>
> Hurray! This is a great approach, especially since the notifies can
> take a long, long time and interfere with other statd functions
> while they're in progress. We did some research on this process
> here as part of other work, btw. I'll look for the list of issues we
> found, there were a couple of interesting ones.

Certainly if there are 'issues' with current statd I'd be very happy
to hear about them.

>
> >B/ Add functionality to the kernel to register and listen to statd
> > 'notify' requests and act upon them immediately. This would have
> > to default to off and only be enabled if a new nfs-utils requested
> > it.
>
> I assume you mean the client kernel, or both? In either case...

Both.

>
> But but but, doesn't this move statd back into the kernel like you
> weren't trying to do? Also, how does it deal with the resolving the
> mon_name in the sm_notify? Won't it depend on a callout (ouch)?

The part of statd that I particularly want to keep out of the kernel
is the creating of files in /var/lib/nfs/sm/. Listening to RPC
requests and updating internal state accordingly is not very much
different from what lockd does and would seem to fit quite neatly in
the kernel with lockd.

SuSE's kstatd seems to cope without resolving mon_name.
One option is to just ignore it and use the source IP address. That
would work fine with the default config which uses IP address for all
host identification.
However this doesn't work well for multi-homes hosts and there is a
relatively new sysctl to day "use hostnames". In that case, maybe we
require the sysadmin to arrange things so a strcmp is all that is
needed.

However if it turns out that name-resolution is really needed in this
path, then I'll definitely keep that part of statd out of the kernel.

Thanks for your thoughts.

NeilBrown

-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-03-18 23:49:28

by NeilBrown

[permalink] [raw]
Subject: Re: HEADS-UP: nearing nfs-utils 1.1.0 and statd changes.

On Friday March 16, [email protected] wrote:
> On Fri, Mar 16, 2007 at 07:00:21PM +1100, Neil Brown wrote:
> > Statd currently does several very different things.
> > 1/ It listens for monitor requests from lockd and creates
> > files in /var/lib/nfs/sm/HOST
> > 2/ It listens for notifications from peers and tells lockd
> > that those peers have restarted.
> > 3/ It moves files from sm/ to sm.bak/ and then tries to
> > notify every host listed in sm.bak/
> >
> > The first is very similar to what NFSv4 needs for state management,
> > though is somewhat simpler. I would like to create a better
> > interface for the kernel to ask for state to be stored, and then use
> > if for NFSv4 and NLM, subsuming this function.
>
> NFSv4 needs something like the third as well--knfsd needs to know on
> startup the list of clients that will be allowed to reclaim state from a
> previous boot instance. (This is to protect clients that *think*
> they're still holding locks on the server, but (thanks to a network
> partition) don't realize that the server has actually rebooted twice.)

Similar... but different...

You would want to forget about clients who haven't reclaimed when the
'grace period' expires. Yes?
So when the grace period starts, you move state from "current" to
"recovering".
Then when a client tries to recover, we check in 'recovering' and if
we find something, we recreate the state in 'current'.
Then when the grace period ends, we remove everything from
'recovering'.
So if the server reboots twice without actually completing a grace
period, the client would still be safe.

Does that fit your understanding?

Thanks for reminding me of that.

NeilBrown

-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs