2007-03-22 04:54:46

by NeilBrown

[permalink] [raw]
Subject: Re: HEADS-UP: nearing nfs-utils 1.1.0 and statd changes.

On Monday March 19, [email protected] wrote:
> At 07:02 PM 3/18/2007, Neil Brown wrote:
> >SuSE's kstatd seems to cope without resolving mon_name.
>
> I'd be interested in the details. It's been our observation that very few
> statd implementations work properly in all cases.

It doesn't do anything particularly clever.
By default it depends on IP address matching, so as long as you don't
have multi-homed hosts everything is easy
If you do have multi-homed hosts you need to enable host-name matching
(via a sysctl) and ensure that hostnames are used consistently.
i.e. the hostname used when you mount a filesystem must be the same as
that which sm-notify on the server will use in mon_name when sending
notify that it has reboot.

In the other direction, sm-notify on the NFS client must simply mon_name
to "uname -n" as that will be the 'caller_name' in the LOCK request.
Not to hard to set up. You just need to make sure when you mount that
you use the name that the server gets from 'uname -a'. Probably easy
to get wrong though, and you never know it is wrong until you reboot.

I guess that to be completely safe, a NOTIFY should be sent from every
IP that the local host has, to every IP that every peer has, with
mon_name set to the source IP address, and then again with mon_name set
to the host name, both simple and FQDN.

And when you receive a NOTIFY, you should do an DNS lookup to find the
FQDN and then all the IP addresses, and then assume a reboot for any
local state that matches any of that.

Which does seem to suggest that DNS lookup is needed on receipt of
NOTIFY, so that part cannot really go in the kernel.

I might need to think about this a bit more...

NeilBrown

-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs


2007-03-22 12:36:19

by Talpey, Thomas

[permalink] [raw]
Subject: Re: HEADS-UP: nearing nfs-utils 1.1.0 and statd changes.

At 12:54 AM 3/22/2007, Neil Brown wrote:
>If you do have multi-homed hosts you need to enable host-name matching
>(via a sysctl) and ensure that hostnames are used consistently.
>i.e. the hostname used when you mount a filesystem must be the same as
>that which sm-notify on the server will use in mon_name when sending
>notify that it has reboot.

Sort of. The server's mon_name needs to map back to the IP address that
the client mounted. It doesn't have to be literally the servername that the
client mounted, in fact the server doesn't know that. Besides there's
multihoming, dhcp, etc.

The point is: the *hostnames* are the invariant in nlm recovery.

>In the other direction, sm-notify on the NFS client must simply mon_name
>to "uname -n" as that will be the 'caller_name' in the LOCK request.

Correct.

>I guess that to be completely safe, a NOTIFY should be sent from every
>IP that the local host has, to every IP that every peer has, with
>mon_name set to the source IP address, and then again with mon_name set
>to the host name, both simple and FQDN.

Some hosts go even further, they send a dotted-quad address in another
notify. And, some machines (I won't name names) fail to handle this, because
they pass the mon_name directly to the DNS resolver library without trying
an inet_aton().

>And when you receive a NOTIFY, you should do an DNS lookup to find the
>FQDN and then all the IP addresses, and then assume a reboot for any
>local state that matches any of that.

Yes, you need to go from the provided hostname to all its IP aliases. You
can throw in the source address you got the notify from, but in theory that's
one of them. To be VERY sure, the server should send all these notifies from
each and every one of its interfaces. This ensures that the message gets
routed successfully, and also that clients who don't do the full resolve (or
see a DNS error etc) have a prayer of knowing what to do.

Even then, the server has to guess. A reply to a notify doesn't mean the
notify succeeded, because (as I mentioned in an earlier message), the notify
is a void procedure, so any statd will reply whether or not it cares. A client
that starts reclaiming is a good indication that something was reached, but
since there's no reclaim done indicator, the server has no clue that the client
is finished.

Bottom line, the server has to do everything it possibly can, and then pray.

Tom.


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs