Hi!
I'm experiencing two problems, one seems to be related to NFS locking,
and the other seems to be some kind of timeout.
I'm trying to create a CD which will boot the root file system on
NFS. The CD can mount the nfs file system and run programs from
there. Also makeing this the root file system works fine. But when
init starts, I get error messages on the server and client, related to
lockd and rpc.statd. The system boots and I can make a user login from
outside via ssh. But getting super-user will take some 3 minutes. I'm
also running an X server there. Sometimes I can make a login there but
it also would take some 3 minutes to succeed. Other times, it just
hangs there, or the keyboard stops working. The keyboard actually
fails always, because if I manage to login, at most after a few
minutes it's dead.
The files on the server are the extraction of a tarball I created
from a working HD. There are no firewalls running and both the server
and client use pidentd. I used tcpdump to see if there are any network
errors, but I couldn't find any.
The server is a debian sid machine with an unpatched 2.4.19 kernel und
the official debian packages for the kernel server. The client is a
debian 3.0 (stable) machine with the same kernel, but having only NFS
client compiled into it, not the kernel nfs server.
This is what the client reports (repeatedly).
nsm_mon_unmon: rpc failed, status=-13
lockd: cannot monitor 192.168.254.15
lockd: failed to monitor 192.168.254.15
I tried to trace the command from the rcS.d scripts which would cause
them, and, if there is no delay, the first comes in checkroot.sh, when
"mount -f -o remount /" is given. I've read that status=-13 tells that
statd is missing, but at this time it's running on the server and on
the client (I checked it inserting a ps command into that
script). There seem to be other commands which also trigger these
error messages.
At the same times, the server reports in /var/log/daemon.log:
rpc.statd[9117]: Can't callback 127.0.0.1 (100021,4), giving up.
rpc.statd[9117]: Received erroneous SM_UNMON request from baco \
for 192.168.254.101
rpc.statd[9117]: notify_host: failed to notify 127.0.0.1
192.168.254.15 is the server (baco) and 192.168.254.101 is the client.
The /etc/exports file on the server has:
/nfsvol 192.168.254.101(rw,sync,no_root_squash)
I also tried with no_auth_nlm, but there was no change. Also on the
server, /etc/hosts.allow has:
ALL: 127.0.0.1
ALL: 192.168.254.0/255.255.255.0
I added the first line later for the error message, but that didn't
help neither. /etc/hosts.deny on the server has no uncommented
line. The /etc/export and /etc/hosts.deny files on the client is also
empty, while /etc/hosts.allow on the client has:
portmap: 192.168.254.15
lockd: 192.168.254.15
rquotad: 192.168.254.15
mountd: 192.168.254.15
statd: 192.168.254.15
This is the server's IP. As soon as I can log in, "rpcinfo -p" on the
client answers:
program vers proto port
100000 2 tcp 111 portmapper
100000 2 udp 111 portmapper
100024 1 udp 1025 status
100024 1 tcp 1024 status
and on the server:
program vers proto port
100000 2 tcp 111 portmapper
100000 2 udp 111 portmapper
100024 1 udp 37824 status
100024 1 tcp 57427 status
100003 2 udp 2049 nfs
100003 3 udp 2049 nfs
100021 1 udp 37825 nlockmgr
100021 3 udp 37825 nlockmgr
100021 4 udp 37825 nlockmgr
100005 1 udp 37826 mountd
100005 1 tcp 57428 mountd
100005 2 udp 37826 mountd
100005 2 tcp 57428 mountd
100005 3 udp 37826 mountd
100005 3 tcp 57428 mountd
The client's /etc/fstab is:
192.168.254.15:/nfsvol / nfs rw,hard,intr 0 0
I also tried with nolock, but that didn't change it.
I also found it strange, that "cat /proc/mount" on the client will
always give:
192.168.254.15:/nfsvol / \
nfs rw,v3,rsize=8192,wsize=8192,hard,udp,lock,addr=192.168.254.15
where v3, rsize, wsize, udp and lock are options I never gave. I've
read that the defaults for rsize,wsize is 1024, so I don't know where
it comes from. While watching tcpdump, I've seen fragmentations, but
assembling seems to have succeeded always.
I would be grateful for any hint for how to solve this. Please CC me,
as I am not on this mailing list.
--
Christoph Simon
[email protected]
-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs
According to Christoph Simon:
> The server is a debian sid machine with an unpatched 2.4.19 kernel und
> the official debian packages for the kernel server. The client is a
> debian 3.0 (stable) machine with the same kernel, but having only NFS
> client compiled into it, not the kernel nfs server.
Caveat: I don't really know what's going wrong.
However, I think this may be part of the problem:
> /etc/hosts.allow on the client has:
> portmap: 192.168.254.15
> lockd: 192.168.254.15
> rquotad: 192.168.254.15
> mountd: 192.168.254.15
> statd: 192.168.254.15
The client needs to be able to communicate with itself. You should
probably have "ALL: 127.0.0.1" in there as well.
Also, in a possibly unrelated matter, I suggest that the client mount
a ramdisk on /var before statd runs from /etc/init.d/nfs-common starts
statd. Make sure that /var/lib/nfs exists before you start. This is
probably a good idea anyway given that you surely don't want lots of
clients sharing /var.
> nsm_mon_unmon: rpc failed, status=-13
> lockd: cannot monitor 192.168.254.15
> lockd: failed to monitor 192.168.254.15
>
> I tried to trace the command from the rcS.d scripts which would cause
> them, and, if there is no delay, the first comes in checkroot.sh, when
> "mount -f -o remount /" is given. I've read that status=-13 tells that
> statd is missing, but at this time it's running on the server and on
> the client [...]
How is it running on the client? IIRC, checkroot.sh runs long before
'/etc/init.d/nfs-common start'.
> I also found it strange, that "cat /proc/mount" on the client will
> always give:
>
> 192.168.254.15:/nfsvol / \
> nfs rw,v3,rsize=8192,wsize=8192,hard,udp,lock,addr=192.168.254.15
>
> where v3, rsize, wsize, udp and lock are options I never gave.
Those are the defaults for the given options.
> read that the defaults for rsize,wsize is 1024,
That was long ago. Nowadays, 8K is the default.
--
Chip Salzenberg - a.k.a. - <[email protected]>
"It furthers one to have somewhere to go."
-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs