2008-02-13 10:20:49

by Andrew Morton

[permalink] [raw]
Subject: Re: Upgrade to 2.6.24 breaks NFS service

(cc linux-nfs)

On Wed, 13 Feb 2008 08:58:03 +0000 Nix <[email protected]> wrote:

> I upgraded from 2.6.23.10 to 2.6.24.2 yesterday, and found NFS service
> failing.
>
> To be specific, all locks were blocking forever, with an endless flood
> of
>
> Feb 12 22:53:10 loki notice: kernel: statd: server localhost not responding, timed out
> Feb 12 22:53:10 loki notice: kernel: lockd: cannot monitor esperi
> Feb 12 22:53:45 loki notice: kernel: statd: server localhost not responding, timed out
> Feb 12 22:53:45 loki notice: kernel: lockd: cannot monitor esperi
> Feb 12 22:54:20 loki notice: kernel: statd: server localhost not responding, timed out
> Feb 12 22:54:20 loki notice: kernel: lockd: cannot monitor esperi
> Feb 12 22:54:55 loki notice: kernel: statd: server localhost not responding, timed out
> Feb 12 22:54:55 loki notice: kernel: lockd: cannot monitor esperi
> Feb 12 22:55:20 esperi notice: kernel: lockd: server loki.wkstn.nix not responding, still trying
>
> (esperi is a UML instance on the same machine, connected via a bridged
> TUN/TAP interface: the bridge, and network service to esperi and to the
> rest of the local net across that bridge, was fine.)
>
> I'm currently using NFSv3 atop nfs-utils 1.1.0.33-gdd08789, with daemons
> being started in the suggested order (portmap, mountd, statd
> --no-notify, nfsd). This evening I'm going to try to upgrade to
> nfs-utils HEAD and see if this continues.
>



2008-02-13 12:57:53

by Jeff Layton

[permalink] [raw]
Subject: Re: Upgrade to 2.6.24 breaks NFS service

On Wed, 13 Feb 2008 02:19:12 -0800
Andrew Morton <[email protected]> wrote:

> (cc linux-nfs)
>
> On Wed, 13 Feb 2008 08:58:03 +0000 Nix <[email protected]> wrote:
>
> > I upgraded from 2.6.23.10 to 2.6.24.2 yesterday, and found NFS
> > service failing.
> >
> > To be specific, all locks were blocking forever, with an endless
> > flood of
> >
> > Feb 12 22:53:10 loki notice: kernel: statd: server localhost not
> > responding, timed out Feb 12 22:53:10 loki notice: kernel: lockd:
> > cannot monitor esperi Feb 12 22:53:45 loki notice: kernel: statd:
> > server localhost not responding, timed out Feb 12 22:53:45 loki
> > notice: kernel: lockd: cannot monitor esperi Feb 12 22:54:20 loki
> > notice: kernel: statd: server localhost not responding, timed out
> > Feb 12 22:54:20 loki notice: kernel: lockd: cannot monitor esperi
> > Feb 12 22:54:55 loki notice: kernel: statd: server localhost not
> > responding, timed out Feb 12 22:54:55 loki notice: kernel: lockd:
> > cannot monitor esperi Feb 12 22:55:20 esperi notice: kernel: lockd:
> > server loki.wkstn.nix not responding, still trying
> >
> > (esperi is a UML instance on the same machine, connected via a
> > bridged TUN/TAP interface: the bridge, and network service to
> > esperi and to the rest of the local net across that bridge, was
> > fine.)
> >
> > I'm currently using NFSv3 atop nfs-utils 1.1.0.33-gdd08789, with
> > daemons being started in the suggested order (portmap, mountd, statd
> > --no-notify, nfsd). This evening I'm going to try to upgrade to
> > nfs-utils HEAD and see if this continues.
> >
>

If upgrading nfs-utils doesn't help, on this box, could you run:

# rpcinfo -p localhost

send the output? statd expects that lockd will always be listening on a
UDP socket and some changes recently made it so that when there are
only TCP mounts that it doesn't necessarily do so. That may be the
problem here.

--
Jeff Layton <[email protected]>

2008-02-14 00:14:50

by Nix

[permalink] [raw]
Subject: Re: Upgrade to 2.6.24 breaks NFS service

On 13 Feb 2008, Jeff Layton told this:

> If upgrading nfs-utils doesn't help, on this box, could you run:
>
> # rpcinfo -p localhost
>
> send the output? statd expects that lockd will always be listening on a
> UDP socket and some changes recently made it so that when there are
> only TCP mounts that it doesn't necessarily do so. That may be the
> problem here.

I rebooted back into 2.6.24.2 again, and everything works now, without
even upgrading nfs-utils (although I did that anyway, to nfs-utils git
head, and it's still happy).

Linux: debugs itself, no human intervention required! :)



(The inconsistency of this screams `port allocation' to me. If it
happens again I'll get some rpcinfo output and packet dumps. I'd
have done it this time if it hadn't been for the plug-the-security-
hole rush and it being 2am.)

--
`The rest is a tale of post and counter-post.' --- Ian Rawlings
describes USENET