2005-04-12 20:55:41

by Edward Hibbert

[permalink] [raw]
Subject: Problem with NLM on Fedora

> I've just installed Fedora, and am having problems with NFS.
>
> What I'm seeing is that NLM requests are not issued to the NFS server (which is a SNAP Server). Instead the application hangs for about 7 seconds, and then returns successfully (i.e. the locks are ostensibly granted). The delay happens both on locks and unlock requests. Via tcpdump I see NFS requests/responses, but no RPC calls for NLM.
>
> Here's what uname -a says about my system.
>
> Linux rack10.datcon.co.uk 2.6.9-1.667smp #1 SMP Tue Nov 2 14:59:52 EST 2004 i686 i686 i386 GNU/Linux
>
> Here's the mount point
>
> 172.19.15.109:/mngdata/keepalive on /opt/dcl/keepalive type nfs (rw,rsize=4096,wsize=4096,hard,intr,bg,lock,nfsvers=3,addr=172.19.15.109)
>
> pstack shows I'm hung here:
>
> 0x002447a2: _dl_sysinfo_int80 + 0x2 (b, e, fefe2080, 0, 0, 0) + 1400
>
> I downloaded nfsutils 1.0.7, but that didn't help. I also saw this:
>
> A. There are permisions on the /var/lib/nfs/sm and /var/lib/nfs/sm.bak files that must be addressed. Whomever rpc.statd is running as must have ownership and rw access to those dirs. The permissions should be set to 700 for both. In addition, etab, rmtab, and xtab all must exist and be writable by root.
>
> So I tried changing the permissions:
>
> /var/lib/nfs:
> total 32
> -rwxrwxrwx 1 root root 0 Apr 12 11:58 etab
> -rwxrwxrwx 1 root root 0 Apr 12 11:58 rmtab
> drwxr-xr-x 7 root root 0 Apr 12 12:29 rpc_pipefs
> drwx------ 2 root root 4096 Apr 12 11:58 sm
> drwx------ 2 root root 4096 Apr 12 11:58 sm.bak
> drwxrwxrwx 4 root root 4096 Apr 12 06:28 statd
> -rwxrwxrwx 1 root root 0 Apr 12 11:58 state
> -rwxrwxrwx 1 root root 0 Apr 12 11:58 xtab
>
> /var/lib/nfs/statd:
> total 12
> drwxrwxrwx 2 root root 4096 Apr 12 12:31 sm
> drwxrwxrwx 2 root root 4096 Apr 12 12:29 sm.bak
> -rwxrwxrwx 1 root root 4 Apr 12 12:29 state
>
> ...and I've rebooted. None of this has helped.
>
> Any suggestions?
>
> Edward.


2005-04-13 23:44:10

by Steve Dickson

[permalink] [raw]
Subject: Re: Problem with NLM on Fedora

[email protected] wrote:
> I've just installed Fedora, and am having problems with NFS.
>
> What I'm seeing is that NLM requests are not issued to the NFS server
> (which is a SNAP Server). Instead the application hangs for about 7
> seconds, and then returns successfully (i.e. the locks are ostensibly
> granted). The delay happens both on locks and unlock requests. Via
> tcpdump I see NFS requests/responses, but no RPC calls for NLM.
Could you post a bzip2-ed ethereal trace (meaning
tethereal -w /tmp/dump.pcap; bzip2 /tmp/dump.pcap) of this?

There are some know locking issue with FC3
(see: https://bugzilla.redhat.com/beta/show_bug.cgi?id=150151)
but this appears to be a bit different...

steved.


-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2005-04-14 10:06:01

by Edward Hibbert

[permalink] [raw]
Subject: RE: Problem with NLM on Fedora

Digging a bit deeper, I was wrong to say there are no NLM packets flowing. Sorry.

What I think is happening is that the NLM requests are being lost, and the delays I'm seeing are due to retransmission. One difference that I notice from tcpdump trace is that Fedora is using TCP connections for the NLM traffic, whereas RedHat 8 (running successfully against the same filer) is using UDP.

Although I'd expect TCP to be more reliable, it's possible that the filer isn't very good at TCP traffic. TCP support has recently been added to this SNAP model - it only used to use UDP. So maybe it's flakey.

Is it possible to force the NLM traffic over UDP from the client side?

Edward.


-----Original Message-----
From: Steve Dickson [ <mailto:[email protected]> mailto:[email protected]]
Sent: 14 April 2005 00:44
To: Edward Hibbert ([email protected])
Cc: [email protected]
Subject: Re: [NFS] Problem with NLM on Fedora


[email protected] wrote:
> I've just installed Fedora, and am having problems with NFS.
>
> What I'm seeing is that NLM requests are not issued to the NFS server
> (which is a SNAP Server). Instead the application hangs for about 7
> seconds, and then returns successfully (i.e. the locks are ostensibly
> granted). The delay happens both on locks and unlock requests. Via
> tcpdump I see NFS requests/responses, but no RPC calls for NLM.
Could you post a bzip2-ed ethereal trace (meaning
tethereal -w /tmp/dump.pcap; bzip2 /tmp/dump.pcap) of this?

There are some know locking issue with FC3
(see: <https://bugzilla.redhat.com/beta/show_bug.cgi?id=150151> https://bugzilla.redhat.com/beta/show_bug.cgi?id=150151)
but this appears to be a bit different...

steved.


2005-04-14 10:51:37

by Edward Hibbert

[permalink] [raw]
Subject: RE: Problem with NLM on Fedora

Moving my data onto a filer which doesn't expose TCP (and therefore forces use of UDP) seems to work fine.

So I'm now convinced this isn't a client-side problem. However I need to work with this particular server, so to work around the server-side problem, I'd still like a way to force NLM connections to use UDP rather than TCP.

Anyone know how to do this?

Edward.

-----Original Message-----
From: Edward Hibbert ([email protected])
Sent: 14 April 2005 11:06
To: 'Steve Dickson'
Cc: [email protected]
Subject: RE: [NFS] Problem with NLM on Fedora



Digging a bit deeper, I was wrong to say there are no NLM packets flowing. Sorry.

What I think is happening is that the NLM requests are being lost, and the delays I'm seeing are due to retransmission. One difference that I notice from tcpdump trace is that Fedora is using TCP connections for the NLM traffic, whereas RedHat 8 (running successfully against the same filer) is using UDP.

Although I'd expect TCP to be more reliable, it's possible that the filer isn't very good at TCP traffic. TCP support has recently been added to this SNAP model - it only used to use UDP. So maybe it's flakey.

Is it possible to force the NLM traffic over UDP from the client side?

Edward.


-----Original Message-----
From: Steve Dickson [ <mailto:[email protected]> mailto:[email protected]]
Sent: 14 April 2005 00:44
To: Edward Hibbert ([email protected])
Cc: [email protected]
Subject: Re: [NFS] Problem with NLM on Fedora


[email protected] wrote:
> I've just installed Fedora, and am having problems with NFS.
>
> What I'm seeing is that NLM requests are not issued to the NFS server
> (which is a SNAP Server). Instead the application hangs for about 7
> seconds, and then returns successfully (i.e. the locks are ostensibly
> granted). The delay happens both on locks and unlock requests. Via
> tcpdump I see NFS requests/responses, but no RPC calls for NLM.
Could you post a bzip2-ed ethereal trace (meaning
tethereal -w /tmp/dump.pcap; bzip2 /tmp/dump.pcap) of this?

There are some know locking issue with FC3
(see: <https://bugzilla.redhat.com/beta/show_bug.cgi?id=150151> https://bugzilla.redhat.com/beta/show_bug.cgi?id=150151)
but this appears to be a bit different...

steved.


2005-04-20 13:40:14

by Neil Horman

[permalink] [raw]
Subject: Re: Problem with NLM on Fedora

On Thu, Apr 14, 2005 at 11:51:29AM +0100, [email protected] wrote:
> Moving my data onto a filer which doesn't expose TCP (and therefore forces use of UDP) seems to work fine.
>
> So I'm now convinced this isn't a client-side problem. However I need to work with this particular server, so to work around the server-side problem, I'd still like a way to force NLM connections to use UDP rather than TCP.
>
> Anyone know how to do this?
>
> Edward.
>
> -----Original Message-----
> From: Edward Hibbert ([email protected])
> Sent: 14 April 2005 11:06
> To: 'Steve Dickson'
> Cc: [email protected]
> Subject: RE: [NFS] Problem with NLM on Fedora
>
>
>
> Digging a bit deeper, I was wrong to say there are no NLM packets flowing. Sorry.
>
> What I think is happening is that the NLM requests are being lost, and the delays I'm seeing are due to retransmission. One difference that I notice from tcpdump trace is that Fedora is using TCP connections for the NLM traffic, whereas RedHat 8 (running successfully against the same filer) is using UDP.
>
> Although I'd expect TCP to be more reliable, it's possible that the filer isn't very good at TCP traffic. TCP support has recently been added to this SNAP model - it only used to use UDP. So maybe it's flakey.
>
> Is it possible to force the NLM traffic over UDP from the client side?
>
> Edward.
>
I think the NLM client mirrors whatever protocol the NFS client is using, so if
you were to mount the NFS share with proto=udp in the mount options, NLM should
also use udp, rather than tcp.

Regards
Neil

>
> -----Original Message-----
> From: Steve Dickson [ <mailto:[email protected]> mailto:[email protected]]
> Sent: 14 April 2005 00:44
> To: Edward Hibbert ([email protected])
> Cc: [email protected]
> Subject: Re: [NFS] Problem with NLM on Fedora
>
>
> [email protected] wrote:
> > I've just installed Fedora, and am having problems with NFS.
> >
> > What I'm seeing is that NLM requests are not issued to the NFS server
> > (which is a SNAP Server). Instead the application hangs for about 7
> > seconds, and then returns successfully (i.e. the locks are ostensibly
> > granted). The delay happens both on locks and unlock requests. Via
> > tcpdump I see NFS requests/responses, but no RPC calls for NLM.
> Could you post a bzip2-ed ethereal trace (meaning
> tethereal -w /tmp/dump.pcap; bzip2 /tmp/dump.pcap) of this?
>
> There are some know locking issue with FC3
> (see: <https://bugzilla.redhat.com/beta/show_bug.cgi?id=150151> https://bugzilla.redhat.com/beta/show_bug.cgi?id=150151)
> but this appears to be a bit different...
>
> steved.
>
>

--
/***************************************************
*Neil Horman
*Software Engineer
*Red Hat, Inc.
*[email protected]
*gpg keyid: 1024D / 0x92A74FA1
*http://pgp.mit.edu
***************************************************/


-------------------------------------------------------
This SF.Net email is sponsored by: New Crystal Reports XI.
Version 11 adds new functionality designed to reduce time involved in
creating, integrating, and deploying reporting solutions. Free runtime info,
new features, or free trial, at: http://www.businessobjects.com/devxi/728
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs