LinuxLists.cc - Current NFS issues (2.5.59)

2003-02-09 23:09:31

Subject: Current NFS issues (2.5.59)

Ok. Here goes. I have two servers that NFS mount from each other and
provide.

Server 1 exports A, B, and C to server 2. Server 2 exports D and E back
to server 1 and exports F and G to two other clients. Each of these
(A-G) are distinctly different filesystem paths and not part of each other.

1. If server 1 is restarted, server 2 will invalidate (make all 'df'
values '1') F and G. This requires an 'exportfs -vra' or similar on
server 2 to fix the client 'df' values. The client doesn't need to do
anything.

2. Repeated nfs system stops and starts (/etc/init.d/nfs restart) will
eventually cause a kernel panic on server 2 (haven't tested on server
1). The number of restarts is variable.

3. Mount point F (/home/david) infrequently loops. ls -la /home/david
will loop forever until all client memory is exhausted and the kernel
kills it via OOM. ls -la /home/david/somefile or /home/david/somedir/
works just fine as well as any sub directory under /home/david.
Restarts of both systems refuse to fix things.

4. Mounts infrequently get "permission denied" messages on the client
with a " rpc.mountd: getfh failed: Operation not permitted" message on
the server. This is fixable by restarting the nfs system on the server.

Server1 is UNI, server 2 is SMP. All servers and clients are stock
2.5.59[1]. NFS is running on top of Reiserfs filesystems on all client
and server machines.

I'll be happy to apply test patches to either clients or servers.

David

[1] One client is 2.5.56 but it rarely accesses the NFS mount unlike the
other machines which use them constantly

2003-02-11 03:55:57

by NeilBrown

[permalink] [raw]

Subject: Re: Current NFS issues (2.5.59)

On Sunday February 9, [email protected] wrote:
> Ok. Here goes. I have two servers that NFS mount from each other and
> provide.

Thankyou for using the development kernel and sharing your woes...

>
> Server 1 exports A, B, and C to server 2. Server 2 exports D and E back
> to server 1 and exports F and G to two other clients. Each of these
> (A-G) are distinctly different filesystem paths and not part of each other.
>
> 1. If server 1 is restarted, server 2 will invalidate (make all 'df'
> values '1') F and G. This requires an 'exportfs -vra' or similar on
> server 2 to fix the client 'df' values. The client doesn't need to do
> anything.

This has me completely mystified. If I understand correctly, an event
on server 1 causes a failure to commuinicate between server 2 and some
third party..

I can only imagine that as server 1 boots it does something to server
2. At the very least it sends a mount request for D and E. I'm not
sure how a mount request for D or E would affect F or G.
Are you using an automounter at all?

>
> 2. Repeated nfs system stops and starts (/etc/init.d/nfs restart) will
> eventually cause a kernel panic on server 2 (haven't tested on server
> 1). The number of restarts is variable.

Can you capture the panic and send it to me please?

>
> 3. Mount point F (/home/david) infrequently loops. ls -la /home/david
> will loop forever until all client memory is exhausted and the kernel
> kills it via OOM. ls -la /home/david/somefile or /home/david/somedir/
> works just fine as well as any sub directory under /home/david.
> Restarts of both systems refuse to fix things.

I think this might be a reiserfs problem. Someone else mentioned that
this started happening when they upgrade from an earlier 2.5 kernel.
If you can capture the NFS traffic
tcpdump -s 1500 -w /tmp/afile host $server and host $client
we could have a look at the directory cookies and see what is
happening.

>
> 4. Mounts infrequently get "permission denied" messages on the client
> with a " rpc.mountd: getfh failed: Operation not permitted" message on
> the server. This is fixable by restarting the nfs system on the server.
>

I've seen this, but it was fixed by the time 2.5.59 came out.
If/when it happens again, could you please check if the IP address of
the client in question is in
/proc/net/rpc/auth.unix.ip/content
and if the name found there is in
/proc/fs/nfs/exports
next to the appropriate filesystem.

NeilBrown

2003-02-11 06:50:32

by Oleg Drokin

[permalink] [raw]

Subject: Re: Current NFS issues (2.5.59)

Hello!

On Tue, Feb 11, 2003 at 03:05:24PM +1100, Neil Brown wrote:

> > 3. Mount point F (/home/david) infrequently loops. ls -la /home/david
> > will loop forever until all client memory is exhausted and the kernel
> > kills it via OOM. ls -la /home/david/somefile or /home/david/somedir/
> > works just fine as well as any sub directory under /home/david.
> > Restarts of both systems refuse to fix things.
> I think this might be a reiserfs problem. Someone else mentioned that

I was not able to reproduce that.

> this started happening when they upgrade from an earlier 2.5 kernel.

I think that earlier report was from David too. This is just more detailed
report it seems.

And while you are listening - I want to share my own NFs problems in 2.5.59 ;)
If I try to mount any NFS exported filesystem from the same host (e.g
localhost), mount process hangs in D state. Server appears to work ok though
and serves requests from external clients.

Bye,
Oleg

2003-02-11 12:14:25

by Trond Myklebust

[permalink] [raw]

Subject: Re: Current NFS issues (2.5.59)

>>>>> " " == Oleg Drokin <[email protected]> writes:

>> this started happening when they upgrade from an earlier 2.5
>> kernel.

> I think that earlier report was from David too. This is just
> more detailed report it seems.

> And while you are listening - I want to share my own NFs
> problems in 2.5.59 ;) If I try to mount any NFS exported
> filesystem from the same host (e.g localhost), mount process
> hangs in D state. Server appears to work ok though and serves
> requests from external clients.

...and I am unable to reproduce this 8-)

I have a feeling that something is still not right in the IPv4 socket
layer. There are odd memory corruptions etc which I don't see on the
same code in 2.4.x.
Still hunting though, so I may yet find the problems were in my own
code...

Cheers,
Trond

2003-02-11 13:21:34

by Oleg Drokin

[permalink] [raw]

Subject: Re: Current NFS issues (2.5.59)

Hello!

On Tue, Feb 11, 2003 at 01:23:51PM +0100, Trond Myklebust wrote:
> >> this started happening when they upgrade from an earlier 2.5
> >> kernel.
>
> > I think that earlier report was from David too. This is just
> > more detailed report it seems.
>
> > And while you are listening - I want to share my own NFs
> > problems in 2.5.59 ;) If I try to mount any NFS exported
> > filesystem from the same host (e.g localhost), mount process
> > hangs in D state. Server appears to work ok though and serves
> > requests from external clients.
> ...and I am unable to reproduce this 8-)

Here is how I achieve this:
2.5.60 kernel from current bk repository.
belka:/usr/src/linux-2.5-bk # grep NFS .config
CONFIG_NFS_FS=y
CONFIG_NFS_V3=y
# CONFIG_NFS_V4 is not set
CONFIG_NFSD=y
CONFIG_NFSD_V3=y
# CONFIG_NFSD_V4 is not set
CONFIG_NFSD_TCP=y

belka:~ # cat /etc/exports
/home (ro)
belka:~ # ps ax | grep portmap
421 ? S 0:00 /sbin/portmap
belka:~ # mount localhost:/home /mnt -t nfs
And it hangs here.
(Actually I found it does not hang, it just cannot connect to something,
as it emits "nfs: server localhost not responding, still trying"
messages from time to time, but I see no way to interrupt the process)
This is on SMP box. Distro is based on SuSE 8.1

Sysrt-T gives me this:
mount D 00000086 4293700280 761 755 (NOTLB)
Call Trace:
[<c0301760>] call_transmit+0x48/0x9c
[<c0305f72>] __rpc_execute+0x23e/0x3ee
[<c011890a>] default_wake_function+0x0/0x12
[<c0301003>] rpc_call_sync+0xbd/0xf2
[<c030350e>] xprt_timer+0x0/0x15a
[<c03017b4>] call_status+0x0/0xf8
[<c0304fd0>] rpc_run_timer+0x0/0xec
[<c01b9dee>] nfs3_rpc_wrapper+0x3a/0x82
[<c01b9f71>] nfs3_proc_get_root+0x53/0x84
[<c01b0894>] nfs_get_root+0x44/0x8a
[<c016d018>] inode_init_once+0x20/0x17c
[<c020ba61>] vsnprintf+0x20d/0x460
[<c020bcdb>] snprintf+0x27/0x2c
[<c01b09a9>] nfs_sb_init+0xcf/0x4d8
[<c013d47b>] cache_alloc_debugcheck_after+0xcd/0xd8
[<c013c3b9>] kmalloc+0xa1/0xdc
[<c030870c>] unx_create+0x50/0x70
[<c0307b2b>] rpcauth_create+0x35/0x48
[<c0300b80>] rpc_create_client+0x160/0x1f0
[<c0307154>] rpciod_up+0x26/0x120
[<c01b10c3>] nfs_fill_super+0x311/0x3c2
[<c0158e86>] sget+0x162/0x188
[<c01b2f1d>] nfs_get_sb+0x1a9/0x238
[<c0159b39>] do_kern_mount+0x5f/0xd4
[<c01717a9>] do_add_mount+0x9d/0x19e
[<c0171adf>] do_mount+0x155/0x1a2
[<c01721bb>] sys_mount+0x113/0x18c
[<c01096b3>] syscall_call+0x7/0xb

Hope this helps.

Bye,
Oleg

2003-02-11 16:46:06

by David Ford

[permalink] [raw]

Subject: Re: Current NFS issues (2.5.59)

Neil Brown wrote:

>On Sunday February 9, [email protected] wrote:
>
>
>>Ok. Here goes. I have two servers that NFS mount from each other and
>>provide.
>>
>>
>
>Thankyou for using the development kernel and sharing your woes...
>

heheheh...

>>Server 1 exports A, B, and C to server 2. Server 2 exports D and E back
>>to server 1 and exports F and G to two other clients. Each of these
>>(A-G) are distinctly different filesystem paths and not part of each other.
>>
>>1. If server 1 is restarted, server 2 will invalidate (make all 'df'
>>values '1') F and G. This requires an 'exportfs -vra' or similar on
>>server 2 to fix the client 'df' values. The client doesn't need to do
>>anything.
>>
>>
>
>This has me completely mystified. If I understand correctly, an event
>on server 1 causes a failure to commuinicate between server 2 and some
>third party..
>
>I can only imagine that as server 1 boots it does something to server
>2. At the very least it sends a mount request for D and E. I'm not
>sure how a mount request for D or E would affect F or G.
>Are you using an automounter at all?
>

No, no automount of any sort. Server 1 and server 2 share /home and
apache virtuals back and forth, shell and web server. So they are
mounted at boot.

Server 1 is the shell server, 2 is the web server. When the shell
server is restarted, all the clients that fetch other mounts off the web
server get '1's for the df information in short order. There is some
delay, not sure what the delay is for. During that delay,
/nfsmountpoint access stalls on the clients. Unfortunately my own home
directory comes off that mountpoint and the wonder coding of Raster
causes multiple large explosions and instantaneous destruction of your
graphical session. So I've lost a fair amount of NFS debug notes
unexpectedly :S

If I'm fast on the draw and run exportfs on server 2 quick enough, I
manage to save my desktop before that timeout hits.

>>2. Repeated nfs system stops and starts (/etc/init.d/nfs restart) will
>>eventually cause a kernel panic on server 2 (haven't tested on server
>>1). The number of restarts is variable.
>>
>>
>
>Can you capture the panic and send it to me please?
>

I plan to setup a notebook w/ serial console capture.

>>3. Mount point F (/home/david) infrequently loops. ls -la /home/david
>>will loop forever until all client memory is exhausted and the kernel
>>kills it via OOM. ls -la /home/david/somefile or /home/david/somedir/
>>works just fine as well as any sub directory under /home/david.
>>Restarts of both systems refuse to fix things.
>>
>>
>
>I think this might be a reiserfs problem. Someone else mentioned that
>this started happening when they upgrade from an earlier 2.5 kernel.
>If you can capture the NFS traffic
> tcpdump -s 1500 -w /tmp/afile host $server and host $client
>we could have a look at the directory cookies and see what is
>happening.
>

Is this important to start the tcpdump before the mount is established?
If I start the tcpdump after I've detected the looping, is that useful?
There's a lot of NFS traffic :)

Another thing that I've noted, it looks like a corrupt ->next ring
pointer. [example only] Given 500 directory entries, at some random
ring index, 50 of those entries start looping. say it was 100-150. A
few hours later it might be 350-400. The previous entries would scroll
by then that 50 would start looping. That 50 is the same 50 that will
loop even after both machines have been rebooted.

The looping only seems to happen when something abnormal happens on the
mountpoint or the system is rebooted. I.e. enlightenment fubars when
you try to edit a menu and the entire desktop blows up as usual. That
has a small probability of starting a loop effect. I've had that once.
At all other times, the looping happens immediately after bootup.

As a reminder, only globbed access to the top level mountpoint see the
looping. Any single entry access or sub level access is fine. ls -l
/nfsmount fails. ls -l /nfsmount/a* fails but ls -dl
/nfsmount/file_or_dir works fine, as well as ls -l
/nfsmount/anything_below/this/

>>4. Mounts infrequently get "permission denied" messages on the client
>>with a " rpc.mountd: getfh failed: Operation not permitted" message on
>>the server. This is fixable by restarting the nfs system on the server.
>>
>>
>>
>
>I've seen this, but it was fixed by the time 2.5.59 came out.
>If/when it happens again, could you please check if the IP address of
>the client in question is in
> /proc/net/rpc/auth.unix.ip/content
>and if the name found there is in
> /proc/fs/nfs/exports
>next to the appropriate filesystem.
>
>NeilBrown
>

Will do. I saw it every time when I booted into 2.5.53 and again when
booting back. If I reboot with the same kernel version, it's normally
fine. However it does happen too often with both systems being 2.5.59.

David

2003-02-11 22:03:28

by NeilBrown

[permalink] [raw]

Subject: Re: Current NFS issues (2.5.59)

On Tuesday February 11, [email protected] wrote:
> Hello!
>
> On Tue, Feb 11, 2003 at 01:23:51PM +0100, Trond Myklebust wrote:
> > >> this started happening when they upgrade from an earlier 2.5
> > >> kernel.
> >
> > > I think that earlier report was from David too. This is just
> > > more detailed report it seems.
> >
> > > And while you are listening - I want to share my own NFs
> > > problems in 2.5.59 ;) If I try to mount any NFS exported
> > > filesystem from the same host (e.g localhost), mount process
> > > hangs in D state. Server appears to work ok though and serves
> > > requests from external clients.
> > ...and I am unable to reproduce this 8-)
>
> Here is how I achieve this:
> 2.5.60 kernel from current bk repository.
> belka:/usr/src/linux-2.5-bk # grep NFS .config
> CONFIG_NFS_FS=y
> CONFIG_NFS_V3=y
> # CONFIG_NFS_V4 is not set
> CONFIG_NFSD=y
> CONFIG_NFSD_V3=y
> # CONFIG_NFSD_V4 is not set
> CONFIG_NFSD_TCP=y
>
> belka:~ # cat /etc/exports
> /home (ro)
> belka:~ # ps ax | grep portmap
> 421 ? S 0:00 /sbin/portmap
> belka:~ # mount localhost:/home /mnt -t nfs
> And it hangs here.

Yep, I can reproduce this.
I turn on lots of debuggin:

# grep . /proc/sys/sunrpc/*
/proc/sys/sunrpc/nfs_debug:16383
/proc/sys/sunrpc/nfsd_debug:16383
/proc/sys/sunrpc/nlm_debug:0
/proc/sys/sunrpc/rpc_debug:16319

and get :

Feb 12 09:11:35 kamen kernel: RPC: cong 256, cwnd was 256, now 256
Feb 12 09:11:35 kamen kernel: RPC: 411 xprt_timer (pending request)
Feb 12 09:11:35 kamen kernel: RPC: 411 call_status (status -110)
Feb 12 09:11:35 kamen kernel: RPC: 411 call_timeout (minor)
Feb 12 09:11:35 kamen kernel: RPC: 411 call_bind xprt e6219000 is connected
Feb 12 09:11:35 kamen kernel: RPC: 411 call_transmit (status 0)
Feb 12 09:11:35 kamen kernel: RPC: 411 xprt_prepare_transmit
Feb 12 09:11:35 kamen kernel: RPC: 411 xprt_cwnd_limited cong = 0 cwnd = 256
Feb 12 09:11:35 kamen kernel: RPC: 411 call_encode (status 0)
Feb 12 09:11:35 kamen kernel: RPC: 411 marshaling UNIX cred f6044944
Feb 12 09:11:35 kamen kernel: RPC: 411 xprt_transmit(88)
Feb 12 09:11:35 kamen kernel: svc: socket f7775304(inet f6e7f5f8), count=96, busy=0
Feb 12 09:11:35 kamen kernel: svc: socket f6e7f5f8 served by daemon f69ed600
Feb 12 09:11:35 kamen kernel: RPC: xprt_sendmsg(88) = 88
Feb 12 09:11:35 kamen kernel: RPC: 411 xmit complete
Feb 12 09:11:35 kamen kernel: svc: server f69ed600, socket f7775304, inuse=1
Feb 12 09:11:35 kamen kernel: svc: socket f6e7f5f8 served by daemon f653d400
Feb 12 09:11:35 kamen kernel: svc: got len=88
Feb 12 09:11:35 kamen kernel: svc: svc_authenticate (1)
Feb 12 09:11:35 kamen kernel: svc: socket f6e7f5f8 busy, not enqueued
Feb 12 09:11:35 kamen kernel: svc: calling dispatcher
Feb 12 09:11:35 kamen kernel: nfsd_dispatch: vers 3 proc 1
Feb 12 09:11:35 kamen kernel: nfsd: GETATTR(3) 12: 00000001 01000800 0004358c 00000000 00000000 00000000
Feb 12 09:11:35 kamen kernel: nfsd: fh_verify(12: 00000001 01000800 0004358c 00000000 00000000 00000000)
Feb 12 09:11:35 kamen kernel: svc: service f69ed600, releasing skb f6f690a4
Feb 12 09:11:35 kamen kernel: svc: socket f7775304(inet f6e7f5f8), write_space busy=1
Feb 12 09:11:35 kamen kernel: svc: socket f6e7f5f8 busy, not enqueued
Feb 12 09:11:35 kamen kernel: RPC: udp_data_ready...
Feb 12 09:11:35 kamen kernel: RPC: udp_data_ready client e6219000
Feb 12 09:11:35 kamen kernel: svc: socket f7775304 sendto([f69a6000 112... ], 112) = 112 (addr 100007f)
Feb 12 09:11:35 kamen kernel: svc: server f653d400, socket f7775304, inuse=2
Feb 12 09:11:35 kamen kernel: svc: got len=-11
Feb 12 09:11:35 kamen kernel: svc: server f653d400 waiting for data (to = 300000)
Feb 12 09:11:35 kamen kernel: svc: server f69ed600 waiting for data (to = 300000)

which seems to suggest that the reply is coming back from the server
(RPC: udp_data_ready) but there is no sign of the client getting it...

NeilBrown

2003-02-11 22:23:03

by NeilBrown

[permalink] [raw]

Subject: Re: Current NFS issues (2.5.59)

On Tuesday February 11, [email protected] wrote:
>
> No, no automount of any sort. Server 1 and server 2 share /home and
> apache virtuals back and forth, shell and web server. So they are
> mounted at boot.
>
> Server 1 is the shell server, 2 is the web server. When the shell
> server is restarted, all the clients that fetch other mounts off the web
> server get '1's for the df information in short order. There is some
> delay, not sure what the delay is for. During that delay,
> /nfsmountpoint access stalls on the clients. Unfortunately my own home
> directory comes off that mountpoint and the wonder coding of Raster
> causes multiple large explosions and instantaneous destruction of your
> graphical session. So I've lost a fair amount of NFS debug notes
> unexpectedly :S
>
> If I'm fast on the draw and run exportfs on server 2 quick enough, I
> manage to save my desktop before that timeout hits.

I think I would need a precise description of everything that is
mounted and exactly where. I don't know what use this would actually
be, but it is very hard to reason about this sort of thing in the
abstract. Maybe there will be something in the details that will ring
a bell.

> >>
> >
> >Can you capture the panic and send it to me please?
> >
>
> I plan to setup a notebook w/ serial console capture.

Thanks.

> >I think this might be a reiserfs problem. Someone else mentioned that
> >this started happening when they upgrade from an earlier 2.5 kernel.
> >If you can capture the NFS traffic
> > tcpdump -s 1500 -w /tmp/afile host $server and host $client
> >we could have a look at the directory cookies and see what is
> >happening.
> >
>
> Is this important to start the tcpdump before the mount is established?
> If I start the tcpdump after I've detected the looping, is that useful?
> There's a lot of NFS traffic :)

Starting the tcpdump once the looping has started would be fine.
However your description of repeated rings makes it sould very much
like a directory cookie problem.

Could you run this program on the server:
--------------------------
#include <sys/types.h>
#include <dirent.h>

main()
{
DIR *dir;
struct dirent *de;
dir = opendir(".");

while ((de = readdir(dir)))
printf("%10lu %10lu %s\n",
de->d_off,
de->d_ino,
de->d_name);
}
----------------------------
In the directory that is causing problems. The first column printed
is the cookie. If it ever repeats, you have simple proof that
reiserfs is doing the wrong thing, and you should report it to the
reiserfs team.

NeilBrown

2003-02-11 23:55:40

by Trond Myklebust

[permalink] [raw]

Subject: Re: Current NFS issues (2.5.59)

>>>>> " " == Neil Brown <[email protected]> writes:

> Feb 12 09:11:35 kamen kernel: RPC: udp_data_ready... Feb 12
> 09:11:35 kamen kernel: RPC: udp_data_ready client e6219000 Feb

> which seems to suggest that the reply is coming back from the
> server (RPC: udp_data_ready) but there is no sign of the client
> getting it...

On the contrary. The above shows that the client is indeed receiving
some data, but for some reason it is not accepting that data as a
valid reply. It looks as if either skb_recv_datagram() or
xprt_lookup_rqst() is failing.

BTW: I'm still not seeing any such trouble with these loopback mounts
on my own machine. Could it be an SMP problem?

Cheers,
Trond

2003-02-12 00:11:17

by David Ford

[permalink] [raw]

Subject: Re: Current NFS issues (2.5.59)

On my network, Server 2 (web server) is SMP and one client is SMP. The
more problematic client is not SMP. I'll do some debugging in a moment,
I've another fire to put out presently.

-d

Trond Myklebust wrote:

>[...]
>
>On the contrary. The above shows that the client is indeed receiving
>some data, but for some reason it is not accepting that data as a
>valid reply. It looks as if either skb_recv_datagram() or
>xprt_lookup_rqst() is failing.
>
>BTW: I'm still not seeing any such trouble with these loopback mounts
>on my own machine. Could it be an SMP problem?
>
>Cheers,
> Trond
>
>