2003-07-16 12:02:21

by Steve Fosdick

[permalink] [raw]
Subject: Hang Problem - Linux Server, HP-UX Client

We have a puzzling problem with NFS hanging while using Linux as the
server and HP-UX as the client. This happens restoring files from
backup tape which are being written to the NFS filesystem. Initially
files are being written OK but after a while the writing stops and the
HP-UX client reports "NFS server <hostname> not responding". At this
point both client and server continue to work normally in every other
respect and IP network connectivity between them is still working so
the problem seems specific to NFS.

Initially it looked like the problem occured after aprox. 4Gb of data
had been written to the NFS filesystem, but since then it seems to have
happened after rather less data had been copied. The problem occurs
with both the kernel-based NFS server and the user mode NFS server and
using either NFS V2 or NFS V3 and both before and after the
installation of the latest NFS patch on the HP-UX client.

The versions I currently have running where we still have the problem
are Linux 2.4.21 with no additional NFS patches (server) and HP-UX
10.20 with PHNE_25234 (client).

Does anyone hear have any idea what may be going on here, or useful
tips as to how I could gather debug information that would help get to
the bottom of this please?

Steve.


-------------------------------------------------------
This SF.net email is sponsored by: VM Ware
With VMware you can run multiple operating systems on a single machine.
WITHOUT REBOOTING! Mix Linux / Windows / Novell virtual machines at the
same time. Free trial click here: http://www.vmware.com/wl/offer/345/0
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs


2003-07-16 18:36:00

by Steve Fosdick

[permalink] [raw]
Subject: Re: Hang Problem - Linux Server, HP-UX Client

Further to my previous message here is part of the output of tcpdump
running on the NFS server and capturing UDP port 2409 at the time
client complains the server is no longer responding:

17:40:30.705564 spitfire.nfs > moorland.1438258734: reply ok 112 getattr
DIR 40700 ids 0/3 sz 0x0000001e8 (DF)
17:40:30.705804 moorland.1438258735 > spitfire.nfs: 128 getattr fh
Unknown/1
17:40:30.705859 spitfire.nfs > moorland.1438258735: reply ok 112 getattr
DIR 40700 ids 0/3 sz 0x000000090 (DF)
17:40:30.706106 moorland.1438258736 > spitfire.nfs: 136 lookup fh
Unknown/1 "0205"
17:40:30.706177 spitfire.nfs > moorland.1438258736: reply ok 232 lookup
fh Unknown/1 (DF)
17:40:30.722560 moorland.1438258737 > spitfire.nfs: 152 commit fh
Unknown/1 17367040 bytes @ 0x001fd0000
17:40:30.722632 spitfire.nfs > moorland.1438258737: reply ok 128 commit
(DF)
17:40:30.729689 moorland.1438258738 > spitfire.nfs: 860 write fh
Unknown/1 [|nfs]
17:40:30.729884 spitfire.nfs > moorland.1438258738: reply ok 136 write
[|nfs] (DF)
17:40:30.730182 moorland.1438258739 > spitfire.nfs: 152 commit fh
Unknown/1 0 bytes @ 0x00306a000
17:40:30.730242 spitfire.nfs > moorland.1438258739: reply ok 128 commit
(DF)
17:40:31.194906 moorland.1438257340 > spitfire.nfs: 1472 write fh
Unknown/1 [|nfs] (frag 33609:1480@0+)
17:40:41.194960 moorland.1438257340 > spitfire.nfs: 1472 write fh
Unknown/1 [|nfs] (frag 33725:1480@0+)
17:40:43.694927 moorland.1438257340 > spitfire.nfs: 1472 write fh
Unknown/1 [|nfs] (frag 33767:1480@0+)
17:40:48.694994 moorland.1438257340 > spitfire.nfs: 1472 write fh
Unknown/1 [|nfs] (frag 33832:1480@0+)
17:40:58.695014 moorland.1438257340 > spitfire.nfs: 1472 write fh
Unknown/1 [|nfs] (frag 33986:1480@0+)
17:41:18.695127 moorland.1438257340 > spitfire.nfs: 1472 write fh
Unknown/1 [|nfs] (frag 34252:1480@0+)
17:41:23.695079 moorland.1438257340 > spitfire.nfs: 1472 write fh
Unknown/1 [|nfs] (frag 34299:1480@0+)
17:41:33.695137 moorland.1438257340 > spitfire.nfs: 1472 write fh
Unknown/1 [|nfs] (frag 34399:1480@0+)
17:41:53.695236 moorland.1438257340 > spitfire.nfs: 1472 write fh
Unknown/1 [|nfs] (frag 34723:1480@0+)
17:42:13.695771 moorland.1438257340 > spitfire.nfs: 1472 write fh
Unknown/1 [|nfs] (frag 34928:1480@0+)
17:42:23.695328 moorland.1438257340 > spitfire.nfs: 1472 write fh
Unknown/1 [|nfs] (frag 35028:1480@0+)
17:42:43.695375 moorland.1438257340 > spitfire.nfs: 1472 write fh
Unknown/1 [|nfs] (frag 35260:1480@0+)

Here is another piece of tcpdump output, this time after the server has
been rebooted and the client should still be trying to write the same
file:

17:59:43.667811 moorland.1438257340 > spitfire.nfs: 1472 write fh
Unknown/1 [|nfs] (frag 47468:1480@0+)
18:00:03.668106 moorland.1438257340 > spitfire.nfs: 1472 write fh
Unknown/1 [|nfs] (frag 47937:1480@0+)
18:00:23.668423 moorland.1438257340 > spitfire.nfs: 1472 write fh
Unknown/1 [|nfs] (frag 48124:1480@0+)
18:00:31.165567 moorland.1438258740 > spitfire.nfs: 116 getattr fh
Unknown/1
18:00:31.165637 spitfire.nfs > moorland.1438258740: reply ok 112 getattr
DIR 40755 ids 0/0 sz 0x000000170 (DF)
18:00:31.165966 moorland.1438258741 > spitfire.nfs: 116 fsinfo fh
Unknown/1
18:00:31.166020 spitfire.nfs > moorland.1438258741: reply ok 80 fsinfo
rtmax 8192 rtpref 8192 wtmax 8192 wtpref 8192 dtpref 4096 (DF)
18:00:31.166245 moorland.1438258742 > spitfire.nfs: 116 fsstat fh
Unknown/1
18:00:31.166316 spitfire.nfs > moorland.1438258742: reply ok 84 fsstat
tbytes 0x5d1de0000 fbytes 0x58ba6f000 abytes 0x58ba6f000 (DF)
18:00:31.166884 moorland.1438258743 > spitfire.nfs: 116 getattr fh
Unknown/1
18:00:31.166940 spitfire.nfs > moorland.1438258743: reply ok 112 getattr
DIR 40755 ids 0/0 sz 0x0000005b0 (DF)
18:00:43.668505 moorland.1438257340 > spitfire.nfs: 1472 write fh
Unknown/1 [|nfs] (frag 48342:1480@0+)
18:01:03.668682 moorland.1438257340 > spitfire.nfs: 1472 write fh
Unknown/1 [|nfs] (frag 48598:1480@0+)

Do either of these shed any light?

Steve.


-------------------------------------------------------
This SF.net email is sponsored by: VM Ware
With VMware you can run multiple operating systems on a single machine.
WITHOUT REBOOTING! Mix Linux / Windows / Novell virtual machines at the
same time. Free trial click here: http://www.vmware.com/wl/offer/345/0
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs