I'm getting these errors after about 15d of uptime:
Jan 27 17:43:18 mir kernel: NFS: giant filename in readdir (len 955ae5)!
Jan 27 21:06:14 mir kernel: NFS: giant filename in readdir (len 74000000)!
And doing an ls (ie. readdir()) inside an NFS-mount always produces
an empty directory listing. I can still access files and
subdirectories OK, though.
This seems to be a problem on the client side and only occurs when
using NFSv3. When I unmount and then remount using NFSv3, the problem
persists, but goes away once I remount with nfsvers=2. Also I tried
downgrading the other server's kernel to 2.4.21 and the problem still
persisted until I remounted with NFSv2.
I'll wait and see wether the downgrade helped on the client side, but
that might take a few days.
Both boxes have an almost identical setup of Slackware 9.1 and were
running 2.4.23-pac1+security bugfixes. The boxes are connected to the
same switch and VLAN. They mount filesystems from each other (yeah, I
know cross-mounting with NFS is a bad idea...) and the problem
occurred on both servers simultaineously.
The mounts look like this:
mir:/home on /home type nfs
(rw,rsize=8192,wsize=8192,hard,intr,lock,addr=XXX)
mir:/archive on /archive type nfs
(rw,rsize=8192,wsize=8192,soft,intr,addr=XXX)
sputnik:/var/spool/mail on /var/spool/mail type nfs
(rw,rsize=8192,wsize=8192,hard,intr,lock,nfsvers=2,addr=XXX)
sputnik:/files on /files type nfs
(rw,rsize=8192,wsize=8192,soft,intr,nfsvers=2,addr=XXX)
I tried searching with Google but couldn't find a resolution to this
problem. I did find references of it occurring as far back as 2002.
Any ideas, folks?
--
-=[ Count Zero / TBH - Jussi H?m?l?inen - email [email protected] ]=-
P? ty , 27/01/2004 klokka 21:53, skreiv Jussi Hamalainen:
> Both boxes have an almost identical setup of Slackware 9.1 and were
> running 2.4.23-pac1+security bugfixes. The boxes are connected to the
> same switch and VLAN. They mount filesystems from each other (yeah, I
> know cross-mounting with NFS is a bad idea...) and the problem
> occurred on both servers simultaineously.
>
> The mounts look like this:
>
> mir:/home on /home type nfs
> (rw,rsize=8192,wsize=8192,hard,intr,lock,addr=XXX)
> mir:/archive on /archive type nfs
> (rw,rsize=8192,wsize=8192,soft,intr,addr=XXX)
>
> sputnik:/var/spool/mail on /var/spool/mail type nfs
> (rw,rsize=8192,wsize=8192,hard,intr,lock,nfsvers=2,addr=XXX)
> sputnik:/files on /files type nfs
> (rw,rsize=8192,wsize=8192,soft,intr,nfsvers=2,addr=XXX)
Any info forthcoming on the filesystem you used and/or a binary tcpdump
demonstrating the problem? (remember to use a large snaplen in the
tcpdump - something like "-s 9000").
Does the problem still occur when you change "soft" to "hard"? Note that
the default setting for "retrans" as set by the nfs-utils "mount"
program is way too low for "soft" on UDP.
Cheers,
Trond
BTW: 2.4.23 has no readdir changes at all compared to 2.4.21. I've no
idea WTF 2.4.23-pac1 contains...
On Thu, 29 Jan 2004, Trond Myklebust wrote:
> Any info forthcoming on the filesystem you used and/or a binary tcpdump
> demonstrating the problem?
All filesystems are ext3. I'll try to get you a tcpdump if and when
the phenomenon happens again.
> Does the problem still occur when you change "soft" to "hard"?
Both boxes have two NFS-mounts from each other. One is soft, one is
hard and this happens on both mounts simultaineously.
--
-=[ Count Zero / TBH - Jussi H?m?l?inen - email [email protected] ]=-