2004-01-27 20:53:52

by Jussi Hamalainen

[permalink] [raw]
Subject: NFS: giant filename in readdir

I'm getting these errors after about 15d of uptime:

Jan 27 17:43:18 mir kernel: NFS: giant filename in readdir (len 955ae5)!
Jan 27 21:06:14 mir kernel: NFS: giant filename in readdir (len 74000000)!

And doing an ls (ie. readdir()) inside an NFS-mount always produces
an empty directory listing. I can still access files and
subdirectories OK, though.

This seems to be a problem on the client side and only occurs when
using NFSv3. When I unmount and then remount using NFSv3, the problem
persists, but goes away once I remount with nfsvers=2. Also I tried
downgrading the other server's kernel to 2.4.21 and the problem still
persisted until I remounted with NFSv2.

I'll wait and see wether the downgrade helped on the client side, but
that might take a few days.

Both boxes have an almost identical setup of Slackware 9.1 and were
running 2.4.23-pac1+security bugfixes. The boxes are connected to the
same switch and VLAN. They mount filesystems from each other (yeah, I
know cross-mounting with NFS is a bad idea...) and the problem
occurred on both servers simultaineously.

The mounts look like this:

mir:/home on /home type nfs
(rw,rsize=8192,wsize=8192,hard,intr,lock,addr=XXX)
mir:/archive on /archive type nfs
(rw,rsize=8192,wsize=8192,soft,intr,addr=XXX)

sputnik:/var/spool/mail on /var/spool/mail type nfs
(rw,rsize=8192,wsize=8192,hard,intr,lock,nfsvers=2,addr=XXX)
sputnik:/files on /files type nfs
(rw,rsize=8192,wsize=8192,soft,intr,nfsvers=2,addr=XXX)

I tried searching with Google but couldn't find a resolution to this
problem. I did find references of it occurring as far back as 2002.
Any ideas, folks?

--
-=[ Count Zero / TBH - Jussi H?m?l?inen - email [email protected] ]=-


2004-01-28 23:06:38

by Trond Myklebust

[permalink] [raw]
Subject: Re: NFS: giant filename in readdir

P? ty , 27/01/2004 klokka 21:53, skreiv Jussi Hamalainen:
> Both boxes have an almost identical setup of Slackware 9.1 and were
> running 2.4.23-pac1+security bugfixes. The boxes are connected to the
> same switch and VLAN. They mount filesystems from each other (yeah, I
> know cross-mounting with NFS is a bad idea...) and the problem
> occurred on both servers simultaineously.
>
> The mounts look like this:
>
> mir:/home on /home type nfs
> (rw,rsize=8192,wsize=8192,hard,intr,lock,addr=XXX)
> mir:/archive on /archive type nfs
> (rw,rsize=8192,wsize=8192,soft,intr,addr=XXX)
>
> sputnik:/var/spool/mail on /var/spool/mail type nfs
> (rw,rsize=8192,wsize=8192,hard,intr,lock,nfsvers=2,addr=XXX)
> sputnik:/files on /files type nfs
> (rw,rsize=8192,wsize=8192,soft,intr,nfsvers=2,addr=XXX)

Any info forthcoming on the filesystem you used and/or a binary tcpdump
demonstrating the problem? (remember to use a large snaplen in the
tcpdump - something like "-s 9000").

Does the problem still occur when you change "soft" to "hard"? Note that
the default setting for "retrans" as set by the nfs-utils "mount"
program is way too low for "soft" on UDP.

Cheers,
Trond

BTW: 2.4.23 has no readdir changes at all compared to 2.4.21. I've no
idea WTF 2.4.23-pac1 contains...

2004-01-29 05:40:12

by Jussi Hamalainen

[permalink] [raw]
Subject: Re: NFS: giant filename in readdir

On Thu, 29 Jan 2004, Trond Myklebust wrote:

> Any info forthcoming on the filesystem you used and/or a binary tcpdump
> demonstrating the problem?

All filesystems are ext3. I'll try to get you a tcpdump if and when
the phenomenon happens again.

> Does the problem still occur when you change "soft" to "hard"?

Both boxes have two NFS-mounts from each other. One is soft, one is
hard and this happens on both mounts simultaineously.

--
-=[ Count Zero / TBH - Jussi H?m?l?inen - email [email protected] ]=-