2005-01-20 19:07:43

by Tobias Diedrich

[permalink] [raw]
Subject: NFSv3, 2.6, ext3 and dir_index

Hi,

I recently switched from a local homedir to a NFS mounted homedir
and noticed some strange things:

I have a Cronjob that goes over my mp3 directory and after the
switch find would print "foo/bar/blubb: No such file or directory"
for a few (approx. 2-3 out of 1928) directories, but not always the same
ones. Also, my mp3 player program would sometimes stop short in the
middle of a song and skip to the next one, because it got ENOENT on
a read of an open file.

Now first I thought it might just be a problem with my Kernel
version or one of the patches I'm using and upgraded to a more
recent one (currently 2.6.10-ac8-nfsacl on the client and
2.6.10-ac8-imq-nfsacl on the server, I also had the same problem
without the nfsacl patches).

Then I read an older thread about trouble with ext3/dir_index and
NFS in older 2.6 versions and tried disabling dir_index (Which was
enabled on all my ext3 filesystems). With that the problem
vanished.

Are there any known problems with dir_index and NFS, or is this
maybe a new bug?

Getting a tcpdump of the client<->server traffic proved difficult,
because it is a quite sporadic bug and reproducing it would involve
quite a lot of NFS traffic. I could reliably trigger this with my
CD/DVD burning script, which generates md5sums for each file and
puts them both into the file MD5SUMS in the current directory, as
well as into another file in my Homedir (But the files being
md5summed also live on another NFS export, so there is a _lot_ of
traffic).

I _do_ have a traffic capture of ls returning "No such file or
directory" on the current directory, then cding up a level, down into the
directory again and then a working ls.

Client fstab entry:
nukunuku:/mnt/space1/ranma /home/ranma nfs hard,intr,bg,udp,rsize=4096,wsize=4096 0 0

/proc/mounts entry:
nukunuku:/mnt/space1/ranma /home/ranma nfs rw,v3,rsize=4096,wsize=4096,hard,intr,udp,lock,addr=nukunuku 0 0

Server export file:
/ melchior.yamamaya.is-a-geek.org(sync,rw,no_root_squash)
/mnt/space1 melchior.yamamaya.is-a-geek.org(sync,rw,no_root_squash)
/mnt/space2 melchior.yamamaya.is-a-geek.org(sync,rw,no_root_squash)
/mnt/space3 melchior.yamamaya.is-a-geek.org(sync,rw,no_root_squash)

--
Tobias PGP: http://9ac7e0bc.uguu.de


Attachments:
(No filename) (2.19 kB)
nfs.tcpdump.gz (5.27 kB)
Download all attachments

2005-01-20 19:26:33

by Michael Haverkamp

[permalink] [raw]
Subject: Re: NFSv3, 2.6, ext3 and dir_index

I have had the same problem with LVM2 and reiserfs using kernel 2.6.9.
Whenever I have a large IO load on the server, these error crop up. I
copy everything on /space (/dev/hda5) to /space_bak (/dev/hdc5) using cp
-a and that always makes the problem happen for me. I have not tried
removing LVM2 yet, so I don't know if that is a contributor.

I don't think that this problem is limited to ext3 and dir_index.

Tobias Diedrich wrote:
> Hi,
>
> I recently switched from a local homedir to a NFS mounted homedir
> and noticed some strange things:
>
> I have a Cronjob that goes over my mp3 directory and after the
> switch find would print "foo/bar/blubb: No such file or directory"
> for a few (approx. 2-3 out of 1928) directories, but not always the same
> ones. Also, my mp3 player program would sometimes stop short in the
> middle of a song and skip to the next one, because it got ENOENT on
> a read of an open file.
>
> Now first I thought it might just be a problem with my Kernel
> version or one of the patches I'm using and upgraded to a more
> recent one (currently 2.6.10-ac8-nfsacl on the client and
> 2.6.10-ac8-imq-nfsacl on the server, I also had the same problem
> without the nfsacl patches).
>
> Then I read an older thread about trouble with ext3/dir_index and
> NFS in older 2.6 versions and tried disabling dir_index (Which was
> enabled on all my ext3 filesystems). With that the problem
> vanished.
>
> Are there any known problems with dir_index and NFS, or is this
> maybe a new bug?
>
> Getting a tcpdump of the client<->server traffic proved difficult,
> because it is a quite sporadic bug and reproducing it would involve
> quite a lot of NFS traffic. I could reliably trigger this with my
> CD/DVD burning script, which generates md5sums for each file and
> puts them both into the file MD5SUMS in the current directory, as
> well as into another file in my Homedir (But the files being
> md5summed also live on another NFS export, so there is a _lot_ of
> traffic).
>
> I _do_ have a traffic capture of ls returning "No such file or
> directory" on the current directory, then cding up a level, down into the
> directory again and then a working ls.
>
> Client fstab entry:
> nukunuku:/mnt/space1/ranma /home/ranma nfs hard,intr,bg,udp,rsize=4096,wsize=4096 0 0
>
> /proc/mounts entry:
> nukunuku:/mnt/space1/ranma /home/ranma nfs rw,v3,rsize=4096,wsize=4096,hard,intr,udp,lock,addr=nukunuku 0 0
>
> Server export file:
> / melchior.yamamaya.is-a-geek.org(sync,rw,no_root_squash)
> /mnt/space1 melchior.yamamaya.is-a-geek.org(sync,rw,no_root_squash)
> /mnt/space2 melchior.yamamaya.is-a-geek.org(sync,rw,no_root_squash)
> /mnt/space3 melchior.yamamaya.is-a-geek.org(sync,rw,no_root_squash)
>

--
Michael Haverkamp


-------------------------------------------------------
This SF.Net email is sponsored by: IntelliVIEW -- Interactive Reporting
Tool for open source databases. Create drag-&-drop reports. Save time
by over 75%! Publish reports on the web. Export to DOC, XLS, RTF, etc.
Download a FREE copy at http://www.intelliview.com/go/osdn_nl
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2005-03-20 21:21:18

by Tobias Diedrich

[permalink] [raw]
Subject: Re: NFSv3, 2.6, ext3 and dir_index

Sorry for the late reply.

Chip Salzenberg wrote:

> Did you ever get any replies on dir_index vs. nfs?

Not really.

> I'm just setting up a new server and I'm wondering if there's
> something still out there that might make me sorry to use dir_index.

For now I've just disabled dir_index and I haven't had any problems
since, but I'm not completely sure if that really was the cause of
the problem I was seeing. At least disabling dir_index is rather
easy (remove dir_index featureflag with tune2fs and run e2fsck to
get rid of the remaining hashtrees).
Maybe I'll try again to get a proper traffic trace, but I guess I'd
have to set up a test client first to separate the unrelated nfs traffic.

--
Tobias PGP: http://9ac7e0bc.uguu.de


-------------------------------------------------------
This SF.net email is sponsored by Microsoft Mobile & Embedded DevCon 2005
Attend MEDC 2005 May 9-12 in Vegas. Learn more about the latest Windows
Embedded(r) & Windows Mobile(tm) platforms, applications & content. Register
by 3/29 & save $300 http://ads.osdn.com/?ad_id=6883&alloc_id=15149&op=click
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs