I recently patched my 2.4.19 kernel with EXT3 dir_index support and tried
it out on my 80GB EXT3 data partition. This partition is used to cache CVS
and BK checkouts from 17 or so different software projects, some of them
quite large (linux-kernel, GNOME2, and GNU src come to mind) and so it
contains thousands of directories and hundreds of thousands of files. I
serve this via NFS to a couple of clients. Using htrees on this file
system would seem to be a good idea in theory.
(First let me state that this kernel has also been patched with the
"International Kernel Patch" for 2.4.18 but I don't believe this patch
touches any of EXT3 or JBD.)
Setting the dir_index feature flag and running e2fsck -fD (1.30-WIP) went
without a hitch. However, my problems started almost immediately.
First, there is some interaction with knfsd and the nfsfs on the clients
(also 2.4.19) where a large directory can put the client into an endless
loop when iterating directory entries. I have an exported directory
that contains 849 Ogg Vorbis files that would lock 'ls' etc. every time.
Also, I encountered a problem when building GNOME2 using a script that
first unpacks a tarball of the module, does a CVS update, repacks the
updated module, then does a configure/build/install cycle, then removes
the working sources.
Intermittently this triggers a race where a file is deleted but the
directory metadata is not entirely updated, leading to a condition where a
file partially exists, e.g.
# rm -rf some-large-project
rm: some-large-project/CVS/Entries: Input/Ouput error.
rm: some-large-project/CVS: Directory not empty.
rm: some-large-project: Directory not empty.
# cd some-large-project/CVS
# ls
Entries: Input/Output error.
I wrote a very simple utility that calls unlink() directly:
# unlink Entries
This succeeds in clearing the bogus entry but EXT3 complains:
kernel: EXT3-fs warning (device ide0(3,65)): ext3-unlink:
Deleting nonexistent file (9012125), 0
Of these two problems the latter is only a nuisance but the former
rendered my NFS exports useless, so I had to revert the filesystem.
Clearing the dir_index feature flag and then running e2fsck did the trick.
If there is any additional information I can provide, please let me know.
Andrew Purtell [email protected]
Network Associates Technologies, Inc. Los Angeles, CA
On Wednesday 09 October 2002 20:29, [email protected] wrote:
> I recently patched my 2.4.19 kernel with EXT3 dir_index support and tried
> it out on my 80GB EXT3 data partition...
Could you please provide a pointer to the patch you used?
--
Daniel
On Oct 10, 2002 15:08 +0200, Daniel Phillips wrote:
> On Wednesday 09 October 2002 20:29, [email protected] wrote:
> > I recently patched my 2.4.19 kernel with EXT3 dir_index support and tried
> > it out on my 80GB EXT3 data partition...
>
> Could you please provide a pointer to the patch you used?
A number of people have been getting this same bug under high load. I
believe they are using the patches from Ted, and/or BK extfs.bkbits.net.
Cheers, Andreas
--
Andreas Dilger
http://www-mddsp.enel.ucalgary.ca/People/adilger/
http://sourceforge.net/projects/ext2resize/
On Thursday 10 October 2002 19:03, Andreas Dilger wrote:
> On Oct 10, 2002 15:08 +0200, Daniel Phillips wrote:
> > On Wednesday 09 October 2002 20:29, [email protected] wrote:
> > > I recently patched my 2.4.19 kernel with EXT3 dir_index support and tried
> > > it out on my 80GB EXT3 data partition...
> >
> > Could you please provide a pointer to the patch you used?
>
> A number of people have been getting this same bug under high load. I
> believe they are using the patches from Ted, and/or BK extfs.bkbits.net.
Does the Chris Lee flavor of the patch (before Ted's cleanups) exhibit the
same bug? I suppose the pre-cleanup patch is incompatible with e2fsck
because of the hash function change, but that would be easy to fix.
--
Daniel
On Thursday 10 October 2002 19:03, Andreas Dilger wrote:
> On Oct 10, 2002 15:08 +0200, Daniel Phillips wrote:
> > On Wednesday 09 October 2002 20:29, [email protected] wrote:
> > > I recently patched my 2.4.19 kernel with EXT3 dir_index support and tried
> > > it out on my 80GB EXT3 data partition...
> >
> > Could you please provide a pointer to the patch you used?
>
> A number of people have been getting this same bug under high load. I
> believe they are using the patches from Ted, and/or BK extfs.bkbits.net.
OK, I've read through the patch and the original thread re this problem.
There are a few obvious things to try:
- Does the problem come up when there is only one rsync running
concurrently? (If so, we have a SMP race.)
- Is the behaviour the same before and after Ted's cleanups? (Get
the old version out of cvs...)
- Does the problem manifest with Ext2? (Somebody - me probably -
has to dust off the Ext2 patch and add the new hash.)
--
Daniel