2002-06-18 16:02:49

by DervishD

[permalink] [raw]
Subject: Shrinking ext3 directories

Hi all :))

All of you know that if you create a lot of files or directories
within a directory on ext2/3 and after that you remove them, the
blocks aren't freed (this is the reason behind the lost+found block
preallocation). If you want to 'shrink' the directory now that it
doesn't contain a lot of leafs, the only solution I know is creating
a new directory, move the remaining leafs to it, remove the
'big-unshrinken' directory and after that renaming the new directory:

$ mkdir new-dir
$ mv bigone/* new-dir/
$ rmdir bigone
$ mv new-dir bigone
(Well, sort of)

Any other way of doing the same without the mess?

Thanks a lot :)
Ra?l


2002-06-18 16:10:32

by Austin Gonyou

[permalink] [raw]
Subject: Re: Shrinking ext3 directories

Use a volume manager?(LVM or EVMS maybe.) You can grow and shrink their
volumes dynamically. EXT3 mus support ioctls for this, but if it does,
cause I've seen it doesn with EXT2, then you're good.

On Tue, 2002-06-18 at 11:08, DervishD wrote:
> Hi all :))
>
> All of you know that if you create a lot of files or directories
> within a directory on ext2/3 and after that you remove them, the
> blocks aren't freed (this is the reason behind the lost+found block
> preallocation). If you want to 'shrink' the directory now that it
> doesn't contain a lot of leafs, the only solution I know is creating
> a new directory, move the remaining leafs to it, remove the
> 'big-unshrinken' directory and after that renaming the new directory:
>
> $ mkdir new-dir
> $ mv bigone/* new-dir/
> $ rmdir bigone
> $ mv new-dir bigone
> (Well, sort of)
>
> Any other way of doing the same without the mess?
>
> Thanks a lot :)
> Ra?l
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
--
Austin Gonyou <[email protected]>

2002-06-18 16:22:13

by Padraig Brady

[permalink] [raw]
Subject: Re: Shrinking ext3 directories

DervishD wrote:
> Hi all :))
>
> All of you know that if you create a lot of files or directories
> within a directory on ext2/3 and after that you remove them, the
> blocks aren't freed (this is the reason behind the lost+found block
> preallocation). If you want to 'shrink' the directory now that it
> doesn't contain a lot of leafs, the only solution I know is creating
> a new directory, move the remaining leafs to it, remove the
> 'big-unshrinken' directory and after that renaming the new directory:
>
> $ mkdir new-dir
> $ mv bigone/* new-dir/
> $ rmdir bigone
> $ mv new-dir bigone
> (Well, sort of)

The zipdir component of fslint does this (while maintaining permissions
etc.).

> Any other way of doing the same without the mess?

Not at present I think. Perhaps we'll get it for free with
the new htree directory indexing?

Padraig.

2002-06-18 16:41:33

by Andreas Dilger

[permalink] [raw]
Subject: Re: Shrinking ext3 directories

On Jun 18, 2002 11:10 -0500, Austin Gonyou wrote:
> Use a volume manager?(LVM or EVMS maybe.) You can grow and shrink their
> volumes dynamically. EXT3 mus support ioctls for this, but if it does,
> cause I've seen it doesn with EXT2, then you're good.

Totally irrelevant.

> On Tue, 2002-06-18 at 11:08, DervishD wrote:
> > Hi all :))
> >
> > All of you know that if you create a lot of files or directories
> > within a directory on ext2/3 and after that you remove them, the
> > blocks aren't freed (this is the reason behind the lost+found block
> > preallocation). If you want to 'shrink' the directory now that it
> > doesn't contain a lot of leafs, the only solution I know is creating
> > a new directory, move the remaining leafs to it, remove the
> > 'big-unshrinken' directory and after that renaming the new directory:
> >
> > $ mkdir new-dir
> > $ mv bigone/* new-dir/
> > $ rmdir bigone
> > $ mv new-dir bigone
> > (Well, sort of)
> >
> > Any other way of doing the same without the mess?

Not right now. Truncating directories on a mounted filesystem is
probably going to be a big source of strange problems. In the end
it isn't really a problem for most people, because if your directory
has grown big once it is likely to grow big again.

With htree directories we will have to make this work at some point,
because you will be able to create huge directories and not being able
to free directory blocks would be a big pain. It isn't in the current
htree directory code yet, but it has been discussed a bit already.

Cheers, Andreas
--
Andreas Dilger
http://www-mddsp.enel.ucalgary.ca/People/adilger/
http://sourceforge.net/projects/ext2resize/

2002-06-18 16:59:28

by David Lang

[permalink] [raw]
Subject: Re: Shrinking ext3 directories

this won't be part of the htree indexing, however the biggest reason to be
concerned about shrinking the directory is the horrible performance that
ext2/3 have when dealing with large directories and the htree stuff will
hopefully eliminate that performance problem so the only remaining reason
would be to free up a few K of disk space (which is a MUCH less critical
issue)

David Lang


On Tue, 18 Jun 2002, Padraig Brady
wrote:

> Date: Tue, 18 Jun 2002 17:21:18 +0100
> From: Padraig Brady <[email protected]>
> To: DervishD <[email protected]>
> Cc: Linux-kernel <[email protected]>
> Subject: Re: Shrinking ext3 directories
>
> DervishD wrote:
> > Hi all :))
> >
> > All of you know that if you create a lot of files or directories
> > within a directory on ext2/3 and after that you remove them, the
> > blocks aren't freed (this is the reason behind the lost+found block
> > preallocation). If you want to 'shrink' the directory now that it
> > doesn't contain a lot of leafs, the only solution I know is creating
> > a new directory, move the remaining leafs to it, remove the
> > 'big-unshrinken' directory and after that renaming the new directory:
> >
> > $ mkdir new-dir
> > $ mv bigone/* new-dir/
> > $ rmdir bigone
> > $ mv new-dir bigone
> > (Well, sort of)
>
> The zipdir component of fslint does this (while maintaining permissions
> etc.).
>
> > Any other way of doing the same without the mess?
>
> Not at present I think. Perhaps we'll get it for free with
> the new htree directory indexing?
>
> Padraig.
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

2002-06-18 19:28:37

by DervishD

[permalink] [raw]
Subject: Re: Shrinking ext3 directories

Hi Austin :)

>Use a volume manager?(LVM or EVMS maybe.) You can grow and shrink
>their volumes dynamically. EXT3 mus support ioctls for this, but if
>it does, cause I've seen it doesn with EXT2, then you're good.

I'm afraid I explained myself bad O:) I didn't refer to directory
contents, but to the metadata, or the blocks of the directory itself.

Thanks anyway :))
Ra?l

2002-06-18 19:32:25

by DervishD

[permalink] [raw]
Subject: Re: Shrinking ext3 directories

Hi Padraig :)

>> Any other way of doing the same without the mess?
>Not at present I think. Perhaps we'll get it for free with
>the new htree directory indexing?

What's that? BTW, thanks a lot, I didn't know about that feature
of fslint.

Ra?l

2002-06-18 19:35:18

by DervishD

[permalink] [raw]
Subject: Re: Shrinking ext3 directories

Hi Andreas :)

>> > Any other way of doing the same without the mess?
>Not right now. Truncating directories on a mounted filesystem is
>probably going to be a big source of strange problems.

So I'd better go with the move-and-rename alternative...

>In the end it isn't really a problem for most people, because if
>your directory has grown big once it is likely to grow big again.

Well, my problem arose just a couple of times, when by mistake I
extract a tarball in a directory, or by a wrong 'touch' command
(related to some bad use of 'seq'), etc... Not a problem at all, but
curiosity about this.

>With htree directories we will have to make this work at some point,

Ok, I'll take a look then :)) Thanks for answering :)
Ra?l

2002-06-18 21:50:07

by Stephen C. Tweedie

[permalink] [raw]
Subject: Re: Shrinking ext3 directories

Hi,

On Tue, Jun 18, 2002 at 06:08:28PM +0200, DervishD wrote:

> All of you know that if you create a lot of files or directories
> within a directory on ext2/3 and after that you remove them, the
> blocks aren't freed (this is the reason behind the lost+found block
> preallocation). If you want to 'shrink' the directory now that it
> doesn't contain a lot of leafs, the only solution I know is creating
> a new directory, move the remaining leafs to it, remove the
> 'big-unshrinken' directory and after that renaming the new directory

Right. Shrinking directories is not implemented for ext2 or ext3 at
the moment. However, I know that Daniel Phillips has been thinking
about adding that for his HTree extensions which add fast directory
indexing to ext2/3.

Cheers,
Stephen

2002-06-18 22:18:54

by Alexander Viro

[permalink] [raw]
Subject: Re: Shrinking ext3 directories



On Tue, 18 Jun 2002, Stephen C. Tweedie wrote:

> Hi,
>
> On Tue, Jun 18, 2002 at 06:08:28PM +0200, DervishD wrote:
>
> > All of you know that if you create a lot of files or directories
> > within a directory on ext2/3 and after that you remove them, the
> > blocks aren't freed (this is the reason behind the lost+found block
> > preallocation). If you want to 'shrink' the directory now that it
> > doesn't contain a lot of leafs, the only solution I know is creating
> > a new directory, move the remaining leafs to it, remove the
> > 'big-unshrinken' directory and after that renaming the new directory
>
> Right. Shrinking directories is not implemented for ext2 or ext3 at
> the moment. However, I know that Daniel Phillips has been thinking
> about adding that for his HTree extensions which add fast directory
> indexing to ext2/3.

<shrug> for ext2 a limited form of "shrinking" is easy to implement.
ext2_delete_entry() can easily notice that it's about to create an
empty entry spanning entire last block. In that case it should
just walk back and check beginnings of previous blocks, as long
as they are empty (inode = 0, len = block size). Then it's vmtruncate()
time - all IO on directories is protected by i_sem, so we are safe.

IOW, making sure that empty blocks in the end of directory get freed
is a matter of 10-20 lines. If you want such patch - just tell, it's
half an hour of work...

2002-06-19 09:33:00

by DervishD

[permalink] [raw]
Subject: Re: Shrinking ext3 directories

Hi Alexander :))

>IOW, making sure that empty blocks in the end of directory get freed
>is a matter of 10-20 lines. If you want such patch - just tell, it's
>half an hour of work...

IMHO it would be a great feature to add, and if the cost is as
low as you say... BTW, thanks a lot for your answer and your offer :)

Ra?l

2002-06-19 10:37:44

by Stephen C. Tweedie

[permalink] [raw]
Subject: Re: Shrinking ext3 directories

Hi,

On Tue, Jun 18, 2002 at 06:18:49PM -0400, Alexander Viro wrote:

> IOW, making sure that empty blocks in the end of directory get freed
> is a matter of 10-20 lines. If you want such patch - just tell, it's
> half an hour of work...

It's certainly easier at the tail, but with htree we may have
genuinely enormous directories and being able to hole-punch arbitrary
coalesced blocks could be a huge win. Also, doing the coalescing
block by block is likely to be far easier for ext3 than truncating
the directory arbitrarily back in one go.

Chopping a large directory at once brings back the truncate()
nightmare of having to make an unbounded disk operation seem atomic,
even if it has to get split over multiple transactions. Incremental
coalescing should allow us to know in advance how many disk blocks we
might end up touching for the operation, so we can guarantee to do it
in one transaction.

Cheers,
Stephen

2002-06-19 19:09:59

by Christopher Li

[permalink] [raw]
Subject: Re: [Ext2-devel] Re: Shrinking ext3 directories



On Wed, 19 Jun 2002, Stephen C. Tweedie wrote:

> Hi,
>
> On Tue, Jun 18, 2002 at 06:18:49PM -0400, Alexander Viro wrote:
>
> > IOW, making sure that empty blocks in the end of directory get freed
> > is a matter of 10-20 lines. If you want such patch - just tell, it's
> > half an hour of work...
>
> It's certainly easier at the tail, but with htree we may have
> genuinely enormous directories and being able to hole-punch arbitrary
> coalesced blocks could be a huge win. Also, doing the coalescing
I would can contribute on that. I am thinking about it anyway.
Daniel might already has some code there.

I have a silly question, where is that ext3 CVS? Under sourcefourge
ext2/ext3 or gkernel?

Chris


2002-06-19 20:11:06

by Stephen C. Tweedie

[permalink] [raw]
Subject: Re: [Ext2-devel] Re: Shrinking ext3 directories

Hi,

On Wed, Jun 19, 2002 at 01:03:38PM -0400, Christopher Li wrote:

> On Wed, 19 Jun 2002, Stephen C. Tweedie wrote:
>
> > Hi,
> >
> > On Tue, Jun 18, 2002 at 06:18:49PM -0400, Alexander Viro wrote:
> >
> > > IOW, making sure that empty blocks in the end of directory get freed
> > > is a matter of 10-20 lines. If you want such patch - just tell, it's
> > > half an hour of work...
> >
> > It's certainly easier at the tail, but with htree we may have
> > genuinely enormous directories and being able to hole-punch arbitrary
> > coalesced blocks could be a huge win. Also, doing the coalescing
> I would can contribute on that. I am thinking about it anyway.
> Daniel might already has some code there.
>
> I have a silly question, where is that ext3 CVS? Under sourcefourge
> ext2/ext3 or gkernel?

cvs -d :ext:[email protected]:/cvsroot/gkernel co ext3

The branches being used are

cvs up -r ext3-1_0-branch # HEAD of ext3 development
cvs up -r features-branch # For htree, ACLs etc

and there are a couple of other branches I use for tracking merges into
Linus's and the -ac trees. The htree stuff is all that's new in the
features-branch right now.

Cheers,
Stephen

2002-06-19 20:34:55

by Stephen C. Tweedie

[permalink] [raw]
Subject: Re: [Ext2-devel] Re: Shrinking ext3 directories

Hi,

On Wed, Jun 19, 2002 at 09:10:51PM +0100, Stephen C. Tweedie wrote:

> cvs -d :ext:[email protected]:/cvsroot/gkernel co ext3
> The branches being used are
>
> cvs up -r ext3-1_0-branch # HEAD of ext3 development
> cvs up -r features-branch # For htree, ACLs etc

And one other thing: the subdirectories tools/, testing/ and scripts/
on the cvs trunk contain various tools for testing and stressing the
filesystem and VM.

--Stephen

2002-06-19 22:50:55

by Daniel Phillips

[permalink] [raw]
Subject: Re: [Ext2-devel] Re: Shrinking ext3 directories

On Wednesday 19 June 2002 19:03, Christopher Li wrote:
> On Wed, 19 Jun 2002, Stephen C. Tweedie wrote:
> > On Tue, Jun 18, 2002 at 06:18:49PM -0400, Alexander Viro wrote:
> >
> > > IOW, making sure that empty blocks in the end of directory get freed
> > > is a matter of 10-20 lines. If you want such patch - just tell, it's
> > > half an hour of work...
> >
> > It's certainly easier at the tail, but with htree we may have
> > genuinely enormous directories and being able to hole-punch arbitrary
> > coalesced blocks could be a huge win. Also, doing the coalescing
>
> I would can contribute on that. I am thinking about it anyway.
> Daniel might already has some code there.

I don't have code, but let me remind you of this post:

http://marc.theaimsgroup.com/?l=ext2-devel&m=102132142032096&w=2

A sketch of the coalescing design is at the end. I'll formalize that.
One issue Stephen touched on that I hadn't settled at the time, is how
to handle deleted blocks. My inclination is to copy the last block of
the directory into the vacated block as opposed to leaving a hole in
the file. The slight extra cost doesn't seem to be worth worrying
about, and it's guaranteed to leave the directory in a compact state
when emptied.

The two competing approaches are the hole-punch idea - which I didn't
consider before - and keeping a list of free blocks somehow. I think
it's best to err on the side of simplicity this time: the copy-down-last
strategy eliminates the need to search for a free block when the
directory needs to be expanded again,

--
Daniel

2002-06-20 00:27:31

by Andreas Dilger

[permalink] [raw]
Subject: Re: [Ext2-devel] Re: Shrinking ext3 directories

On Jun 20, 2002 00:49 +0200, Daniel Phillips wrote:
> My inclination is to copy the last block of the directory into the
> vacated block as opposed to leaving a hole in the file. The slight
> extra cost doesn't seem to be worth worrying about, and it's guaranteed
> to leave the directory in a compact state when emptied.

This also has the benefit of avoiding huge truncates when we are
deleting lots of files. At most it will add a single block into
each transaction.

> I think it's best to err on the side of simplicity this time: the
> copy-down-last strategy eliminates the need to search for a free
> block when the directory needs to be expanded again,

It also keeps compatibility with older code, whereas having holes in
directories can cause problems on older kernels.

Cheers, Andreas
--
Andreas Dilger
http://www-mddsp.enel.ucalgary.ca/People/adilger/
http://sourceforge.net/projects/ext2resize/

2002-06-20 09:35:07

by Stephen C. Tweedie

[permalink] [raw]
Subject: Re: [Ext2-devel] Re: Shrinking ext3 directories

Hi,

On Thu, Jun 20, 2002 at 12:49:57AM +0200, Daniel Phillips wrote:

> I don't have code, but let me remind you of this post:
>
> http://marc.theaimsgroup.com/?l=ext2-devel&m=102132142032096&w=2
>
> A sketch of the coalescing design is at the end. I'll formalize that.

One question --- just how stable will this be if we boot into a kernel
that doesn't have the coalescing enabled, and start modifying
directories? We _could_ just teach the current code to clear those
top 8 bits in the parent any time we touch a leaf node, but that's
unnecessarily expensive, so we'd really need to have some way of
either recreating the hint fields from scratch every so often, or of
spotting when they have become badly out-of-date.

Cheers,
Stephen

2002-06-20 10:21:26

by Andreas Dilger

[permalink] [raw]
Subject: Re: [Ext2-devel] Re: Shrinking ext3 directories

On Jun 20, 2002 10:34 +0100, Stephen C. Tweedie wrote:
> On Thu, Jun 20, 2002 at 12:49:57AM +0200, Daniel Phillips wrote:
> > I don't have code, but let me remind you of this post:
> >
> > http://marc.theaimsgroup.com/?l=ext2-devel&m=102132142032096&w=2
> >
> > A sketch of the coalescing design is at the end. I'll formalize that.
>
> One question --- just how stable will this be if we boot into a kernel
> that doesn't have the coalescing enabled, and start modifying
> directories? We _could_ just teach the current code to clear those
> top 8 bits in the parent any time we touch a leaf node, but that's
> unnecessarily expensive, so we'd really need to have some way of
> either recreating the hint fields from scratch every so often, or of
> spotting when they have become badly out-of-date.

Three notes:
1) Coalescing isn't necessarily the same as just discarding empty
blocks. We can do the latter much more easily, and without the
hint bits at all, but it won't work unless a block is totally
empty, so you could still approach 0% fullness with huge directories.

2) The hint bits are meant to be intentionally vague (i.e. only a hint)
so there is no need to keep them 100% up-to-date. If it turns out
that you modify a directory with a kernel that does not understand
coalescing it is fairly benign. The worst that would happen is that
you get an empty leaf block (assuming you don't even have the simple
support for dropping empty blocks), or you try to coalesce with such
a block and find it too full to do the merge, so you update the hint
again with the correct value. Over the normal course of operations
you would be updating the hints for each block anyways, so invalid
hints would be cleaned out from the index.

To avoid extra overhead from writing out the parent each time you delete
an entry from the leaf, you could update the values all of the time
(you had to have read the parent to find the correct leaf block), but
not mark the block dirty, so the updated hints are only written to disk
if there is another reason to write out the block (e.g. split/coalesce
of a leaf block).

Having a large number of bits of hint info would not necessarily be
useful. In Daniel's "1-bit hint" example, the actual worst-case fullness
could _approach_ 25%, but you would always drop 100% empty blocks
immediately, so it would never quite get there.

With 2 bits of hint, you would probably only merge if the sum of the two
neighbours was <= 3 (i.e. 75% fullness for a single block), because you
don't necessarily want to be merging blocks to be almost 100% full and
then splitting them again. This would give a worst-case fullness between
37.5% and 75% at any time, which isn't really so bad given the performance
implications of repeated merge+truncate+allocate+split operations.

Remember also that each leaf block merge will incur a copy from the tail
block (which may need to be read from disk) and then a truncate to drop
that block. We _could_ leave some number of empty dir blocks at the end
of the directory file if we had some sort of dir prealloc scheme happening.
There would be some amount of hysteresis there to avoid the repeated
alloc/free overhead (i.e. keep no more than 8 free blocks, but allocate
8 blocks at a time if you need more).

Cheers, Andreas
--
Andreas Dilger
http://www-mddsp.enel.ucalgary.ca/People/adilger/
http://sourceforge.net/projects/ext2resize/

2002-06-20 13:47:00

by Daniel Phillips

[permalink] [raw]
Subject: Re: [Ext2-devel] Re: Shrinking ext3 directories

On Thursday 20 June 2002 12:18, Andreas Dilger wrote:
> On Jun 20, 2002 10:34 +0100, Stephen C. Tweedie wrote:
> > One question --- just how stable will this be if we boot into a kernel
> > that doesn't have the coalescing enabled, and start modifying
> > directories? We _could_ just teach the current code to clear those
> > top 8 bits in the parent any time we touch a leaf node, but that's
> > unnecessarily expensive, so we'd really need to have some way of
> > either recreating the hint fields from scratch every so often, or of
> > spotting when they have become badly out-of-date.
>
> Three notes:
> 1) Coalescing isn't necessarily the same as just discarding empty
> blocks. We can do the latter much more easily, and without the
> hint bits at all, but it won't work unless a block is totally
> empty, so you could still approach 0% fullness with huge directories.

And in the accidental-untar case that started this thread, Raul would
have the same complaint: a directory bloats up and never unbloats
until completely emptied.

> 2) The hint bits are meant to be intentionally vague (i.e. only a hint)
> so there is no need to keep them 100% up-to-date. If it turns out
> that you modify a directory with a kernel that does not understand
> coalescing it is fairly benign. The worst that would happen is that
> you get an empty leaf block (assuming you don't even have the simple
> support for dropping empty blocks), or you try to coalesce with such
> a block and find it too full to do the merge, so you update the hint
> again with the correct value. Over the normal course of operations
> you would be updating the hints for each block anyways, so invalid
> hints would be cleaned out from the index.

To state it another way: when the fullness hint is wrong, it's an
underestimate. The self-correcting mechanism you described is exactly
what I had in mind.

> To avoid extra overhead from writing out the parent each time you delete
> an entry from the leaf, you could update the values all of the time
> (you had to have read the parent to find the correct leaf block), but
> not mark the block dirty, so the updated hints are only written to disk
> if there is another reason to write out the block (e.g. split/coalesce
> of a leaf block).

Hmm, so if the VM discards the unmarked dirty index block then on
reread the fullness estimate becomes an overestimate and we could, in
rare circumstances, end up with mergable blocks that never get merged.
Since we're touching the index block (set PG_Referenced), this might
be a very rare occurance indeed.

Another way to achieve a similar effect is to set the fullness to an
underestimate on delete, by rounding down to some smaller number of
hint bits than we actually have available, and set it to an accurate
estimate only on a failed merge (to prevent repeated unsuccessful
merge attempts). This way, several deletes can take place before we
need to update the index block. Say we round down to 5 bits on delete,
then on a 4K block we can delete 128 bytes, or 6-7 entries, before
having to update the estimate. This gives us 85% of the benefit in
terms of reducing index block dirties while only slightly increasing
the chance of a failed merge.

So let me see, the purpose of recording the fullness bits in the
index block in the first place is to save CPU (probing the index
and scanning dirent blocks) and the occasional read. The cost is
extra complexity in the algorithm, and some extra index block writes.
Did we win? I think we did, and significantly, but the analysis
isn't good enough yet to quantify that.

> Having a large number of bits of hint info would not necessarily be
> useful. In Daniel's "1-bit hint" example, the actual worst-case fullness
> could _approach_ 25%, but you would always drop 100% empty blocks
> immediately, so it would never quite get there.
>
> With 2 bits of hint, you would probably only merge if the sum of the two
> neighbours was <= 3 (i.e. 75% fullness for a single block), because you
> don't necessarily want to be merging blocks to be almost 100% full and
> then splitting them again. This would give a worst-case fullness between
> 37.5% and 75% at any time, which isn't really so bad given the performance
> implications of repeated merge+truncate+allocate+split operations.

Yes, provably containing worst case fullness at 50% would be entirely
satisfactory. The current worst case, however rare it may be, is 0%.
I'm hoping for an average steady state fullness in the 70% range.

> Remember also that each leaf block merge will incur a copy from the tail
> block (which may need to be read from disk) and then a truncate to drop
> that block. We _could_ leave some number of empty dir blocks at the end
> of the directory file if we had some sort of dir prealloc scheme happening.
> There would be some amount of hysteresis there to avoid the repeated
> alloc/free overhead (i.e. keep no more than 8 free blocks, but allocate
> 8 blocks at a time if you need more).

We could hope that the block allocation policy eventually improves to
the point where the preallocation is taken care of without the directory
subsystem needing to know about it. On the other hand, if we do want
explicit preallocation at some point then we probably have to anticipate
it now for forward compatibility reasons. I think we will be ok just by
building in enough split-merge hysteresis.

--
Daniel

2002-06-20 16:30:58

by Bill Davidsen

[permalink] [raw]
Subject: Re: [Ext2-devel] Re: Shrinking ext3 directories

On Thu, 20 Jun 2002, Andreas Dilger wrote:

> Remember also that each leaf block merge will incur a copy from the tail
> block (which may need to be read from disk) and then a truncate to drop
> that block. We _could_ leave some number of empty dir blocks at the end
> of the directory file if we had some sort of dir prealloc scheme happening.
> There would be some amount of hysteresis there to avoid the repeated
> alloc/free overhead (i.e. keep no more than 8 free blocks, but allocate
> 8 blocks at a time if you need more).

Wouldn't the hysteresis be the frag or block size? Is there benefit to
truncating if it doesn't free any disk space? Actually, there might be
benefit to leaving a few empty blocks at the end of the dir when doing
trunc as a means of reducing alloc.

--
bill davidsen <[email protected]>
CTO, TMR Associates, Inc
Doing interesting things with little computers since 1979.

2002-06-21 14:54:57

by Ville Herva

[permalink] [raw]
Subject: Re: [Ext2-devel] Re: Shrinking ext3 directories

On Thu, Jun 20, 2002 at 03:45:51PM +0200, you [Daniel Phillips] wrote:
>
> And in the accidental-untar case that started this thread, Raul would
> have the same complaint: a directory bloats up and never unbloats
> until completely emptied.

Not only accidental untar, but buggy progs as well. Recently, I found out
that named had created tens of thousands of (luckily zero-length) files in a
single dir on ext2. While it only took couple of hours to delete them with
"find . -name '...'| xargs -n 5000 rm" commands, I can imagine remote DOS
attacks through daemons that create local temp files. Accessing such
directory quickly becomes slow as molasses on ext2.

Daniels patch seems great. I also recall someone (Ted T'so? Stephen Tweedie?)
had another dir access speed-up patch for ext3... Is that applicable to ext2
or was it already merged?


-- v --

[email protected]

2002-06-21 15:09:12

by Stephen C. Tweedie

[permalink] [raw]
Subject: Re: [Ext2-devel] Re: Shrinking ext3 directories

Hi,

On Fri, Jun 21, 2002 at 05:54:51PM +0300, Ville Herva wrote:

> Daniels patch seems great. I also recall someone (Ted T'so? Stephen Tweedie?)
> had another dir access speed-up patch for ext3... Is that applicable to ext2
> or was it already merged?

That was Ted's microoptimisation to start directory lookups at the
point where we last looked in the directory. It's in ext3 already
these days, and it would definitely help for the mass-delete case.

Cheers,
Stephen

2002-06-21 15:38:14

by Ville Herva

[permalink] [raw]
Subject: Re: [Ext2-devel] Re: Shrinking ext3 directories

On Fri, Jun 21, 2002 at 04:08:33PM +0100, you [Stephen C. Tweedie] wrote:
>
> That was Ted's microoptimisation to start directory lookups at the
> point where we last looked in the directory.

That exactly. (Micro maybe in size, but I gather the sped up lookups quite a
bit in some cases?)

> It's in ext3 already these days,

So I thought, but not in ext2?

> and it would definitely help for the mass-delete case.

Yep. Anyway, nice to see the large dir case is being addressed.


-- v --

[email protected]

2002-06-21 16:16:20

by Stephen C. Tweedie

[permalink] [raw]
Subject: Re: [Ext2-devel] Re: Shrinking ext3 directories

Hi,

On Fri, Jun 21, 2002 at 06:38:01PM +0300, Ville Herva wrote:

> So I thought, but not in ext2?

It's in ext2 --- it's the i_dir_start_lookup field that remembers
where we were last.

--Stephen

2002-06-21 18:44:08

by Ville Herva

[permalink] [raw]
Subject: Re: [Ext2-devel] Re: Shrinking ext3 directories

On Fri, Jun 21, 2002 at 05:15:50PM +0100, you [Stephen C. Tweedie] wrote:
> Hi,
>
> On Fri, Jun 21, 2002 at 06:38:01PM +0300, Ville Herva wrote:
>
> > So I thought, but not in ext2?
>
> It's in ext2 --- it's the i_dir_start_lookup field that remembers
> where we were last.

Great. (Unfortunately the box that experienced the named incident was a
2.2.20 + patches, so obviously the patch didn't help there :/ ).


-- v --

[email protected]