2000-11-30 04:48:17

by Steven Van Acker

[permalink] [raw]
Subject: ext2 directory size bug (?)

It this is a known thing, please don't kill me...
Hmm, gonna try to follow the REPORTING-BUGS file here...

[1.] One line summary of the problem:

directory size increases when adding 0-size files,
but doesn't decrease when removing them.

[2.] Full description of the problem/report:

when creating lots of 0-size files, the size of directory .
increases but when you delete those files, the directory size
doesn't decrease. the diskspace can only be freed when the
directory in question is removed. I know it's a stupid
bug/feature to report, but imagine someone creating lots of
files in /tmp. on some systems I know, /tmp is a small partition,
which would get filled up by that diskspace, and can only be
removed by removing /tmp

[3.] Keywords (i.e., modules, networking, kernel):

ext2 directory dir size

[4.] Kernel version (from /proc/version):

Linux version 2.2.17 (root@warp) (gcc version 2.95.2 20000220
(Debian GNU/Linux)) #1 Thu Nov 30 06:16:39 CET 2000

[5.] Output of Oops.. message (if applicable) with symbolic information
resolved (see Documentation/oops-tracing.txt)

No Oops message

[6.] A small shell script or example program which triggers the
problem (if possible)

#!/bin/bash

cd /tmp;
ls -ld .;
for i in `seq 1 3000`; do touch AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA$i; done
ls -ld .;
rm AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA* ;
ls -ld .;

[7.] Environment

[7.1.] Software (add the output of the ver_linux script here)

Linux warp 2.2.17 #1 Thu Nov 30 06:16:39 CET 2000 i686 unknown
Kernel modules 2.3.19
Gnu C 2.95.2
Binutils 2.10.91
Linux C Library 2.1.95
Dynamic linker ldd (GNU libc) 2.1.95
Procps 2.0.6
Mount 2.10o
Net-tools 2.05
Console-tools 0.2.3
Sh-utils 2.0i
Modules Loaded au8820

[7.2.] Processor information (from /proc/cpuinfo):

processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 6
model name : Celeron (Mendocino)
stepping : 5
cpu MHz : 434.325
cache size : 128 KB
fdiv_bug : no
hlt_bug : no
sep_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 2
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 sep mtrr pge mca cmov pat pse36 mmx fxsr
bogomips : 865.08

[7.3.] Module information (from /proc/modules):

au8820 111120 3

[7.4.] SCSI information (from /proc/scsi/scsi)

none

[7.5.] Other information that might be relevant to the problem
(please look in /proc and include all information that you
think to be relevant):

my /tmp is on an ext2 partition

[X.] Other notes, patches, fixes, workarounds:

I wish, but I have 0 kernel programming experience

--
"An ounce of prevention is worth a pound of purge."


2000-11-30 07:51:33

by Andreas Dilger

[permalink] [raw]
Subject: Re: ext2 directory size bug (?)

You write:
> Hmm, gonna try to follow the REPORTING-BUGS file here...
>
> [1.] One line summary of the problem:
>
> directory size increases when adding 0-size files,
> but doesn't decrease when removing them.

It may or may not be considered a bug, but in any case it has been like
this for a long time and I doubt it will change. The directory size is
not dependent upon the file size, only the length of the file names.

One "reason" why ext2 directories don't shrink when the files are deleted
is because e2fsck relies on this behaviour for the lost+found directory,
so that you don't need to allocate blocks for lost+foung on a corrupted
filesystem when doing recovery of unlinked files.

In some cases, you may have a directory entry in the last block, so you
can't free any of the earlier blocks even if they are empty.

In most cases, if you have created many files in one directory in the past,
you are likely to create many there again - so easier just to keep the
directory blocks until next time.

In most cases, the number of blocks allocated to a directory (but never
to be used again) is very small, and people don't really worry about it.
I think the scenario where you have a large amount of space in directories
that will never be used again is very unusual and should not be a reason
to make the code more complex.

Cheers, Andreas
--
Andreas Dilger \ "If a man ate a pound of pasta and a pound of antipasto,
\ would they cancel out, leaving him still hungry?"
http://www-mddsp.enel.ucalgary.ca/People/adilger/ -- Dogbert

2000-11-30 13:55:15

by Richard B. Johnson

[permalink] [raw]
Subject: Re: ext2 directory size bug (?)

On Thu, 30 Nov 2000, Steven Van Acker wrote:

> It this is a known thing, please don't kill me...
> Hmm, gonna try to follow the REPORTING-BUGS file here...

It is not a bug. Directory entries increase in size as allocation
units are added to handle directory entries. Once allocated, they
are not deallocated until the directory is removed.

This is actually a feature. The directory does not get truncated.

Cheers,
Dick Johnson

Penguin : Linux version 2.4.0 on an i686 machine (799.54 BogoMips).

"Memory is like gasoline. You use it up when you are running. Of
course you get it all back when you reboot..."; Actual explanation
obtained from the Micro$oft help desk.


2000-12-02 05:27:53

by Chris Wedgwood

[permalink] [raw]
Subject: Re: ext2 directory size bug (?)

On Thu, Nov 30, 2000 at 08:24:02AM -0500, Richard B. Johnson wrote:

This is actually a feature. The directory does not get truncated.

Arguably directories could be truncated when objects towards the end
are removed; I believe UFS under Solaris might do this?

An even better heuristic I like would allow repacking of a directory
and truncation if you could safely half the size -- but I suspect
locking issues might be hideous here.


--cw

2000-12-02 05:45:20

by Alexander Viro

[permalink] [raw]
Subject: Re: ext2 directory size bug (?)



On Sat, 2 Dec 2000, Chris Wedgwood wrote:

> On Thu, Nov 30, 2000 at 08:24:02AM -0500, Richard B. Johnson wrote:
>
> This is actually a feature. The directory does not get truncated.
>
> Arguably directories could be truncated when objects towards the end
> are removed; I believe UFS under Solaris might do this?
>
> An even better heuristic I like would allow repacking of a directory
> and truncation if you could safely half the size -- but I suspect
> locking issues might be hideous here.

Not really. Anything that modifies directories holds both ->i_sem and
->i_zombie, lookups hold ->i_sem and emptiness checks (i.e. victim in
rmdir and overwriting rename) hold ->i_zombie, readdir holds both.

So we could even play with punching holes in them - very easy to do when
you do ext2_delete_entry(). I've done that on directories-in-pagecache
system, but decided that I don't want to deal with (bogus) warnings from
earlier kernels (they would do the right thing, but they would complain
loudly). Truncating is a piece of cake. Repacking is not a good idea,
though, since you are risking massive corruption in case of dirty shutdown
in the wrong moment.

2000-12-02 06:01:21

by Chris Wedgwood

[permalink] [raw]
Subject: Re: ext2 directory size bug (?)

On Sat, Dec 02, 2000 at 12:14:34AM -0500, Alexander Viro wrote:

Not really. Anything that modifies directories holds both ->i_sem and
->i_zombie, lookups hold ->i_sem and emptiness checks (i.e. victim in
rmdir and overwriting rename) hold ->i_zombie, readdir holds both.

what performance issues does this raise in the cast of a directory
with _many_ files in it -- when we are renaming often involving that
directory?

I ask this because certain MTAs do just that; and when you have
10,000 to 100,000 messages queued I immagine you might spend much of
your time waiting for ->i_sem locks?

Truncating is a piece of cake. Repacking is not a good idea,
though, since you are risking massive corruption in case of dirty
shutdown in the wrong moment.

ext2 directories seem somewhat susepctable to corruption on badly
timed shutdowns anyhow; and I don't think there is any way to do
atomic writes to them with most disk hardware is there?


--cw

2000-12-02 06:13:37

by Alexander Viro

[permalink] [raw]
Subject: Re: ext2 directory size bug (?)



On Sat, 2 Dec 2000, Chris Wedgwood wrote:

> On Sat, Dec 02, 2000 at 12:14:34AM -0500, Alexander Viro wrote:
>
> Not really. Anything that modifies directories holds both ->i_sem and
> ->i_zombie, lookups hold ->i_sem and emptiness checks (i.e. victim in
> rmdir and overwriting rename) hold ->i_zombie, readdir holds both.
>
> what performance issues does this raise in the cast of a directory
> with _many_ files in it -- when we are renaming often involving that
> directory?
>
> I ask this because certain MTAs do just that; and when you have
> 10,000 to 100,000 messages queued I immagine you might spend much of
> your time waiting for ->i_sem locks?

And where do you get contending processes? 'Cause it takes at least two
to get that...

When you have that size of message queues your best bet is to split them into
many directories, though - all FFS derivatives do linear searches, so locking
or not, you are going to lose.

> Truncating is a piece of cake. Repacking is not a good idea,
> though, since you are risking massive corruption in case of dirty
> shutdown in the wrong moment.
>
> ext2 directories seem somewhat susepctable to corruption on badly
> timed shutdowns anyhow; and I don't think there is any way to do
> atomic writes to them with most disk hardware is there?

It has to do with the lack of write ordering. Which can be fixed, but not
if you do many updates in one operation. However, even without the ordering
repacking increases the window of corruption. Big way.

2000-12-02 13:11:50

by kaih

[permalink] [raw]
Subject: Re: ext2 directory size bug (?)

[email protected] (Alexander Viro) wrote on 02.12.00 in <[email protected]>:

> On Sat, 2 Dec 2000, Chris Wedgwood wrote:
>
> > On Sat, Dec 02, 2000 at 12:14:34AM -0500, Alexander Viro wrote:
> >
> > Not really. Anything that modifies directories holds both ->i_sem and
> > ->i_zombie, lookups hold ->i_sem and emptiness checks (i.e. victim in
> > rmdir and overwriting rename) hold ->i_zombie, readdir holds both.
> >
> > what performance issues does this raise in the cast of a directory
> > with _many_ files in it -- when we are renaming often involving that
> > directory?
> >
> > I ask this because certain MTAs do just that; and when you have
> > 10,000 to 100,000 messages queued I immagine you might spend much of
> > your time waiting for ->i_sem locks?
>
> And where do you get contending processes? 'Cause it takes at least two
> to get that...

More than one queue worker running, for example. On systems with that much
mail, that's just about essential to have.

But I suspect scanning the directory is much worse than renaming. Scanning
long ext2 (or traditional Unix, for that matter) directories gets *really*
ugly. That's why Exim, for example, has the "split spool directory" code
(works very similar to the traditional terminfo split).

> When you have that size of message queues your best bet is to split them
> into many directories, though - all FFS derivatives do linear searches, so
> locking or not, you are going to lose.

Exactly.

> > ext2 directories seem somewhat susepctable to corruption on badly
> > timed shutdowns anyhow; and I don't think there is any way to do
> > atomic writes to them with most disk hardware is there?

I don't think I've seen that. Possibly if you're doing massive directory
creation just at that moment (unpacking a kernel source tarball, say), but
in that case I'd call it expected on a non-journalling fs.

If anything, I've seen chopped-up regular files (usually stuff like spool
files the MTA was just messing around with).

MfG Kai