2004-06-22 17:42:53

by Michael Kerrisk

[permalink] [raw]
Subject: Strange NOTAIL inheritance behaviour in Reiserfs 3.6

Gidday,

Problem summary:
On a Reiserfs 3.6 file system, I create a directory with the NOTAIL
attribute set and create 10000 1-byte files in that directory. lsattr(1)
shows that the NOTAIL attribute is set on (i.e., inherited by) all of the
files. However, the disk space consumption remains small (certainly not
10000 blocks used). Only when I explicitly set the NOTAIL attribute on all
the files does disk consumption rise to what I would expect. In other
words, the files are inheriting the NOTAIL attribute form their parent
directory, but this inheritance has no effect.

Looking at the 2.6.6 (vanilla) kernel sources, AFAICS the code matches my
observations (unpacking is only performed on an explicit ioctl() call).

The question is why are things done like this? It certainly seems to be
misleading, possibly buggy and undesirable behaviour.

This behaviour observed on Reiserfs 3.6.13 (SUSE's 2.6.4 kernel on SUSE
9.1).


Detailed example follows:

Create a file system, with a directory marked NOTAIL:

# mkreiserfs -b 4096 /dev/hda12
mkreiserfs 3.6.13 (2003 http://www.namesys.com)
[...]
Guessing about desired format.. Kernel 2.6.4-52-default is running.
Format 3.6 with standard journal
Count of blocks on the device: 158624
Number of blocks consumed by mkreiserfs formatting process: 8216
Blocksize: 4096
Hash function used to sort names: "r5"
Journal Size 8193 blocks (first block 18)
Journal Max transaction length 1024
inode generation number: 0
UUID: 89f14047-2daf-4707-bce3-bbf9128ace2e
ATTENTION: YOU SHOULD REBOOT AFTER FDISK!
ALL DATA WILL BE LOST ON '/dev/hda12'!
Continue (y/n):y
Initializing journal - 0%....20%....40%....60%....80%....100%
Syncing..ok
ReiserFS is successfully created on /dev/hda12.

# mount -t reiserfs /dev/hda12 /testfs
# mkdir /testfs/t
# chattr +t /testfs/t
# df /dev/hda12
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/hda12 634472 32840 601632 6% /testfs

The 'write_blocks' program creates 1000 files, each 1 byte long:

# time ./write_blocks -s 1 -n 1 -m 10000 /te stfs/t/x
real 0m1.142s
user 0m0.056s
sys 0m1.075s
# df /dev/hda12
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/hda12 634472 34080 600392 6% /testfs

Above, we see a change in disc consumption of 1240 1-k blocks -- i.e., those
10000 files are consuming way less than 10000 * 4096 bytes.

# cd /testfs/t

Show that there really are 10000 files, that they are 1 byte long, and that
the NOTAIL attribute is set on on them:

# ls | wc
10002 10002 80005
# ls -l | head -8
total 40234
drwxr-xr-x 2 root root 240048 2004-06-22 17:59 .
drwxr-xr-x 5 root root 104 2004-06-22 17:59 ..
-rw-r--r-- 1 root root 1 2004-06-22 17:59 x000000
-rw-r--r-- 1 root root 1 2004-06-22 17:59 x000001
-rw-r--r-- 1 root root 1 2004-06-22 17:59 x000002
-rw-r--r-- 1 root root 1 2004-06-22 17:59 x000003
-rw-r--r-- 1 root root 1 2004-06-22 17:59 x000004
# lsattr | head -5
-----------t- ./x000000
-----------t- ./x000001
-----------t- ./x000002
-----------t- ./x000003
-----------t- ./x000004

Now explicitly setting the NOTAIL attribute on all of the files causes the
expected disk consumption:

# time chattr +t *

real 0m0.836s
user 0m0.117s
sys 0m0.711s
# df /dev/hda12
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/hda12 634472 74080 560392 12% /testfs

74080-34080 ==> 40000 1-k bytes.

Best regards,

Michael Kerrisk


2004-06-22 21:33:16

by Hans Reiser

[permalink] [raw]
Subject: Re: Strange NOTAIL inheritance behaviour in Reiserfs 3.6

vs and chris, please comment.

Hans

Michael Kerrisk wrote:

>Gidday,
>
>Problem summary:
>On a Reiserfs 3.6 file system, I create a directory with the NOTAIL
>attribute set and create 10000 1-byte files in that directory. lsattr(1)
>shows that the NOTAIL attribute is set on (i.e., inherited by) all of the
>files. However, the disk space consumption remains small (certainly not
>10000 blocks used). Only when I explicitly set the NOTAIL attribute on all
>the files does disk consumption rise to what I would expect. In other
>words, the files are inheriting the NOTAIL attribute form their parent
>directory, but this inheritance has no effect.
>
>Looking at the 2.6.6 (vanilla) kernel sources, AFAICS the code matches my
>observations (unpacking is only performed on an explicit ioctl() call).
>
>The question is why are things done like this? It certainly seems to be
>misleading, possibly buggy and undesirable behaviour.
>
>This behaviour observed on Reiserfs 3.6.13 (SUSE's 2.6.4 kernel on SUSE
>9.1).
>
>
>Detailed example follows:
>
>Create a file system, with a directory marked NOTAIL:
>
> # mkreiserfs -b 4096 /dev/hda12
> mkreiserfs 3.6.13 (2003 http://www.namesys.com)
> [...]
> Guessing about desired format.. Kernel 2.6.4-52-default is running.
> Format 3.6 with standard journal
> Count of blocks on the device: 158624
> Number of blocks consumed by mkreiserfs formatting process: 8216
> Blocksize: 4096
> Hash function used to sort names: "r5"
> Journal Size 8193 blocks (first block 18)
> Journal Max transaction length 1024
> inode generation number: 0
> UUID: 89f14047-2daf-4707-bce3-bbf9128ace2e
> ATTENTION: YOU SHOULD REBOOT AFTER FDISK!
> ALL DATA WILL BE LOST ON '/dev/hda12'!
> Continue (y/n):y
> Initializing journal - 0%....20%....40%....60%....80%....100%
> Syncing..ok
> ReiserFS is successfully created on /dev/hda12.
>
> # mount -t reiserfs /dev/hda12 /testfs
> # mkdir /testfs/t
> # chattr +t /testfs/t
> # df /dev/hda12
> Filesystem 1K-blocks Used Available Use% Mounted on
> /dev/hda12 634472 32840 601632 6% /testfs
>
>The 'write_blocks' program creates 1000 files, each 1 byte long:
>
> # time ./write_blocks -s 1 -n 1 -m 10000 /te stfs/t/x
> real 0m1.142s
> user 0m0.056s
> sys 0m1.075s
> # df /dev/hda12
> Filesystem 1K-blocks Used Available Use% Mounted on
> /dev/hda12 634472 34080 600392 6% /testfs
>
>Above, we see a change in disc consumption of 1240 1-k blocks -- i.e., those
>10000 files are consuming way less than 10000 * 4096 bytes.
>
> # cd /testfs/t
>
>Show that there really are 10000 files, that they are 1 byte long, and that
>the NOTAIL attribute is set on on them:
>
> # ls | wc
> 10002 10002 80005
> # ls -l | head -8
> total 40234
> drwxr-xr-x 2 root root 240048 2004-06-22 17:59 .
> drwxr-xr-x 5 root root 104 2004-06-22 17:59 ..
> -rw-r--r-- 1 root root 1 2004-06-22 17:59 x000000
> -rw-r--r-- 1 root root 1 2004-06-22 17:59 x000001
> -rw-r--r-- 1 root root 1 2004-06-22 17:59 x000002
> -rw-r--r-- 1 root root 1 2004-06-22 17:59 x000003
> -rw-r--r-- 1 root root 1 2004-06-22 17:59 x000004
> # lsattr | head -5
> -----------t- ./x000000
> -----------t- ./x000001
> -----------t- ./x000002
> -----------t- ./x000003
> -----------t- ./x000004
>
>Now explicitly setting the NOTAIL attribute on all of the files causes the
>expected disk consumption:
>
> # time chattr +t *
>
> real 0m0.836s
> user 0m0.117s
> sys 0m0.711s
> # df /dev/hda12
> Filesystem 1K-blocks Used Available Use% Mounted on
> /dev/hda12 634472 74080 560392 12% /testfs
>
>74080-34080 ==> 40000 1-k bytes.
>
>Best regards,
>
>Michael Kerrisk
>
>
>
>
>

2004-06-23 11:11:28

by Oleg Drokin

[permalink] [raw]
Subject: Re: Strange NOTAIL inheritance behaviour in Reiserfs 3.6

Hello!

"Michael Kerrisk" <[email protected]> wrote:

MK> On a Reiserfs 3.6 file system, I create a directory with the NOTAIL
MK> attribute set and create 10000 1-byte files in that directory. lsattr(1)
MK> shows that the NOTAIL attribute is set on (i.e., inherited by) all of the
MK> files. However, the disk space consumption remains small (certainly not
MK> 10000 blocks used). Only when I explicitly set the NOTAIL attribute on all
MK> the files does disk consumption rise to what I would expect. In other
MK> words, the files are inheriting the NOTAIL attribute form their parent
MK> directory, but this inheritance has no effect.

I believe there is user error on your part. Extended inode attributes
are disabled by default on reiserfs.

MK> Detailed example follows:

MK> # mount -t reiserfs /dev/hda12 /testfs

Does it work as expected if you add "-o attrs" to the mount command?

Bye,
Oleg

2004-06-24 15:59:20

by Oleg Drokin

[permalink] [raw]
Subject: Re: Strange NOTAIL inheritance behaviour in Reiserfs 3.6

Hello!

On Thu, Jun 24, 2004 at 04:27:44PM +0200, Michael Kerrisk wrote:
> >
> > MK> # mount -t reiserfs /dev/hda12 /testfs
> > Does it work as expected if you add "-o attrs" to the mount command?
> Yes! Thanks. However, it is a little unfortunate that if one fails
> to use this option, then:
> 1. "chattr +t" (and I suppose underlying ioctl()s) can still be used to
> set this attribute on a directory, without any error resulting.
> It would be better if an error is reported.

Well, initial idea was to allow people to at least reset attributes
in case of operationg with disabled attributes processing.

> 2. The attribute is then inherited by files created in that directory,
> but has no effect.

Yes, attribute inheritance is working. The only part that is disabled
by default is copying from fs-specific attribute storage to actual VFS inode
attributes.

> 3. A later explicit "chattr + t" on the files themselves DOES result in
> unpacking of the tails. Why?

There is a check in attributes setting code (and attributes setting/cleaning
is enabled), that tests if NOTAIL attribute is set, that calls tails
unpacking if so. Next time you write to that file it will be packed back
(if possible).

I agree that all of this is not very intuitive, though.

Bye,
Oleg

2004-06-24 16:12:20

by Michael Kerrisk

[permalink] [raw]
Subject: Re: Strange NOTAIL inheritance behaviour in Reiserfs 3.6

> On Thu, Jun 24, 2004 at 04:27:44PM +0200, Michael Kerrisk wrote:
> > >
> > > MK> # mount -t reiserfs /dev/hda12 /testfs
> > > Does it work as expected if you add "-o attrs" to the mount command?
> > Yes! Thanks. However, it is a little unfortunate that if one fails
> > to use this option, then:
> > 1. "chattr +t" (and I suppose underlying ioctl()s) can still be used to
> > set this attribute on a directory, without any error resulting.
> > It would be better if an error is reported.
>
> Well, initial idea was to allow people to at least reset attributes
> in case of operationg with disabled attributes processing.

This seems to a bad idea. Why should I be able to reset
attributes if the FS is mounted without "-o attrs"? That
seems to thwart the point of excluding "-o attrs". In any
case, how about at least an error when one tries to *set*
attributes when "-o attrs" was specified?

> > 2. The attribute is then inherited by files created in that directory,
> > but has no effect.
>
> Yes, attribute inheritance is working. The only part that is disabled
> by default is copying from fs-specific attribute storage to actual VFS
inode
> attributes.
>
> > 3. A later explicit "chattr + t" on the files themselves DOES result in
> > unpacking of the tails. Why?
>
> There is a check in attributes setting code (and attributes
setting/cleaning
> is enabled), that tests if NOTAIL attribute is set, that calls tails
> unpacking if so. Next time you write to that file it will be packed back
> (if possible).

Strange ;-).

> I agree that all of this is not very intuitive, though.

Okay -- thanks Oleg.

Cheers,

Michael