2014-02-02 16:20:07

by Bastien Traverse

[permalink] [raw]
Subject: mke2fs options for large media archive filesystem

Hi everybody,

I'm looking for advices concerning mkfs.ext4 parameters to use in
the following use case:

I'm planning to move my media partition (holding
my Documents, Music, Pictures & Downloads folders) from ntfs to ext4 as
well as to a larger partition (380 GiB to 900 GiB) on a new HDD. I'd
like to optimise mkfs.ext4 for this specific use case, default options
leading to not-so-optimal values (e.g. quite some wasted space because
of inappropriate inode_ratio value: account of this can be found here[1]
and there[2]).

I currently have between 93128 (df -i) and 111957 (ls -Ra | wc -l) used
inodes in this fs (btw does anybody know why I come up with such a
difference between the two methods? I don't know whether ntfs use inodes
in a compatible way with unix utilities...), for a total size of 369
GiB. Average file size revolves around 4 MiB, and if I extrapolate those
numbers to the new 900G fs I must be able to fit at least between 227141
and 273066 inodes on it, to accommodate for the same usage (900/369=2,43
so I should be able to put 2,43 times what I currently have in it if
usage remains stable).

The questions I have concern a) inode number setting, b) bigalloc
feature and c) any other tuning I could do.

a) inode number
I ran simulations of mkfs.ext4 on an up-to-date Archlinux (x86_64) to
get the characteristics of the future fs (900 GiB LUKS encryted logical
volume):
$ mkfs.ext4 -V
mke2fs 1.42.8 (20-Jun-2013)
Using EXT2FS Library version 1.42.8

Standard mkfs.ext4 command (with no reserved space) creates 59 millions
inodes:
# mkfs.ext4 -n -m 0 /dev/data/data
mke2fs 1.42.8 (20-Jun-2013)
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
Stride=0 blocks, Stripe width=0 blocks
58982400 inodes, 235929600 blocks
0 blocks (0.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=4294967296
7200 block groups
32768 blocks per group, 32768 fragments per group
8192 inodes per group
Superblock backups stored on blocks:
32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208,
4096000, 7962624, 11239424, 20480000, 23887872, 71663616, 78675968,
102400000, 214990848

59 millions is definitely far more inodes than I want/ever need, so I
tried with the largest mke2fs.conf preset pertaining to inode_ratio (-T
largefile4). I get 230400 inodes:
# mkfs.ext4 -n -m 0 -T largefile4 /dev/data/data
mke2fs 1.42.8 (20-Jun-2013)
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
Stride=0 blocks, Stripe width=0 blocks
230400 inodes, 235929600 blocks
0 blocks (0.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=4294967296
7200 block groups
32768 blocks per group, 32768 fragments per group
32 inodes per group
Superblock backups stored on blocks:
32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208,
4096000, 7962624, 11239424, 20480000, 23887872, 71663616, 78675968,
102400000, 214990848

Now this might be a bit short seeing considering my projections, and
largefile is probably still too much. So I think I'd either manually
specify the number of inodes I want with the -N option to be around 400
000, or I'd use the -i option to set the bytes-per-inode ratio to 2 MiB
(-i 2097152), therefore setting it between largefile (1 MiB) and
largefile4 (4MiB). Any hints on which method should be preferred?

b) bigalloc feature
I discovered this option while going through man mkfs.ext4 and found
more info on it in an LWN article (https://lwn.net/Articles/469805/) as
well as the ext4 wiki (https://ext4.wiki.kernel.org/index.php/Bigalloc).
Although it seems to perfectly fit my use case, I'm a bit wary because
of the warnings displayed in man page and wiki about possible problems.
Moreover I saw that Arch e2fsprogs is currently out-of-date with version
1.42.9 being in [Testing] for a month already. Changelog indicates that
a large number of bugs concerning bigalloc have been fxed in this release...
So, can I safely use bigalloc feature right now with mk2fs.ext4, setting
the cluster size to 1 MiB?

Simulation run with those options gave me following output:
# mkfs.ext4 -n -m 0 -i 2097152 -O bigalloc -C 1M /dev/data/data
mke2fs 1.42.8 (20-Jun-2013)

Warning: the bigalloc feature is still under development
See https://ext4.wiki.kernel.org/index.php/Bigalloc for more information

Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Cluster size=1048576 (log=10)
Stride=0 blocks, Stripe width=0 blocks
461216 inodes, 235929600 blocks
0 blocks (0.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=4294967296
29 block groups
8388608 blocks per group, 32768 clusters per group
15904 inodes per group
Superblock backups stored on blocks:
8388608, 25165824, 41943040, 58720256, 75497472, 209715200, 226492416

Which is far closer to what I want, but not to the expense of stability...

c) Other options
Are there any other options that could fit my use case? What do people
around here generally use for their media archives?

Regards,
- Bastien


[1] https://forums.gentoo.org/viewtopic-t-906642-start-0.html
[2] http://ubuntuforums.org/showthread.php?t=1758514


Subject: Re: mke2fs options for large media archive filesystem

Hi,

> b) bigalloc feature

As I understand (please correct me someone if I'm wrong), given that ext4
uses extents, bigalloc feature is really useful ONLY for minimizing block
bitmaps. Which is probably only useful for (maybe) reducing kernel memory
usage and for (maybe) reducing fragmentation. 'maybe' means I personally
don't know if benefit will be large enough, and I even don't know if there
would be any real benefit at all... I didn't see any real bigalloc tests
in the web...

So, I personally don't think you'll really benefit from bigalloc on a 1TB
archive partition... But I would like it very much if someone can tell an
authoritative verdict. :)

--
With best regards,
Vitaliy Filippov

2014-02-03 01:36:57

by Andreas Dilger

[permalink] [raw]
Subject: Re: mke2fs options for large media archive filesystem

I would say that bigalloc is probably not what you want for an archival system. It is relatively new and the space savings is minimal. It is mostly for filesystems that need to allocate and free large files rapidly.

Note that with bigalloc you will also be allocating directories at the bigalloc cluster size? Which will be too large in most cases.

Since you already know the average file size, using -i {average size} is what you want. The -N option is just a different way of specifying the same thing, the underlying filesystem ends up being the same.

I would also recommend to allocate about 2x the inodes you think you will need. If there are suddenly thumbnails of the photos or key photos for videos then you don't want to run out of inodes.

Cheers, Andreas

> On Feb 2, 2014, at 9:35, "Vitaliy Filippov" <[email protected]> wrote:
>
> Hi,
>
>> b) bigalloc feature
>
> As I understand (please correct me someone if I'm wrong), given that ext4 uses extents, bigalloc feature is really useful ONLY for minimizing block bitmaps. Which is probably only useful for (maybe) reducing kernel memory usage and for (maybe) reducing fragmentation. 'maybe' means I personally don't know if benefit will be large enough, and I even don't know if there would be any real benefit at all... I didn't see any real bigalloc tests in the web...
>
> So, I personally don't think you'll really benefit from bigalloc on a 1TB archive partition... But I would like it very much if someone can tell an authoritative verdict. :)
>
> --
> With best regards,
> Vitaliy Filippov
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

Subject: Re: mke2fs options for large media archive filesystem

> I would say that bigalloc is probably not what you want for an archival
> system. It is relatively new and the space savings is minimal. It is
> mostly for filesystems that need to allocate and free large files
> rapidly.

And does it introduce any memory usage benefit? Maybe not on a 1TB, but on
a 32TB partition? :)
I.e. does the kernel keep full block bitmap (or maybe a tree of blocks?)
in memory, or does it read parts of it when they're needed?

--
With best regards,
Vitaliy Filippov

2014-02-03 21:09:20

by Andreas Dilger

[permalink] [raw]
Subject: Re: mke2fs options for large media archive filesystem

On Feb 3, 2014, at 12:15 AM, Vitaliy Filippov <[email protected]> wrote:
>> I would say that bigalloc is probably not what you want for an archival system. It is relatively new and the space savings is minimal. It is mostly for filesystems that need to allocate and free large files rapidly.
>
> And does it introduce any memory usage benefit? Maybe not on a 1TB, but on a 32TB partition? :)
> I.e. does the kernel keep full block bitmap (or maybe a tree of blocks?) in memory, or does it read parts of it when they're needed?

The kernel dynamically loads bitmaps when needed.

Cheers, Andreas






Attachments:
signature.asc (833.00 B)
Message signed with OpenPGP using GPGMail

2014-02-05 16:45:38

by Bastien Traverse

[permalink] [raw]
Subject: Re: mke2fs options for large media archive filesystem

Le 02/02/2014 17:35, Vitaliy Filippov a ?crit :
> I personally don't think you'll really benefit from bigalloc on a
> 1TB archive partition.

Le 03/02/2014 02:30, Andreas Dilger a ?crit :
> I would say that bigalloc is probably not what you want for an
> archival system.

Thanks for the enlightenment! I'll leave bigalloc aside for this usage,
but I'll definitely keep it around for another use case that I wanted to
submit to the list (in a later message).

> Since you already know the average file size, using -i {average size}
> is what you want. The -N option is just a different way of specifying
> the same thing, the underlying filesystem ends up being the same.
>
> I would also recommend to allocate about 2x the inodes you think you >
will need. If there are suddenly thumbnails of the photos or key >
photos for videos then you don't want to run out of inodes.

Thanks a lot, this is much helpful. It confirms my intent to go for a 2M
bytes-per-inode ratio so as to reach ~460k inodes (or twice as much as
I'll probably need).

Cheers, Bastien