2008-12-23 03:02:13

by Zhang Xiliang

[permalink] [raw]
Subject: Problems with the max value for create directory

Hi,

I creat 65537 long directories and failed when the block size is 1024.

# mkfs.ext4dev -b 1024 -I 256 /dev/hda3
# tune2fs -E test_fs -O extents /dev/hda3
# mount -t ext4dev /dev/hda3 /mnt
# ./create_long_dirs 65537 /mnt

The code of create_long_dirs.c:

#define _ATFILE_SOURCE
#include <unistd.h>
#include <stdlib.h>
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <fcntl.h>
#include <sys/types.h>
//#define __USE_ATFILE
#include <sys/stat.h>

#define NAME_LEN 255
#define NCHARS 62
#define MAX_LEN1 62
#define MAX_LEN2 (62 * 62)
#define MAX_LEN3 (62 * 62 * 62)

/* valid characters for the directory name */
char chars[NCHARS + 1] =
"0123456789qwertyuiopasdfghjklzxcvbnmQWERTYUIOPASDFGHJKLZXCVBNM";

/* to store the generated directory name */
char name[NAME_LEN + 1];
int names;
int parent_fd;

void init_name(void)
{
int i;

srand(time(NULL));

for (i = 0; i < NAME_LEN; i++)
name[i] = chars[rand() % 62];
}

void create_dir(void)
{
if (mkdirat(parent_fd, name, S_IRWXU)) {
perror("mkdir");
exit(1);
}
}

/*
* create_dirs - create @names directory names
* @n: how many names to be created
*
* if n <= 62, we need to modify 1 char of the name
* if n <= 62*62, we need to modify 2 chars
* if n <= 62*62*62, we need to modify 3 chars
*/
void create_dirs(int n)
{
int i, j, k;
int depth;

if (n <= MAX_LEN1)
depth = 1;
else if (n <= MAX_LEN2)
depth = 2;
else
depth = 3;

for (i = 0; i < NCHARS; i++) {
name[0] = chars[i];
if (depth == 1) {
create_dir();
if (--n == 0)
return;
continue;
}

for (j = 0; j < NCHARS; j++) {
name[1] = chars[j];
if (depth == 2) {
create_dir();
if (--n == 0)
return;
continue;
}

for (k = 0; k < NCHARS; k++) {
name[2] = chars[k];
create_dir();
if (--n == 0)
return;
}
}
}
}

void usage()
{
fprintf(stderr, "Usage: create_long_dirs nr_dirs parent_dir\n");
}

int main(int argc, char *argv[])
{
if (argc != 3) {
usage();
return 1;
}

names = atoi(argv[1]);
if (names > MAX_LEN3 || names <= 0) {
usage();
return 1;
}

parent_fd = open(argv[2], O_RDONLY);
if (parent_fd == -1) {
perror("open parent dir failed");
return 1;
}

init_name();

create_dirs(names);

return 0;
}


--
Regards
Zhang Xiliang





2008-12-23 04:02:37

by Toshiyuki Okajima

[permalink] [raw]
Subject: Re: Problems with the max value for create directory

Hi,

Zhang Xiliang wrote:
> Hi,
>
> I creat 65537 long directories and failed when the block size is 1024.
>
> # mkfs.ext4dev -b 1024 -I 256 /dev/hda3
> # tune2fs -E test_fs -O extents /dev/hda3
> # mount -t ext4dev /dev/hda3 /mnt
> # ./create_long_dirs 65537 /mnt
>
> The code of create_long_dirs.c:

ext4 filesystem cannot make over 65000 links toward a file.
(ext3 filesystem cannot make over 32000 links toward a file.)
This test makes over 65000 links toward /mnt-directory.
(Creating 65000 sub-directories makes 65000 links toward /mnt-directory.)

static int ext4_mkdir(struct inode *dir, struct dentry *dentry, int mode)
{
handle_t *handle;
struct inode *inode;
struct buffer_head *dir_block;
struct ext4_dir_entry_2 *de;
int err, retries = 0;

if (EXT4_DIR_LINK_MAX(dir))
     ^^^^^^^^^^^^^^^^^^^^^^^^^^^    
return -EMLINK;

This limit is ext4's specification.

Regards,
Toshiyuki Okajima

2008-12-23 07:49:49

by Andreas Dilger

[permalink] [raw]
Subject: Re: Problems with the max value for create directory

On Dec 23, 2008 13:02 +0900, Toshiyuki Okajima wrote:
> Zhang Xiliang wrote:
>> Hi,
>>
>> I creat 65537 long directories and failed when the block size is 1024.
>>
>> # mkfs.ext4dev -b 1024 -I 256 /dev/hda3
>> # tune2fs -E test_fs -O extents /dev/hda3
>> # mount -t ext4dev /dev/hda3 /mnt
>> # ./create_long_dirs 65537 /mnt
>>
>> The code of create_long_dirs.c:
>
> ext4 filesystem cannot make over 65000 links toward a file.
> (ext3 filesystem cannot make over 32000 links toward a file.)
> This test makes over 65000 links toward /mnt-directory.
> (Creating 65000 sub-directories makes 65000 links toward /mnt-directory.)

Note that there is a specific reason why it was implemented this way:
- a directory with > 65000 subdirectories can be checked if empty even
if the link count is wrong (in fact link count was ignored even in ext3)
- a file needs to keep accurate link counts or it is impossible to know
when the file needs to be deleted.

We thought about adding a "i_links_count_hi" but it wasn't thought that
many (any) real applications would create so many hard links on the same
file.


Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.


2008-12-23 20:12:58

by Theodore Ts'o

[permalink] [raw]
Subject: Re: Problems with the max value for create directory

On Tue, Dec 23, 2008 at 01:02:25PM +0900, Toshiyuki Okajima wrote:
>> I creat 65537 long directories and failed when the block size is 1024.
>>
>
> static int ext4_mkdir(struct inode *dir, struct dentry *dentry, int mode)
> {
> handle_t *handle;
> struct inode *inode;
> struct buffer_head *dir_block;
> struct ext4_dir_entry_2 *de;
> int err, retries = 0;
>
> if (EXT4_DIR_LINK_MAX(dir))
>      ^^^^^^^^^^^^^^^^^^^^^^^^^^^    
> return -EMLINK;
>
> This limit is ext4's specification.

The definition of EXT4_DIR_LINK_MAX is here:

#define is_dx(dir) (EXT4_HAS_COMPAT_FEATURE(dir->i_sb, \
EXT4_FEATURE_COMPAT_DIR_INDEX) && \
(EXT4_I(dir)->i_flags & EXT4_INDEX_FL))
#define EXT4_DIR_LINK_MAX(dir) (!is_dx(dir) && (dir)->i_nlink >= EXT4_LINK_MAX)

So that's not it. The problem is that indexed diretories have a limit
that only allows the trees to be two levels deep. The fanout is
normally big enough that this is effectively not a problem, but if you
use very long filenames, and a 1k blocksize, you will run into this
limit much more quickly. So the problem is not the number of sub
directories, but the maximum depth of the htree allowed in Daniel
Phillips' relatively restricted implementation. Note that with a 4k
block filesystem, the limits get expanded by a factor of 4 cubed, or
64. And most of the time users aren't maximal length named directory
entries, which further pushes the limit out in the normal case.

It in theory would be possible to relax this restriction, using a more
advanced htree implementation and a feature flag to allow backwards
compatibility with older kernels that only support the maximal depth.
Andreas has a prototype kernel implementation which in theory could be
added to ext4. It hasn't been high on my priority list to complete,
but if someone else really finds this limit to be annoying, it is a
project they might try to complete.

Were you writing this test program because this is a realistic
situation for your application, or just to explore the limits of ext4?

Regards,

- Ted

2008-12-24 01:18:09

by Zhang Xiliang

[permalink] [raw]
Subject: Re: Problems with the max value for create directory

Theodore Tso 写道:

>
> So that's not it. The problem is that indexed diretories have a limit
> that only allows the trees to be two levels deep. The fanout is
> normally big enough that this is effectively not a problem, but if you
> use very long filenames, and a 1k blocksize, you will run into this
> limit much more quickly. So the problem is not the number of sub
> directories, but the maximum depth of the htree allowed in Daniel
> Phillips' relatively restricted implementation. Note that with a 4k
> block filesystem, the limits get expanded by a factor of 4 cubed, or
> 64. And most of the time users aren't maximal length named directory
> entries, which further pushes the limit out in the normal case.
>
> It in theory would be possible to relax this restriction, using a more
> advanced htree implementation and a feature flag to allow backwards
> compatibility with older kernels that only support the maximal depth.
> Andreas has a prototype kernel implementation which in theory could be
> added to ext4. It hasn't been high on my priority list to complete,
> but if someone else really finds this limit to be annoying, it is a
> project they might try to complete.
>
> Were you writing this test program because this is a realistic
> situation for your application, or just to explore the limits of ext4?
>

Thanks for explanation.

I see the limit of ext4 subdirectory. The test program originally tests it.
But I fail and find the limit of the htree.

I think it may be annoying. Somebody may be puzzled for the two limits.
The limit of the htree should be greater than the limit of ext4 subdirectory.

--
Regards
Zhang Xiliang

2008-12-24 23:54:32

by Theodore Ts'o

[permalink] [raw]
Subject: Re: Problems with the max value for create directory

On Wed, Dec 24, 2008 at 09:18:53AM +0800, Zhang Xiliang wrote:
>
> Thanks for explanation.
>
> I see the limit of ext4 subdirectory. The test program originally tests it.
> But I fail and find the limit of the htree.

But if that was your goal, why did you use the largest possible
filename length for the subdirectories? And why did you use a 1k
block filesystem? Both of these are totally unnecessary if the goal
is to test the limit of the number of ext4 subdirectories, and are
*ideal* if you are trying to detect the htree limit.

> I think it may be annoying. Somebody may be puzzled for the two limits.
> The limit of the htree should be greater than the limit of ext4 subdirectory.

Well, are you annoyed enough to help to try to solve the problem? In
actual practice very few people have run across the htree limit as far
as I know.

- Ted

2008-12-29 13:47:16

by Peng Tao

[permalink] [raw]
Subject: Re: Problems with the max value for create directory

Hi, guys

I got a question about this.
Since htree is a potential limitation for subdirectories. Is there a
reason why EXT4_LINK_MAX is applied when fs hasn't dir_index but
ignored when fs has dir_index(by the following code)?

#define is_dx(dir) (EXT4_HAS_COMPAT_FEATURE(dir->i_sb, \
EXT4_FEATURE_COMPAT_DIR_INDEX) && \
(EXT4_I(dir)->i_flags & EXT4_INDEX_FL))
#define EXT4_DIR_LINK_MAX(dir) (!is_dx(dir) && (dir)->i_nlink >= EXT4_LINK_MAX)


On Wed, Dec 24, 2008 at 9:18 AM, Zhang Xiliang
<[email protected]> wrote:
> Theodore Tso 写道:
>
>>
>> So that's not it. The problem is that indexed diretories have a limit
>> that only allows the trees to be two levels deep. The fanout is
>> normally big enough that this is effectively not a problem, but if you
>> use very long filenames, and a 1k blocksize, you will run into this
>> limit much more quickly. So the problem is not the number of sub
>> directories, but the maximum depth of the htree allowed in Daniel
>> Phillips' relatively restricted implementation. Note that with a 4k
>> block filesystem, the limits get expanded by a factor of 4 cubed, or
>> 64. And most of the time users aren't maximal length named directory
>> entries, which further pushes the limit out in the normal case.
>>
>> It in theory would be possible to relax this restriction, using a more
>> advanced htree implementation and a feature flag to allow backwards
>> compatibility with older kernels that only support the maximal depth.
>> Andreas has a prototype kernel implementation which in theory could be
>> added to ext4. It hasn't been high on my priority list to complete,
>> but if someone else really finds this limit to be annoying, it is a
>> project they might try to complete.
>>
>> Were you writing this test program because this is a realistic
>> situation for your application, or just to explore the limits of ext4?
>>
>
> Thanks for explanation.
>
> I see the limit of ext4 subdirectory. The test program originally tests it.
> But I fail and find the limit of the htree.
>
> I think it may be annoying. Somebody may be puzzled for the two limits.
> The limit of the htree should be greater than the limit of ext4
> subdirectory.
>
> --
> Regards
> Zhang Xiliang
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>



--
Cheers,

Bergwolf

................
Rodney Dangerfield - "The way my luck is running, if I was a
politician I would be honest."

2009-01-06 03:16:32

by Andreas Dilger

[permalink] [raw]
Subject: Re: Problems with the max value for create directory

On Dec 29, 2008 21:47 +0800, Peng tao wrote:
> I got a question about this.
> Since htree is a potential limitation for subdirectories. Is there a
> reason why EXT4_LINK_MAX is applied when fs hasn't dir_index but
> ignored when fs has dir_index(by the following code)?
>
> #define is_dx(dir) (EXT4_HAS_COMPAT_FEATURE(dir->i_sb, \
> EXT4_FEATURE_COMPAT_DIR_INDEX) && \
> (EXT4_I(dir)->i_flags & EXT4_INDEX_FL))
> #define EXT4_DIR_LINK_MAX(dir) (!is_dx(dir) && (dir)->i_nlink >= EXT4_LINK_MAX)

Because without the dir_index feature the largest usable directory size
is only about 10k-20k files before the O(n^2) insertion performance is
so bad that the directory is useless.

> On Wed, Dec 24, 2008 at 9:18 AM, Zhang Xiliang
> <[email protected]> wrote:
> > Theodore Tso 写道:
> >
> >>
> >> So that's not it. The problem is that indexed diretories have a limit
> >> that only allows the trees to be two levels deep. The fanout is
> >> normally big enough that this is effectively not a problem, but if you
> >> use very long filenames, and a 1k blocksize, you will run into this
> >> limit much more quickly. So the problem is not the number of sub
> >> directories, but the maximum depth of the htree allowed in Daniel
> >> Phillips' relatively restricted implementation. Note that with a 4k
> >> block filesystem, the limits get expanded by a factor of 4 cubed, or
> >> 64. And most of the time users aren't maximal length named directory
> >> entries, which further pushes the limit out in the normal case.
> >>
> >> It in theory would be possible to relax this restriction, using a more
> >> advanced htree implementation and a feature flag to allow backwards
> >> compatibility with older kernels that only support the maximal depth.
> >> Andreas has a prototype kernel implementation which in theory could be
> >> added to ext4. It hasn't been high on my priority list to complete,
> >> but if someone else really finds this limit to be annoying, it is a
> >> project they might try to complete.
> >>
> >> Were you writing this test program because this is a realistic
> >> situation for your application, or just to explore the limits of ext4?
> >>
> >
> > Thanks for explanation.
> >
> > I see the limit of ext4 subdirectory. The test program originally tests it.
> > But I fail and find the limit of the htree.
> >
> > I think it may be annoying. Somebody may be puzzled for the two limits.
> > The limit of the htree should be greater than the limit of ext4
> > subdirectory.
> >
> > --
> > Regards
> > Zhang Xiliang
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> > the body of a message to [email protected]
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
> >
>
>
>
> --
> Cheers,
>
> Bergwolf
>
> ................
> Rodney Dangerfield - "The way my luck is running, if I was a
> politician I would be honest."
> N?????r??y????b?X??ǧv?^?)޺{.n?+????{?{x?{ay?ʇڙ?,j??f???h???z??w??? ???j:+v???w?j?m????????zZ+?????ݢj"??!

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.