2005-11-13 05:13:05

by Jeff V. Merkey

[permalink] [raw]
Subject: Severe VFS Performance Issues 2.6 with > 95000 directory entries


The subject line speaks for itself. This is using standard VFS readdir
and lookup calls through the VFSwith ftp. Very poor.
It appears dcache related since longer file names proportionately take
longer based on the size of the name. My lookup routines use static
pinned tables in memory, and are very fast. VFS peformance non-ftp are
reasonable, but still have problems with the number of entries
in a directory gets above 50,000.

Jeff


2005-11-13 12:46:20

by Nikita Danilov

[permalink] [raw]
Subject: Re: Severe VFS Performance Issues 2.6 with > 95000 directory entries

Jeff V. Merkey writes:
>
> The subject line speaks for itself. This is using standard VFS readdir
> and lookup calls through the VFSwith ftp. Very poor.

Reiser4 works fine with 100M entries in a directory, so VFS is not a
bottleneck here.

[...]

>
> Jeff

Nikita.

2005-11-13 21:27:29

by Jeff V. Merkey

[permalink] [raw]
Subject: Re: Severe VFS Performance Issues 2.6 with > 95000 directory entries

Nikita Danilov wrote:

>Jeff V. Merkey writes:
> >
> > The subject line speaks for itself. This is using standard VFS readdir
> > and lookup calls through the VFSwith ftp. Very poor.
>
>Reiser4 works fine with 100M entries in a directory, so VFS is not a
>bottleneck here.
>
>

how about with ftp running on top? Try running FTP in directory with
100M entries. See how long it takes to return the data to
the remote client for a dir listing.

Jeff

>[...]
>
> >
> > Jeff
>
>Nikita.
>
>
>


2005-11-13 21:34:44

by Nikita Danilov

[permalink] [raw]
Subject: Re: Severe VFS Performance Issues 2.6 with > 95000 directory entries

jmerkey writes:
> Nikita Danilov wrote:
>
> >Jeff V. Merkey writes:
> > >
> > > The subject line speaks for itself. This is using standard VFS readdir
> > > and lookup calls through the VFSwith ftp. Very poor.
> >
> >Reiser4 works fine with 100M entries in a directory, so VFS is not a
> >bottleneck here.
> >
> >
>
> how about with ftp running on top? Try running FTP in directory with
> 100M entries. See how long it takes to return the data to
> the remote client for a dir listing.

Why are you thinking that it is VFS that is causing performance
degradation here?

>
> Jeff
>
> >[...]
> >
> > >
> > > Jeff
> >

Nikita.

> >
> >
> >

2005-11-13 21:51:04

by Jeff V. Merkey

[permalink] [raw]
Subject: Re: Severe VFS Performance Issues 2.6 with > 95000 directory entries

Nikita Danilov wrote:

>jmerkey writes:
> > Nikita Danilov wrote:
> >
> > >Jeff V. Merkey writes:
> > > >
> > > > The subject line speaks for itself. This is using standard VFS readdir
> > > > and lookup calls through the VFSwith ftp. Very poor.
> > >
> > >Reiser4 works fine with 100M entries in a directory, so VFS is not a
> > >bottleneck here.
> > >
> > >
> >
> > how about with ftp running on top? Try running FTP in directory with
> > 100M entries. See how long it takes to return the data to
> > the remote client for a dir listing.
>
>Why are you thinking that it is VFS that is causing performance
>degradation here?
>
>

Because I see the same degredation local vs. remote. My path for readdir
and lookup are short. I dynamically (via math) create
the file names on the fly and lookup simply reads a static table in
memory for inode number. One thing I can check are calls
to igetblk inside of lookup.

Jeff

> >
> > Jeff
> >
> > >[...]
> > >
> > > >
> > > > Jeff
> > >
>
>Nikita.
>
> > >
> > >
> > >
>
>
>

2005-11-13 21:51:14

by Douglas McNaught

[permalink] [raw]
Subject: Re: Severe VFS Performance Issues 2.6 with > 95000 directory entries

jmerkey <[email protected]> writes:

> Nikita Danilov wrote:
>
>>Jeff V. Merkey writes:
>> > > The subject line speaks for itself. This is using standard VFS
>> readdir > and lookup calls through the VFSwith ftp. Very poor.
>>
>>Reiser4 works fine with 100M entries in a directory, so VFS is not a
>>bottleneck here.
>>
>>
>
> how about with ftp running on top? Try running FTP in directory with
> 100M entries. See how long it takes to return the data to
> the remote client for a dir listing.

What filesystem are you using? If it's ext3 without dirindex turned
on, that would definitely explain it.

-Doug

2005-11-13 21:54:21

by Jeff V. Merkey

[permalink] [raw]
Subject: Re: Severe VFS Performance Issues 2.6 with > 95000 directory entries

Douglas McNaught wrote:

>jmerkey <[email protected]> writes:
>
>
>
>>Nikita Danilov wrote:
>>
>>
>>
>>>Jeff V. Merkey writes:
>>>
>>>
>>>>>The subject line speaks for itself. This is using standard VFS
>>>>>
>>>>>
>>>readdir > and lookup calls through the VFSwith ftp. Very poor.
>>>
>>>Reiser4 works fine with 100M entries in a directory, so VFS is not a
>>>bottleneck here.
>>>
>>>
>>>
>>>
>>how about with ftp running on top? Try running FTP in directory with
>>100M entries. See how long it takes to return the data to
>>the remote client for a dir listing.
>>
>>
>
>What filesystem are you using? If it's ext3 without dirindex turned
>on, that would definitely explain it.
>
>-Doug
>
>
>
Thanks. I'll enable dirindex.

Jeff

2005-11-13 22:03:51

by Jeffrey V. Merkey

[permalink] [raw]
Subject: Re: Severe VFS Performance Issues 2.6 with > 95000 directory entries

Douglas McNaught wrote:

>jmerkey <[email protected]> writes:
>
>
>
>>Nikita Danilov wrote:
>>
>>
>>
>>>Jeff V. Merkey writes:
>>>
>>>
>>>>>The subject line speaks for itself. This is using standard VFS
>>>>>
>>>>>
>>>readdir > and lookup calls through the VFSwith ftp. Very poor.
>>>
>>>Reiser4 works fine with 100M entries in a directory, so VFS is not a
>>>bottleneck here.
>>>
>>>
>>>
>>>
>>how about with ftp running on top? Try running FTP in directory with
>>100M entries. See how long it takes to return the data to
>>the remote client for a dir listing.
>>
>>
>
>What filesystem are you using? If it's ext3 without dirindex turned
>on, that would definitely explain it.
>
>
>
I just noticed the I_NEW flag for iget which prevents multiple calls to
refresh the inode. There's another code section where I update the
filesize field
after I call iget from lookup. This does not explain it either since I
use math here to hash and post into the inode. I am still convinced that
either in userspace
or in the kernel VFS, there's still a case where readdit goes linear and
starts to exhibit (O)(Log 2(N)) behavior as the directory gets large
(above 50,000 entries).

Jeff