I've just uploaded an updated version of Momchil Velikov's patch for a
scalable pagecache using radix trees. The patch can be found at:
ftp://ftp.kernel.org/pub/linux/kernel/people/hch/patches/v2.4/2.4.17/linux-2.4.17-ratpagecache.patch.gz
ftp://ftp.kernel.org/pub/linux/kernel/people/hch/patches/v2.4/2.4.17/linux-2.4.17-ratpagecache.patch.bz2
It contains a number of fixed and improvements by Momchil and me.
The basic advantage over the old version (besides the fixes :)) is that
the radix tree implementation is now independand of struct page /
struct address_space and thus can easily be used in other code.
=== Changelog ===
Momchil Velikov:
- It was possible to return a PG_locked page to the buddy
allocator with a subsequent oops, if the call to rat_insert in
__add_to_page_cache failed. Thus the functions is changed as to
avoid modifying the pages before rat_insert was
successful. Somewhat paranoid, I changed add_page_cache_locked
too.
- shmem_writepage was causing an infinite looping deadlock, when a
couple of processes was yielding for kswapd, _including kswapd
itself_.
- Initialized swapper_space. On some architectures the spinlock is
initilized to 0 on some to 1, who knows maybe there are/will be
others. I have no idea why this didn't break the test on OSDL's
4- and 8-way boxes.
Me:
- moved rat.c from mm/ to lib/.
- new structure: rat_root containing root-node, height and gfp_mask.
- changed rat_* arguments to struct rat_root * and void *.
- change struct page * arguments to void *.
- moved all declarations in rat.h that are not public to rat.c
- replaced page_cache_init() by ratcache_init() in rat.c.
- rat_node slab handling moved to rat.c
- in swap_state.c removed 0/NULL initializers that aren't needed.
- replaced __find_get_page/__find_lock_page with non-prefixed versions.
- added kdoc-style comments to rat.c.
- fixed up whitespaces in function declarations to math Linux style.
Christoph Hellwig schrieb:
>
> [please Cc [email protected] and lkml on reply]
>
> I've just uploaded an updated version of Momchil Velikov's patch for a
> scalable pagecache using radix trees. The patch can be found at:
>
> It contains a number of fixed and improvements by Momchil and me.
>
Can you sum up the advantages of this implementation?
I think it scales better on "big systems" where otherwise you end up with many
pages on the same hash?
Is it beneficial for small systems? (I think not)
>>>>> "Peter" == Peter W?chtler <[email protected]> writes:
Peter> Is it beneficial for small systems? (I think not)
Does it hurt performance on small systems? (I think not)
Christoph Hellwig schrieb:
>> [please Cc [email protected] and lkml on reply]
>>
>> I've just uploaded an updated version of Momchil Velikov's patch for a
>> scalable pagecache using radix trees. The patch can be found at:
>>
>> It contains a number of fixed and improvements by Momchil and me.
On Mon, Jan 07, 2002 at 11:05:08AM +0100, Peter W?chtler wrote:
> Can you sum up the advantages of this implementation?
> I think it scales better on "big systems" where otherwise you end up
> with many pages on the same hash?
>
> Is it beneficial for small systems? (I think not)
I speculate this would be good for small systems as well as it reduces
the size of struct page by 2*sizeof(unsigned long) bytes, allowing more
incremental allocation of pagecache metadata. I haven't tried it on my
smaller systems yet (due to lack of disk space and needing to build the
cross-toolchains), though I'm now curious as to its exact behavior there.
Has anyone tried to do accounting on the radix tree metadata overhead yet?
Cheers,
Bill
On January 7, 2002 12:03 pm, William Lee Irwin III wrote:
> On Mon, Jan 07, 2002 at 11:05:08AM +0100, Peter W?chtler wrote:
> > Can you sum up the advantages of this implementation?
> > I think it scales better on "big systems" where otherwise you end up
> > with many pages on the same hash?
> >
> > Is it beneficial for small systems? (I think not)
>
> I speculate this would be good for small systems as well as it reduces
> the size of struct page by 2*sizeof(unsigned long) bytes, allowing more
> incremental allocation of pagecache metadata. I haven't tried it on my
> smaller systems yet (due to lack of disk space and needing to build the
> cross-toolchains), though I'm now curious as to its exact behavior there.
Benchmark it on UML. In my experience, performance on UML is quite predictive of
performance on native systems.
--
Daniel
On January 7, 2002 12:03 pm, William Lee Irwin III wrote:
> Christoph Hellwig schrieb:
> >> [please Cc [email protected] and lkml on reply]
> >>
> >> I've just uploaded an updated version of Momchil Velikov's patch for a
> >> scalable pagecache using radix trees. The patch can be found at:
> >>
> >> It contains a number of fixed and improvements by Momchil and me.
Hi!
> I speculate this would be good for small systems as well as it reduces
> the size of struct page by 2*sizeof(unsigned long) bytes, allowing more
> incremental allocation of pagecache metadata. I haven't tried it on my
> smaller systems yet (due to lack of disk space and needing to build the
> cross-toolchains), though I'm now curious as to its exact behavior there.
Why not mem=8M, nosmp on your "big" system?
Pavel
--
Philips Velo 1: 1"x4"x8", 300gram, 60, 12MB, 40bogomips, linux, mutt,
details at http://atrey.karlin.mff.cuni.cz/~pavel/velo/index.html.