2005-09-23 07:11:55

by Con Kolivas

[permalink] [raw]
Subject: [PATCH] vm - swap_prefetch-11

More optimisations, much much less locking costs, limits on when to prefetch
and bugfixes.

Thanks to all testers!

Incrementals are available
http://ck.kolivas.org/patches/swap-prefetch/

Cheers,
Con


Attachments:
(No filename) (202.00 B)
vm-swap_prefetch-11.patch (21.03 kB)
Download all attachments

2005-09-27 18:56:37

by Jonathan Corbet

[permalink] [raw]
Subject: Re: [PATCH] vm - swap_prefetch-11

Hi, Con,

> This patch implements swap prefetching when the vm is relatively idle and
> there is free ram available.

I'm having a look at it now (better late than never...), and a couple of
questions come to mind...

The more general of the two is: would it make sense to somehow merge
your swapped_entry data structure with Rik's page-remembering scheme for
CLOCK-PRO? Assumed that both are someday destined for inclusion, it
seems it would make sense to add just one "remember info about swapped
pages" data structure, rather than two.

Second question:

> +++ linux-2.6.13-sp/include/linux/fs.h 2005-09-23 16:57:03.000000000 +1000
> @@ -340,6 +340,8 @@ struct address_space {
> struct inode *host; /* owner: inode, block_device */
> struct radix_tree_root page_tree; /* radix tree of all pages */
> rwlock_t tree_lock; /* and rwlock protecting it */
> + struct radix_tree_root swap_tree; /* radix tree of swapped pages */
> + struct list_head swapped_pages; /* list of swapped pages */

It looks like you are adding these fields to every address_space
structure in the system - and there can be a fair number of those. But
further down, when it comes time to remember a swapped page:

> +void add_to_swapped_list(unsigned long index)
> +{
> + struct swapped_entry_t *entry;
> + struct address_space *mapping = &swapper_space;

You're only actually remembering pages associated with a single address
space.

Do you anticipate adding prefetching from other address spaces as well?
If not, it might be worth putting these structures somewhere else to
avoid bloating the address_space structure.

...or am I missing something again...?

Thanks,

jon

2005-09-27 23:07:44

by Con Kolivas

[permalink] [raw]
Subject: Re: [PATCH] vm - swap_prefetch-11

On Wed, 28 Sep 2005 04:56 am, Jonathan Corbet wrote:
> Hi, Con,
>
> > This patch implements swap prefetching when the vm is relatively idle and
> > there is free ram available.
>
> I'm having a look at it now (better late than never...), and a couple of
> questions come to mind...

Hey thanks for looking. It can be quite hard for people to find time to review
patches and I appreciate the effort.

> The more general of the two is: would it make sense to somehow merge
> your swapped_entry data structure with Rik's page-remembering scheme for
> CLOCK-PRO? Assumed that both are someday destined for inclusion, it
> seems it would make sense to add just one "remember info about swapped
> pages" data structure, rather than two.

Indeed it would. I was hoping the time frame for inclusion of swap prefetching
would be much shorter than clock pro given the relatively intrusive nature of
clock pro. It would make sense to update the accounting to that of clock pro
if/when clock pro was pushed. One of the features of the current accounting
is it is extremely cheap and slightly sloppy on purpose since there is no
point to being perfectly accurate and incur the extra overhead. Once that
overhead is considered worthwhile for clock pro algorithms we can just use
that data.

> Second question:
> It looks like you are adding these fields to every address_space
> structure in the system - and there can be a fair number of those. But
>
> further down, when it comes time to remember a swapped page:

> You're only actually remembering pages associated with a single address
> space.
>
> Do you anticipate adding prefetching from other address spaces as well?
> If not, it might be worth putting these structures somewhere else to
> avoid bloating the address_space structure.
>
> ...or am I missing something again...?

Nope you're definitely not missing something. As you're well aware code ends
up often being far from the original attempt. It's the legacy of the code
evolution and it was something I would hopefully have thought about myself
without being prompted by someone else. I'll have to get onto that with the
next version.

Just as a data point there is still a workload that is inappopriately causing
out-of-memory conditions when it shouldn't and only once I nail all known
issues from the -ck users will I push it to -mm.

Thanks again

Cheers,
Con