2007-05-14 00:54:47

by Con Kolivas

[permalink] [raw]
Subject: [PATCH] mm: swap prefetch more improvements

akpm, please queue on top of "mm: swap prefetch improvements"

---
Failed radix_tree_insert wasn't being handled leaving stale kmem.

The list should be iterated over in the reverse order when prefetching.

Make the yield within kprefetchd stronger through the use of cond_resched.

Check that the pos entry hasn't been removed while unlocked.

Signed-off-by: Con Kolivas <[email protected]>

---
mm/swap_prefetch.c | 19 ++++++++++---------
1 file changed, 10 insertions(+), 9 deletions(-)

Index: linux-2.6.21-ck1/mm/swap_prefetch.c
===================================================================
--- linux-2.6.21-ck1.orig/mm/swap_prefetch.c 2007-05-14 09:39:37.000000000 +1000
+++ linux-2.6.21-ck1/mm/swap_prefetch.c 2007-05-14 10:21:47.000000000 +1000
@@ -117,7 +117,8 @@ void add_to_swapped_list(struct page *pa
if (likely(!radix_tree_insert(&swapped.swap_tree, index, entry))) {
list_add(&entry->swapped_list, &swapped.list);
swapped.count++;
- }
+ } else
+ kmem_cache_free(swapped.cache, entry);

out_locked:
spin_unlock_irqrestore(&swapped.lock, flags);
@@ -427,7 +428,7 @@ out:
static enum trickle_return trickle_swap(void)
{
enum trickle_return ret = TRICKLE_DELAY;
- struct list_head *p, *next;
+ struct swapped_entry *pos, *n;
unsigned long flags;

if (!prefetch_enabled())
@@ -440,19 +441,19 @@ static enum trickle_return trickle_swap(
return TRICKLE_FAILED;

spin_lock_irqsave(&swapped.lock, flags);
- list_for_each_safe(p, next, &swapped.list) {
- struct swapped_entry *entry;
+ list_for_each_entry_safe_reverse(pos, n, &swapped.list, swapped_list) {
swp_entry_t swp_entry;
int node;

spin_unlock_irqrestore(&swapped.lock, flags);
- might_sleep();
- if (!prefetch_suitable())
+ /* Yield to anything else running */
+ if (cond_resched() || !prefetch_suitable())
goto out_unlocked;

spin_lock_irqsave(&swapped.lock, flags);
- entry = list_entry(p, struct swapped_entry, swapped_list);
- node = get_swap_entry_node(entry);
+ if (unlikely(!pos))
+ continue;
+ node = get_swap_entry_node(pos);
if (!node_isset(node, sp_stat.prefetch_nodes)) {
/*
* We found an entry that belongs to a node that is
@@ -460,7 +461,7 @@ static enum trickle_return trickle_swap(
*/
continue;
}
- swp_entry = entry->swp_entry;
+ swp_entry = pos->swp_entry;
spin_unlock_irqrestore(&swapped.lock, flags);

if (trickle_swap_cache_async(swp_entry, node) == TRICKLE_DELAY)

--
-ck


2007-05-14 22:01:03

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] mm: swap prefetch more improvements

On Mon, 14 May 2007 10:50:54 +1000
Con Kolivas <[email protected]> wrote:

> akpm, please queue on top of "mm: swap prefetch improvements"
>
> ---
> Failed radix_tree_insert wasn't being handled leaving stale kmem.
>
> The list should be iterated over in the reverse order when prefetching.
>
> Make the yield within kprefetchd stronger through the use of cond_resched.

hm.

>
> - might_sleep();
> - if (!prefetch_suitable())
> + /* Yield to anything else running */
> + if (cond_resched() || !prefetch_suitable())
> goto out_unlocked;

So if cond_resched() happened to schedule away, we terminate this
swap-tricking attempt. It's not possible to determine the reasons for this
from the code or from the changelog (==bad).

How come?

2007-05-14 22:48:11

by Con Kolivas

[permalink] [raw]
Subject: Re: [PATCH] mm: swap prefetch more improvements

On Tuesday 15 May 2007 08:00, Andrew Morton wrote:
> On Mon, 14 May 2007 10:50:54 +1000
>
> Con Kolivas <[email protected]> wrote:
> > akpm, please queue on top of "mm: swap prefetch improvements"
> >
> > ---
> > Failed radix_tree_insert wasn't being handled leaving stale kmem.
> >
> > The list should be iterated over in the reverse order when prefetching.
> >
> > Make the yield within kprefetchd stronger through the use of
> > cond_resched.
>
> hm.
>
> > - might_sleep();
> > - if (!prefetch_suitable())
> > + /* Yield to anything else running */
> > + if (cond_resched() || !prefetch_suitable())
> > goto out_unlocked;
>
> So if cond_resched() happened to schedule away, we terminate this
> swap-tricking attempt. It's not possible to determine the reasons for this
> from the code or from the changelog (==bad).
>
> How come?

Hmm I thought the line above that says "yield to anything else running" was
explicit enough. The idea is kprefetchd shouldn't run if any other real
activity is happening just about anywhere, and a positive cond_resched would
indicate likely activity so we just put kprefetchd back to sleep.

--
-ck

2007-05-14 23:01:44

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] mm: swap prefetch more improvements

On Tue, 15 May 2007 08:43:35 +1000
Con Kolivas <[email protected]> wrote:

> On Tuesday 15 May 2007 08:00, Andrew Morton wrote:
> > On Mon, 14 May 2007 10:50:54 +1000
> >
> > Con Kolivas <[email protected]> wrote:
> > > akpm, please queue on top of "mm: swap prefetch improvements"
> > >
> > > ---
> > > Failed radix_tree_insert wasn't being handled leaving stale kmem.
> > >
> > > The list should be iterated over in the reverse order when prefetching.
> > >
> > > Make the yield within kprefetchd stronger through the use of
> > > cond_resched.
> >
> > hm.
> >
> > > - might_sleep();
> > > - if (!prefetch_suitable())
> > > + /* Yield to anything else running */
> > > + if (cond_resched() || !prefetch_suitable())
> > > goto out_unlocked;
> >
> > So if cond_resched() happened to schedule away, we terminate this
> > swap-tricking attempt. It's not possible to determine the reasons for this
> > from the code or from the changelog (==bad).
> >
> > How come?
>
> Hmm I thought the line above that says "yield to anything else running" was
> explicit enough. The idea is kprefetchd shouldn't run if any other real
> activity is happening just about anywhere, and a positive cond_resched would
> indicate likely activity so we just put kprefetchd back to sleep.

But kprefetchd runs as SCHED_BATCH. Doesn't that mean that some low-prio
background thing (seti?) will disable swap-prefetch?

I mean, if swap-prefetch is actually useful, then it'll still be useful if
the machine happens to be doing some computational work. It's not obvious
to me that there is linkage between "doing CPU work" and "prefetching is
presently undesirable".


2007-05-14 23:28:31

by Con Kolivas

[permalink] [raw]
Subject: Re: [PATCH] mm: swap prefetch more improvements

On Tuesday 15 May 2007 09:01, Andrew Morton wrote:
> On Tue, 15 May 2007 08:43:35 +1000
>
> Con Kolivas <[email protected]> wrote:
> > On Tuesday 15 May 2007 08:00, Andrew Morton wrote:
> > > On Mon, 14 May 2007 10:50:54 +1000
> > >
> > > Con Kolivas <[email protected]> wrote:
> > > > akpm, please queue on top of "mm: swap prefetch improvements"
> > > >
> > > > ---
> > > > Failed radix_tree_insert wasn't being handled leaving stale kmem.
> > > >
> > > > The list should be iterated over in the reverse order when
> > > > prefetching.
> > > >
> > > > Make the yield within kprefetchd stronger through the use of
> > > > cond_resched.
> > >
> > > hm.
> > >
> > > > - might_sleep();
> > > > - if (!prefetch_suitable())
> > > > + /* Yield to anything else running */
> > > > + if (cond_resched() || !prefetch_suitable())
> > > > goto out_unlocked;
> > >
> > > So if cond_resched() happened to schedule away, we terminate this
> > > swap-tricking attempt. It's not possible to determine the reasons for
> > > this from the code or from the changelog (==bad).
> > >
> > > How come?
> >
> > Hmm I thought the line above that says "yield to anything else running"
> > was explicit enough. The idea is kprefetchd shouldn't run if any other
> > real activity is happening just about anywhere, and a positive
> > cond_resched would indicate likely activity so we just put kprefetchd
> > back to sleep.
>
> But kprefetchd runs as SCHED_BATCH. Doesn't that mean that some low-prio
> background thing (seti?) will disable swap-prefetch?
>
> I mean, if swap-prefetch is actually useful, then it'll still be useful if
> the machine happens to be doing some computational work. It's not obvious
> to me that there is linkage between "doing CPU work" and "prefetching is
> presently undesirable".

set_tsk_need_resched which is the trigger for a cond_resched occurring won't
be set just by a cpu bound task constantly running in the background as far
as I can see. It's only if some wakeup has triggered a set_tsk_need_resched.
ie prefetching still happens here with setiathome or equivalent running in my
testing. It might be overkill but from what I can see here it is no
impediment to prefetching occurring. I'll think about it some more and do
more testing but it seems ok to me.

--
-ck

2007-05-14 23:37:21

by Michael Chang

[permalink] [raw]
Subject: Re: [ck] Re: [PATCH] mm: swap prefetch more improvements

On 5/14/07, Andrew Morton <[email protected]> wrote:
> On Tue, 15 May 2007 08:43:35 +1000
> Con Kolivas <[email protected]> wrote:
>
> > On Tuesday 15 May 2007 08:00, Andrew Morton wrote:
> > > On Mon, 14 May 2007 10:50:54 +1000
> > >
> > > Con Kolivas <[email protected]> wrote:
> > > > akpm, please queue on top of "mm: swap prefetch improvements"
> > > >
> > > > ---
> > > > Failed radix_tree_insert wasn't being handled leaving stale kmem.
> > > >
> > > > The list should be iterated over in the reverse order when prefetching.
> > > >
> > > > Make the yield within kprefetchd stronger through the use of
> > > > cond_resched.
> > >
> > > hm.
> > >
> > > > - might_sleep();
> > > > - if (!prefetch_suitable())
> > > > + /* Yield to anything else running */
> > > > + if (cond_resched() || !prefetch_suitable())
> > > > goto out_unlocked;
> > >
> > > So if cond_resched() happened to schedule away, we terminate this
> > > swap-tricking attempt. It's not possible to determine the reasons for this
> > > from the code or from the changelog (==bad).
> > >
> > > How come?
> >
> > Hmm I thought the line above that says "yield to anything else running" was
> > explicit enough. The idea is kprefetchd shouldn't run if any other real
> > activity is happening just about anywhere, and a positive cond_resched would
> > indicate likely activity so we just put kprefetchd back to sleep.
>
> I mean, if swap-prefetch is actually useful, then it'll still be useful if
> the machine happens to be doing some computational work. It's not obvious
> to me that there is linkage between "doing CPU work" and "prefetching is
> presently undesirable".

That may be true, but I believe Con is attempting to err on the side
of caution in saying that swap prefetch should have practically no
negative impact if _anything_ is running. (The whole premise, for now,
anyways, is that swap prefetch should provide... "something for
(almost) nothing", if I'm interpreting this right.)

That said, there are probably some cases (seti) where swap prefetch
during the run of that batch program would help. On the flip side,
there are also some cases where batch processes (various other
seti-like "boinc" apps) which use a good deal of memory, meaning that
performing the prefetch at that time is futile, unhelpful, or
otherwise unwanted.

Would it be better to be less yield-y in this circumstance?

--
Michael Chang

Please avoid sending me Word or PowerPoint attachments.
See http://www.gnu.org/philosophy/no-word-attachments.html
Thank you.

2007-05-15 09:58:22

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH] mm: swap prefetch more improvements

On Mon, 2007-05-14 at 15:00 -0700, Andrew Morton wrote:
> On Mon, 14 May 2007 10:50:54 +1000
> Con Kolivas <[email protected]> wrote:
>
> > akpm, please queue on top of "mm: swap prefetch improvements"
> >
> > ---
> > Failed radix_tree_insert wasn't being handled leaving stale kmem.
> >
> > The list should be iterated over in the reverse order when prefetching.
> >
> > Make the yield within kprefetchd stronger through the use of cond_resched.
>
> hm.
>
> >
> > - might_sleep();
> > - if (!prefetch_suitable())
> > + /* Yield to anything else running */
> > + if (cond_resched() || !prefetch_suitable())
> > goto out_unlocked;
>
> So if cond_resched() happened to schedule away, we terminate this
> swap-tricking attempt. It's not possible to determine the reasons for this
> from the code or from the changelog (==bad).
>
> How come?

I think Con meant need_resched(). That would indicate someone else wants
to use the CPU and and has higher priority than kprefetchd.



2007-05-15 12:41:11

by Con Kolivas

[permalink] [raw]
Subject: Re: [PATCH] mm: swap prefetch more improvements

On Tuesday 15 May 2007 19:58, Peter Zijlstra wrote:
> On Mon, 2007-05-14 at 15:00 -0700, Andrew Morton wrote:
> > On Mon, 14 May 2007 10:50:54 +1000
> >
> > Con Kolivas <[email protected]> wrote:
> > > akpm, please queue on top of "mm: swap prefetch improvements"
> > >
> > > ---
> > > Failed radix_tree_insert wasn't being handled leaving stale kmem.
> > >
> > > The list should be iterated over in the reverse order when prefetching.
> > >
> > > Make the yield within kprefetchd stronger through the use of
> > > cond_resched.
> >
> > hm.
> >
> > > - might_sleep();
> > > - if (!prefetch_suitable())
> > > + /* Yield to anything else running */
> > > + if (cond_resched() || !prefetch_suitable())
> > > goto out_unlocked;
> >
> > So if cond_resched() happened to schedule away, we terminate this
> > swap-tricking attempt. It's not possible to determine the reasons for
> > this from the code or from the changelog (==bad).
> >
> > How come?
>
> I think Con meant need_resched(). That would indicate someone else wants
> to use the CPU and and has higher priority than kprefetchd.

It may well be that need_resched is what I was trying to do... I don't need it
to do the resched and _then_ break out of swap prefetch.

--
-ck