2012-05-23 20:45:38

by Satoru Moriya

[permalink] [raw]
Subject: [PATCH RESEND] avoid swapping out with swappiness==0

Hi Andrew,

This patch has been reviewed for couple of months.

This patch *only* improves the behavior when the kernel has
enough filebacked pages. It means that it does not change
the behavior when kernel has small number of filebacked pages.

Kosaki-san pointed out that the threshold which we use
to decide whether filebacked page is enough or not is not
appropriate(*).

(*) http://www.spinics.net/lists/linux-mm/msg32380.html

As I described in (**), I believe that threshold discussion
should be done in other thread because it affects not only
swappiness=0 case and the kernel behave the same way with
or without this patch below the threshold.

(**) http://www.spinics.net/lists/linux-mm/msg34317.html

The patch may not be perfect but, at least, we can improve
the kernel behavior in the enough filebacked memory case
with this patch. I believe it's better than nothing.

Do you have any comments about it?

NOTE: I updated the patch with Acked-by tags

---
Sometimes we'd like to avoid swapping out anonymous memory
in particular, avoid swapping out pages of important process or
process groups while there is a reasonable amount of pagecache
on RAM so that we can satisfy our customers' requirements.

OTOH, we can control how aggressive the kernel will swap memory pages
with /proc/sys/vm/swappiness for global and
/sys/fs/cgroup/memory/memory.swappiness for each memcg.

But with current reclaim implementation, the kernel may swap out
even if we set swappiness==0 and there is pagecache on RAM.

This patch changes the behavior with swappiness==0. If we set
swappiness==0, the kernel does not swap out completely
(for global reclaim until the amount of free pages and filebacked
pages in a zone has been reduced to something very very small
(nr_free + nr_filebacked < high watermark)).

Any comments are welcome.

Regards,
Satoru Moriya

Signed-off-by: Satoru Moriya <[email protected]>
Acked-by: Minchan Kim <[email protected]>
Acked-by: Rik van Riel <[email protected]>

---
mm/vmscan.c | 6 +++---
1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 33dc256..52d64bf 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1983,10 +1983,10 @@ static void get_scan_count(struct mem_cgroup_zone *mz, struct scan_control *sc,
* proportional to the fraction of recently scanned pages on
* each list that were recently referenced and in active use.
*/
- ap = (anon_prio + 1) * (reclaim_stat->recent_scanned[0] + 1);
+ ap = anon_prio * (reclaim_stat->recent_scanned[0] + 1);
ap /= reclaim_stat->recent_rotated[0] + 1;

- fp = (file_prio + 1) * (reclaim_stat->recent_scanned[1] + 1);
+ fp = file_prio * (reclaim_stat->recent_scanned[1] + 1);
fp /= reclaim_stat->recent_rotated[1] + 1;
spin_unlock_irq(&mz->zone->lru_lock);

@@ -1999,7 +1999,7 @@ out:
unsigned long scan;

scan = zone_nr_lru_pages(mz, lru);
- if (priority || noswap) {
+ if (priority || noswap || !vmscan_swappiness(mz, sc)) {
scan >>= priority;
if (!scan && force_scan)
scan = SWAP_CLUSTER_MAX;
--
1.7.6.5


2012-05-23 21:46:30

by Rik van Riel

[permalink] [raw]
Subject: Re: [PATCH RESEND] avoid swapping out with swappiness==0

On 05/23/2012 04:41 PM, Satoru Moriya wrote:

> The patch may not be perfect but, at least, we can improve
> the kernel behavior in the enough filebacked memory case
> with this patch. I believe it's better than nothing.

Agreed.

> Do you have any comments about it?

Only one comment, and it's for Andrew :)

> Signed-off-by: Satoru Moriya<[email protected]>
> Acked-by: Minchan Kim<[email protected]>
> Acked-by: Rik van Riel<[email protected]>

Andrew, you can turn my Acked-by into a

Reviewed-by: Rik van Riel<[email protected]>

This is functionality that many people seem to want, and
will not break anything current users typically do.

--
All rights reversed

2012-05-24 09:16:14

by Jerome Marchand

[permalink] [raw]
Subject: Re: [PATCH RESEND] avoid swapping out with swappiness==0

On 05/23/2012 10:41 PM, Satoru Moriya wrote:
> Hi Andrew,
>
> This patch has been reviewed for couple of months.
>
> This patch *only* improves the behavior when the kernel has
> enough filebacked pages. It means that it does not change
> the behavior when kernel has small number of filebacked pages.
>
> Kosaki-san pointed out that the threshold which we use
> to decide whether filebacked page is enough or not is not
> appropriate(*).
>
> (*) http://www.spinics.net/lists/linux-mm/msg32380.html
>
> As I described in (**), I believe that threshold discussion
> should be done in other thread because it affects not only
> swappiness=0 case and the kernel behave the same way with
> or without this patch below the threshold.
>
> (**) http://www.spinics.net/lists/linux-mm/msg34317.html
>
> The patch may not be perfect but, at least, we can improve
> the kernel behavior in the enough filebacked memory case
> with this patch. I believe it's better than nothing.
>
> Do you have any comments about it?
>
> NOTE: I updated the patch with Acked-by tags
>
> ---
> Sometimes we'd like to avoid swapping out anonymous memory
> in particular, avoid swapping out pages of important process or
> process groups while there is a reasonable amount of pagecache
> on RAM so that we can satisfy our customers' requirements.
>
> OTOH, we can control how aggressive the kernel will swap memory pages
> with /proc/sys/vm/swappiness for global and
> /sys/fs/cgroup/memory/memory.swappiness for each memcg.
>
> But with current reclaim implementation, the kernel may swap out
> even if we set swappiness==0 and there is pagecache on RAM.
>
> This patch changes the behavior with swappiness==0. If we set
> swappiness==0, the kernel does not swap out completely
> (for global reclaim until the amount of free pages and filebacked
> pages in a zone has been reduced to something very very small
> (nr_free + nr_filebacked < high watermark)).
>
> Any comments are welcome.
>
> Regards,
> Satoru Moriya
>
> Signed-off-by: Satoru Moriya <[email protected]>
> Acked-by: Minchan Kim <[email protected]>
> Acked-by: Rik van Riel <[email protected]>
>

Acked-by: Jerome Marchand <[email protected]>