LinuxLists.cc - [PATCH] vmscan: memcg needs may_swap (Re: [patch] vmscan: rename sc.may_swap to may

2009-03-27 06:24:26

Subject: [PATCH] vmscan: memcg needs may_swap (Re: [patch] vmscan: rename sc.may_swap to may_unmap)

Added
Cc: KAMEZAWA Hiroyuki <[email protected]>
Cc: Balbir Singh <[email protected]>

I'm sorry for replying to a very old mail.

> @@ -1713,7 +1713,7 @@ unsigned long try_to_free_mem_cgroup_pag
> {
> struct scan_control sc = {
> .may_writepage = !laptop_mode,
> - .may_swap = 1,
> + .may_unmap = 1,
> .swap_cluster_max = SWAP_CLUSTER_MAX,
> .swappiness = swappiness,
> .order = 0,
> @@ -1723,7 +1723,7 @@ unsigned long try_to_free_mem_cgroup_pag
> struct zonelist *zonelist;
>
> if (noswap)
> - sc.may_swap = 0;
> + sc.may_unmap = 0;
>
> sc.gfp_mask = (gfp_mask & GFP_RECLAIM_MASK) |
> (GFP_HIGHUSER_MOVABLE & ~GFP_RECLAIM_MASK);
IIUC, memcg had used may_swap as a flag for "we need to use swap?" as the name indicate.

Because, when mem+swap hits the limit, trying to swapout pages is meaningless
as it doesn't change mem+swap usage.

What do you think of this patch?
===
From: Daisuke Nishimura <[email protected]>

vmscan-rename-scmay_swap-to-may_unmap.patch removed may_swap flag,
but memcg had used it as a flag for "we need to use swap?", as the
name indicate.

And in current implementation, memcg cannot reclaim mapped file caches
when mem+swap hits the limit.

re-introduce may_swap flag and handle it at shrink_page_list.

This patch doesn't influence any scan_control users other than memcg.

Signed-off-by: Daisuke Nishimura <[email protected]>
---
mm/vmscan.c | 15 ++++++++++++++-
1 files changed, 14 insertions(+), 1 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index c815653..86118d9 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -64,6 +64,9 @@ struct scan_control {
/* Can mapped pages be reclaimed? */
int may_unmap;

+ /* Can pages be swapped as part of reclaim? */
+ int may_swap;
+
/* This context's SWAP_CLUSTER_MAX. If freeing memory for
* suspend, we effectively ignore SWAP_CLUSTER_MAX.
* In this context, it doesn't matter that we scan the
@@ -616,6 +619,11 @@ static unsigned long shrink_page_list(struct list_head *page_list,
if (!sc->may_unmap && page_mapped(page))
goto keep_locked;

+ if (!sc->may_swap && PageSwapBacked(page)
+ /* SwapCache uses 'swap' already */
+ && !PageSwapCache(page))
+ goto keep_locked;
+
/* Double the slab pressure for mapped and swapcache pages */
if (page_mapped(page) || PageSwapCache(page))
sc->nr_scanned++;
@@ -1696,6 +1704,7 @@ unsigned long try_to_free_pages(struct zonelist *zonelist, int order,
.may_writepage = !laptop_mode,
.swap_cluster_max = SWAP_CLUSTER_MAX,
.may_unmap = 1,
+ .may_swap = 1,
.swappiness = vm_swappiness,
.order = order,
.mem_cgroup = NULL,
@@ -1715,6 +1724,7 @@ unsigned long try_to_free_mem_cgroup_pages(struct mem_cgroup *mem_cont,
struct scan_control sc = {
.may_writepage = !laptop_mode,
.may_unmap = 1,
+ .may_swap = 1,
.swap_cluster_max = SWAP_CLUSTER_MAX,
.swappiness = swappiness,
.order = 0,
@@ -1724,7 +1734,7 @@ unsigned long try_to_free_mem_cgroup_pages(struct mem_cgroup *mem_cont,
struct zonelist *zonelist;

if (noswap)
- sc.may_unmap = 0;
+ sc.may_swap = 0;

sc.gfp_mask = (gfp_mask & GFP_RECLAIM_MASK) |
(GFP_HIGHUSER_MOVABLE & ~GFP_RECLAIM_MASK);
@@ -1764,6 +1774,7 @@ static unsigned long balance_pgdat(pg_data_t *pgdat, int order)
struct scan_control sc = {
.gfp_mask = GFP_KERNEL,
.may_unmap = 1,
+ .may_swap = 1,
.swap_cluster_max = SWAP_CLUSTER_MAX,
.swappiness = vm_swappiness,
.order = order,
@@ -2110,6 +2121,7 @@ unsigned long shrink_all_memory(unsigned long nr_pages)
struct scan_control sc = {
.gfp_mask = GFP_KERNEL,
.may_unmap = 0,
+ .may_swap = 1,
.may_writepage = 1,
.isolate_pages = isolate_pages_global,
};
@@ -2292,6 +2304,7 @@ static int __zone_reclaim(struct zone *zone, gfp_t gfp_mask, unsigned int order)
struct scan_control sc = {
.may_writepage = !!(zone_reclaim_mode & RECLAIM_WRITE),
.may_unmap = !!(zone_reclaim_mode & RECLAIM_SWAP),
+ .may_swap = 1,
.swap_cluster_max = max_t(unsigned long, nr_pages,
SWAP_CLUSTER_MAX),
.gfp_mask = gfp_mask,

2009-03-27 06:32:21

by Kamezawa Hiroyuki

[permalink] [raw]

Subject: Re: [PATCH] vmscan: memcg needs may_swap (Re: [patch] vmscan: rename sc.may_swap to may_unmap)

On Fri, 27 Mar 2009 15:19:26 +0900
Daisuke Nishimura <[email protected]> wrote:

> Added
> Cc: KAMEZAWA Hiroyuki <[email protected]>
> Cc: Balbir Singh <[email protected]>
>
> I'm sorry for replying to a very old mail.
>
> > @@ -1713,7 +1713,7 @@ unsigned long try_to_free_mem_cgroup_pag
> > {
> > struct scan_control sc = {
> > .may_writepage = !laptop_mode,
> > - .may_swap = 1,
> > + .may_unmap = 1,
> > .swap_cluster_max = SWAP_CLUSTER_MAX,
> > .swappiness = swappiness,
> > .order = 0,
> > @@ -1723,7 +1723,7 @@ unsigned long try_to_free_mem_cgroup_pag
> > struct zonelist *zonelist;
> >
> > if (noswap)
> > - sc.may_swap = 0;
> > + sc.may_unmap = 0;
> >
> > sc.gfp_mask = (gfp_mask & GFP_RECLAIM_MASK) |
> > (GFP_HIGHUSER_MOVABLE & ~GFP_RECLAIM_MASK);
> IIUC, memcg had used may_swap as a flag for "we need to use swap?" as the name indicate.
>
> Because, when mem+swap hits the limit, trying to swapout pages is meaningless
> as it doesn't change mem+swap usage.
>
Good catch...sigh, I missed this disussion.

> What do you think of this patch?
> ===
> From: Daisuke Nishimura <[email protected]>
>
> vmscan-rename-scmay_swap-to-may_unmap.patch removed may_swap flag,
> but memcg had used it as a flag for "we need to use swap?", as the
> name indicate.
>
> And in current implementation, memcg cannot reclaim mapped file caches
> when mem+swap hits the limit.
>
When mem+swap hits the limit, swap-out anonymous page doesn't reduce the
amount of usage of mem+swap, so, swap-out should be avoided.

> re-introduce may_swap flag and handle it at shrink_page_list.
>
> This patch doesn't influence any scan_control users other than memcg.
>

> Signed-off-by: Daisuke Nishimura <[email protected]>

Seems good,
Reviewed-by: KAMEZAWA Hiroyuki <[email protected]>

But hum....Maybe this lru scan work in the same way as the case
of !total_swap_pages. (means don't scan anon LRU.)
revisit this later.

-Kame

> ---
> mm/vmscan.c | 15 ++++++++++++++-
> 1 files changed, 14 insertions(+), 1 deletions(-)
>
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index c815653..86118d9 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -64,6 +64,9 @@ struct scan_control {
> /* Can mapped pages be reclaimed? */
> int may_unmap;
>
> + /* Can pages be swapped as part of reclaim? */
> + int may_swap;
> +
> /* This context's SWAP_CLUSTER_MAX. If freeing memory for
> * suspend, we effectively ignore SWAP_CLUSTER_MAX.
> * In this context, it doesn't matter that we scan the
> @@ -616,6 +619,11 @@ static unsigned long shrink_page_list(struct list_head *page_list,
> if (!sc->may_unmap && page_mapped(page))
> goto keep_locked;
>
> + if (!sc->may_swap && PageSwapBacked(page)
> + /* SwapCache uses 'swap' already */
> + && !PageSwapCache(page))
> + goto keep_locked;
> +
> /* Double the slab pressure for mapped and swapcache pages */
> if (page_mapped(page) || PageSwapCache(page))
> sc->nr_scanned++;
> @@ -1696,6 +1704,7 @@ unsigned long try_to_free_pages(struct zonelist *zonelist, int order,
> .may_writepage = !laptop_mode,
> .swap_cluster_max = SWAP_CLUSTER_MAX,
> .may_unmap = 1,
> + .may_swap = 1,
> .swappiness = vm_swappiness,
> .order = order,
> .mem_cgroup = NULL,
> @@ -1715,6 +1724,7 @@ unsigned long try_to_free_mem_cgroup_pages(struct mem_cgroup *mem_cont,
> struct scan_control sc = {
> .may_writepage = !laptop_mode,
> .may_unmap = 1,
> + .may_swap = 1,
> .swap_cluster_max = SWAP_CLUSTER_MAX,
> .swappiness = swappiness,
> .order = 0,
> @@ -1724,7 +1734,7 @@ unsigned long try_to_free_mem_cgroup_pages(struct mem_cgroup *mem_cont,
> struct zonelist *zonelist;
>
> if (noswap)
> - sc.may_unmap = 0;
> + sc.may_swap = 0;
>
> sc.gfp_mask = (gfp_mask & GFP_RECLAIM_MASK) |
> (GFP_HIGHUSER_MOVABLE & ~GFP_RECLAIM_MASK);
> @@ -1764,6 +1774,7 @@ static unsigned long balance_pgdat(pg_data_t *pgdat, int order)
> struct scan_control sc = {
> .gfp_mask = GFP_KERNEL,
> .may_unmap = 1,
> + .may_swap = 1,
> .swap_cluster_max = SWAP_CLUSTER_MAX,
> .swappiness = vm_swappiness,
> .order = order,
> @@ -2110,6 +2121,7 @@ unsigned long shrink_all_memory(unsigned long nr_pages)
> struct scan_control sc = {
> .gfp_mask = GFP_KERNEL,
> .may_unmap = 0,
> + .may_swap = 1,
> .may_writepage = 1,
> .isolate_pages = isolate_pages_global,
> };
> @@ -2292,6 +2304,7 @@ static int __zone_reclaim(struct zone *zone, gfp_t gfp_mask, unsigned int order)
> struct scan_control sc = {
> .may_writepage = !!(zone_reclaim_mode & RECLAIM_WRITE),
> .may_unmap = !!(zone_reclaim_mode & RECLAIM_SWAP),
> + .may_swap = 1,
> .swap_cluster_max = max_t(unsigned long, nr_pages,
> SWAP_CLUSTER_MAX),
> .gfp_mask = gfp_mask,
>

2009-03-29 23:45:45

by KOSAKI Motohiro

[permalink] [raw]

Subject: Re: [PATCH] vmscan: memcg needs may_swap (Re: [patch] vmscan: rename sc.may_swap to may_unmap)

> On Fri, 27 Mar 2009 15:19:26 +0900
> Daisuke Nishimura <[email protected]> wrote:
>
> > Added
> > Cc: KAMEZAWA Hiroyuki <[email protected]>
> > Cc: Balbir Singh <[email protected]>
> >
> > I'm sorry for replying to a very old mail.
> >
> > > @@ -1713,7 +1713,7 @@ unsigned long try_to_free_mem_cgroup_pag
> > > {
> > > struct scan_control sc = {
> > > .may_writepage = !laptop_mode,
> > > - .may_swap = 1,
> > > + .may_unmap = 1,
> > > .swap_cluster_max = SWAP_CLUSTER_MAX,
> > > .swappiness = swappiness,
> > > .order = 0,
> > > @@ -1723,7 +1723,7 @@ unsigned long try_to_free_mem_cgroup_pag
> > > struct zonelist *zonelist;
> > >
> > > if (noswap)
> > > - sc.may_swap = 0;
> > > + sc.may_unmap = 0;
> > >
> > > sc.gfp_mask = (gfp_mask & GFP_RECLAIM_MASK) |
> > > (GFP_HIGHUSER_MOVABLE & ~GFP_RECLAIM_MASK);
> > IIUC, memcg had used may_swap as a flag for "we need to use swap?" as the name indicate.
> >
> > Because, when mem+swap hits the limit, trying to swapout pages is meaningless
> > as it doesn't change mem+swap usage.
> >
> Good catch...sigh, I missed this disussion.
>
>
>
> > What do you think of this patch?
> > ===
> > From: Daisuke Nishimura <[email protected]>
> >
> > vmscan-rename-scmay_swap-to-may_unmap.patch removed may_swap flag,
> > but memcg had used it as a flag for "we need to use swap?", as the
> > name indicate.
> >
> > And in current implementation, memcg cannot reclaim mapped file caches
> > when mem+swap hits the limit.
> >
> When mem+swap hits the limit, swap-out anonymous page doesn't reduce the
> amount of usage of mem+swap, so, swap-out should be avoided.
>
> > re-introduce may_swap flag and handle it at shrink_page_list.
> >
> > This patch doesn't influence any scan_control users other than memcg.
> >
>
>
> > Signed-off-by: Daisuke Nishimura <[email protected]>
>
> Seems good,
> Reviewed-by: KAMEZAWA Hiroyuki <[email protected]>
>
> But hum....Maybe this lru scan work in the same way as the case
> of !total_swap_pages. (means don't scan anon LRU.)
> revisit this later.

Well, How about following patch?

So, I have to agree my judgement of may_unmap was wrong.
You explain memcg can use may_swap instead may_unmap. and I think
other may_unmap user (zone_reclaim and shrink_all_list) can convert
may_unmap code to may_swap.

IOW, Nishimura-san, you explain we can remove the branch of the may_unmap
from shrink_page_list().
it's really good job. thanks!

========
Subject: vmswan: reintroduce sc->may_swap

vmscan-rename-scmay_swap-to-may_unmap.patch removed may_swap flag,
but memcg had used it as a flag for "we need to use swap?", as the
name indicate.

And in current implementation, memcg cannot reclaim mapped file caches
when mem+swap hits the limit.

re-introduce may_swap flag and handle it at get_scan_ratio().
This patch doesn't influence any scan_control users other than memcg.

Signed-off-by: KOSAKI Motohiro <[email protected]>
Signed-off-by: Daisuke Nishimura <[email protected]>
--
mm/vmscan.c | 12 ++++++++++--
1 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 3be6157..00ea4a1 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -63,6 +63,9 @@ struct scan_control {
/* Can mapped pages be reclaimed? */
int may_unmap;

+ /* Can pages be swapped as part of reclaim? */
+ int may_swap;
+
/* This context's SWAP_CLUSTER_MAX. If freeing memory for
* suspend, we effectively ignore SWAP_CLUSTER_MAX.
* In this context, it doesn't matter that we scan the
@@ -1379,7 +1382,7 @@ static void get_scan_ratio(struct zone *zone, struct scan_control *sc,
struct zone_reclaim_stat *reclaim_stat = get_reclaim_stat(zone, sc);

/* If we have no swap space, do not bother scanning anon pages. */
- if (nr_swap_pages <= 0) {
+ if (!sc->may_swap || (nr_swap_pages <= 0)) {
percent[0] = 0;
percent[1] = 100;
return;
@@ -1695,6 +1698,7 @@ unsigned long try_to_free_pages(struct zonelist *zonelist, int order,
.may_writepage = !laptop_mode,
.swap_cluster_max = SWAP_CLUSTER_MAX,
.may_unmap = 1,
+ .may_swap = 1,
.swappiness = vm_swappiness,
.order = order,
.mem_cgroup = NULL,
@@ -1714,6 +1718,7 @@ unsigned long try_to_free_mem_cgroup_pages(struct mem_cgroup *mem_cont,
struct scan_control sc = {
.may_writepage = !laptop_mode,
.may_unmap = 1,
+ .may_swap = 1,
.swap_cluster_max = SWAP_CLUSTER_MAX,
.swappiness = swappiness,
.order = 0,
@@ -1723,7 +1728,7 @@ unsigned long try_to_free_mem_cgroup_pages(struct mem_cgroup *mem_cont,
struct zonelist *zonelist;

if (noswap)
- sc.may_unmap = 0;
+ sc.may_swap = 0;

sc.gfp_mask = (gfp_mask & GFP_RECLAIM_MASK) |
(GFP_HIGHUSER_MOVABLE & ~GFP_RECLAIM_MASK);
@@ -1763,6 +1768,7 @@ static unsigned long balance_pgdat(pg_data_t *pgdat, int order)
struct scan_control sc = {
.gfp_mask = GFP_KERNEL,
.may_unmap = 1,
+ .may_swap = 1,
.swap_cluster_max = SWAP_CLUSTER_MAX,
.swappiness = vm_swappiness,
.order = order,
@@ -2109,6 +2115,7 @@ unsigned long shrink_all_memory(unsigned long nr_pages)
struct scan_control sc = {
.gfp_mask = GFP_KERNEL,
.may_unmap = 0,
+ .may_swap = 1,
.swap_cluster_max = nr_pages,
.may_writepage = 1,
.isolate_pages = isolate_pages_global,
@@ -2289,6 +2296,7 @@ static int __zone_reclaim(struct zone *zone, gfp_t gfp_mask, unsigned int order)
struct scan_control sc = {
.may_writepage = !!(zone_reclaim_mode & RECLAIM_WRITE),
.may_unmap = !!(zone_reclaim_mode & RECLAIM_SWAP),
+ .may_swap = 1,
.swap_cluster_max = max_t(unsigned long, nr_pages,
SWAP_CLUSTER_MAX),
.gfp_mask = gfp_mask,

2009-03-31 00:23:52

by Daisuke Nishimura

[permalink] [raw]

Subject: Re: [PATCH] vmscan: memcg needs may_swap (Re: [patch] vmscan: rename sc.may_swap to may_unmap)

On Mon, 30 Mar 2009 08:45:28 +0900 (JST), KOSAKI Motohiro <[email protected]> wrote:
> > On Fri, 27 Mar 2009 15:19:26 +0900
> > Daisuke Nishimura <[email protected]> wrote:
> >
> > > Added
> > > Cc: KAMEZAWA Hiroyuki <[email protected]>
> > > Cc: Balbir Singh <[email protected]>
> > >
> > > I'm sorry for replying to a very old mail.
> > >
> > > > @@ -1713,7 +1713,7 @@ unsigned long try_to_free_mem_cgroup_pag
> > > > {
> > > > struct scan_control sc = {
> > > > .may_writepage = !laptop_mode,
> > > > - .may_swap = 1,
> > > > + .may_unmap = 1,
> > > > .swap_cluster_max = SWAP_CLUSTER_MAX,
> > > > .swappiness = swappiness,
> > > > .order = 0,
> > > > @@ -1723,7 +1723,7 @@ unsigned long try_to_free_mem_cgroup_pag
> > > > struct zonelist *zonelist;
> > > >
> > > > if (noswap)
> > > > - sc.may_swap = 0;
> > > > + sc.may_unmap = 0;
> > > >
> > > > sc.gfp_mask = (gfp_mask & GFP_RECLAIM_MASK) |
> > > > (GFP_HIGHUSER_MOVABLE & ~GFP_RECLAIM_MASK);
> > > IIUC, memcg had used may_swap as a flag for "we need to use swap?" as the name indicate.
> > >
> > > Because, when mem+swap hits the limit, trying to swapout pages is meaningless
> > > as it doesn't change mem+swap usage.
> > >
> > Good catch...sigh, I missed this disussion.
> >
> >
> >
> > > What do you think of this patch?
> > > ===
> > > From: Daisuke Nishimura <[email protected]>
> > >
> > > vmscan-rename-scmay_swap-to-may_unmap.patch removed may_swap flag,
> > > but memcg had used it as a flag for "we need to use swap?", as the
> > > name indicate.
> > >
> > > And in current implementation, memcg cannot reclaim mapped file caches
> > > when mem+swap hits the limit.
> > >
> > When mem+swap hits the limit, swap-out anonymous page doesn't reduce the
> > amount of usage of mem+swap, so, swap-out should be avoided.
> >
> > > re-introduce may_swap flag and handle it at shrink_page_list.
> > >
> > > This patch doesn't influence any scan_control users other than memcg.
> > >
> >
> >
> > > Signed-off-by: Daisuke Nishimura <[email protected]>
> >
> > Seems good,
> > Reviewed-by: KAMEZAWA Hiroyuki <[email protected]>
> >
> > But hum....Maybe this lru scan work in the same way as the case
> > of !total_swap_pages. (means don't scan anon LRU.)
> > revisit this later.
>
> Well, How about following patch?
>
I think your patch looks better because vain scanning of anon list
is avoided.

Thanks,
Daisuke Nishimura.

> So, I have to agree my judgement of may_unmap was wrong.
> You explain memcg can use may_swap instead may_unmap. and I think
> other may_unmap user (zone_reclaim and shrink_all_list) can convert
> may_unmap code to may_swap.
>
> IOW, Nishimura-san, you explain we can remove the branch of the may_unmap
> from shrink_page_list().
> it's really good job. thanks!
>
>
> ========
> Subject: vmswan: reintroduce sc->may_swap
>
> vmscan-rename-scmay_swap-to-may_unmap.patch removed may_swap flag,
> but memcg had used it as a flag for "we need to use swap?", as the
> name indicate.
>
> And in current implementation, memcg cannot reclaim mapped file caches
> when mem+swap hits the limit.
>
> re-introduce may_swap flag and handle it at get_scan_ratio().
> This patch doesn't influence any scan_control users other than memcg.
>
> Signed-off-by: KOSAKI Motohiro <[email protected]>
> Signed-off-by: Daisuke Nishimura <[email protected]>
> --
> mm/vmscan.c | 12 ++++++++++--
> 1 files changed, 10 insertions(+), 2 deletions(-)
>
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 3be6157..00ea4a1 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -63,6 +63,9 @@ struct scan_control {
> /* Can mapped pages be reclaimed? */
> int may_unmap;
>
> + /* Can pages be swapped as part of reclaim? */
> + int may_swap;
> +
> /* This context's SWAP_CLUSTER_MAX. If freeing memory for
> * suspend, we effectively ignore SWAP_CLUSTER_MAX.
> * In this context, it doesn't matter that we scan the
> @@ -1379,7 +1382,7 @@ static void get_scan_ratio(struct zone *zone, struct scan_control *sc,
> struct zone_reclaim_stat *reclaim_stat = get_reclaim_stat(zone, sc);
>
> /* If we have no swap space, do not bother scanning anon pages. */
> - if (nr_swap_pages <= 0) {
> + if (!sc->may_swap || (nr_swap_pages <= 0)) {
> percent[0] = 0;
> percent[1] = 100;
> return;
> @@ -1695,6 +1698,7 @@ unsigned long try_to_free_pages(struct zonelist *zonelist, int order,
> .may_writepage = !laptop_mode,
> .swap_cluster_max = SWAP_CLUSTER_MAX,
> .may_unmap = 1,
> + .may_swap = 1,
> .swappiness = vm_swappiness,
> .order = order,
> .mem_cgroup = NULL,
> @@ -1714,6 +1718,7 @@ unsigned long try_to_free_mem_cgroup_pages(struct mem_cgroup *mem_cont,
> struct scan_control sc = {
> .may_writepage = !laptop_mode,
> .may_unmap = 1,
> + .may_swap = 1,
> .swap_cluster_max = SWAP_CLUSTER_MAX,
> .swappiness = swappiness,
> .order = 0,
> @@ -1723,7 +1728,7 @@ unsigned long try_to_free_mem_cgroup_pages(struct mem_cgroup *mem_cont,
> struct zonelist *zonelist;
>
> if (noswap)
> - sc.may_unmap = 0;
> + sc.may_swap = 0;
>
> sc.gfp_mask = (gfp_mask & GFP_RECLAIM_MASK) |
> (GFP_HIGHUSER_MOVABLE & ~GFP_RECLAIM_MASK);
> @@ -1763,6 +1768,7 @@ static unsigned long balance_pgdat(pg_data_t *pgdat, int order)
> struct scan_control sc = {
> .gfp_mask = GFP_KERNEL,
> .may_unmap = 1,
> + .may_swap = 1,
> .swap_cluster_max = SWAP_CLUSTER_MAX,
> .swappiness = vm_swappiness,
> .order = order,
> @@ -2109,6 +2115,7 @@ unsigned long shrink_all_memory(unsigned long nr_pages)
> struct scan_control sc = {
> .gfp_mask = GFP_KERNEL,
> .may_unmap = 0,
> + .may_swap = 1,
> .swap_cluster_max = nr_pages,
> .may_writepage = 1,
> .isolate_pages = isolate_pages_global,
> @@ -2289,6 +2296,7 @@ static int __zone_reclaim(struct zone *zone, gfp_t gfp_mask, unsigned int order)
> struct scan_control sc = {
> .may_writepage = !!(zone_reclaim_mode & RECLAIM_WRITE),
> .may_unmap = !!(zone_reclaim_mode & RECLAIM_SWAP),
> + .may_swap = 1,
> .swap_cluster_max = max_t(unsigned long, nr_pages,
> SWAP_CLUSTER_MAX),
> .gfp_mask = gfp_mask,
>
>
>
>

2009-03-31 01:26:32

by Minchan Kim

[permalink] [raw]

Subject: Re: [PATCH] vmscan: memcg needs may_swap (Re: [patch] vmscan: rename sc.may_swap to may_unmap)

Hi,

On Mon, Mar 30, 2009 at 8:45 AM, KOSAKI Motohiro
<[email protected]> wrote:
>> On Fri, 27 Mar 2009 15:19:26 +0900
>> Daisuke Nishimura <[email protected]> wrote:
>>
>> > Added
>> > Cc: KAMEZAWA Hiroyuki <[email protected]>
>> > Cc: Balbir Singh <[email protected]>
>> >
>> > I'm sorry for replying to a very old mail.
>> >
>> > > @@ -1713,7 +1713,7 @@ unsigned long try_to_free_mem_cgroup_pag
>> > > {
>> > > struct scan_control sc = {
>> > > .may_writepage = !laptop_mode,
>> > > - .may_swap = 1,
>> > > + .may_unmap = 1,
>> > > .swap_cluster_max = SWAP_CLUSTER_MAX,
>> > > .swappiness = swappiness,
>> > > .order = 0,
>> > > @@ -1723,7 +1723,7 @@ unsigned long try_to_free_mem_cgroup_pag
>> > > struct zonelist *zonelist;
>> > >
>> > > if (noswap)
>> > > - sc.may_swap = 0;
>> > > + sc.may_unmap = 0;
>> > >
>> > > sc.gfp_mask = (gfp_mask & GFP_RECLAIM_MASK) |
>> > > (GFP_HIGHUSER_MOVABLE & ~GFP_RECLAIM_MASK);
>> > IIUC, memcg had used may_swap as a flag for "we need to use swap?" as the name indicate.
>> >
>> > Because, when mem+swap hits the limit, trying to swapout pages is meaningless
>> > as it doesn't change mem+swap usage.
>> >
>> Good catch...sigh, I missed this disussion.
>>
>>
>>
>> > What do you think of this patch?
>> > ===
>> > From: Daisuke Nishimura <[email protected]>
>> >
>> > vmscan-rename-scmay_swap-to-may_unmap.patch removed may_swap flag,
>> > but memcg had used it as a flag for "we need to use swap?", as the
>> > name indicate.
>> >
>> > And in current implementation, memcg cannot reclaim mapped file caches
>> > when mem+swap hits the limit.
>> >
>> When mem+swap hits the limit, swap-out anonymous page doesn't reduce the
>> amount of usage of mem+swap, so, swap-out should be avoided.
>>
>> > re-introduce may_swap flag and handle it at shrink_page_list.
>> >
>> > This patch doesn't influence any scan_control users other than memcg.
>> >
>>
>>
>> > Signed-off-by: Daisuke Nishimura <[email protected]>
>>
>> Seems good,
>> Reviewed-by: KAMEZAWA Hiroyuki <[email protected]>
>>
>> But hum....Maybe this lru scan work in the same way as the case
>> of !total_swap_pages. (means don't scan anon LRU.)
>> revisit this later.
>
> Well, How about following patch?
>
> So, I have to agree my judgement of may_unmap was wrong.
> You explain memcg can use may_swap instead may_unmap. and I think
> other may_unmap user (zone_reclaim and shrink_all_list) can convert
> may_unmap code to may_swap.
>
> IOW, Nishimura-san, you explain we can remove the branch of the may_unmap
> from shrink_page_list().
> it's really good job. thanks!
>
>
> ========
> Subject: vmswan: reintroduce sc->may_swap
>
> vmscan-rename-scmay_swap-to-may_unmap.patch removed may_swap flag,
> but memcg had used it as a flag for "we need to use swap?", as the
> name indicate.
>
> And in current implementation, memcg cannot reclaim mapped file caches
> when mem+swap hits the limit.
>
> re-introduce may_swap flag and handle it at get_scan_ratio().
> This patch doesn't influence any scan_control users other than memcg.
>
> Signed-off-by: KOSAKI Motohiro <[email protected]>
> Signed-off-by: Daisuke Nishimura <[email protected]>
> --
> mm/vmscan.c | 12 ++++++++++--
> 1 files changed, 10 insertions(+), 2 deletions(-)
>
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 3be6157..00ea4a1 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -63,6 +63,9 @@ struct scan_control {
> /* Can mapped pages be reclaimed? */
> int may_unmap;
>
> + /* Can pages be swapped as part of reclaim? */
> + int may_swap;
> +

Sorry for too late response.
I don't know memcg well.

The memcg managed to use may_swap well with global page reclaim until now.
I think that was because may_swap can represent both meaning.
Do we need each variables really ?

How about using union variable ?
---

struct scan_control {
/* Incremented by the number of inactive pages that were scanned */
unsigned long nr_scanned;
...
union {
int may_swap; /* memcg: Cap pages be swapped as part of reclaim? */
int may_unmap /* global: Can mapped pages be reclaimed? */
};

> /* This context's SWAP_CLUSTER_MAX. If freeing memory for
> * suspend, we effectively ignore SWAP_CLUSTER_MAX.
> * In this context, it doesn't matter that we scan the
> @@ -1379,7 +1382,7 @@ static void get_scan_ratio(struct zone *zone, struct scan_control *sc,
> struct zone_reclaim_stat *reclaim_stat = get_reclaim_stat(zone, sc);
>
> /* If we have no swap space, do not bother scanning anon pages. */
> - if (nr_swap_pages <= 0) {
> + if (!sc->may_swap || (nr_swap_pages <= 0)) {
> percent[0] = 0;
> percent[1] = 100;
> return;
> @@ -1695,6 +1698,7 @@ unsigned long try_to_free_pages(struct zonelist *zonelist, int order,
> .may_writepage = !laptop_mode,
> .swap_cluster_max = SWAP_CLUSTER_MAX,
> .may_unmap = 1,
> + .may_swap = 1,
> .swappiness = vm_swappiness,
> .order = order,
> .mem_cgroup = NULL,
> @@ -1714,6 +1718,7 @@ unsigned long try_to_free_mem_cgroup_pages(struct mem_cgroup *mem_cont,
> struct scan_control sc = {
> .may_writepage = !laptop_mode,
> .may_unmap = 1,
> + .may_swap = 1,
> .swap_cluster_max = SWAP_CLUSTER_MAX,
> .swappiness = swappiness,
> .order = 0,
> @@ -1723,7 +1728,7 @@ unsigned long try_to_free_mem_cgroup_pages(struct mem_cgroup *mem_cont,
> struct zonelist *zonelist;
>
> if (noswap)
> - sc.may_unmap = 0;
> + sc.may_swap = 0;
>
> sc.gfp_mask = (gfp_mask & GFP_RECLAIM_MASK) |
> (GFP_HIGHUSER_MOVABLE & ~GFP_RECLAIM_MASK);
> @@ -1763,6 +1768,7 @@ static unsigned long balance_pgdat(pg_data_t *pgdat, int order)
> struct scan_control sc = {
> .gfp_mask = GFP_KERNEL,
> .may_unmap = 1,
> + .may_swap = 1,
> .swap_cluster_max = SWAP_CLUSTER_MAX,
> .swappiness = vm_swappiness,
> .order = order,
> @@ -2109,6 +2115,7 @@ unsigned long shrink_all_memory(unsigned long nr_pages)
> struct scan_control sc = {
> .gfp_mask = GFP_KERNEL,
> .may_unmap = 0,
> + .may_swap = 1,
> .swap_cluster_max = nr_pages,
> .may_writepage = 1,
> .isolate_pages = isolate_pages_global,
> @@ -2289,6 +2296,7 @@ static int __zone_reclaim(struct zone *zone, gfp_t gfp_mask, unsigned int order)
> struct scan_control sc = {
> .may_writepage = !!(zone_reclaim_mode & RECLAIM_WRITE),
> .may_unmap = !!(zone_reclaim_mode & RECLAIM_SWAP),
> + .may_swap = 1,
> .swap_cluster_max = max_t(unsigned long, nr_pages,
> SWAP_CLUSTER_MAX),
> .gfp_mask = gfp_mask,
>
>
>
>
>

--
Kinds regards,
Minchan Kim

2009-03-31 01:44:22

by Kamezawa Hiroyuki

[permalink] [raw]

Subject: Re: [PATCH] vmscan: memcg needs may_swap (Re: [patch] vmscan: rename sc.may_swap to may_unmap)

On Tue, 31 Mar 2009 10:26:17 +0900
Minchan Kim <[email protected]> wrote:

> > diff --git a/mm/vmscan.c b/mm/vmscan.c
> > index 3be6157..00ea4a1 100644
> > --- a/mm/vmscan.c
> > +++ b/mm/vmscan.c
> > @@ -63,6 +63,9 @@ struct scan_control {
> > /* Can mapped pages be reclaimed? */
> > int may_unmap;
> >
> > + /* Can pages be swapped as part of reclaim? */
> > + int may_swap;
> > +
>
> Sorry for too late response.
> I don't know memcg well.
>
> The memcg managed to use may_swap well with global page reclaim until now.
> I think that was because may_swap can represent both meaning.
> Do we need each variables really ?
>
> How about using union variable ?

or Just removing one of them ?

Thanks,
-Kame

> ---
>
> struct scan_control {
> /* Incremented by the number of inactive pages that were scanned */
> unsigned long nr_scanned;
> ...
> union {
> int may_swap; /* memcg: Cap pages be swapped as part of reclaim? */
> int may_unmap /* global: Can mapped pages be reclaimed? */
> };
>
>
>
> > /* This context's SWAP_CLUSTER_MAX. If freeing memory for
> > * suspend, we effectively ignore SWAP_CLUSTER_MAX.
> > * In this context, it doesn't matter that we scan the
> > @@ -1379,7 +1382,7 @@ static void get_scan_ratio(struct zone *zone, struct scan_control *sc,
> > struct zone_reclaim_stat *reclaim_stat = get_reclaim_stat(zone, sc);
> >
> > /* If we have no swap space, do not bother scanning anon pages. */
> > - if (nr_swap_pages <= 0) {
> > + if (!sc->may_swap || (nr_swap_pages <= 0)) {
> > percent[0] = 0;
> > percent[1] = 100;
> > return;
> > @@ -1695,6 +1698,7 @@ unsigned long try_to_free_pages(struct zonelist *zonelist, int order,
> > .may_writepage = !laptop_mode,
> > .swap_cluster_max = SWAP_CLUSTER_MAX,
> > .may_unmap = 1,
> > + .may_swap = 1,
> > .swappiness = vm_swappiness,
> > .order = order,
> > .mem_cgroup = NULL,
> > @@ -1714,6 +1718,7 @@ unsigned long try_to_free_mem_cgroup_pages(struct mem_cgroup *mem_cont,
> > struct scan_control sc = {
> > .may_writepage = !laptop_mode,
> > .may_unmap = 1,
> > + .may_swap = 1,
> > .swap_cluster_max = SWAP_CLUSTER_MAX,
> > .swappiness = swappiness,
> > .order = 0,
> > @@ -1723,7 +1728,7 @@ unsigned long try_to_free_mem_cgroup_pages(struct mem_cgroup *mem_cont,
> > struct zonelist *zonelist;
> >
> > if (noswap)
> > - sc.may_unmap = 0;
> > + sc.may_swap = 0;
> >
> > sc.gfp_mask = (gfp_mask & GFP_RECLAIM_MASK) |
> > (GFP_HIGHUSER_MOVABLE & ~GFP_RECLAIM_MASK);
> > @@ -1763,6 +1768,7 @@ static unsigned long balance_pgdat(pg_data_t *pgdat, int order)
> > struct scan_control sc = {
> > .gfp_mask = GFP_KERNEL,
> > .may_unmap = 1,
> > + .may_swap = 1,
> > .swap_cluster_max = SWAP_CLUSTER_MAX,
> > .swappiness = vm_swappiness,
> > .order = order,
> > @@ -2109,6 +2115,7 @@ unsigned long shrink_all_memory(unsigned long nr_pages)
> > struct scan_control sc = {
> > .gfp_mask = GFP_KERNEL,
> > .may_unmap = 0,
> > + .may_swap = 1,
> > .swap_cluster_max = nr_pages,
> > .may_writepage = 1,
> > .isolate_pages = isolate_pages_global,
> > @@ -2289,6 +2296,7 @@ static int __zone_reclaim(struct zone *zone, gfp_t gfp_mask, unsigned int order)
> > struct scan_control sc = {
> > .may_writepage = !!(zone_reclaim_mode & RECLAIM_WRITE),
> > .may_unmap = !!(zone_reclaim_mode & RECLAIM_SWAP),
> > + .may_swap = 1,
> > .swap_cluster_max = max_t(unsigned long, nr_pages,
> > SWAP_CLUSTER_MAX),
> > .gfp_mask = gfp_mask,
> >
> >
> >
> >
> >
>
>
>
> --
> Kinds regards,
> Minchan Kim
>

2009-03-31 01:48:48

by KOSAKI Motohiro

[permalink] [raw]

Subject: Re: [PATCH] vmscan: memcg needs may_swap (Re: [patch] vmscan: rename sc.may_swap to may_unmap)

> > Sorry for too late response.
> > I don't know memcg well.
> >
> > The memcg managed to use may_swap well with global page reclaim until now.
> > I think that was because may_swap can represent both meaning.
> > Do we need each variables really ?
> >
> > How about using union variable ?
>
> or Just removing one of them ?

I hope all may_unmap user convert to using may_swap.
may_swap is more efficient and cleaner meaning.

2009-03-31 02:03:51

by Daisuke Nishimura

[permalink] [raw]

Subject: Re: [PATCH] vmscan: memcg needs may_swap (Re: [patch] vmscan: rename sc.may_swap to may_unmap)

Hi,

> > ========
> > Subject: vmswan: reintroduce sc->may_swap
> >
> > vmscan-rename-scmay_swap-to-may_unmap.patch removed may_swap flag,
> > but memcg had used it as a flag for "we need to use swap?", as the
> > name indicate.
> >
> > And in current implementation, memcg cannot reclaim mapped file caches
> > when mem+swap hits the limit.
> >
> > re-introduce may_swap flag and handle it at get_scan_ratio().
> > This patch doesn't influence any scan_control users other than memcg.
> >
> > Signed-off-by: KOSAKI Motohiro <[email protected]>
> > Signed-off-by: Daisuke Nishimura <[email protected]>
> > --
> > mm/vmscan.c | 12 ++++++++++--
> > 1 files changed, 10 insertions(+), 2 deletions(-)
> >
> > diff --git a/mm/vmscan.c b/mm/vmscan.c
> > index 3be6157..00ea4a1 100644
> > --- a/mm/vmscan.c
> > +++ b/mm/vmscan.c
> > @@ -63,6 +63,9 @@ struct scan_control {
> > /* Can mapped pages be reclaimed? */
> > int may_unmap;
> >
> > + /* Can pages be swapped as part of reclaim? */
> > + int may_swap;
> > +
>
> Sorry for too late response.
> I don't know memcg well.
>
> The memcg managed to use may_swap well with global page reclaim until now.
memcg had a bug that it cannot reclaim mapped file caches when it hit
the mem+swap limit :(

Thanks,
Daisuke Nishimura.

2009-04-01 04:11:49

by Johannes Weiner

[permalink] [raw]

Subject: Re: [PATCH] vmscan: memcg needs may_swap (Re: [patch] vmscan: rename sc.may_swap to may_unmap)

On Tue, Mar 31, 2009 at 10:48:32AM +0900, KOSAKI Motohiro wrote:
> > > Sorry for too late response.
> > > I don't know memcg well.
> > >
> > > The memcg managed to use may_swap well with global page reclaim until now.
> > > I think that was because may_swap can represent both meaning.
> > > Do we need each variables really ?
> > >
> > > How about using union variable ?
> >
> > or Just removing one of them ?
>
> I hope all may_unmap user convert to using may_swap.
> may_swap is more efficient and cleaner meaning.

How about making may_swap mean the following:

@@ -642,6 +639,8 @@ static unsigned long shrink_page_list(st
* Try to allocate it some swap space here.
*/
if (PageAnon(page) && !PageSwapCache(page)) {
+ if (!sc->map_swap)
+ goto keep_locked;
if (!(sc->gfp_mask & __GFP_IO))
goto keep_locked;
if (!add_to_swap(page))

try_to_free_pages() always sets it.

try_to_free_mem_cgroup_pages() sets it depending on whether it really
wants swapping, and only swapping, right? But the above would still
reclaim already swapped anon pages and I don't know the memory
controller.

balance_pgdat() always sets it.

__zone_reclaim() sets it depending on zone_reclaim_mode. The
RECLAIM_SWAP bit of this field and its documentation in
Documentation/sysctl/vm.txt suggests it also really only means swap.

shrink_all_memory() would be the sole user of may_unmap because it
really wants to eat cache first. But this could be figured out on a
different occasion.

Hannes

2009-04-01 05:22:15

by Daisuke Nishimura

[permalink] [raw]

Subject: Re: [PATCH] vmscan: memcg needs may_swap (Re: [patch] vmscan: rename sc.may_swap to may_unmap)

On Wed, 1 Apr 2009 06:09:51 +0200, Johannes Weiner <[email protected]> wrote:
> On Tue, Mar 31, 2009 at 10:48:32AM +0900, KOSAKI Motohiro wrote:
> > > > Sorry for too late response.
> > > > I don't know memcg well.
> > > >
> > > > The memcg managed to use may_swap well with global page reclaim until now.
> > > > I think that was because may_swap can represent both meaning.
> > > > Do we need each variables really ?
> > > >
> > > > How about using union variable ?
> > >
> > > or Just removing one of them ?
> >
> > I hope all may_unmap user convert to using may_swap.
> > may_swap is more efficient and cleaner meaning.
>
> How about making may_swap mean the following:
>
> @@ -642,6 +639,8 @@ static unsigned long shrink_page_list(st
> * Try to allocate it some swap space here.
> */
> if (PageAnon(page) && !PageSwapCache(page)) {
> + if (!sc->map_swap)
> + goto keep_locked;
> if (!(sc->gfp_mask & __GFP_IO))
> goto keep_locked;
> if (!add_to_swap(page))
>
but it doesn't work for shmem/tmpfs, does it?
So, I did in my first patch like:

@@ -616,6 +619,11 @@ static unsigned long shrink_page_list(struct list_head *page_list,
if (!sc->may_unmap && page_mapped(page))
goto keep_locked;

+ if (!sc->may_swap && PageSwapBacked(page)
+ /* SwapCache uses 'swap' already */
+ && !PageSwapCache(page))
+ goto keep_locked;
+
/* Double the slab pressure for mapped and swapcache pages */
if (page_mapped(page) || PageSwapCache(page))
sc->nr_scanned++;

> try_to_free_pages() always sets it.
>
> try_to_free_mem_cgroup_pages() sets it depending on whether it really
> wants swapping, and only swapping, right?
right.

> But the above would still reclaim already swapped anon pages
then, it would be better to add a check at shrink_page_list anyway..

Kosaki-san, what do you think?

Thanks,
Daisuke Nishimura.

> and I don't know the memory
> controller.
>
> balance_pgdat() always sets it.
>
> __zone_reclaim() sets it depending on zone_reclaim_mode. The
> RECLAIM_SWAP bit of this field and its documentation in
> Documentation/sysctl/vm.txt suggests it also really only means swap.
>
> shrink_all_memory() would be the sole user of may_unmap because it
> really wants to eat cache first. But this could be figured out on a
> different occasion.
>
> Hannes

2009-04-01 09:06:39

by Kamezawa Hiroyuki

[permalink] [raw]

Subject: Re: [PATCH] vmscan: memcg needs may_swap (Re: [patch] vmscan: rename sc.may_swap to may_unmap)

On Wed, 1 Apr 2009 06:09:51 +0200
Johannes Weiner <[email protected]> wrote:

> On Tue, Mar 31, 2009 at 10:48:32AM +0900, KOSAKI Motohiro wrote:
> > > > Sorry for too late response.
> > > > I don't know memcg well.
> > > >
> > > > The memcg managed to use may_swap well with global page reclaim until now.
> > > > I think that was because may_swap can represent both meaning.
> > > > Do we need each variables really ?
> > > >
> > > > How about using union variable ?
> > >
> > > or Just removing one of them ?
> >
> > I hope all may_unmap user convert to using may_swap.
> > may_swap is more efficient and cleaner meaning.
>
> How about making may_swap mean the following:
>
> @@ -642,6 +639,8 @@ static unsigned long shrink_page_list(st
> * Try to allocate it some swap space here.
> */
> if (PageAnon(page) && !PageSwapCache(page)) {
> + if (!sc->map_swap)
> + goto keep_locked;
> if (!(sc->gfp_mask & __GFP_IO))
> goto keep_locked;
> if (!add_to_swap(page))
>
> try_to_free_pages() always sets it.
>
What is the advantage than _not_ scanning ANON LRU at all ?

> try_to_free_mem_cgroup_pages() sets it depending on whether it really
> wants swapping, and only swapping, right? But the above would still
> reclaim already swapped anon pages and I don't know the memory
> controller.
>
memory cgroup has 2 calls to this shrink_zone.
1. memory usage hits the limit.
2. mem+swap usage hits the limit.

At "2", swap-out doesn't decrease the usage of mem+swap, then set may_swap=0.
So, we want to kick out only file caches.
But, we can reclaim file cache and "unmap file cache and reclaim it!" is
necessary even if may_swap=0.

Then, scanning only FILE LRU makes sense at may_swap=0 *if* memcg is
the only user of may_swap=0.

Let's see others.

- __zone_reclaim sets may_unmap to be 0 when they don't want swap-out.
.....can be replaced with may_swap.

- shrink_all_memory sets may_swap to be 0. Is this called by hibernation ?
If you don't want to unmap file caches while hibernation, adding may_unmap
as *new* paramter makes sense, I think.

The change you proposed is for dropping unused SwapCache pages. Right ?
But this will be dropped by kswapd if necessary.

As far as memcg concerns, scanning ANON LRU even when may_swap=0 is just
a waste of cpu time.

Thanks,
-Kame

2009-04-01 09:11:51

by KOSAKI Motohiro

[permalink] [raw]

Subject: Re: [PATCH] vmscan: memcg needs may_swap (Re: [patch] vmscan: rename sc.may_swap to may_unmap)

> memory cgroup has 2 calls to this shrink_zone.
> 1. memory usage hits the limit.
> 2. mem+swap usage hits the limit.
>
> At "2", swap-out doesn't decrease the usage of mem+swap, then set may_swap=0.
> So, we want to kick out only file caches.
> But, we can reclaim file cache and "unmap file cache and reclaim it!" is
> necessary even if may_swap=0.
>
> Then, scanning only FILE LRU makes sense at may_swap=0 *if* memcg is
> the only user of may_swap=0.
>
> Let's see others.
>
> - __zone_reclaim sets may_unmap to be 0 when they don't want swap-out.
> .....can be replaced with may_swap.
>
> - shrink_all_memory sets may_swap to be 0. Is this called by hibernation ?
> If you don't want to unmap file caches while hibernation, adding may_unmap
> as *new* paramter makes sense, I think.
>
> The change you proposed is for dropping unused SwapCache pages. Right ?
> But this will be dropped by kswapd if necessary.
>
> As far as memcg concerns, scanning ANON LRU even when may_swap=0 is just
> a waste of cpu time.

this sentence just explain my intention.

1. memcg, zone_reclaim scanning ANON LRU is just waste of cpu.
2. kswapd and normal direct reclaim can reclaim stealed swapcache anyway.
then above trick don't cause any system hang-up and performance degression.

2009-04-01 09:51:55

by Johannes Weiner

[permalink] [raw]

Subject: Re: [PATCH] vmscan: memcg needs may_swap (Re: [patch] vmscan: rename sc.may_swap to may_unmap)

On Wed, Apr 01, 2009 at 06:04:45PM +0900, KAMEZAWA Hiroyuki wrote:
> On Wed, 1 Apr 2009 06:09:51 +0200
> Johannes Weiner <[email protected]> wrote:
>
> > On Tue, Mar 31, 2009 at 10:48:32AM +0900, KOSAKI Motohiro wrote:
> > > > > Sorry for too late response.
> > > > > I don't know memcg well.
> > > > >
> > > > > The memcg managed to use may_swap well with global page reclaim until now.
> > > > > I think that was because may_swap can represent both meaning.
> > > > > Do we need each variables really ?
> > > > >
> > > > > How about using union variable ?
> > > >
> > > > or Just removing one of them ?
> > >
> > > I hope all may_unmap user convert to using may_swap.
> > > may_swap is more efficient and cleaner meaning.
> >
> > How about making may_swap mean the following:
> >
> > @@ -642,6 +639,8 @@ static unsigned long shrink_page_list(st
> > * Try to allocate it some swap space here.
> > */
> > if (PageAnon(page) && !PageSwapCache(page)) {
> > + if (!sc->map_swap)
> > + goto keep_locked;
> > if (!(sc->gfp_mask & __GFP_IO))
> > goto keep_locked;
> > if (!add_to_swap(page))
> >
> > try_to_free_pages() always sets it.
> >
> What is the advantage than _not_ scanning ANON LRU at all ?

I thought we could collect anon pages that don't need swap io.

> > try_to_free_mem_cgroup_pages() sets it depending on whether it really
> > wants swapping, and only swapping, right? But the above would still
> > reclaim already swapped anon pages and I don't know the memory
> > controller.
> >
> memory cgroup has 2 calls to this shrink_zone.
> 1. memory usage hits the limit.
> 2. mem+swap usage hits the limit.
>
> At "2", swap-out doesn't decrease the usage of mem+swap, then set may_swap=0.
> So, we want to kick out only file caches.
> But, we can reclaim file cache and "unmap file cache and reclaim it!" is
> necessary even if may_swap=0.

Yes.

> Then, scanning only FILE LRU makes sense at may_swap=0 *if* memcg is
> the only user of may_swap=0.
>
> Let's see others.
>
> - __zone_reclaim sets may_unmap to be 0 when they don't want swap-out.
> .....can be replaced with may_swap.
>
> - shrink_all_memory sets may_swap to be 0. Is this called by hibernation ?
> If you don't want to unmap file caches while hibernation, adding may_unmap
> as *new* paramter makes sense, I think.

Yep, that was my idea too. At least for now and then reevaluate
whether it shouldn't just reclaim in lru order without this flag...

> The change you proposed is for dropping unused SwapCache pages. Right ?
> But this will be dropped by kswapd if necessary.
>
> As far as memcg concerns, scanning ANON LRU even when may_swap=0 is just
> a waste of cpu time.

Okay.

2009-04-01 09:56:13

by KOSAKI Motohiro

[permalink] [raw]

Subject: Re: [PATCH] vmscan: memcg needs may_swap (Re: [patch] vmscan: rename sc.may_swap to may_unmap)

> > > How about making may_swap mean the following:
> > >
> > > @@ -642,6 +639,8 @@ static unsigned long shrink_page_list(st
> > > * Try to allocate it some swap space here.
> > > */
> > > if (PageAnon(page) && !PageSwapCache(page)) {
> > > + if (!sc->map_swap)
> > > + goto keep_locked;
> > > if (!(sc->gfp_mask & __GFP_IO))
> > > goto keep_locked;
> > > if (!add_to_swap(page))
> > >
> > > try_to_free_pages() always sets it.
> > >
> > What is the advantage than _not_ scanning ANON LRU at all ?
>
> I thought we could collect anon pages that don't need swap io.

Yes. but Is this important?
if memcg reclaim don't collect sleal swapcache, other global reclaim can.

Am I missing any viewpoint?

2009-04-01 16:05:40

by Johannes Weiner

[permalink] [raw]

Subject: Re: [PATCH] vmscan: memcg needs may_swap (Re: [patch] vmscan: rename sc.may_swap to may_unmap)

On Wed, Apr 01, 2009 at 06:55:45PM +0900, KOSAKI Motohiro wrote:
> > > > How about making may_swap mean the following:
> > > >
> > > > @@ -642,6 +639,8 @@ static unsigned long shrink_page_list(st
> > > > * Try to allocate it some swap space here.
> > > > */
> > > > if (PageAnon(page) && !PageSwapCache(page)) {
> > > > + if (!sc->map_swap)
> > > > + goto keep_locked;
> > > > if (!(sc->gfp_mask & __GFP_IO))
> > > > goto keep_locked;
> > > > if (!add_to_swap(page))
> > > >
> > > > try_to_free_pages() always sets it.
> > > >
> > > What is the advantage than _not_ scanning ANON LRU at all ?
> >
> > I thought we could collect anon pages that don't need swap io.
>
> Yes. but Is this important?
> if memcg reclaim don't collect sleal swapcache, other global reclaim can.
>
> Am I missing any viewpoint?

Nothing I am aware of, it should work as you suggest. I just wasn't
sure about the memory controller.