LinuxLists.cc - [PATCH mmotm] vmscan: fix may

2009-06-08 03:11:53

Subject: [PATCH mmotm] vmscan: fix may_swap handling for memcg

From: Daisuke Nishimura <[email protected]>

Commit 2e2e425989080cc534fc0fca154cae515f971cf5 ("vmscan,memcg: reintroduce
sc->may_swap) add may_swap flag and handle it at get_scan_ratio().

But the result of get_scan_ratio() is ignored when priority == 0, and this
means, when memcg hits the mem+swap limit, anon pages can be swapped
just in vain. Especially when memcg causes oom by mem+swap limit,
we can see many and many pages are swapped out.

Instead of not scanning anon lru completely when priority == 0, this patch adds
a hook to handle may_swap flag in shrink_page_list() to avoid using useless swaps,
and calls try_to_free_swap() if needed because it can reduce
both mem.usage and memsw.usage if the page(SwapCache) is unused anymore.

Such unused-but-managed-under-memcg SwapCache can be made in some paths,
for example trylock_page() failure in free_swap_cache().

Signed-off-by: Daisuke Nishimura <[email protected]>
---
mm/vmscan.c | 19 +++++++++++++++++++
1 files changed, 19 insertions(+), 0 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 2ddcfc8..d9a3f54 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -640,6 +640,25 @@ static unsigned long shrink_page_list(struct list_head *page_list,
referenced && page_mapping_inuse(page))
goto activate_locked;

+ if (!sc->may_swap && PageSwapBacked(page)) {
+ /* SwapCache has already uses swap entry */
+ if (!PageSwapCache(page))
+ goto keep_locked;
+ /*
+ * From the view point of memcg, may_swap is false when
+ * memsw.usage hits the limit.
+ * But swaping out SwapCache to disk doesn't reduce the
+ * memsw.usage, so it is a waste of time.
+ * Call try_to_free_swap() if the page isn't used,
+ * because it can reduce both mem.usage and memsw.usage.
+ */
+ if (!scanning_global_lru(sc)) {
+ if (!page_mapped(page))
+ try_to_free_swap(page);
+ goto keep_locked;
+ }
+ }
+
/*
* Anonymous process memory has backing store?
* Try to allocate it some swap space here.

2009-06-08 03:21:07

by KOSAKI Motohiro

[permalink] [raw]

Subject: Re: [PATCH mmotm] vmscan: fix may_swap handling for memcg

Hi

> From: Daisuke Nishimura <[email protected]>
>
> Commit 2e2e425989080cc534fc0fca154cae515f971cf5 ("vmscan,memcg: reintroduce
> sc->may_swap) add may_swap flag and handle it at get_scan_ratio().
>
> But the result of get_scan_ratio() is ignored when priority == 0, and this
> means, when memcg hits the mem+swap limit, anon pages can be swapped
> just in vain. Especially when memcg causes oom by mem+swap limit,
> we can see many and many pages are swapped out.
>
> Instead of not scanning anon lru completely when priority == 0, this patch adds
> a hook to handle may_swap flag in shrink_page_list() to avoid using useless swaps,
> and calls try_to_free_swap() if needed because it can reduce
> both mem.usage and memsw.usage if the page(SwapCache) is unused anymore.
>
> Such unused-but-managed-under-memcg SwapCache can be made in some paths,
> for example trylock_page() failure in free_swap_cache().
>
> Signed-off-by: Daisuke Nishimura <[email protected]>

I think root cause is following branch, right?
if so, Why can't we handle this issue on shrink_zone()?

---------------------------------------------------------------
static void shrink_zone(int priority, struct zone *zone,
struct scan_control *sc)
{
get_scan_ratio(zone, sc, percent);

for_each_evictable_lru(l) {
int file = is_file_lru(l);
unsigned long scan;

scan = zone_nr_pages(zone, sc, l);
if (priority) { // !!here!!
scan >>= priority;
scan = (scan * percent[file]) / 100;
}

2009-06-08 06:46:17

by Daisuke Nishimura

[permalink] [raw]

Subject: Re: [PATCH mmotm] vmscan: fix may_swap handling for memcg

On Mon, 8 Jun 2009 12:20:54 +0900 (JST), KOSAKI Motohiro <[email protected]> wrote:
> Hi
>
Hi, thank you for your comment.

> > From: Daisuke Nishimura <[email protected]>
> >
> > Commit 2e2e425989080cc534fc0fca154cae515f971cf5 ("vmscan,memcg: reintroduce
> > sc->may_swap) add may_swap flag and handle it at get_scan_ratio().
> >
> > But the result of get_scan_ratio() is ignored when priority == 0, and this
> > means, when memcg hits the mem+swap limit, anon pages can be swapped
> > just in vain. Especially when memcg causes oom by mem+swap limit,
> > we can see many and many pages are swapped out.
> >
> > Instead of not scanning anon lru completely when priority == 0, this patch adds
> > a hook to handle may_swap flag in shrink_page_list() to avoid using useless swaps,
> > and calls try_to_free_swap() if needed because it can reduce
> > both mem.usage and memsw.usage if the page(SwapCache) is unused anymore.
> >
> > Such unused-but-managed-under-memcg SwapCache can be made in some paths,
> > for example trylock_page() failure in free_swap_cache().
> >
> > Signed-off-by: Daisuke Nishimura <[email protected]>
>
> I think root cause is following branch, right?
yes.

> if so, Why can't we handle this issue on shrink_zone()?
>
Just because priority==0 means oom is about to happen and I don't
want to see oom if possible.
So I thought it would be better to reclaim as much pages(memsw.usage) as possible
in this case.

>
> ---------------------------------------------------------------
> static void shrink_zone(int priority, struct zone *zone,
> struct scan_control *sc)
> {
> get_scan_ratio(zone, sc, percent);
>
> for_each_evictable_lru(l) {
> int file = is_file_lru(l);
> unsigned long scan;
>
> scan = zone_nr_pages(zone, sc, l);
> if (priority) { // !!here!!
> scan >>= priority;
> scan = (scan * percent[file]) / 100;
> }
>
>
>
>

2009-06-08 06:54:16

by KOSAKI Motohiro

[permalink] [raw]

Subject: Re: [PATCH mmotm] vmscan: fix may_swap handling for memcg

> On Mon, 8 Jun 2009 12:20:54 +0900 (JST), KOSAKI Motohiro <[email protected]> wrote:
> > Hi
> >
> Hi, thank you for your comment.
>
> > > From: Daisuke Nishimura <[email protected]>
> > >
> > > Commit 2e2e425989080cc534fc0fca154cae515f971cf5 ("vmscan,memcg: reintroduce
> > > sc->may_swap) add may_swap flag and handle it at get_scan_ratio().
> > >
> > > But the result of get_scan_ratio() is ignored when priority == 0, and this
> > > means, when memcg hits the mem+swap limit, anon pages can be swapped
> > > just in vain. Especially when memcg causes oom by mem+swap limit,
> > > we can see many and many pages are swapped out.
> > >
> > > Instead of not scanning anon lru completely when priority == 0, this patch adds
> > > a hook to handle may_swap flag in shrink_page_list() to avoid using useless swaps,
> > > and calls try_to_free_swap() if needed because it can reduce
> > > both mem.usage and memsw.usage if the page(SwapCache) is unused anymore.
> > >
> > > Such unused-but-managed-under-memcg SwapCache can be made in some paths,
> > > for example trylock_page() failure in free_swap_cache().
> > >
> > > Signed-off-by: Daisuke Nishimura <[email protected]>
> >
> > I think root cause is following branch, right?
> yes.
>
> > if so, Why can't we handle this issue on shrink_zone()?
> >
> Just because priority==0 means oom is about to happen and I don't
> want to see oom if possible.
> So I thought it would be better to reclaim as much pages(memsw.usage) as possible
> in this case.

hmmm..

In general, adding new branch to shrink_page_list() is not good idea.
it can cause performance degression.

Plus, it is not big problem at all. it happen only when priority==0.
Definitely, priority==0 don't occur normally.
and, too many recliaming pages is not only memcg issue. I don't think this
patch provide generic solution.

Why your test environment makes oom so frequently?

2009-06-08 08:08:27

by Daisuke Nishimura

[permalink] [raw]

Subject: Re: [PATCH mmotm] vmscan: fix may_swap handling for memcg

On Mon, 8 Jun 2009 15:53:50 +0900 (JST), KOSAKI Motohiro <[email protected]> wrote:
> > On Mon, 8 Jun 2009 12:20:54 +0900 (JST), KOSAKI Motohiro <[email protected]> wrote:
> > > Hi
> > >
> > Hi, thank you for your comment.
> >
> > > > From: Daisuke Nishimura <[email protected]>
> > > >
> > > > Commit 2e2e425989080cc534fc0fca154cae515f971cf5 ("vmscan,memcg: reintroduce
> > > > sc->may_swap) add may_swap flag and handle it at get_scan_ratio().
> > > >
> > > > But the result of get_scan_ratio() is ignored when priority == 0, and this
> > > > means, when memcg hits the mem+swap limit, anon pages can be swapped
> > > > just in vain. Especially when memcg causes oom by mem+swap limit,
> > > > we can see many and many pages are swapped out.
> > > >
> > > > Instead of not scanning anon lru completely when priority == 0, this patch adds
> > > > a hook to handle may_swap flag in shrink_page_list() to avoid using useless swaps,
> > > > and calls try_to_free_swap() if needed because it can reduce
> > > > both mem.usage and memsw.usage if the page(SwapCache) is unused anymore.
> > > >
> > > > Such unused-but-managed-under-memcg SwapCache can be made in some paths,
> > > > for example trylock_page() failure in free_swap_cache().
> > > >
> > > > Signed-off-by: Daisuke Nishimura <[email protected]>
> > >
> > > I think root cause is following branch, right?
> > yes.
> >
> > > if so, Why can't we handle this issue on shrink_zone()?
> > >
> > Just because priority==0 means oom is about to happen and I don't
> > want to see oom if possible.
> > So I thought it would be better to reclaim as much pages(memsw.usage) as possible
> > in this case.
>
> hmmm..
>
> In general, adding new branch to shrink_page_list() is not good idea.
> it can cause performance degression.
>
> Plus, it is not big problem at all. it happen only when priority==0.
> Definitely, priority==0 don't occur normally.
But it happens under high memory pressure...

> and, too many recliaming pages is not only memcg issue. I don't think this
> patch provide generic solution.
>
Ah, you're right. It's not only memcg issue.

>
> Why your test environment makes oom so frequently?
>
Not so frequently :)
But I can see almost all of pages are swapped-out when memcg causes oom
by memsw.limit(it's a waste of cpu time).
And even after Kamezawa-san's memcg-fix-behavior-under-memorylimit-equals-to-memswlimit.patch,
I can sometimes see swap usage when mem.limit==memsw.limit(it's a waste of cpu time too).

Thanks,
Daisuke Nishimura.

2009-06-09 07:14:49

by Daisuke Nishimura

[permalink] [raw]

Subject: [PATCH mmotm] vmscan: handle may_swap more strictly (Re: [PATCH mmotm] vmscan: fix may_swap handling for memcg)

> > and, too many recliaming pages is not only memcg issue. I don't think this
> > patch provide generic solution.
> >
> Ah, you're right. It's not only memcg issue.
>
How about this one ?

===
From: Daisuke Nishimura <[email protected]>

Commit 2e2e425989080cc534fc0fca154cae515f971cf5 ("vmscan,memcg: reintroduce
sc->may_swap) add may_swap flag and handle it at get_scan_ratio().

But the result of get_scan_ratio() is ignored when priority == 0,
so anon lru is scanned even if may_swap == 0 or nr_swap_pages == 0.
IMHO, this is not an expected behavior.

As for memcg especially, because of this behavior many and many pages are
swapped-out just in vain when oom is invoked by mem+swap limit.

This patch is for handling may_swap flag more strictly.

Signed-off-by: Daisuke Nishimura <[email protected]>
---
mm/vmscan.c | 18 +++++++++---------
1 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 2ddcfc8..bacb092 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1407,13 +1407,6 @@ static void get_scan_ratio(struct zone *zone, struct scan_control *sc,
unsigned long ap, fp;
struct zone_reclaim_stat *reclaim_stat = get_reclaim_stat(zone, sc);

- /* If we have no swap space, do not bother scanning anon pages. */
- if (!sc->may_swap || (nr_swap_pages <= 0)) {
- percent[0] = 0;
- percent[1] = 100;
- return;
- }
-
anon = zone_nr_pages(zone, sc, LRU_ACTIVE_ANON) +
zone_nr_pages(zone, sc, LRU_INACTIVE_ANON);
file = zone_nr_pages(zone, sc, LRU_ACTIVE_FILE) +
@@ -1511,15 +1504,22 @@ static void shrink_zone(int priority, struct zone *zone,
enum lru_list l;
unsigned long nr_reclaimed = sc->nr_reclaimed;
unsigned long swap_cluster_max = sc->swap_cluster_max;
+ int noswap = 0;

- get_scan_ratio(zone, sc, percent);
+ /* If we have no swap space, do not bother scanning anon pages. */
+ if (!sc->may_swap || (nr_swap_pages <= 0)) {
+ noswap = 1;
+ percent[0] = 0;
+ percent[1] = 100;
+ } else
+ get_scan_ratio(zone, sc, percent);

for_each_evictable_lru(l) {
int file = is_file_lru(l);
unsigned long scan;

scan = zone_nr_pages(zone, sc, l);
- if (priority) {
+ if (priority || noswap) {
scan >>= priority;
scan = (scan * percent[file]) / 100;
}

2009-06-09 07:20:49

by KOSAKI Motohiro

[permalink] [raw]

Subject: Re: [PATCH mmotm] vmscan: handle may_swap more strictly (Re: [PATCH mmotm] vmscan: fix may_swap handling for memcg)

> > > and, too many recliaming pages is not only memcg issue. I don't think this
> > > patch provide generic solution.
> > >
> > Ah, you're right. It's not only memcg issue.
> >
> How about this one ?
>
> ===
> From: Daisuke Nishimura <[email protected]>
>
> Commit 2e2e425989080cc534fc0fca154cae515f971cf5 ("vmscan,memcg: reintroduce
> sc->may_swap) add may_swap flag and handle it at get_scan_ratio().
>
> But the result of get_scan_ratio() is ignored when priority == 0,
> so anon lru is scanned even if may_swap == 0 or nr_swap_pages == 0.
> IMHO, this is not an expected behavior.
>
> As for memcg especially, because of this behavior many and many pages are
> swapped-out just in vain when oom is invoked by mem+swap limit.
>
> This patch is for handling may_swap flag more strictly.
>
> Signed-off-by: Daisuke Nishimura <[email protected]>

Looks great.
your patch doesn't only improve memcg, bug also improve noswap system.

Thanks.
Reviewed-by: KOSAKI Motohiro <[email protected]>

> ---
> mm/vmscan.c | 18 +++++++++---------
> 1 files changed, 9 insertions(+), 9 deletions(-)
>
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 2ddcfc8..bacb092 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -1407,13 +1407,6 @@ static void get_scan_ratio(struct zone *zone, struct scan_control *sc,
> unsigned long ap, fp;
> struct zone_reclaim_stat *reclaim_stat = get_reclaim_stat(zone, sc);
>
> - /* If we have no swap space, do not bother scanning anon pages. */
> - if (!sc->may_swap || (nr_swap_pages <= 0)) {
> - percent[0] = 0;
> - percent[1] = 100;
> - return;
> - }
> -
> anon = zone_nr_pages(zone, sc, LRU_ACTIVE_ANON) +
> zone_nr_pages(zone, sc, LRU_INACTIVE_ANON);
> file = zone_nr_pages(zone, sc, LRU_ACTIVE_FILE) +
> @@ -1511,15 +1504,22 @@ static void shrink_zone(int priority, struct zone *zone,
> enum lru_list l;
> unsigned long nr_reclaimed = sc->nr_reclaimed;
> unsigned long swap_cluster_max = sc->swap_cluster_max;
> + int noswap = 0;
>
> - get_scan_ratio(zone, sc, percent);
> + /* If we have no swap space, do not bother scanning anon pages. */
> + if (!sc->may_swap || (nr_swap_pages <= 0)) {
> + noswap = 1;
> + percent[0] = 0;
> + percent[1] = 100;
> + } else
> + get_scan_ratio(zone, sc, percent);
>
> for_each_evictable_lru(l) {
> int file = is_file_lru(l);
> unsigned long scan;
>
> scan = zone_nr_pages(zone, sc, l);
> - if (priority) {
> + if (priority || noswap) {
> scan >>= priority;
> scan = (scan * percent[file]) / 100;
> }

2009-06-09 07:29:53

by Kamezawa Hiroyuki

[permalink] [raw]

Subject: Re: [PATCH mmotm] vmscan: handle may_swap more strictly (Re: [PATCH mmotm] vmscan: fix may_swap handling for memcg)

On Tue, 9 Jun 2009 16:13:30 +0900
Daisuke Nishimura <[email protected]> wrote:

> > > and, too many recliaming pages is not only memcg issue. I don't think this
> > > patch provide generic solution.
> > >
> > Ah, you're right. It's not only memcg issue.
> >
> How about this one ?
>
> ===
> From: Daisuke Nishimura <[email protected]>
>
> Commit 2e2e425989080cc534fc0fca154cae515f971cf5 ("vmscan,memcg: reintroduce
> sc->may_swap) add may_swap flag and handle it at get_scan_ratio().
>
> But the result of get_scan_ratio() is ignored when priority == 0,
> so anon lru is scanned even if may_swap == 0 or nr_swap_pages == 0.
> IMHO, this is not an expected behavior.
>
> As for memcg especially, because of this behavior many and many pages are
> swapped-out just in vain when oom is invoked by mem+swap limit.
>
> This patch is for handling may_swap flag more strictly.
>
> Signed-off-by: Daisuke Nishimura <[email protected]>

Thanks,
Acked-by: KAMEZAWA Hiroyuki <[email protected]>

2009-06-09 07:48:33

by Minchan Kim

[permalink] [raw]

Subject: Re: [PATCH mmotm] vmscan: handle may_swap more strictly (Re: [PATCH mmotm] vmscan: fix may_swap handling for memcg)

Hi, KOSAKI.

As you know, this problem caused by if condition(priority) in shrink_zone.
Let me have a question.

Why do we have to prevent scan value calculation when the priority is zero ?
As I know, before split-lru, we didn't do it.

Is there any specific issue in case of the priority is zero ?

On Tue, Jun 9, 2009 at 4:20 PM, KOSAKI
Motohiro<[email protected]> wrote:
>> > > and, too many recliaming pages is not only memcg issue. I don't think this
>> > > patch provide generic solution.
>> > >
>> > Ah, you're right. It's not only memcg issue.
>> >
>> How about this one ?
>>
>> ===
>> From: Daisuke Nishimura <[email protected]>
>>
>> Commit 2e2e425989080cc534fc0fca154cae515f971cf5 ("vmscan,memcg: reintroduce
>> sc->may_swap) add may_swap flag and handle it at get_scan_ratio().
>>
>> But the result of get_scan_ratio() is ignored when priority == 0,
>> so anon lru is scanned even if may_swap == 0 or nr_swap_pages == 0.
>> IMHO, this is not an expected behavior.
>>
>> As for memcg especially, because of this behavior many and many pages are
>> swapped-out just in vain when oom is invoked by mem+swap limit.
>>
>> This patch is for handling may_swap flag more strictly.
>>
>> Signed-off-by: Daisuke Nishimura <[email protected]>
>
> Looks great.
> your patch doesn't only improve memcg, bug also improve noswap system.
>
> Thanks.
> Reviewed-by: KOSAKI Motohiro <[email protected]>
>
>
>
>> ---
>> mm/vmscan.c | 18 +++++++++---------
>> 1 files changed, 9 insertions(+), 9 deletions(-)
>>
>> diff --git a/mm/vmscan.c b/mm/vmscan.c
>> index 2ddcfc8..bacb092 100644
>> --- a/mm/vmscan.c
>> +++ b/mm/vmscan.c
>> @@ -1407,13 +1407,6 @@ static void get_scan_ratio(struct zone *zone, struct scan_control *sc,
>> unsigned long ap, fp;
>> struct zone_reclaim_stat *reclaim_stat = get_reclaim_stat(zone, sc);
>>
>> - /* If we have no swap space, do not bother scanning anon pages. */
>> - if (!sc->may_swap || (nr_swap_pages <= 0)) {
>> - percent[0] = 0;
>> - percent[1] = 100;
>> - return;
>> - }
>> -
>> anon = zone_nr_pages(zone, sc, LRU_ACTIVE_ANON) +
>> zone_nr_pages(zone, sc, LRU_INACTIVE_ANON);
>> file = zone_nr_pages(zone, sc, LRU_ACTIVE_FILE) +
>> @@ -1511,15 +1504,22 @@ static void shrink_zone(int priority, struct zone *zone,
>> enum lru_list l;
>> unsigned long nr_reclaimed = sc->nr_reclaimed;
>> unsigned long swap_cluster_max = sc->swap_cluster_max;
>> + int noswap = 0;
>>
>> - get_scan_ratio(zone, sc, percent);
>> + /* If we have no swap space, do not bother scanning anon pages. */
>> + if (!sc->may_swap || (nr_swap_pages <= 0)) {
>> + noswap = 1;
>> + percent[0] = 0;
>> + percent[1] = 100;
>> + } else
>> + get_scan_ratio(zone, sc, percent);
>>
>> for_each_evictable_lru(l) {
>> int file = is_file_lru(l);
>> unsigned long scan;
>>
>> scan = zone_nr_pages(zone, sc, l);
>> - if (priority) {
>> + if (priority || noswap) {
>> scan >>= priority;
>> scan = (scan * percent[file]) / 100;
>> }
>
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

--
Kinds regards,
Minchan Kim

2009-06-09 07:58:56

by KOSAKI Motohiro

[permalink] [raw]

Subject: Re: [PATCH mmotm] vmscan: handle may_swap more strictly (Re: [PATCH mmotm] vmscan: fix may_swap handling for memcg)

> Hi, KOSAKI.
>
> As you know, this problem caused by if condition(priority) in shrink_zone.
> Let me have a question.
>
> Why do we have to prevent scan value calculation when the priority is zero ?
> As I know, before split-lru, we didn't do it.
>
> Is there any specific issue in case of the priority is zero ?

Yes.

example:

get_scan_ratio() return anon:80%, file=20%. and the system have
10000 anon pages and 10000 file pages.

shrink_zone() picked up 8000 anon pages and 2000 file pages.
it mean 8000 file pages aren't scanned at all.

Oops, it can makes OOM-killer although system have droppable file cache.

2009-06-09 08:19:44

by Minchan Kim

[permalink] [raw]

Subject: Re: [PATCH mmotm] vmscan: handle may_swap more strictly (Re: [PATCH mmotm] vmscan: fix may_swap handling for memcg)

On Tue, Jun 9, 2009 at 4:58 PM, KOSAKI
Motohiro<[email protected]> wrote:
>> Hi, KOSAKI.
>>
>> As you know, this problem caused by if condition(priority) in shrink_zone.
>> Let me have a question.
>>
>> Why do we have to prevent scan value calculation when the priority is zero ?
>> As I know, before split-lru, we didn't do it.
>>
>> Is there any specific issue in case of the priority is zero ?
>
> Yes.
>
> example:
>
> get_scan_ratio() return anon:80%, file=20%. and the system have
> 10000 anon pages and 10000 file pages.
>
> shrink_zone() picked up 8000 anon pages and 2000 file pages.
> it mean 8000 file pages aren't scanned at all.
>
> Oops, it can makes OOM-killer although system have droppable file cache.
>
Hmm..Can that problem be happen in real system ?
The file ratio is big means that file lru list scanning is so big but
rotate is small.
It means file lru have few reclaimable page.

Isn't it ? I am confusing.
Could you elaborate, please if you don't mind ?

--
Kinds regards,
Minchan Kim

2009-06-09 08:24:22

by KOSAKI Motohiro

[permalink] [raw]

Subject: Re: [PATCH mmotm] vmscan: handle may_swap more strictly (Re: [PATCH mmotm] vmscan: fix may_swap handling for memcg)

> On Tue, Jun 9, 2009 at 4:58 PM, KOSAKI
> Motohiro<[email protected]> wrote:
> >> Hi, KOSAKI.
> >>
> >> As you know, this problem caused by if condition(priority) in shrink_zone.
> >> Let me have a question.
> >>
> >> Why do we have to prevent scan value calculation when the priority is zero ?
> >> As I know, before split-lru, we didn't do it.
> >>
> >> Is there any specific issue in case of the priority is zero ?
> >
> > Yes.
> >
> > example:
> >
> > get_scan_ratio() return anon:80%, file=20%. and the system have
> > 10000 anon pages and 10000 file pages.
> >
> > shrink_zone() picked up 8000 anon pages and 2000 file pages.
> > it mean 8000 file pages aren't scanned at all.
> >
> > Oops, it can makes OOM-killer although system have droppable file cache.
> >
> Hmm..Can that problem be happen in real system ?
> The file ratio is big means that file lru list scanning is so big but
> rotate is small.
> It means file lru have few reclaimable page.
>
> Isn't it ? I am confusing.
> Could you elaborate, please if you don't mind ?

hm, ok, my example was wrong.
I intention is, if there are droppable file-back pages (althout only 1 page),
OOM-killer shouldn't occuer.

many or few is unrelated.

2009-06-09 08:35:19

by Minchan Kim

[permalink] [raw]

Subject: Re: [PATCH mmotm] vmscan: handle may_swap more strictly (Re: [PATCH mmotm] vmscan: fix may_swap handling for memcg)

On Tue, Jun 9, 2009 at 5:24 PM, KOSAKI
Motohiro<[email protected]> wrote:
>> On Tue, Jun 9, 2009 at 4:58 PM, KOSAKI
>> Motohiro<[email protected]> wrote:
>> >> Hi, KOSAKI.
>> >>
>> >> As you know, this problem caused by if condition(priority) in shrink_zone.
>> >> Let me have a question.
>> >>
>> >> Why do we have to prevent scan value calculation when the priority is zero ?
>> >> As I know, before split-lru, we didn't do it.
>> >>
>> >> Is there any specific issue in case of the priority is zero ?
>> >
>> > Yes.
>> >
>> > example:
>> >
>> > get_scan_ratio() return anon:80%, file=20%. and the system have
>> > 10000 anon pages and 10000 file pages.
>> >
>> > shrink_zone() picked up 8000 anon pages and 2000 file pages.
>> > it mean 8000 file pages aren't scanned at all.
>> >
>> > Oops, it can makes OOM-killer although system have droppable file cache.
>> >
>> Hmm..Can that problem be happen in real system ?
>> The file ratio is big means that file lru list scanning is so big but
>> rotate is small.
>> It means file lru have few reclaimable page.
>>
>> Isn't it ? I am confusing.
>> Could you elaborate, please if you don't mind ?
>
> hm, ok, my example was wrong.
> I intention is, if there are droppable file-back pages (althout only 1 page),
> OOM-killer shouldn't occuer.
>
> many or few is unrelated.
>

I am not sure that is effective.
Have you ever met this problem in real situation ?

BTW, I have to dive into code. :)
Thanks for spending valuable time for commenting

--
Kinds regards,
Minchan Kim

2009-06-09 08:37:31

by KOSAKI Motohiro

[permalink] [raw]

Subject: Re: [PATCH mmotm] vmscan: handle may_swap more strictly (Re: [PATCH mmotm] vmscan: fix may_swap handling for memcg)

> On Tue, Jun 9, 2009 at 5:24 PM, KOSAKI
> Motohiro<[email protected]> wrote:
> >> On Tue, Jun 9, 2009 at 4:58 PM, KOSAKI
> >> Motohiro<[email protected]> wrote:
> >> >> Hi, KOSAKI.
> >> >>
> >> >> As you know, this problem caused by if condition(priority) in shrink_zone.
> >> >> Let me have a question.
> >> >>
> >> >> Why do we have to prevent scan value calculation when the priority is zero ?
> >> >> As I know, before split-lru, we didn't do it.
> >> >>
> >> >> Is there any specific issue in case of the priority is zero ?
> >> >
> >> > Yes.
> >> >
> >> > example:
> >> >
> >> > get_scan_ratio() return anon:80%, file=20%. and the system have
> >> > 10000 anon pages and 10000 file pages.
> >> >
> >> > shrink_zone() picked up 8000 anon pages and 2000 file pages.
> >> > it mean 8000 file pages aren't scanned at all.
> >> >
> >> > Oops, it can makes OOM-killer although system have droppable file cache.
> >> >
> >> Hmm..Can that problem be happen in real system ?
> >> The file ratio is big means that file lru list scanning is so big but
> >> rotate is small.
> >> It means file lru have few reclaimable page.
> >>
> >> Isn't it ? I am confusing.
> >> Could you elaborate, please if you don't mind ?
> >
> > hm, ok, my example was wrong.
> > I intention is, if there are droppable file-back pages (althout only 1 page),
> > OOM-killer shouldn't occuer.
> >
> > many or few is unrelated.
> >
>
> I am not sure that is effective.
> Have you ever met this problem in real situation ?

No.
It's only stress workload issue. but VM subsystem sould work on
stress workload, I think.

> BTW, I have to dive into code. :)
> Thanks for spending valuable time for commenting