2008-11-28 09:33:08

by Daisuke Nishimura

[permalink] [raw]
Subject: [RFC][PATCH -mmotm 0/2] misc patches for memory cgroup hierarchy

Hi.

I'm writing some patches for memory cgroup hierarchy.

I think KAMEZAWA-san's cgroup-id patches are the most important pathes now,
but I post these patches as RFC before going further.

Patch descriptions:
- [1/2] take account of memsw
mem_cgroup_hierarchical_reclaim checks only mem->res now.
It should also check mem->memsw when do_swap_account.
- [2/2] avoid oom
In previous implementation, mem_cgroup_try_charge checked the return
value of mem_cgroup_try_to_free_pages, and just retried if some pages
had been reclaimed.
But now, try_charge(and mem_cgroup_hierarchical_reclaim called from it)
only checks whether the usage is less than the limit.
I see oom easily in some tests which didn't cause oom before.

Both patches are for memory-cgroup-hierarchical-reclaim-v4 patch series.

My current plan for memory cgroup hierarchy:
- If hierarchy is enabled, limit of child should not exceed that of parent.
- Change other calls for mem_cgroup_try_to_free_page() to
mem_cgroup_hierarchical_reclaim() if possible.


Thanks,
Daisuke Nishimura.


2008-11-28 09:33:35

by Daisuke Nishimura

[permalink] [raw]
Subject: [RFC][PATCH -mmotm 1/2] take account of memsw

mem_cgroup_hierarchical_reclaim checks only mem->res now.
It should also check mem->memsw when do_swap_account.

Signed-off-by: Daisuke Nishimura <[email protected]>
---
mm/memcontrol.c | 17 +++++++++++++++--
1 files changed, 15 insertions(+), 2 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 20e1d90..e7806fc 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -567,6 +567,19 @@ done:
return ret;
}

+static int mem_cgroup_check_under_limit(struct mem_cgroup *mem)
+{
+ if (do_swap_account) {
+ if (res_counter_check_under_limit(&mem->res) &&
+ res_counter_check_under_limit(&mem->memsw))
+ return 1;
+ } else
+ if (res_counter_check_under_limit(&mem->res))
+ return 1;
+
+ return 0;
+}
+
/*
* Dance down the hierarchy if needed to reclaim memory. We remember the
* last child we reclaimed from, so that we don't end up penalizing
@@ -588,7 +601,7 @@ static int mem_cgroup_hierarchical_reclaim(struct mem_cgroup *root_mem,
* have left.
*/
ret = try_to_free_mem_cgroup_pages(root_mem, gfp_mask, noswap);
- if (res_counter_check_under_limit(&root_mem->res))
+ if (mem_cgroup_check_under_limit(root_mem))
return 0;

next_mem = mem_cgroup_get_first_node(root_mem);
@@ -602,7 +615,7 @@ static int mem_cgroup_hierarchical_reclaim(struct mem_cgroup *root_mem,
continue;
}
ret = try_to_free_mem_cgroup_pages(next_mem, gfp_mask, noswap);
- if (res_counter_check_under_limit(&root_mem->res))
+ if (mem_cgroup_check_under_limit(root_mem))
return 0;
cgroup_lock();
next_mem = mem_cgroup_get_next_node(next_mem, root_mem);

2008-11-28 09:34:08

by Daisuke Nishimura

[permalink] [raw]
Subject: [RFC][PATCH -mmotm 2/2] avoid oom

In previous implementation, mem_cgroup_try_charge checked the return
value of mem_cgroup_try_to_free_pages, and just retried if some pages
had been reclaimed.
But now, try_charge(and mem_cgroup_hierarchical_reclaim called from it)
only checks whether the usage is less than the limit.
I see oom easily in some tests which didn't cause oom before.

This patch tries to change the behavior as before.

I've tested this patch with only one (except root) mem cgroup directory,
and a mem cgroup directory(use_hierarchy=1) which has 4 children with running
test programs on itself and each children's directories.

Of course, even after this patch is applied, oom happens if trying to use
too much memory.

Signed-off-by: Daisuke Nishimura <[email protected]>
---

mm/memcontrol.c | 19 ++++++++++++-------
1 files changed, 12 insertions(+), 7 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index e7806fc..ab134b7 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -592,6 +592,7 @@ static int mem_cgroup_hierarchical_reclaim(struct mem_cgroup *root_mem,
{
struct mem_cgroup *next_mem;
int ret = 0;
+ int child = 0;

/*
* Reclaim unconditionally and don't check for return value.
@@ -600,9 +601,9 @@ static int mem_cgroup_hierarchical_reclaim(struct mem_cgroup *root_mem,
* but there might be left over accounting, even after children
* have left.
*/
- ret = try_to_free_mem_cgroup_pages(root_mem, gfp_mask, noswap);
+ ret += try_to_free_mem_cgroup_pages(root_mem, gfp_mask, noswap);
if (mem_cgroup_check_under_limit(root_mem))
- return 0;
+ return 1; /* indicate success of reclaim */

next_mem = mem_cgroup_get_first_node(root_mem);

@@ -614,14 +615,17 @@ static int mem_cgroup_hierarchical_reclaim(struct mem_cgroup *root_mem,
cgroup_unlock();
continue;
}
- ret = try_to_free_mem_cgroup_pages(next_mem, gfp_mask, noswap);
+ child++;
+ ret += try_to_free_mem_cgroup_pages(next_mem, gfp_mask, noswap);
if (mem_cgroup_check_under_limit(root_mem))
- return 0;
+ return 1; /* indicate success of reclaim */
cgroup_lock();
next_mem = mem_cgroup_get_next_node(next_mem, root_mem);
cgroup_unlock();
}
- return ret;
+
+ /* reclaimed at least one page on average from root and each child */
+ return ret > child;
}

/*
@@ -684,8 +688,9 @@ static int __mem_cgroup_try_charge(struct mm_struct *mm,
if (!(gfp_mask & __GFP_WAIT))
goto nomem;

- ret = mem_cgroup_hierarchical_reclaim(mem_over_limit, gfp_mask,
- noswap);
+ if (mem_cgroup_hierarchical_reclaim(mem_over_limit, gfp_mask,
+ noswap))
+ continue;

/*
* try_to_free_mem_cgroup_pages() might not give us a full

2008-11-28 10:50:38

by Kamezawa Hiroyuki

[permalink] [raw]
Subject: Re: [RFC][PATCH -mmotm 0/2] misc patches for memory cgroup hierarchy

On Fri, 28 Nov 2008 18:02:52 +0900
Daisuke Nishimura <[email protected]> wrote:

> Hi.
>
> I'm writing some patches for memory cgroup hierarchy.
>
> I think KAMEZAWA-san's cgroup-id patches are the most important pathes now,
> but I post these patches as RFC before going further.
>
Don't wait me ;) I'll rebase mine.


> Patch descriptions:
> - [1/2] take account of memsw
> mem_cgroup_hierarchical_reclaim checks only mem->res now.
> It should also check mem->memsw when do_swap_account.
> - [2/2] avoid oom
> In previous implementation, mem_cgroup_try_charge checked the return
> value of mem_cgroup_try_to_free_pages, and just retried if some pages
> had been reclaimed.
> But now, try_charge(and mem_cgroup_hierarchical_reclaim called from it)
> only checks whether the usage is less than the limit.
> I see oom easily in some tests which didn't cause oom before.
>
> Both patches are for memory-cgroup-hierarchical-reclaim-v4 patch series.
>
> My current plan for memory cgroup hierarchy:
> - If hierarchy is enabled, limit of child should not exceed that of parent.
limit of a child or
limit of sum of children ?

> - Change other calls for mem_cgroup_try_to_free_page() to
> mem_cgroup_hierarchical_reclaim() if possible.
>
maybe makes sense.

Thanks,
-Kame

2008-11-28 10:52:20

by Kamezawa Hiroyuki

[permalink] [raw]
Subject: Re: [RFC][PATCH -mmotm 1/2] take account of memsw

On Fri, 28 Nov 2008 18:07:37 +0900
Daisuke Nishimura <[email protected]> wrote:

> mem_cgroup_hierarchical_reclaim checks only mem->res now.
> It should also check mem->memsw when do_swap_account.
>
> Signed-off-by: Daisuke Nishimura <[email protected]>

make sense

Acked-by: KAMEZAWA Hiroyuki <[email protected]>

> ---
> mm/memcontrol.c | 17 +++++++++++++++--
> 1 files changed, 15 insertions(+), 2 deletions(-)
>
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 20e1d90..e7806fc 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -567,6 +567,19 @@ done:
> return ret;
> }
>
> +static int mem_cgroup_check_under_limit(struct mem_cgroup *mem)
> +{
> + if (do_swap_account) {
> + if (res_counter_check_under_limit(&mem->res) &&
> + res_counter_check_under_limit(&mem->memsw))
> + return 1;
> + } else
> + if (res_counter_check_under_limit(&mem->res))
> + return 1;
> +
> + return 0;
> +}
> +
> /*
> * Dance down the hierarchy if needed to reclaim memory. We remember the
> * last child we reclaimed from, so that we don't end up penalizing
> @@ -588,7 +601,7 @@ static int mem_cgroup_hierarchical_reclaim(struct mem_cgroup *root_mem,
> * have left.
> */
> ret = try_to_free_mem_cgroup_pages(root_mem, gfp_mask, noswap);
> - if (res_counter_check_under_limit(&root_mem->res))
> + if (mem_cgroup_check_under_limit(root_mem))
> return 0;
>
> next_mem = mem_cgroup_get_first_node(root_mem);
> @@ -602,7 +615,7 @@ static int mem_cgroup_hierarchical_reclaim(struct mem_cgroup *root_mem,
> continue;
> }
> ret = try_to_free_mem_cgroup_pages(next_mem, gfp_mask, noswap);
> - if (res_counter_check_under_limit(&root_mem->res))
> + if (mem_cgroup_check_under_limit(root_mem))
> return 0;
> cgroup_lock();
> next_mem = mem_cgroup_get_next_node(next_mem, root_mem);
>

2008-11-28 11:00:52

by Kamezawa Hiroyuki

[permalink] [raw]
Subject: Re: [RFC][PATCH -mmotm 2/2] avoid oom

On Fri, 28 Nov 2008 18:09:37 +0900
Daisuke Nishimura <[email protected]> wrote:

> In previous implementation, mem_cgroup_try_charge checked the return
> value of mem_cgroup_try_to_free_pages, and just retried if some pages
> had been reclaimed.
> But now, try_charge(and mem_cgroup_hierarchical_reclaim called from it)
> only checks whether the usage is less than the limit.
> I see oom easily in some tests which didn't cause oom before.
>
> This patch tries to change the behavior as before.
>
> I've tested this patch with only one (except root) mem cgroup directory,
> and a mem cgroup directory(use_hierarchy=1) which has 4 children with running
> test programs on itself and each children's directories.
>
> Of course, even after this patch is applied, oom happens if trying to use
> too much memory.
>
> Signed-off-by: Daisuke Nishimura <[email protected]>
> ---
>
> mm/memcontrol.c | 19 ++++++++++++-------
> 1 files changed, 12 insertions(+), 7 deletions(-)
>
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index e7806fc..ab134b7 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -592,6 +592,7 @@ static int mem_cgroup_hierarchical_reclaim(struct mem_cgroup *root_mem,
> {
> struct mem_cgroup *next_mem;
> int ret = 0;
> + int child = 0;
>
> /*
> * Reclaim unconditionally and don't check for return value.
> @@ -600,9 +601,9 @@ static int mem_cgroup_hierarchical_reclaim(struct mem_cgroup *root_mem,
> * but there might be left over accounting, even after children
> * have left.
> */
> - ret = try_to_free_mem_cgroup_pages(root_mem, gfp_mask, noswap);
> + ret += try_to_free_mem_cgroup_pages(root_mem, gfp_mask, noswap);
> if (mem_cgroup_check_under_limit(root_mem))
> - return 0;
> + return 1; /* indicate success of reclaim */
>
> next_mem = mem_cgroup_get_first_node(root_mem);
>
> @@ -614,14 +615,17 @@ static int mem_cgroup_hierarchical_reclaim(struct mem_cgroup *root_mem,
> cgroup_unlock();
> continue;
> }
> - ret = try_to_free_mem_cgroup_pages(next_mem, gfp_mask, noswap);
> + child++;
> + ret += try_to_free_mem_cgroup_pages(next_mem, gfp_mask, noswap);
> if (mem_cgroup_check_under_limit(root_mem))
> - return 0;
> + return 1; /* indicate success of reclaim */
> cgroup_lock();
> next_mem = mem_cgroup_get_next_node(next_mem, root_mem);
> cgroup_unlock();
> }
> - return ret;
> +
> + /* reclaimed at least one page on average from root and each child */
> + return ret > child;
> }
>
I can't understand why this heuristic...

just (ret != 0) is ?

Thanks,
-Kame



> /*
> @@ -684,8 +688,9 @@ static int __mem_cgroup_try_charge(struct mm_struct *mm,
> if (!(gfp_mask & __GFP_WAIT))
> goto nomem;
>
> - ret = mem_cgroup_hierarchical_reclaim(mem_over_limit, gfp_mask,
> - noswap);
> + if (mem_cgroup_hierarchical_reclaim(mem_over_limit, gfp_mask,
> + noswap))
> + continue;
>
> /*
> * try_to_free_mem_cgroup_pages() might not give us a full
>

2008-11-28 12:49:49

by Daisuke Nishimura

[permalink] [raw]
Subject: Re: [RFC][PATCH -mmotm 0/2] misc patches for memory cgroup hierarchy

On Fri, 28 Nov 2008 19:49:38 +0900
KAMEZAWA Hiroyuki <[email protected]> wrote:

> On Fri, 28 Nov 2008 18:02:52 +0900
> Daisuke Nishimura <[email protected]> wrote:
>
> > Hi.
> >
> > I'm writing some patches for memory cgroup hierarchy.
> >
> > I think KAMEZAWA-san's cgroup-id patches are the most important pathes now,
> > but I post these patches as RFC before going further.
> >
> Don't wait me ;) I'll rebase mine.
>
I see :)

>
> > Patch descriptions:
> > - [1/2] take account of memsw
> > mem_cgroup_hierarchical_reclaim checks only mem->res now.
> > It should also check mem->memsw when do_swap_account.
> > - [2/2] avoid oom
> > In previous implementation, mem_cgroup_try_charge checked the return
> > value of mem_cgroup_try_to_free_pages, and just retried if some pages
> > had been reclaimed.
> > But now, try_charge(and mem_cgroup_hierarchical_reclaim called from it)
> > only checks whether the usage is less than the limit.
> > I see oom easily in some tests which didn't cause oom before.
> >
> > Both patches are for memory-cgroup-hierarchical-reclaim-v4 patch series.
> >
> > My current plan for memory cgroup hierarchy:
> > - If hierarchy is enabled, limit of child should not exceed that of parent.
> limit of a child or
> limit of sum of children ?
>
I'm sorry for my poor explanation.
I meant *max* of limits of children.

I think setting limit like

group A (limit=1G)
group A0 (limit=500M)
group A1 (limit=800M)

is not wrong itself.


Thanks,
Daisuke Nishimura.

> > - Change other calls for mem_cgroup_try_to_free_page() to
> > mem_cgroup_hierarchical_reclaim() if possible.
> >
> maybe makes sense.
>
> Thanks,
> -Kame
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to [email protected]. For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"[email protected]"> [email protected] </a>
>

2008-11-28 14:09:50

by Daisuke Nishimura

[permalink] [raw]
Subject: Re: [RFC][PATCH -mmotm 2/2] avoid oom

On Fri, 28 Nov 2008 19:59:53 +0900
KAMEZAWA Hiroyuki <[email protected]> wrote:

> On Fri, 28 Nov 2008 18:09:37 +0900
> Daisuke Nishimura <[email protected]> wrote:
>
> > In previous implementation, mem_cgroup_try_charge checked the return
> > value of mem_cgroup_try_to_free_pages, and just retried if some pages
> > had been reclaimed.
> > But now, try_charge(and mem_cgroup_hierarchical_reclaim called from it)
> > only checks whether the usage is less than the limit.
> > I see oom easily in some tests which didn't cause oom before.
> >
> > This patch tries to change the behavior as before.
> >
> > I've tested this patch with only one (except root) mem cgroup directory,
> > and a mem cgroup directory(use_hierarchy=1) which has 4 children with running
> > test programs on itself and each children's directories.
> >
> > Of course, even after this patch is applied, oom happens if trying to use
> > too much memory.
> >
> > Signed-off-by: Daisuke Nishimura <[email protected]>
> > ---
> >
> > mm/memcontrol.c | 19 ++++++++++++-------
> > 1 files changed, 12 insertions(+), 7 deletions(-)
> >
> > diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> > index e7806fc..ab134b7 100644
> > --- a/mm/memcontrol.c
> > +++ b/mm/memcontrol.c
> > @@ -592,6 +592,7 @@ static int mem_cgroup_hierarchical_reclaim(struct mem_cgroup *root_mem,
> > {
> > struct mem_cgroup *next_mem;
> > int ret = 0;
> > + int child = 0;
> >
> > /*
> > * Reclaim unconditionally and don't check for return value.
> > @@ -600,9 +601,9 @@ static int mem_cgroup_hierarchical_reclaim(struct mem_cgroup *root_mem,
> > * but there might be left over accounting, even after children
> > * have left.
> > */
> > - ret = try_to_free_mem_cgroup_pages(root_mem, gfp_mask, noswap);
> > + ret += try_to_free_mem_cgroup_pages(root_mem, gfp_mask, noswap);
> > if (mem_cgroup_check_under_limit(root_mem))
> > - return 0;
> > + return 1; /* indicate success of reclaim */
> >
> > next_mem = mem_cgroup_get_first_node(root_mem);
> >
> > @@ -614,14 +615,17 @@ static int mem_cgroup_hierarchical_reclaim(struct mem_cgroup *root_mem,
> > cgroup_unlock();
> > continue;
> > }
> > - ret = try_to_free_mem_cgroup_pages(next_mem, gfp_mask, noswap);
> > + child++;
> > + ret += try_to_free_mem_cgroup_pages(next_mem, gfp_mask, noswap);
> > if (mem_cgroup_check_under_limit(root_mem))
> > - return 0;
> > + return 1; /* indicate success of reclaim */
> > cgroup_lock();
> > next_mem = mem_cgroup_get_next_node(next_mem, root_mem);
> > cgroup_unlock();
> > }
> > - return ret;
> > +
> > + /* reclaimed at least one page on average from root and each child */
> > + return ret > child;
> > }
> >
> I can't understand why this heuristic...
>
> just (ret != 0) is ?
>
I also thought the same thing first.

I'm worrying about the case there are many and many children.
IMHO, this function rarely fails in such cases(just because there
are many candidates to reclaim from. Ah.. I should check at least
each memcgroup has charges), so I thought it would be better
to take account of the number of children, to prevent a long software
stuck at infinite loop at __mem_cgroup_try_charge.

Actually, I tested "return ret" version in my test (4 children) too,
and I didn't see a big difference(it seemed that in "return ret" version
it took a bit long time to cause oom, but I've not confirmed it in detail).

I must consider more.


Thanks,
Daisuke Nishimura.

> Thanks,
> -Kame
>
>
>
> > /*
> > @@ -684,8 +688,9 @@ static int __mem_cgroup_try_charge(struct mm_struct *mm,
> > if (!(gfp_mask & __GFP_WAIT))
> > goto nomem;
> >
> > - ret = mem_cgroup_hierarchical_reclaim(mem_over_limit, gfp_mask,
> > - noswap);
> > + if (mem_cgroup_hierarchical_reclaim(mem_over_limit, gfp_mask,
> > + noswap))
> > + continue;
> >
> > /*
> > * try_to_free_mem_cgroup_pages() might not give us a full
> >
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

2008-11-28 18:18:55

by Balbir Singh

[permalink] [raw]
Subject: Re: [RFC][PATCH -mmotm 0/2] misc patches for memory cgroup hierarchy

* Daisuke Nishimura <[email protected]> [2008-11-28 18:02:52]:

> Hi.
>
> I'm writing some patches for memory cgroup hierarchy.
>
> I think KAMEZAWA-san's cgroup-id patches are the most important pathes now,
> but I post these patches as RFC before going further.
>
> Patch descriptions:
> - [1/2] take account of memsw
> mem_cgroup_hierarchical_reclaim checks only mem->res now.
> It should also check mem->memsw when do_swap_account.
> - [2/2] avoid oom
> In previous implementation, mem_cgroup_try_charge checked the return
> value of mem_cgroup_try_to_free_pages, and just retried if some pages
> had been reclaimed.
> But now, try_charge(and mem_cgroup_hierarchical_reclaim called from it)
> only checks whether the usage is less than the limit.
> I see oom easily in some tests which didn't cause oom before.
>
> Both patches are for memory-cgroup-hierarchical-reclaim-v4 patch series.
>
> My current plan for memory cgroup hierarchy:
> - If hierarchy is enabled, limit of child should not exceed that of parent.
> - Change other calls for mem_cgroup_try_to_free_page() to
> mem_cgroup_hierarchical_reclaim() if possible.
>

Thanks, Daisuke,

I am in a conference and taken a quick look. The patches seem sane,
but I've not reviewed them carefully. I'll revert back next week


--
Balbir