by Kamezawa Hiroyuki

[permalink] [raw]

On Mon, May 16, 2011 at 11:38:07PM -0700, Ying Han wrote:
> On Thu, May 12, 2011 at 9:03 AM, Johannes Weiner <[email protected]> wrote:
> > On Thu, May 12, 2011 at 11:33:13AM -0400, Rik van Riel wrote:
> >> On 05/12/2011 10:53 AM, Johannes Weiner wrote:
> >> >The reclaim code has a single predicate for whether it currently
> >> >reclaims on behalf of a memory cgroup, as well as whether it is
> >> >reclaiming from the global LRU list or a memory cgroup LRU list.
> >> >
> >> >Up to now, both cases always coincide, but subsequent patches will
> >> >change things such that global reclaim will scan memory cgroup lists.
> >> >
> >> >This patch adds a new predicate that tells global reclaim from memory
> >> >cgroup reclaim, and then changes all callsites that are actually about
> >> >global reclaim heuristics rather than strict LRU list selection.
> >> >
> >> >Signed-off-by: Johannes Weiner<[email protected]>
> >> >---
> >> > ?mm/vmscan.c | ? 96 ++++++++++++++++++++++++++++++++++------------------------
> >> > ?1 files changed, 56 insertions(+), 40 deletions(-)
> >> >
> >> >diff --git a/mm/vmscan.c b/mm/vmscan.c
> >> >index f6b435c..ceeb2a5 100644
> >> >--- a/mm/vmscan.c
> >> >+++ b/mm/vmscan.c
> >> >@@ -104,8 +104,12 @@ struct scan_control {
> >> > ? ? ?*/
> >> > ? ? reclaim_mode_t reclaim_mode;
> >> >
> >> >- ? ?/* Which cgroup do we reclaim from */
> >> >- ? ?struct mem_cgroup *mem_cgroup;
> >> >+ ? ?/*
> >> >+ ? ? * The memory cgroup we reclaim on behalf of, and the one we
> >> >+ ? ? * are currently reclaiming from.
> >> >+ ? ? */
> >> >+ ? ?struct mem_cgroup *memcg;
> >> >+ ? ?struct mem_cgroup *current_memcg;
> >>
> >> I can't say I'm fond of these names. ?I had to read the
> >> rest of the patch to figure out that the old mem_cgroup
> >> got renamed to current_memcg.
> >
> > To clarify: sc->memcg will be the memcg that hit the hard limit and is
> > the main target of this reclaim invocation. ?current_memcg is the
> > iterator over the hierarchy below the target.
>
> I would assume the new variable memcg is a renaming of the
> "mem_cgroup" which indicating which cgroup we reclaim on behalf of.

The thing is, mem_cgroup would mean both the group we are reclaiming
on behalf of AND the group we are currently reclaiming from. Because
the hierarchy walk was implemented in memcontrol.c, vmscan.c only ever
saw one cgroup at a time.

> About the "current_memcg", i couldn't find where it is indicating to
> be the current cgroup under the hierarchy below the "memcg".

It's codified in shrink_zone().

for each child of sc->memcg:
sc->current_memcg = child
reclaim(sc)

In the new version I named (and documented) them:

sc->target_mem_cgroup: the entry point into the hierarchy, set
by the functions that have the scan control structure on their
stack. That's the one hitting its hard limit.

sc->mem_cgroup: the current position in the hierarchy below
sc->target_mem_cgroup. That's the one that actively gets its
pages reclaimed.

> Both mem_cgroup_shrink_node_zone() and try_to_free_mem_cgroup_pages()
> are called within mem_cgroup_hierarchical_reclaim(), and the sc->memcg
> is initialized w/ the victim passed down which is already the memcg
> under hierarchy.

I changed mem_cgroup_shrink_node_zone() to use do_shrink_zone(), and
mem_cgroup_hierarchical_reclaim() no longer calls
try_to_free_mem_cgroup_pages().

So there is no hierarchy walk triggered from within a hierarchy walk.

I just noticed that there is, however, a bug in that
mem_cgroup_shrink_node_zone() does not initialize sc->current_memcg.

2011-05-17 13:56:08

by Rik van Riel

[permalink] [raw]

Subject: Re: [rfc patch 4/6] memcg: reclaim statistics

On 05/17/2011 03:42 AM, Johannes Weiner wrote:

> It does hierarchical soft limit reclaim once triggered, but I meant
> that soft limits themselves have no hierarchical meaning. Say you
> have the following hierarchy:
>
> root_mem_cgroup
>
> aaa bbb
>
> a1 a2 b1 b2
>
> a1-1
>
> Consider aaa and a1 had a soft limit. If global memory arose, aaa and
> all its children would be pushed back with the current scheme, the one
> you are proposing, and the one I am proposing.
>
> But now consider aaa hitting its hard limit. Regular target reclaim
> will be triggered, and a1, a2, and a1-1 will be scanned equally from
> hierarchical reclaim. That a1 is in excess of its soft limit is not
> considered at all.
>
> With what I am proposing, a1 and a1-1 would be pushed back more
> aggressively than a2, because a1 is in excess of its soft limit and
> a1-1 is contributing to that.

Ying, I think Johannes has a good point. I do not see
a way to enforce the limits properly with the scheme we
came up with at LSF, in the hierarchical scenario above.

There may be a way, but until we think of it, I suspect
it will be better to go with Johannes's scheme for now.

--
All rights reversed