2017-07-25 11:41:54

by Roman Gushchin

[permalink] [raw]
Subject: [PATCH] mm, memcg: reset low limit during memcg offlining

A removed memory cgroup with a defined low limit and some belonging
pagecache has very low chances to be freed.

If a cgroup has been removed, there is likely no memory pressure inside
the cgroup, and the pagecache is protected from the external pressure
by the defined low limit. The cgroup will be freed only after
the reclaim of all belonging pages. And it will not happen until
there are any reclaimable memory in the system. That means,
there is a good chance, that a cold pagecache will reside
in the memory for an undefined amount of time, wasting
system resources.

Fix this issue by zeroing memcg->low during memcg offlining.

Signed-off-by: Roman Gushchin <[email protected]>
Cc: Tejun Heo <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Vladimir Davydov <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
---
mm/memcontrol.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index aed11b2d0251..2aa204b8f9fd 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -4300,6 +4300,8 @@ static void mem_cgroup_css_offline(struct cgroup_subsys_state *css)
}
spin_unlock(&memcg->event_list_lock);

+ memcg->low = 0;
+
memcg_offline_kmem(memcg);
wb_memcg_offline(memcg);

--
2.13.3


2017-07-25 11:58:14

by Michal Hocko

[permalink] [raw]
Subject: Re: [PATCH] mm, memcg: reset low limit during memcg offlining

On Tue 25-07-17 12:40:47, Roman Gushchin wrote:
> A removed memory cgroup with a defined low limit and some belonging
> pagecache has very low chances to be freed.
>
> If a cgroup has been removed, there is likely no memory pressure inside
> the cgroup, and the pagecache is protected from the external pressure
> by the defined low limit. The cgroup will be freed only after
> the reclaim of all belonging pages. And it will not happen until
> there are any reclaimable memory in the system. That means,
> there is a good chance, that a cold pagecache will reside
> in the memory for an undefined amount of time, wasting
> system resources.
>
> Fix this issue by zeroing memcg->low during memcg offlining.

Very well spotted! This goes all the way down to low limit inclusion
AFAICS. I would be even tempted to mark it for stable because hiding
some memory from reclaim basically indefinitely is not good. We might
have been just lucky nobody has noticed that yet.

Fixes: 241994ed8649 ("mm: memcontrol: default hierarchy interface for memory")

> Signed-off-by: Roman Gushchin <[email protected]>
> Cc: Tejun Heo <[email protected]>
> Cc: Johannes Weiner <[email protected]>
> Cc: Michal Hocko <[email protected]>
> Cc: Vladimir Davydov <[email protected]>
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]

Acked-by: Michal Hocko <[email protected]>

Thanks!

> ---
> mm/memcontrol.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index aed11b2d0251..2aa204b8f9fd 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -4300,6 +4300,8 @@ static void mem_cgroup_css_offline(struct cgroup_subsys_state *css)
> }
> spin_unlock(&memcg->event_list_lock);
>
> + memcg->low = 0;
> +
> memcg_offline_kmem(memcg);
> wb_memcg_offline(memcg);
>
> --
> 2.13.3

--
Michal Hocko
SUSE Labs

2017-07-25 12:05:42

by Vladimir Davydov

[permalink] [raw]
Subject: Re: [PATCH] mm, memcg: reset low limit during memcg offlining

On Tue, Jul 25, 2017 at 12:40:47PM +0100, Roman Gushchin wrote:
> A removed memory cgroup with a defined low limit and some belonging
> pagecache has very low chances to be freed.
>
> If a cgroup has been removed, there is likely no memory pressure inside
> the cgroup, and the pagecache is protected from the external pressure
> by the defined low limit. The cgroup will be freed only after
> the reclaim of all belonging pages. And it will not happen until
> there are any reclaimable memory in the system. That means,
> there is a good chance, that a cold pagecache will reside
> in the memory for an undefined amount of time, wasting
> system resources.
>
> Fix this issue by zeroing memcg->low during memcg offlining.
>
> Signed-off-by: Roman Gushchin <[email protected]>
> Cc: Tejun Heo <[email protected]>
> Cc: Johannes Weiner <[email protected]>
> Cc: Michal Hocko <[email protected]>
> Cc: Vladimir Davydov <[email protected]>
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> ---
> mm/memcontrol.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index aed11b2d0251..2aa204b8f9fd 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -4300,6 +4300,8 @@ static void mem_cgroup_css_offline(struct cgroup_subsys_state *css)
> }
> spin_unlock(&memcg->event_list_lock);
>
> + memcg->low = 0;
> +
> memcg_offline_kmem(memcg);
> wb_memcg_offline(memcg);
>

We already have that - see mem_cgroup_css_reset().

2017-07-25 12:07:08

by Roman Gushchin

[permalink] [raw]
Subject: Re: [PATCH] mm, memcg: reset low limit during memcg offlining

On Tue, Jul 25, 2017 at 01:58:08PM +0200, Michal Hocko wrote:
> On Tue 25-07-17 12:40:47, Roman Gushchin wrote:
> > A removed memory cgroup with a defined low limit and some belonging
> > pagecache has very low chances to be freed.
> >
> > If a cgroup has been removed, there is likely no memory pressure inside
> > the cgroup, and the pagecache is protected from the external pressure
> > by the defined low limit. The cgroup will be freed only after
> > the reclaim of all belonging pages. And it will not happen until
> > there are any reclaimable memory in the system. That means,
> > there is a good chance, that a cold pagecache will reside
> > in the memory for an undefined amount of time, wasting
> > system resources.
> >
> > Fix this issue by zeroing memcg->low during memcg offlining.
>
> Very well spotted! This goes all the way down to low limit inclusion
> AFAICS. I would be even tempted to mark it for stable because hiding
> some memory from reclaim basically indefinitely is not good. We might
> have been just lucky nobody has noticed that yet.

I believe it's because there are not so many actual low limit users,
and those who do, are using some offstream patches to mitigate this issue.

Thanks!

Roman

2017-07-25 12:31:42

by Roman Gushchin

[permalink] [raw]
Subject: Re: [PATCH] mm, memcg: reset low limit during memcg offlining

On Tue, Jul 25, 2017 at 03:05:37PM +0300, Vladimir Davydov wrote:
> On Tue, Jul 25, 2017 at 12:40:47PM +0100, Roman Gushchin wrote:
> > A removed memory cgroup with a defined low limit and some belonging
> > pagecache has very low chances to be freed.
> >
> > If a cgroup has been removed, there is likely no memory pressure inside
> > the cgroup, and the pagecache is protected from the external pressure
> > by the defined low limit. The cgroup will be freed only after
> > the reclaim of all belonging pages. And it will not happen until
> > there are any reclaimable memory in the system. That means,
> > there is a good chance, that a cold pagecache will reside
> > in the memory for an undefined amount of time, wasting
> > system resources.
> >
> > Fix this issue by zeroing memcg->low during memcg offlining.
> >
> > Signed-off-by: Roman Gushchin <[email protected]>
> > Cc: Tejun Heo <[email protected]>
> > Cc: Johannes Weiner <[email protected]>
> > Cc: Michal Hocko <[email protected]>
> > Cc: Vladimir Davydov <[email protected]>
> > Cc: [email protected]
> > Cc: [email protected]
> > Cc: [email protected]
> > Cc: [email protected]
> > ---
> > mm/memcontrol.c | 2 ++
> > 1 file changed, 2 insertions(+)
> >
> > diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> > index aed11b2d0251..2aa204b8f9fd 100644
> > --- a/mm/memcontrol.c
> > +++ b/mm/memcontrol.c
> > @@ -4300,6 +4300,8 @@ static void mem_cgroup_css_offline(struct cgroup_subsys_state *css)
> > }
> > spin_unlock(&memcg->event_list_lock);
> >
> > + memcg->low = 0;
> > +
> > memcg_offline_kmem(memcg);
> > wb_memcg_offline(memcg);
> >
>
> We already have that - see mem_cgroup_css_reset().

Hm, I see...

But are you sure, that calling mem_cgroup_css_reset() from offlining path
is always a good idea?

As I understand, css_reset() callback is intended to _completely_ disable all
limits, as if there were no cgroup at all. And it's main purpose to be called
when controllers are detached from the hierarhy.

Offlining is different: some limits make perfect sence after offlining
(e.g. we want to limit the writeback speed), and other might be tweaked
(e.g. we can set soft limit to prioritize reclaiming of abandoned cgroups).

So, I'd prefer to move this code to the offlining callback,
and not to call css_reset.

But, anyway, thanks for pointing at the mem_cgroup_css_reset().

Roman

2017-07-25 12:44:29

by Michal Hocko

[permalink] [raw]
Subject: Re: [PATCH] mm, memcg: reset low limit during memcg offlining

On Tue 25-07-17 13:31:13, Roman Gushchin wrote:
> On Tue, Jul 25, 2017 at 03:05:37PM +0300, Vladimir Davydov wrote:
> > On Tue, Jul 25, 2017 at 12:40:47PM +0100, Roman Gushchin wrote:
> > > A removed memory cgroup with a defined low limit and some belonging
> > > pagecache has very low chances to be freed.
> > >
> > > If a cgroup has been removed, there is likely no memory pressure inside
> > > the cgroup, and the pagecache is protected from the external pressure
> > > by the defined low limit. The cgroup will be freed only after
> > > the reclaim of all belonging pages. And it will not happen until
> > > there are any reclaimable memory in the system. That means,
> > > there is a good chance, that a cold pagecache will reside
> > > in the memory for an undefined amount of time, wasting
> > > system resources.
> > >
> > > Fix this issue by zeroing memcg->low during memcg offlining.
> > >
> > > Signed-off-by: Roman Gushchin <[email protected]>
> > > Cc: Tejun Heo <[email protected]>
> > > Cc: Johannes Weiner <[email protected]>
> > > Cc: Michal Hocko <[email protected]>
> > > Cc: Vladimir Davydov <[email protected]>
> > > Cc: [email protected]
> > > Cc: [email protected]
> > > Cc: [email protected]
> > > Cc: [email protected]
> > > ---
> > > mm/memcontrol.c | 2 ++
> > > 1 file changed, 2 insertions(+)
> > >
> > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> > > index aed11b2d0251..2aa204b8f9fd 100644
> > > --- a/mm/memcontrol.c
> > > +++ b/mm/memcontrol.c
> > > @@ -4300,6 +4300,8 @@ static void mem_cgroup_css_offline(struct cgroup_subsys_state *css)
> > > }
> > > spin_unlock(&memcg->event_list_lock);
> > >
> > > + memcg->low = 0;
> > > +
> > > memcg_offline_kmem(memcg);
> > > wb_memcg_offline(memcg);
> > >
> >
> > We already have that - see mem_cgroup_css_reset().
>
> Hm, I see...
>
> But are you sure, that calling mem_cgroup_css_reset() from offlining path
> is always a good idea?

Well, originally I wanted to suggest the same but then I asked the very
same question and couldn't answer it myself. memcg_offline_kmem feels
much more generic.

> As I understand, css_reset() callback is intended to _completely_ disable all
> limits, as if there were no cgroup at all. And it's main purpose to be called
> when controllers are detached from the hierarhy.

yes, that is my understanding as well.

> Offlining is different: some limits make perfect sence after offlining
> (e.g. we want to limit the writeback speed), and other might be tweaked
> (e.g. we can set soft limit to prioritize reclaiming of abandoned cgroups).

and the writeback path was exactly the one that triggered my
suspicious...
--
Michal Hocko
SUSE Labs

2017-07-26 08:30:24

by Vladimir Davydov

[permalink] [raw]
Subject: Re: [PATCH] mm, memcg: reset low limit during memcg offlining

On Tue, Jul 25, 2017 at 01:31:13PM +0100, Roman Gushchin wrote:
> On Tue, Jul 25, 2017 at 03:05:37PM +0300, Vladimir Davydov wrote:
> > On Tue, Jul 25, 2017 at 12:40:47PM +0100, Roman Gushchin wrote:
> > > A removed memory cgroup with a defined low limit and some belonging
> > > pagecache has very low chances to be freed.
> > >
> > > If a cgroup has been removed, there is likely no memory pressure inside
> > > the cgroup, and the pagecache is protected from the external pressure
> > > by the defined low limit. The cgroup will be freed only after
> > > the reclaim of all belonging pages. And it will not happen until
> > > there are any reclaimable memory in the system. That means,
> > > there is a good chance, that a cold pagecache will reside
> > > in the memory for an undefined amount of time, wasting
> > > system resources.
> > >
> > > Fix this issue by zeroing memcg->low during memcg offlining.
> > >
> > > Signed-off-by: Roman Gushchin <[email protected]>
> > > Cc: Tejun Heo <[email protected]>
> > > Cc: Johannes Weiner <[email protected]>
> > > Cc: Michal Hocko <[email protected]>
> > > Cc: Vladimir Davydov <[email protected]>
> > > Cc: [email protected]
> > > Cc: [email protected]
> > > Cc: [email protected]
> > > Cc: [email protected]
> > > ---
> > > mm/memcontrol.c | 2 ++
> > > 1 file changed, 2 insertions(+)
> > >
> > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> > > index aed11b2d0251..2aa204b8f9fd 100644
> > > --- a/mm/memcontrol.c
> > > +++ b/mm/memcontrol.c
> > > @@ -4300,6 +4300,8 @@ static void mem_cgroup_css_offline(struct cgroup_subsys_state *css)
> > > }
> > > spin_unlock(&memcg->event_list_lock);
> > >
> > > + memcg->low = 0;
> > > +
> > > memcg_offline_kmem(memcg);
> > > wb_memcg_offline(memcg);
> > >
> >
> > We already have that - see mem_cgroup_css_reset().
>
> Hm, I see...
>
> But are you sure, that calling mem_cgroup_css_reset() from offlining path
> is always a good idea?
>
> As I understand, css_reset() callback is intended to _completely_ disable all
> limits, as if there were no cgroup at all.

But that's exactly what cgroup offline is: deletion of a cgroup as if it
never existed. The fact that we leave the zombie dangling until all
pages charged to the cgroup are gone is an implementation detail. IIRC
we would "reparent" those charges and delete the mem_cgroup right away
if it were not inherently racy.

> And it's main purpose to be called
> when controllers are detached from the hierarhy.
>
> Offlining is different: some limits make perfect sence after offlining
> (e.g. we want to limit the writeback speed), and other might be tweaked
> (e.g. we can set soft limit to prioritize reclaiming of abandoned cgroups).

The user can't tweak limits of an offline cgroup, because the cgroup
directory no longer exist. So IMHO resetting all limits is reasonable.
If you want to keep the cgroup limits effective, you shouldn't have
deleted it in the first place, I suppose.

You might also want to check out this:

http://www.spinics.net/lists/linux-mm/msg102995.html

>
> So, I'd prefer to move this code to the offlining callback,
> and not to call css_reset.
>
> But, anyway, thanks for pointing at the mem_cgroup_css_reset().

2017-07-26 12:06:54

by Tejun Heo

[permalink] [raw]
Subject: Re: [PATCH] mm, memcg: reset low limit during memcg offlining

Hello, Vladimir.

On Wed, Jul 26, 2017 at 11:30:17AM +0300, Vladimir Davydov wrote:
> > As I understand, css_reset() callback is intended to _completely_ disable all
> > limits, as if there were no cgroup at all.
>
> But that's exactly what cgroup offline is: deletion of a cgroup as if it
> never existed. The fact that we leave the zombie dangling until all
> pages charged to the cgroup are gone is an implementation detail. IIRC
> we would "reparent" those charges and delete the mem_cgroup right away
> if it were not inherently racy.

That may be true for memcg but not in general. Think about writeback
IOs servicing dirty pages of a removed cgroup. Removing a cgroup
shouldn't grant it more resources than when it was alive and changing
the membership to the parent will break that. For memcg, they seem
the same just because no new major consumption can be generated after
removal.

> The user can't tweak limits of an offline cgroup, because the cgroup
> directory no longer exist. So IMHO resetting all limits is reasonable.
> If you want to keep the cgroup limits effective, you shouldn't have
> deleted it in the first place, I suppose.

I don't think that's the direction we wanna go. Granting more
resources on removal is surprising.

Thanks.

--
tejun

2017-07-27 13:05:11

by Roman Gushchin

[permalink] [raw]
Subject: [PATCH 1/2] mm, memcg: reset memory.low during memcg offlining

A removed memory cgroup with a defined memory.low and some belonging
pagecache has very low chances to be freed.

If a cgroup has been removed, there is likely no memory pressure inside
the cgroup, and the pagecache is protected from the external pressure
by the defined low limit. The cgroup will be freed only after
the reclaim of all belonging pages. And it will not happen until
there are any reclaimable memory in the system. That means,
there is a good chance, that a cold pagecache will reside
in the memory for an undefined amount of time, wasting
system resources.

This problem was fixed earlier by commit fa06235b8eb0
("cgroup: reset css on destruction"), but it's not a best way
to do it, as we can't really reset all limits/counters during
cgroup offlining.

Signed-off-by: Roman Gushchin <[email protected]>
Cc: Vladimir Davydov <[email protected]>
Cc: Tejun Heo <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
---
mm/memcontrol.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index d61133e6af99..7b24210596ea 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -4300,6 +4300,8 @@ static void mem_cgroup_css_offline(struct cgroup_subsys_state *css)
}
spin_unlock(&memcg->event_list_lock);

+ memcg->low = 0;
+
memcg_offline_kmem(memcg);
wb_memcg_offline(memcg);

--
2.13.3

2017-07-27 13:05:33

by Roman Gushchin

[permalink] [raw]
Subject: [PATCH 2/2] cgroup: revert fa06235b8eb0 ("cgroup: reset css on destruction")

Commit fa06235b8eb0 ("cgroup: reset css on destruction") caused
css_reset callback to be called from the offlining path. Although
it solves the problem mentioned in the commit description
("For instance, memory cgroup needs to reset memory.low, otherwise
pages charged to a dead cgroup might never get reclaimed."),
generally speaking, it's not correct.

An offline cgroup can still be a resource domain, and we shouldn't
grant it more resources than it had before deletion.

For instance, if an offline memory cgroup has dirty pages, we should
still imply i/o limits during writeback.

The css_reset callback is designed to return the cgroup state
into the original state, that means reset all limits and counters.
It's spomething different from the offlining, and we shouldn't use
it from the offlining path. Instead, we should adjust necessary
settings from the per-controller css_offline callbacks (e.g. reset
memory.low).

Signed-off-by: Roman Gushchin <[email protected]>
Cc: Vladimir Davydov <[email protected]>
Cc: Tejun Heo <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
---
kernel/cgroup/cgroup.c | 3 ---
1 file changed, 3 deletions(-)

diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
index 29c36c075249..4e93482e066c 100644
--- a/kernel/cgroup/cgroup.c
+++ b/kernel/cgroup/cgroup.c
@@ -4499,9 +4499,6 @@ static void offline_css(struct cgroup_subsys_state *css)
if (!(css->flags & CSS_ONLINE))
return;

- if (ss->css_reset)
- ss->css_reset(css);
-
if (ss->css_offline)
ss->css_offline(css);

--
2.13.3

2017-07-27 13:52:46

by Tejun Heo

[permalink] [raw]
Subject: Re: [PATCH 2/2] cgroup: revert fa06235b8eb0 ("cgroup: reset css on destruction")

On Thu, Jul 27, 2017 at 02:04:28PM +0100, Roman Gushchin wrote:
> Commit fa06235b8eb0 ("cgroup: reset css on destruction") caused
> css_reset callback to be called from the offlining path. Although
> it solves the problem mentioned in the commit description
> ("For instance, memory cgroup needs to reset memory.low, otherwise
> pages charged to a dead cgroup might never get reclaimed."),
> generally speaking, it's not correct.
>
> An offline cgroup can still be a resource domain, and we shouldn't
> grant it more resources than it had before deletion.
>
> For instance, if an offline memory cgroup has dirty pages, we should
> still imply i/o limits during writeback.
>
> The css_reset callback is designed to return the cgroup state
> into the original state, that means reset all limits and counters.
> It's spomething different from the offlining, and we shouldn't use
> it from the offlining path. Instead, we should adjust necessary
> settings from the per-controller css_offline callbacks (e.g. reset
> memory.low).
>
> Signed-off-by: Roman Gushchin <[email protected]>
> Cc: Vladimir Davydov <[email protected]>
> Cc: Tejun Heo <[email protected]>
> Cc: Johannes Weiner <[email protected]>
> Cc: Michal Hocko <[email protected]>
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]

Acked-by: Tejun Heo <[email protected]>

Please feel free to route with the previous patch through -mm.

Thanks.

--
tejun

2017-07-27 14:35:28

by Johannes Weiner

[permalink] [raw]
Subject: Re: [PATCH 1/2] mm, memcg: reset memory.low during memcg offlining

CC Andrew - can you route these through the -mm tree please?

On Thu, Jul 27, 2017 at 02:04:27PM +0100, Roman Gushchin wrote:
> A removed memory cgroup with a defined memory.low and some belonging
> pagecache has very low chances to be freed.
>
> If a cgroup has been removed, there is likely no memory pressure inside
> the cgroup, and the pagecache is protected from the external pressure
> by the defined low limit. The cgroup will be freed only after
> the reclaim of all belonging pages. And it will not happen until
> there are any reclaimable memory in the system. That means,
> there is a good chance, that a cold pagecache will reside
> in the memory for an undefined amount of time, wasting
> system resources.
>
> This problem was fixed earlier by commit fa06235b8eb0
> ("cgroup: reset css on destruction"), but it's not a best way
> to do it, as we can't really reset all limits/counters during
> cgroup offlining.
>
> Signed-off-by: Roman Gushchin <[email protected]>
> Cc: Vladimir Davydov <[email protected]>
> Cc: Tejun Heo <[email protected]>
> Cc: Johannes Weiner <[email protected]>
> Cc: Michal Hocko <[email protected]>
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]

Acked-by: Johannes Weiner <[email protected]>

2017-07-27 14:36:23

by Johannes Weiner

[permalink] [raw]
Subject: Re: [PATCH 2/2] cgroup: revert fa06235b8eb0 ("cgroup: reset css on destruction")

On Thu, Jul 27, 2017 at 02:04:28PM +0100, Roman Gushchin wrote:
> Commit fa06235b8eb0 ("cgroup: reset css on destruction") caused
> css_reset callback to be called from the offlining path. Although
> it solves the problem mentioned in the commit description
> ("For instance, memory cgroup needs to reset memory.low, otherwise
> pages charged to a dead cgroup might never get reclaimed."),
> generally speaking, it's not correct.
>
> An offline cgroup can still be a resource domain, and we shouldn't
> grant it more resources than it had before deletion.
>
> For instance, if an offline memory cgroup has dirty pages, we should
> still imply i/o limits during writeback.
>
> The css_reset callback is designed to return the cgroup state
> into the original state, that means reset all limits and counters.
> It's spomething different from the offlining, and we shouldn't use
> it from the offlining path. Instead, we should adjust necessary
> settings from the per-controller css_offline callbacks (e.g. reset
> memory.low).
>
> Signed-off-by: Roman Gushchin <[email protected]>
> Cc: Vladimir Davydov <[email protected]>
> Cc: Tejun Heo <[email protected]>
> Cc: Johannes Weiner <[email protected]>
> Cc: Michal Hocko <[email protected]>
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]

Acked-by: Johannes Weiner <[email protected]>

2017-07-27 14:48:00

by Michal Hocko

[permalink] [raw]
Subject: Re: [PATCH 1/2] mm, memcg: reset memory.low during memcg offlining

On Thu 27-07-17 14:04:27, Roman Gushchin wrote:
> A removed memory cgroup with a defined memory.low and some belonging
> pagecache has very low chances to be freed.
>
> If a cgroup has been removed, there is likely no memory pressure inside
> the cgroup, and the pagecache is protected from the external pressure
> by the defined low limit. The cgroup will be freed only after
> the reclaim of all belonging pages. And it will not happen until
> there are any reclaimable memory in the system. That means,
> there is a good chance, that a cold pagecache will reside
> in the memory for an undefined amount of time, wasting
> system resources.
>
> This problem was fixed earlier by commit fa06235b8eb0
> ("cgroup: reset css on destruction"), but it's not a best way
> to do it, as we can't really reset all limits/counters during
> cgroup offlining.
>
> Signed-off-by: Roman Gushchin <[email protected]>
> Cc: Vladimir Davydov <[email protected]>
> Cc: Tejun Heo <[email protected]>
> Cc: Johannes Weiner <[email protected]>
> Cc: Michal Hocko <[email protected]>
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]

my ack for this patch still holds.
Acked-by: Michal Hocko <[email protected]>

> ---
> mm/memcontrol.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index d61133e6af99..7b24210596ea 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -4300,6 +4300,8 @@ static void mem_cgroup_css_offline(struct cgroup_subsys_state *css)
> }
> spin_unlock(&memcg->event_list_lock);
>
> + memcg->low = 0;
> +
> memcg_offline_kmem(memcg);
> wb_memcg_offline(memcg);
>
> --
> 2.13.3

--
Michal Hocko
SUSE Labs