2020-04-30 18:29:45

by Shakeel Butt

[permalink] [raw]
Subject: [PATCH] memcg: oom: ignore oom warnings from memory.max

Lowering memory.max can trigger an oom-kill if the reclaim does not
succeed. However if oom-killer does not find a process for killing, it
dumps a lot of warnings.

Deleting a memcg does not reclaim memory from it and the memory can
linger till there is a memory pressure. One normal way to proactively
reclaim such memory is to set memory.max to 0 just before deleting the
memcg. However if some of the memcg's memory is pinned by others, this
operation can trigger an oom-kill without any process and thus can log a
lot un-needed warnings. So, ignore all such warnings from memory.max.

Signed-off-by: Shakeel Butt <[email protected]>
---
include/linux/oom.h | 3 +++
mm/memcontrol.c | 9 +++++----
mm/oom_kill.c | 2 +-
3 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/include/linux/oom.h b/include/linux/oom.h
index c696c265f019..6345dc55df64 100644
--- a/include/linux/oom.h
+++ b/include/linux/oom.h
@@ -52,6 +52,9 @@ struct oom_control {

/* Used to print the constraint info. */
enum oom_constraint constraint;
+
+ /* Do not warn even if there is no process to be killed. */
+ bool no_warn;
};

extern struct mutex oom_lock;
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 317dbbaac603..a1f00d9b9bb0 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -1571,7 +1571,7 @@ unsigned long mem_cgroup_size(struct mem_cgroup *memcg)
}

static bool mem_cgroup_out_of_memory(struct mem_cgroup *memcg, gfp_t gfp_mask,
- int order)
+ int order, bool no_warn)
{
struct oom_control oc = {
.zonelist = NULL,
@@ -1579,6 +1579,7 @@ static bool mem_cgroup_out_of_memory(struct mem_cgroup *memcg, gfp_t gfp_mask,
.memcg = memcg,
.gfp_mask = gfp_mask,
.order = order,
+ .no_warn = no_warn,
};
bool ret;

@@ -1821,7 +1822,7 @@ static enum oom_status mem_cgroup_oom(struct mem_cgroup *memcg, gfp_t mask, int
mem_cgroup_oom_notify(memcg);

mem_cgroup_unmark_under_oom(memcg);
- if (mem_cgroup_out_of_memory(memcg, mask, order))
+ if (mem_cgroup_out_of_memory(memcg, mask, order, false))
ret = OOM_SUCCESS;
else
ret = OOM_FAILED;
@@ -1880,7 +1881,7 @@ bool mem_cgroup_oom_synchronize(bool handle)
mem_cgroup_unmark_under_oom(memcg);
finish_wait(&memcg_oom_waitq, &owait.wait);
mem_cgroup_out_of_memory(memcg, current->memcg_oom_gfp_mask,
- current->memcg_oom_order);
+ current->memcg_oom_order, false);
} else {
schedule();
mem_cgroup_unmark_under_oom(memcg);
@@ -6106,7 +6107,7 @@ static ssize_t memory_max_write(struct kernfs_open_file *of,
}

memcg_memory_event(memcg, MEMCG_OOM);
- if (!mem_cgroup_out_of_memory(memcg, GFP_KERNEL, 0))
+ if (!mem_cgroup_out_of_memory(memcg, GFP_KERNEL, 0, true))
break;
}

diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index 463b3d74a64a..5ace39f6fe1e 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -1098,7 +1098,7 @@ bool out_of_memory(struct oom_control *oc)

select_bad_process(oc);
/* Found nothing?!?! */
- if (!oc->chosen) {
+ if (!oc->chosen && !oc->no_warn) {
dump_header(oc, NULL);
pr_warn("Out of memory and no killable processes...\n");
/*
--
2.26.2.526.g744177e7f7-goog


2020-04-30 19:08:52

by Roman Gushchin

[permalink] [raw]
Subject: Re: [PATCH] memcg: oom: ignore oom warnings from memory.max

Hello, Shakeel!

On Thu, Apr 30, 2020 at 11:27:12AM -0700, Shakeel Butt wrote:
> Lowering memory.max can trigger an oom-kill if the reclaim does not
> succeed. However if oom-killer does not find a process for killing, it
> dumps a lot of warnings.

Makes total sense to me.

>
> Deleting a memcg does not reclaim memory from it and the memory can
> linger till there is a memory pressure. One normal way to proactively
> reclaim such memory is to set memory.max to 0 just before deleting the
> memcg. However if some of the memcg's memory is pinned by others, this
> operation can trigger an oom-kill without any process and thus can log a
> lot un-needed warnings. So, ignore all such warnings from memory.max.
>
> Signed-off-by: Shakeel Butt <[email protected]>
> ---
> include/linux/oom.h | 3 +++
> mm/memcontrol.c | 9 +++++----
> mm/oom_kill.c | 2 +-
> 3 files changed, 9 insertions(+), 5 deletions(-)
>
> diff --git a/include/linux/oom.h b/include/linux/oom.h
> index c696c265f019..6345dc55df64 100644
> --- a/include/linux/oom.h
> +++ b/include/linux/oom.h
> @@ -52,6 +52,9 @@ struct oom_control {
>
> /* Used to print the constraint info. */
> enum oom_constraint constraint;
> +
> + /* Do not warn even if there is no process to be killed. */
> + bool no_warn;

I'd invert it to warn. Or maybe even warn_on_no_proc?

> };
>
> extern struct mutex oom_lock;
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 317dbbaac603..a1f00d9b9bb0 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -1571,7 +1571,7 @@ unsigned long mem_cgroup_size(struct mem_cgroup *memcg)
> }
>
> static bool mem_cgroup_out_of_memory(struct mem_cgroup *memcg, gfp_t gfp_mask,
> - int order)
> + int order, bool no_warn)
> {
> struct oom_control oc = {
> .zonelist = NULL,
> @@ -1579,6 +1579,7 @@ static bool mem_cgroup_out_of_memory(struct mem_cgroup *memcg, gfp_t gfp_mask,
> .memcg = memcg,
> .gfp_mask = gfp_mask,
> .order = order,
> + .no_warn = no_warn,
> };
> bool ret;
>
> @@ -1821,7 +1822,7 @@ static enum oom_status mem_cgroup_oom(struct mem_cgroup *memcg, gfp_t mask, int
> mem_cgroup_oom_notify(memcg);
>
> mem_cgroup_unmark_under_oom(memcg);
> - if (mem_cgroup_out_of_memory(memcg, mask, order))
> + if (mem_cgroup_out_of_memory(memcg, mask, order, false))
> ret = OOM_SUCCESS;
> else
> ret = OOM_FAILED;
> @@ -1880,7 +1881,7 @@ bool mem_cgroup_oom_synchronize(bool handle)
> mem_cgroup_unmark_under_oom(memcg);
> finish_wait(&memcg_oom_waitq, &owait.wait);
> mem_cgroup_out_of_memory(memcg, current->memcg_oom_gfp_mask,
> - current->memcg_oom_order);
> + current->memcg_oom_order, false);
> } else {
> schedule();
> mem_cgroup_unmark_under_oom(memcg);
> @@ -6106,7 +6107,7 @@ static ssize_t memory_max_write(struct kernfs_open_file *of,
> }
>
> memcg_memory_event(memcg, MEMCG_OOM);
> - if (!mem_cgroup_out_of_memory(memcg, GFP_KERNEL, 0))
> + if (!mem_cgroup_out_of_memory(memcg, GFP_KERNEL, 0, true))

I wonder if we can handle it automatically from the oom_killer side?
We can suppress warnings if oc->memcg is set and the cgroup scanning
showed that there are no belonging processes?

Thanks!

2020-04-30 19:31:14

by Johannes Weiner

[permalink] [raw]
Subject: Re: [PATCH] memcg: oom: ignore oom warnings from memory.max

On Thu, Apr 30, 2020 at 11:27:12AM -0700, Shakeel Butt wrote:
> Lowering memory.max can trigger an oom-kill if the reclaim does not
> succeed. However if oom-killer does not find a process for killing, it
> dumps a lot of warnings.
>
> Deleting a memcg does not reclaim memory from it and the memory can
> linger till there is a memory pressure. One normal way to proactively
> reclaim such memory is to set memory.max to 0 just before deleting the
> memcg. However if some of the memcg's memory is pinned by others, this
> operation can trigger an oom-kill without any process and thus can log a
> lot un-needed warnings. So, ignore all such warnings from memory.max.

Can't you set memory.high=0 instead? It does the reclaim portion of
memory.max, without the actual OOM killing that causes you problems.

2020-04-30 19:33:21

by Johannes Weiner

[permalink] [raw]
Subject: Re: [PATCH] memcg: oom: ignore oom warnings from memory.max

On Thu, Apr 30, 2020 at 12:06:10PM -0700, Roman Gushchin wrote:
> On Thu, Apr 30, 2020 at 11:27:12AM -0700, Shakeel Butt wrote:
> > @@ -6106,7 +6107,7 @@ static ssize_t memory_max_write(struct kernfs_open_file *of,
> > }
> >
> > memcg_memory_event(memcg, MEMCG_OOM);
> > - if (!mem_cgroup_out_of_memory(memcg, GFP_KERNEL, 0))
> > + if (!mem_cgroup_out_of_memory(memcg, GFP_KERNEL, 0, true))
>
> I wonder if we can handle it automatically from the oom_killer side?
> We can suppress warnings if oc->memcg is set and the cgroup scanning
> showed that there are no belonging processes?

Note that we do remote charging for certain consumers, where memory
gets charged from the outside of the cgroup.

We would want to know if these cause OOMs on an empty cgroup, rather
than force-charge the allocations quietly.

2020-04-30 19:36:04

by Shakeel Butt

[permalink] [raw]
Subject: Re: [PATCH] memcg: oom: ignore oom warnings from memory.max

On Thu, Apr 30, 2020 at 12:06 PM Roman Gushchin <[email protected]> wrote:
>
> Hello, Shakeel!
>
> On Thu, Apr 30, 2020 at 11:27:12AM -0700, Shakeel Butt wrote:
> > Lowering memory.max can trigger an oom-kill if the reclaim does not
> > succeed. However if oom-killer does not find a process for killing, it
> > dumps a lot of warnings.
>
> Makes total sense to me.
>
> >
> > Deleting a memcg does not reclaim memory from it and the memory can
> > linger till there is a memory pressure. One normal way to proactively
> > reclaim such memory is to set memory.max to 0 just before deleting the
> > memcg. However if some of the memcg's memory is pinned by others, this
> > operation can trigger an oom-kill without any process and thus can log a
> > lot un-needed warnings. So, ignore all such warnings from memory.max.
> >
> > Signed-off-by: Shakeel Butt <[email protected]>
> > ---
> > include/linux/oom.h | 3 +++
> > mm/memcontrol.c | 9 +++++----
> > mm/oom_kill.c | 2 +-
> > 3 files changed, 9 insertions(+), 5 deletions(-)
> >
> > diff --git a/include/linux/oom.h b/include/linux/oom.h
> > index c696c265f019..6345dc55df64 100644
> > --- a/include/linux/oom.h
> > +++ b/include/linux/oom.h
> > @@ -52,6 +52,9 @@ struct oom_control {
> >
> > /* Used to print the constraint info. */
> > enum oom_constraint constraint;
> > +
> > + /* Do not warn even if there is no process to be killed. */
> > + bool no_warn;
>
> I'd invert it to warn. Or maybe even warn_on_no_proc?
>

Sure.

> > };
> >
> > extern struct mutex oom_lock;
> > diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> > index 317dbbaac603..a1f00d9b9bb0 100644
> > --- a/mm/memcontrol.c
> > +++ b/mm/memcontrol.c
> > @@ -1571,7 +1571,7 @@ unsigned long mem_cgroup_size(struct mem_cgroup *memcg)
> > }
> >
> > static bool mem_cgroup_out_of_memory(struct mem_cgroup *memcg, gfp_t gfp_mask,
> > - int order)
> > + int order, bool no_warn)
> > {
> > struct oom_control oc = {
> > .zonelist = NULL,
> > @@ -1579,6 +1579,7 @@ static bool mem_cgroup_out_of_memory(struct mem_cgroup *memcg, gfp_t gfp_mask,
> > .memcg = memcg,
> > .gfp_mask = gfp_mask,
> > .order = order,
> > + .no_warn = no_warn,
> > };
> > bool ret;
> >
> > @@ -1821,7 +1822,7 @@ static enum oom_status mem_cgroup_oom(struct mem_cgroup *memcg, gfp_t mask, int
> > mem_cgroup_oom_notify(memcg);
> >
> > mem_cgroup_unmark_under_oom(memcg);
> > - if (mem_cgroup_out_of_memory(memcg, mask, order))
> > + if (mem_cgroup_out_of_memory(memcg, mask, order, false))
> > ret = OOM_SUCCESS;
> > else
> > ret = OOM_FAILED;
> > @@ -1880,7 +1881,7 @@ bool mem_cgroup_oom_synchronize(bool handle)
> > mem_cgroup_unmark_under_oom(memcg);
> > finish_wait(&memcg_oom_waitq, &owait.wait);
> > mem_cgroup_out_of_memory(memcg, current->memcg_oom_gfp_mask,
> > - current->memcg_oom_order);
> > + current->memcg_oom_order, false);
> > } else {
> > schedule();
> > mem_cgroup_unmark_under_oom(memcg);
> > @@ -6106,7 +6107,7 @@ static ssize_t memory_max_write(struct kernfs_open_file *of,
> > }
> >
> > memcg_memory_event(memcg, MEMCG_OOM);
> > - if (!mem_cgroup_out_of_memory(memcg, GFP_KERNEL, 0))
> > + if (!mem_cgroup_out_of_memory(memcg, GFP_KERNEL, 0, true))
>
> I wonder if we can handle it automatically from the oom_killer side?
> We can suppress warnings if oc->memcg is set and the cgroup scanning
> showed that there are no belonging processes?
>

What about the charging path? Do we not want such warnings from
charging paths? It might be due to some misconfiguration.

2020-04-30 20:23:07

by Shakeel Butt

[permalink] [raw]
Subject: Re: [PATCH] memcg: oom: ignore oom warnings from memory.max

On Thu, Apr 30, 2020 at 12:29 PM Johannes Weiner <[email protected]> wrote:
>
> On Thu, Apr 30, 2020 at 11:27:12AM -0700, Shakeel Butt wrote:
> > Lowering memory.max can trigger an oom-kill if the reclaim does not
> > succeed. However if oom-killer does not find a process for killing, it
> > dumps a lot of warnings.
> >
> > Deleting a memcg does not reclaim memory from it and the memory can
> > linger till there is a memory pressure. One normal way to proactively
> > reclaim such memory is to set memory.max to 0 just before deleting the
> > memcg. However if some of the memcg's memory is pinned by others, this
> > operation can trigger an oom-kill without any process and thus can log a
> > lot un-needed warnings. So, ignore all such warnings from memory.max.
>
> Can't you set memory.high=0 instead? It does the reclaim portion of
> memory.max, without the actual OOM killing that causes you problems.

Yes that would work but remote charging concerns me. Remote charging
can still happen after the memcg is offlined and at the moment, high
reclaim does not work for remote memcg and the usage can go till max
or global pressure. This is most probably a misconfiguration and we
might not receive the warnings in the log ever. Setting memory.max to
0 will definitely give such warnings.

2020-04-30 20:29:21

by Roman Gushchin

[permalink] [raw]
Subject: Re: [PATCH] memcg: oom: ignore oom warnings from memory.max

On Thu, Apr 30, 2020 at 03:30:49PM -0400, Johannes Weiner wrote:
> On Thu, Apr 30, 2020 at 12:06:10PM -0700, Roman Gushchin wrote:
> > On Thu, Apr 30, 2020 at 11:27:12AM -0700, Shakeel Butt wrote:
> > > @@ -6106,7 +6107,7 @@ static ssize_t memory_max_write(struct kernfs_open_file *of,
> > > }
> > >
> > > memcg_memory_event(memcg, MEMCG_OOM);
> > > - if (!mem_cgroup_out_of_memory(memcg, GFP_KERNEL, 0))
> > > + if (!mem_cgroup_out_of_memory(memcg, GFP_KERNEL, 0, true))
> >
> > I wonder if we can handle it automatically from the oom_killer side?
> > We can suppress warnings if oc->memcg is set and the cgroup scanning
> > showed that there are no belonging processes?
>
> Note that we do remote charging for certain consumers, where memory
> gets charged from the outside of the cgroup.
>
> We would want to know if these cause OOMs on an empty cgroup, rather
> than force-charge the allocations quietly.
>

Yeah, good point.

2020-05-01 01:42:09

by Yafang Shao

[permalink] [raw]
Subject: Re: [PATCH] memcg: oom: ignore oom warnings from memory.max

On Fri, May 1, 2020 at 2:27 AM Shakeel Butt <[email protected]> wrote:
>
> Lowering memory.max can trigger an oom-kill if the reclaim does not
> succeed. However if oom-killer does not find a process for killing, it
> dumps a lot of warnings.
>

I have been confused by this behavior for several months and I think
it will confuse more memcg users.
We should keep the memcg oom behavior consistent with system oom - no
oom kill if no process.

What about bellow change ?

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index e28098e13f1c..25fbc37a747f 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -6086,6 +6086,9 @@ static ssize_t memory_max_write(struct
kernfs_open_file *of,
continue;
}

+ if (!cgroup_is_populated(memcg->css.cgroup))
+ break;
+
memcg_memory_event(memcg, MEMCG_OOM);
if (!mem_cgroup_out_of_memory(memcg, GFP_KERNEL, 0))
break;

> Deleting a memcg does not reclaim memory from it and the memory can
> linger till there is a memory pressure. One normal way to proactively
> reclaim such memory is to set memory.max to 0 just before deleting the
> memcg. However if some of the memcg's memory is pinned by others, this
> operation can trigger an oom-kill without any process and thus can log a
> lot un-needed warnings. So, ignore all such warnings from memory.max.
>
> Signed-off-by: Shakeel Butt <[email protected]>
> ---
> include/linux/oom.h | 3 +++
> mm/memcontrol.c | 9 +++++----
> mm/oom_kill.c | 2 +-
> 3 files changed, 9 insertions(+), 5 deletions(-)
>
> diff --git a/include/linux/oom.h b/include/linux/oom.h
> index c696c265f019..6345dc55df64 100644
> --- a/include/linux/oom.h
> +++ b/include/linux/oom.h
> @@ -52,6 +52,9 @@ struct oom_control {
>
> /* Used to print the constraint info. */
> enum oom_constraint constraint;
> +
> + /* Do not warn even if there is no process to be killed. */
> + bool no_warn;
> };
>
> extern struct mutex oom_lock;
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 317dbbaac603..a1f00d9b9bb0 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -1571,7 +1571,7 @@ unsigned long mem_cgroup_size(struct mem_cgroup *memcg)
> }
>
> static bool mem_cgroup_out_of_memory(struct mem_cgroup *memcg, gfp_t gfp_mask,
> - int order)
> + int order, bool no_warn)
> {
> struct oom_control oc = {
> .zonelist = NULL,
> @@ -1579,6 +1579,7 @@ static bool mem_cgroup_out_of_memory(struct mem_cgroup *memcg, gfp_t gfp_mask,
> .memcg = memcg,
> .gfp_mask = gfp_mask,
> .order = order,
> + .no_warn = no_warn,
> };
> bool ret;
>
> @@ -1821,7 +1822,7 @@ static enum oom_status mem_cgroup_oom(struct mem_cgroup *memcg, gfp_t mask, int
> mem_cgroup_oom_notify(memcg);
>
> mem_cgroup_unmark_under_oom(memcg);
> - if (mem_cgroup_out_of_memory(memcg, mask, order))
> + if (mem_cgroup_out_of_memory(memcg, mask, order, false))
> ret = OOM_SUCCESS;
> else
> ret = OOM_FAILED;
> @@ -1880,7 +1881,7 @@ bool mem_cgroup_oom_synchronize(bool handle)
> mem_cgroup_unmark_under_oom(memcg);
> finish_wait(&memcg_oom_waitq, &owait.wait);
> mem_cgroup_out_of_memory(memcg, current->memcg_oom_gfp_mask,
> - current->memcg_oom_order);
> + current->memcg_oom_order, false);
> } else {
> schedule();
> mem_cgroup_unmark_under_oom(memcg);
> @@ -6106,7 +6107,7 @@ static ssize_t memory_max_write(struct kernfs_open_file *of,
> }
>
> memcg_memory_event(memcg, MEMCG_OOM);
> - if (!mem_cgroup_out_of_memory(memcg, GFP_KERNEL, 0))
> + if (!mem_cgroup_out_of_memory(memcg, GFP_KERNEL, 0, true))
> break;
> }
>
> diff --git a/mm/oom_kill.c b/mm/oom_kill.c
> index 463b3d74a64a..5ace39f6fe1e 100644
> --- a/mm/oom_kill.c
> +++ b/mm/oom_kill.c
> @@ -1098,7 +1098,7 @@ bool out_of_memory(struct oom_control *oc)
>
> select_bad_process(oc);
> /* Found nothing?!?! */
> - if (!oc->chosen) {
> + if (!oc->chosen && !oc->no_warn) {
> dump_header(oc, NULL);
> pr_warn("Out of memory and no killable processes...\n");
> /*
> --
> 2.26.2.526.g744177e7f7-goog
>
>


--
Thanks
Yafang

2020-05-01 02:06:45

by Shakeel Butt

[permalink] [raw]
Subject: Re: [PATCH] memcg: oom: ignore oom warnings from memory.max

On Thu, Apr 30, 2020 at 6:39 PM Yafang Shao <[email protected]> wrote:
>
> On Fri, May 1, 2020 at 2:27 AM Shakeel Butt <[email protected]> wrote:
> >
> > Lowering memory.max can trigger an oom-kill if the reclaim does not
> > succeed. However if oom-killer does not find a process for killing, it
> > dumps a lot of warnings.
> >
>
> I have been confused by this behavior for several months and I think
> it will confuse more memcg users.
> We should keep the memcg oom behavior consistent with system oom - no
> oom kill if no process.
>
> What about bellow change ?
>

Seems fine to me. If others are ok with this, please do send a signed-off patch.

> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index e28098e13f1c..25fbc37a747f 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -6086,6 +6086,9 @@ static ssize_t memory_max_write(struct
> kernfs_open_file *of,
> continue;
> }
>
> + if (!cgroup_is_populated(memcg->css.cgroup))
> + break;
> +
> memcg_memory_event(memcg, MEMCG_OOM);
> if (!mem_cgroup_out_of_memory(memcg, GFP_KERNEL, 0))
> break;
>
> > Deleting a memcg does not reclaim memory from it and the memory can
> > linger till there is a memory pressure. One normal way to proactively
> > reclaim such memory is to set memory.max to 0 just before deleting the
> > memcg. However if some of the memcg's memory is pinned by others, this
> > operation can trigger an oom-kill without any process and thus can log a
> > lot un-needed warnings. So, ignore all such warnings from memory.max.
> >
> > Signed-off-by: Shakeel Butt <[email protected]>
> > ---
> > include/linux/oom.h | 3 +++
> > mm/memcontrol.c | 9 +++++----
> > mm/oom_kill.c | 2 +-
> > 3 files changed, 9 insertions(+), 5 deletions(-)
> >
> > diff --git a/include/linux/oom.h b/include/linux/oom.h
> > index c696c265f019..6345dc55df64 100644
> > --- a/include/linux/oom.h
> > +++ b/include/linux/oom.h
> > @@ -52,6 +52,9 @@ struct oom_control {
> >
> > /* Used to print the constraint info. */
> > enum oom_constraint constraint;
> > +
> > + /* Do not warn even if there is no process to be killed. */
> > + bool no_warn;
> > };
> >
> > extern struct mutex oom_lock;
> > diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> > index 317dbbaac603..a1f00d9b9bb0 100644
> > --- a/mm/memcontrol.c
> > +++ b/mm/memcontrol.c
> > @@ -1571,7 +1571,7 @@ unsigned long mem_cgroup_size(struct mem_cgroup *memcg)
> > }
> >
> > static bool mem_cgroup_out_of_memory(struct mem_cgroup *memcg, gfp_t gfp_mask,
> > - int order)
> > + int order, bool no_warn)
> > {
> > struct oom_control oc = {
> > .zonelist = NULL,
> > @@ -1579,6 +1579,7 @@ static bool mem_cgroup_out_of_memory(struct mem_cgroup *memcg, gfp_t gfp_mask,
> > .memcg = memcg,
> > .gfp_mask = gfp_mask,
> > .order = order,
> > + .no_warn = no_warn,
> > };
> > bool ret;
> >
> > @@ -1821,7 +1822,7 @@ static enum oom_status mem_cgroup_oom(struct mem_cgroup *memcg, gfp_t mask, int
> > mem_cgroup_oom_notify(memcg);
> >
> > mem_cgroup_unmark_under_oom(memcg);
> > - if (mem_cgroup_out_of_memory(memcg, mask, order))
> > + if (mem_cgroup_out_of_memory(memcg, mask, order, false))
> > ret = OOM_SUCCESS;
> > else
> > ret = OOM_FAILED;
> > @@ -1880,7 +1881,7 @@ bool mem_cgroup_oom_synchronize(bool handle)
> > mem_cgroup_unmark_under_oom(memcg);
> > finish_wait(&memcg_oom_waitq, &owait.wait);
> > mem_cgroup_out_of_memory(memcg, current->memcg_oom_gfp_mask,
> > - current->memcg_oom_order);
> > + current->memcg_oom_order, false);
> > } else {
> > schedule();
> > mem_cgroup_unmark_under_oom(memcg);
> > @@ -6106,7 +6107,7 @@ static ssize_t memory_max_write(struct kernfs_open_file *of,
> > }
> >
> > memcg_memory_event(memcg, MEMCG_OOM);
> > - if (!mem_cgroup_out_of_memory(memcg, GFP_KERNEL, 0))
> > + if (!mem_cgroup_out_of_memory(memcg, GFP_KERNEL, 0, true))
> > break;
> > }
> >
> > diff --git a/mm/oom_kill.c b/mm/oom_kill.c
> > index 463b3d74a64a..5ace39f6fe1e 100644
> > --- a/mm/oom_kill.c
> > +++ b/mm/oom_kill.c
> > @@ -1098,7 +1098,7 @@ bool out_of_memory(struct oom_control *oc)
> >
> > select_bad_process(oc);
> > /* Found nothing?!?! */
> > - if (!oc->chosen) {
> > + if (!oc->chosen && !oc->no_warn) {
> > dump_header(oc, NULL);
> > pr_warn("Out of memory and no killable processes...\n");
> > /*
> > --
> > 2.26.2.526.g744177e7f7-goog
> >
> >
>
>
> --
> Thanks
> Yafang

2020-05-01 02:15:32

by Yafang Shao

[permalink] [raw]
Subject: Re: [PATCH] memcg: oom: ignore oom warnings from memory.max

On Fri, May 1, 2020 at 10:04 AM Shakeel Butt <[email protected]> wrote:
>
> On Thu, Apr 30, 2020 at 6:39 PM Yafang Shao <[email protected]> wrote:
> >
> > On Fri, May 1, 2020 at 2:27 AM Shakeel Butt <[email protected]> wrote:
> > >
> > > Lowering memory.max can trigger an oom-kill if the reclaim does not
> > > succeed. However if oom-killer does not find a process for killing, it
> > > dumps a lot of warnings.
> > >
> >
> > I have been confused by this behavior for several months and I think
> > it will confuse more memcg users.
> > We should keep the memcg oom behavior consistent with system oom - no
> > oom kill if no process.
> >
> > What about bellow change ?
> >
>
> Seems fine to me. If others are ok with this, please do send a signed-off patch.
>

Sure, I will.

> > diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> > index e28098e13f1c..25fbc37a747f 100644
> > --- a/mm/memcontrol.c
> > +++ b/mm/memcontrol.c
> > @@ -6086,6 +6086,9 @@ static ssize_t memory_max_write(struct
> > kernfs_open_file *of,
> > continue;
> > }
> >
> > + if (!cgroup_is_populated(memcg->css.cgroup))
> > + break;
> > +
> > memcg_memory_event(memcg, MEMCG_OOM);
> > if (!mem_cgroup_out_of_memory(memcg, GFP_KERNEL, 0))
> > break;
> >
> > > Deleting a memcg does not reclaim memory from it and the memory can
> > > linger till there is a memory pressure. One normal way to proactively
> > > reclaim such memory is to set memory.max to 0 just before deleting the
> > > memcg. However if some of the memcg's memory is pinned by others, this
> > > operation can trigger an oom-kill without any process and thus can log a
> > > lot un-needed warnings. So, ignore all such warnings from memory.max.
> > >
> > > Signed-off-by: Shakeel Butt <[email protected]>
> > > ---
> > > include/linux/oom.h | 3 +++
> > > mm/memcontrol.c | 9 +++++----
> > > mm/oom_kill.c | 2 +-
> > > 3 files changed, 9 insertions(+), 5 deletions(-)
> > >
> > > diff --git a/include/linux/oom.h b/include/linux/oom.h
> > > index c696c265f019..6345dc55df64 100644
> > > --- a/include/linux/oom.h
> > > +++ b/include/linux/oom.h
> > > @@ -52,6 +52,9 @@ struct oom_control {
> > >
> > > /* Used to print the constraint info. */
> > > enum oom_constraint constraint;
> > > +
> > > + /* Do not warn even if there is no process to be killed. */
> > > + bool no_warn;
> > > };
> > >
> > > extern struct mutex oom_lock;
> > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> > > index 317dbbaac603..a1f00d9b9bb0 100644
> > > --- a/mm/memcontrol.c
> > > +++ b/mm/memcontrol.c
> > > @@ -1571,7 +1571,7 @@ unsigned long mem_cgroup_size(struct mem_cgroup *memcg)
> > > }
> > >
> > > static bool mem_cgroup_out_of_memory(struct mem_cgroup *memcg, gfp_t gfp_mask,
> > > - int order)
> > > + int order, bool no_warn)
> > > {
> > > struct oom_control oc = {
> > > .zonelist = NULL,
> > > @@ -1579,6 +1579,7 @@ static bool mem_cgroup_out_of_memory(struct mem_cgroup *memcg, gfp_t gfp_mask,
> > > .memcg = memcg,
> > > .gfp_mask = gfp_mask,
> > > .order = order,
> > > + .no_warn = no_warn,
> > > };
> > > bool ret;
> > >
> > > @@ -1821,7 +1822,7 @@ static enum oom_status mem_cgroup_oom(struct mem_cgroup *memcg, gfp_t mask, int
> > > mem_cgroup_oom_notify(memcg);
> > >
> > > mem_cgroup_unmark_under_oom(memcg);
> > > - if (mem_cgroup_out_of_memory(memcg, mask, order))
> > > + if (mem_cgroup_out_of_memory(memcg, mask, order, false))
> > > ret = OOM_SUCCESS;
> > > else
> > > ret = OOM_FAILED;
> > > @@ -1880,7 +1881,7 @@ bool mem_cgroup_oom_synchronize(bool handle)
> > > mem_cgroup_unmark_under_oom(memcg);
> > > finish_wait(&memcg_oom_waitq, &owait.wait);
> > > mem_cgroup_out_of_memory(memcg, current->memcg_oom_gfp_mask,
> > > - current->memcg_oom_order);
> > > + current->memcg_oom_order, false);
> > > } else {
> > > schedule();
> > > mem_cgroup_unmark_under_oom(memcg);
> > > @@ -6106,7 +6107,7 @@ static ssize_t memory_max_write(struct kernfs_open_file *of,
> > > }
> > >
> > > memcg_memory_event(memcg, MEMCG_OOM);
> > > - if (!mem_cgroup_out_of_memory(memcg, GFP_KERNEL, 0))
> > > + if (!mem_cgroup_out_of_memory(memcg, GFP_KERNEL, 0, true))
> > > break;
> > > }
> > >
> > > diff --git a/mm/oom_kill.c b/mm/oom_kill.c
> > > index 463b3d74a64a..5ace39f6fe1e 100644
> > > --- a/mm/oom_kill.c
> > > +++ b/mm/oom_kill.c
> > > @@ -1098,7 +1098,7 @@ bool out_of_memory(struct oom_control *oc)
> > >
> > > select_bad_process(oc);
> > > /* Found nothing?!?! */
> > > - if (!oc->chosen) {
> > > + if (!oc->chosen && !oc->no_warn) {
> > > dump_header(oc, NULL);
> > > pr_warn("Out of memory and no killable processes...\n");
> > > /*
> > > --
> > > 2.26.2.526.g744177e7f7-goog
> > >
> > >
> >
> >
> > --
> > Thanks
> > Yafang



--
Thanks
Yafang

2020-05-04 07:04:46

by Michal Hocko

[permalink] [raw]
Subject: Re: [PATCH] memcg: oom: ignore oom warnings from memory.max

On Fri 01-05-20 09:39:24, Yafang Shao wrote:
> On Fri, May 1, 2020 at 2:27 AM Shakeel Butt <[email protected]> wrote:
> >
> > Lowering memory.max can trigger an oom-kill if the reclaim does not
> > succeed. However if oom-killer does not find a process for killing, it
> > dumps a lot of warnings.
> >
>
> I have been confused by this behavior for several months and I think
> it will confuse more memcg users.

Could you be more specific what has caused the confusion?

> We should keep the memcg oom behavior consistent with system oom - no
> oom kill if no process.

This is not the global mmemcg behavior. We do complain loud on no
eligible tasks and actually panic the system. Memcg cannot simply
do the same by default for obvious reasons.

> What about bellow change ?
>
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index e28098e13f1c..25fbc37a747f 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -6086,6 +6086,9 @@ static ssize_t memory_max_write(struct
> kernfs_open_file *of,
> continue;
> }
>
> + if (!cgroup_is_populated(memcg->css.cgroup))
> + break;
> +
> memcg_memory_event(memcg, MEMCG_OOM);
> if (!mem_cgroup_out_of_memory(memcg, GFP_KERNEL, 0))
> break;

I am not a great fan to be honest. The warning might be useful for other
usecases when it is not clear that the memcg is empty.

--
Michal Hocko
SUSE Labs

2020-05-04 07:39:29

by Michal Hocko

[permalink] [raw]
Subject: Re: [PATCH] memcg: oom: ignore oom warnings from memory.max

On Mon 04-05-20 15:26:52, Yafang Shao wrote:
> On Mon, May 4, 2020 at 3:03 PM Michal Hocko <[email protected]> wrote:
> >
> > On Fri 01-05-20 09:39:24, Yafang Shao wrote:
> > > On Fri, May 1, 2020 at 2:27 AM Shakeel Butt <[email protected]> wrote:
> > > >
> > > > Lowering memory.max can trigger an oom-kill if the reclaim does not
> > > > succeed. However if oom-killer does not find a process for killing, it
> > > > dumps a lot of warnings.
> > > >
> > >
> > > I have been confused by this behavior for several months and I think
> > > it will confuse more memcg users.
> >
> > Could you be more specific what has caused the confusion?
> >
>
> No task is different from no eligible task.
> No eligible task means there are some candidates but no one is eligible.
> Whille no task means there is no candidate.

I really fail to see a difference. It is clear the one is subset of the
other but in practical life tasks might come and go at any time and if
you try to reduce the hard limit and there are no tasks that could be
reclaimed then I believe we should complain whether it is only oom
disabled tasks or no tasks at all. It is certainly unexpected situation
in some cases because there are resources which are bound to the memcg
without any task we can act on.

> > > We should keep the memcg oom behavior consistent with system oom - no
> > > oom kill if no process.
> >
> > This is not the global mmemcg behavior. We do complain loud on no
> > eligible tasks and actually panic the system. Memcg cannot simply
> > do the same by default for obvious reasons.
> >
>
> As explianed above, no eligible task is different from no task.
> If there are some candidates but no one is eligible, the system will panic.
> While if there's no task, it is definitely no OOM, because that's an
> improssible thing for the system.

This is very much possible situation when all eligible tasks have been
already killed but they didn't really help to resolve the oom situation
- e.g. in kernel memory leak or unbounded shmem consumption etc...

--
Michal Hocko
SUSE Labs

2020-05-04 07:44:01

by Yafang Shao

[permalink] [raw]
Subject: Re: [PATCH] memcg: oom: ignore oom warnings from memory.max

On Mon, May 4, 2020 at 3:35 PM Michal Hocko <[email protected]> wrote:
>
> On Mon 04-05-20 15:26:52, Yafang Shao wrote:
> > On Mon, May 4, 2020 at 3:03 PM Michal Hocko <[email protected]> wrote:
> > >
> > > On Fri 01-05-20 09:39:24, Yafang Shao wrote:
> > > > On Fri, May 1, 2020 at 2:27 AM Shakeel Butt <[email protected]> wrote:
> > > > >
> > > > > Lowering memory.max can trigger an oom-kill if the reclaim does not
> > > > > succeed. However if oom-killer does not find a process for killing, it
> > > > > dumps a lot of warnings.
> > > > >
> > > >
> > > > I have been confused by this behavior for several months and I think
> > > > it will confuse more memcg users.
> > >
> > > Could you be more specific what has caused the confusion?
> > >
> >
> > No task is different from no eligible task.
> > No eligible task means there are some candidates but no one is eligible.
> > Whille no task means there is no candidate.
>
> I really fail to see a difference. It is clear the one is subset of the
> other but in practical life tasks might come and go at any time and if
> you try to reduce the hard limit and there are no tasks that could be
> reclaimed then I believe we should complain whether it is only oom
> disabled tasks or no tasks at all. It is certainly unexpected situation
> in some cases because there are resources which are bound to the memcg
> without any task we can act on.
>
> > > > We should keep the memcg oom behavior consistent with system oom - no
> > > > oom kill if no process.
> > >
> > > This is not the global mmemcg behavior. We do complain loud on no
> > > eligible tasks and actually panic the system. Memcg cannot simply
> > > do the same by default for obvious reasons.
> > >
> >
> > As explianed above, no eligible task is different from no task.
> > If there are some candidates but no one is eligible, the system will panic.
> > While if there's no task, it is definitely no OOM, because that's an
> > improssible thing for the system.
>
> This is very much possible situation when all eligible tasks have been
> already killed but they didn't really help to resolve the oom situation
> - e.g. in kernel memory leak or unbounded shmem consumption etc...
>

That's still an impossible thing, because many tasks are invisible to
the oom killer.
See oom_unkillable_task().



--
Thanks
Yafang

2020-05-04 08:05:21

by Michal Hocko

[permalink] [raw]
Subject: Re: [PATCH] memcg: oom: ignore oom warnings from memory.max

On Mon 04-05-20 15:40:18, Yafang Shao wrote:
> On Mon, May 4, 2020 at 3:35 PM Michal Hocko <[email protected]> wrote:
> >
> > On Mon 04-05-20 15:26:52, Yafang Shao wrote:
[...]
> > > As explianed above, no eligible task is different from no task.
> > > If there are some candidates but no one is eligible, the system will panic.
> > > While if there's no task, it is definitely no OOM, because that's an
> > > improssible thing for the system.
> >
> > This is very much possible situation when all eligible tasks have been
> > already killed but they didn't really help to resolve the oom situation
> > - e.g. in kernel memory leak or unbounded shmem consumption etc...
> >
>
> That's still an impossible thing, because many tasks are invisible to
> the oom killer.
> See oom_unkillable_task().

I do not follow, really. oom_unkillable_task only says that global init
cannot be killed and that it doesn't make any sense to kill kernel
threads as they do not own any mm normally.

--
Michal Hocko
SUSE Labs

2020-05-04 08:42:00

by Michal Hocko

[permalink] [raw]
Subject: Re: [PATCH] memcg: oom: ignore oom warnings from memory.max

On Thu 30-04-20 13:20:10, Shakeel Butt wrote:
> On Thu, Apr 30, 2020 at 12:29 PM Johannes Weiner <[email protected]> wrote:
> >
> > On Thu, Apr 30, 2020 at 11:27:12AM -0700, Shakeel Butt wrote:
> > > Lowering memory.max can trigger an oom-kill if the reclaim does not
> > > succeed. However if oom-killer does not find a process for killing, it
> > > dumps a lot of warnings.
> > >
> > > Deleting a memcg does not reclaim memory from it and the memory can
> > > linger till there is a memory pressure. One normal way to proactively
> > > reclaim such memory is to set memory.max to 0 just before deleting the
> > > memcg. However if some of the memcg's memory is pinned by others, this
> > > operation can trigger an oom-kill without any process and thus can log a
> > > lot un-needed warnings. So, ignore all such warnings from memory.max.
> >
> > Can't you set memory.high=0 instead? It does the reclaim portion of
> > memory.max, without the actual OOM killing that causes you problems.
>
> Yes that would work but remote charging concerns me. Remote charging
> can still happen after the memcg is offlined and at the moment, high
> reclaim does not work for remote memcg and the usage can go till max
> or global pressure. This is most probably a misconfiguration and we
> might not receive the warnings in the log ever. Setting memory.max to
> 0 will definitely give such warnings.

Can we add a warning for the remote charging on dead memcgs?

--
Michal Hocko
SUSE Labs