2018-08-21 16:44:48

by Johannes Weiner

[permalink] [raw]
Subject: [PATCH] mm: memcontrol: print proper OOM header when no eligible victim left

When the memcg OOM killer runs out of killable tasks, it currently
prints a WARN with no further OOM context. This has caused some user
confusion.

Warnings indicate a kernel problem. In a reported case, however, the
situation was triggered by a non-sensical memcg configuration (hard
limit set to 0). But without any VM context this wasn't obvious from
the report, and it took some back and forth on the mailing list to
identify what is actually a trivial issue.

Handle this OOM condition like we handle it in the global OOM killer:
dump the full OOM context and tell the user we ran out of tasks.

This way the user can identify misconfigurations easily by themselves
and rectify the problem - without having to go through the hassle of
running into an obscure but unsettling warning, finding the
appropriate kernel mailing list and waiting for a kernel developer to
remote-analyze that the memcg configuration caused this.

If users cannot make sense of why the OOM killer was triggered or why
it failed, they will still report it to the mailing list, we know that
from experience. So in case there is an actual kernel bug causing
this, kernel developers will very likely hear about it.

Signed-off-by: Johannes Weiner <[email protected]>
Acked-by: Michal Hocko <[email protected]>
---
mm/memcontrol.c | 2 --
mm/oom_kill.c | 13 ++++++++++---
2 files changed, 10 insertions(+), 5 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 4e3c1315b1de..29d9d1a69b36 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -1701,8 +1701,6 @@ static enum oom_status mem_cgroup_oom(struct mem_cgroup *memcg, gfp_t mask, int
if (mem_cgroup_out_of_memory(memcg, mask, order))
return OOM_SUCCESS;

- WARN(1,"Memory cgroup charge failed because of no reclaimable memory! "
- "This looks like a misconfiguration or a kernel bug.");
return OOM_FAILED;
}

diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index b5b25e4dcbbb..95fbbc46f68f 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -1103,10 +1103,17 @@ bool out_of_memory(struct oom_control *oc)
}

select_bad_process(oc);
- /* Found nothing?!?! Either we hang forever, or we panic. */
- if (!oc->chosen && !is_sysrq_oom(oc) && !is_memcg_oom(oc)) {
+ /* Found nothing?!?! */
+ if (!oc->chosen) {
dump_header(oc, NULL);
- panic("Out of memory and no killable processes...\n");
+ pr_warn("Out of memory and no killable processes...\n");
+ /*
+ * If we got here due to an actual allocation at the
+ * system level, we cannot survive this and will enter
+ * an endless loop in the allocator. Bail out now.
+ */
+ if (!is_sysrq_oom(oc) && !is_memcg_oom(oc))
+ panic("System is deadlocked on memory\n");
}
if (oc->chosen && oc->chosen != (void *)-1UL)
oom_kill_process(oc, !is_memcg_oom(oc) ? "Out of memory" :
--
2.18.0



2018-09-08 13:37:49

by Tetsuo Handa

[permalink] [raw]
Subject: Re: [PATCH] mm: memcontrol: print proper OOM header when no eligible victim left

On 2018/08/22 1:04, Johannes Weiner wrote:
> When the memcg OOM killer runs out of killable tasks, it currently
> prints a WARN with no further OOM context. This has caused some user
> confusion.
>
> Warnings indicate a kernel problem. In a reported case, however, the
> situation was triggered by a non-sensical memcg configuration (hard
> limit set to 0). But without any VM context this wasn't obvious from
> the report, and it took some back and forth on the mailing list to
> identify what is actually a trivial issue.
>
> Handle this OOM condition like we handle it in the global OOM killer:
> dump the full OOM context and tell the user we ran out of tasks.
>
> This way the user can identify misconfigurations easily by themselves
> and rectify the problem - without having to go through the hassle of
> running into an obscure but unsettling warning, finding the
> appropriate kernel mailing list and waiting for a kernel developer to
> remote-analyze that the memcg configuration caused this.
>
> If users cannot make sense of why the OOM killer was triggered or why
> it failed, they will still report it to the mailing list, we know that
> from experience. So in case there is an actual kernel bug causing
> this, kernel developers will very likely hear about it.
>
> Signed-off-by: Johannes Weiner <[email protected]>
> Acked-by: Michal Hocko <[email protected]>
> ---
> mm/memcontrol.c | 2 --
> mm/oom_kill.c | 13 ++++++++++---
> 2 files changed, 10 insertions(+), 5 deletions(-)
>

Now that above patch went to 4.19-rc3, please apply below one.

From eb2bff2ed308da04785bcf541dd3f748286bfa23 Mon Sep 17 00:00:00 2001
From: Tetsuo Handa <[email protected]>
Date: Sat, 8 Sep 2018 22:26:28 +0900
Subject: [PATCH] mm, oom: Don't emit noises for failed SysRq-f.

Due to commit d75da004c708c9fc ("oom: improve oom disable handling") and
commit 3100dab2aa09dc6e ("mm: memcontrol: print proper OOM header when
no eligible victim left"), all

kworker/0:1 invoked oom-killer: gfp_mask=0x6000c0(GFP_KERNEL), nodemask=(null), order=-1, oom_score_adj=0
(...snipped...)
Out of memory and no killable processes...
OOM request ignored. No task eligible

lines are printed.
Let's not emit "invoked oom-killer" lines when SysRq-f failed.

Signed-off-by: Tetsuo Handa <[email protected]>
---
mm/oom_kill.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index f10aa53..92122ef 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -1106,8 +1106,10 @@ bool out_of_memory(struct oom_control *oc)
select_bad_process(oc);
/* Found nothing?!?! */
if (!oc->chosen) {
- dump_header(oc, NULL);
- pr_warn("Out of memory and no killable processes...\n");
+ if (!is_sysrq_oom(oc)) {
+ dump_header(oc, NULL);
+ pr_warn("Out of memory and no killable processes...\n");
+ }
/*
* If we got here due to an actual allocation at the
* system level, we cannot survive this and will enter
--
1.8.3.1



2018-09-08 13:59:31

by Johannes Weiner

[permalink] [raw]
Subject: Re: [PATCH] mm: memcontrol: print proper OOM header when no eligible victim left

On Sat, Sep 08, 2018 at 10:36:06PM +0900, Tetsuo Handa wrote:
> On 2018/08/22 1:04, Johannes Weiner wrote:
> > When the memcg OOM killer runs out of killable tasks, it currently
> > prints a WARN with no further OOM context. This has caused some user
> > confusion.
> >
> > Warnings indicate a kernel problem. In a reported case, however, the
> > situation was triggered by a non-sensical memcg configuration (hard
> > limit set to 0). But without any VM context this wasn't obvious from
> > the report, and it took some back and forth on the mailing list to
> > identify what is actually a trivial issue.
> >
> > Handle this OOM condition like we handle it in the global OOM killer:
> > dump the full OOM context and tell the user we ran out of tasks.
> >
> > This way the user can identify misconfigurations easily by themselves
> > and rectify the problem - without having to go through the hassle of
> > running into an obscure but unsettling warning, finding the
> > appropriate kernel mailing list and waiting for a kernel developer to
> > remote-analyze that the memcg configuration caused this.
> >
> > If users cannot make sense of why the OOM killer was triggered or why
> > it failed, they will still report it to the mailing list, we know that
> > from experience. So in case there is an actual kernel bug causing
> > this, kernel developers will very likely hear about it.
> >
> > Signed-off-by: Johannes Weiner <[email protected]>
> > Acked-by: Michal Hocko <[email protected]>
> > ---
> > mm/memcontrol.c | 2 --
> > mm/oom_kill.c | 13 ++++++++++---
> > 2 files changed, 10 insertions(+), 5 deletions(-)
> >
>
> Now that above patch went to 4.19-rc3, please apply below one.
>
> From eb2bff2ed308da04785bcf541dd3f748286bfa23 Mon Sep 17 00:00:00 2001
> From: Tetsuo Handa <[email protected]>
> Date: Sat, 8 Sep 2018 22:26:28 +0900
> Subject: [PATCH] mm, oom: Don't emit noises for failed SysRq-f.
>
> Due to commit d75da004c708c9fc ("oom: improve oom disable handling") and
> commit 3100dab2aa09dc6e ("mm: memcontrol: print proper OOM header when
> no eligible victim left"), all
>
> kworker/0:1 invoked oom-killer: gfp_mask=0x6000c0(GFP_KERNEL), nodemask=(null), order=-1, oom_score_adj=0
> (...snipped...)
> Out of memory and no killable processes...
> OOM request ignored. No task eligible
>
> lines are printed.

This doesn't explain the context, what you were trying to do here, and
what you expected to happen. Plus, you (...snipped...) the important
part to understand why it failed in the first place.

> Let's not emit "invoked oom-killer" lines when SysRq-f failed.

I disagree. If the user asked for an OOM kill, it makes perfect sense
to dump the memory context and the outcome of the operation - even if
the outcome is "I didn't find anything to kill". I'd argue that the
failure case *in particular* is where I want to know about and have
all the information that could help me understand why it failed.

So NAK on the inferred patch premise, but please include way more
rationale, reproduction scenario etc. in future patches. It's not at
all clear *why* you think it should work the way you propose here.

2018-09-08 14:17:46

by Tetsuo Handa

[permalink] [raw]
Subject: Re: [PATCH] mm: memcontrol: print proper OOM header when no eligible victim left

On 2018/09/08 22:57, Johannes Weiner wrote:
> On Sat, Sep 08, 2018 at 10:36:06PM +0900, Tetsuo Handa wrote:
>> Due to commit d75da004c708c9fc ("oom: improve oom disable handling") and
>> commit 3100dab2aa09dc6e ("mm: memcontrol: print proper OOM header when
>> no eligible victim left"), all
>>
>> kworker/0:1 invoked oom-killer: gfp_mask=0x6000c0(GFP_KERNEL), nodemask=(null), order=-1, oom_score_adj=0
>> (...snipped...)
>> Out of memory and no killable processes...
>> OOM request ignored. No task eligible
>>
>> lines are printed.
>
> This doesn't explain the context, what you were trying to do here, and
> what you expected to happen. Plus, you (...snipped...) the important
> part to understand why it failed in the first place.

I expect:

When SysRq-f did not find killable processes, it does not emit
message other than "OOM request ignored. No task eligible".

There is no point with emitting memory information etc.

>
>> Let's not emit "invoked oom-killer" lines when SysRq-f failed.
>
> I disagree. If the user asked for an OOM kill, it makes perfect sense
> to dump the memory context and the outcome of the operation - even if
> the outcome is "I didn't find anything to kill". I'd argue that the
> failure case *in particular* is where I want to know about and have
> all the information that could help me understand why it failed.

How emitting memory information etc. helps you understand why it failed?
"No task eligible" is sufficient for you to understand why, isn't it?

>
> So NAK on the inferred patch premise, but please include way more
> rationale, reproduction scenario etc. in future patches. It's not at
> all clear *why* you think it should work the way you propose here.
>