From: yuzhoujian <[email protected]>
This patch will make some preparation for the follow-up patch: Refactor
part of the oom report in dump_header. It puts enum oom_constraint in
memcontrol.h and adds an array of const char for each constraint.
Signed-off-by: yuzhoujian <[email protected]>
---
include/linux/memcontrol.h | 14 ++++++++++++++
mm/oom_kill.c | 7 -------
2 files changed, 14 insertions(+), 7 deletions(-)
diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index d99b71bc2c66..57311b6c4d67 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -62,6 +62,20 @@ struct mem_cgroup_reclaim_cookie {
unsigned int generation;
};
+enum oom_constraint {
+ CONSTRAINT_NONE,
+ CONSTRAINT_CPUSET,
+ CONSTRAINT_MEMORY_POLICY,
+ CONSTRAINT_MEMCG,
+};
+
+static const char * const oom_constraint_text[] = {
+ [CONSTRAINT_NONE] = "CONSTRAINT_NONE",
+ [CONSTRAINT_CPUSET] = "CONSTRAINT_CPUSET",
+ [CONSTRAINT_MEMORY_POLICY] = "CONSTRAINT_MEMORY_POLICY",
+ [CONSTRAINT_MEMCG] = "CONSTRAINT_MEMCG",
+};
+
#ifdef CONFIG_MEMCG
#define MEM_CGROUP_ID_SHIFT 16
diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index 8ba6cb88cf58..c806cd656af6 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -237,13 +237,6 @@ unsigned long oom_badness(struct task_struct *p, struct mem_cgroup *memcg,
return points > 0 ? points : 1;
}
-enum oom_constraint {
- CONSTRAINT_NONE,
- CONSTRAINT_CPUSET,
- CONSTRAINT_MEMORY_POLICY,
- CONSTRAINT_MEMCG,
-};
-
/*
* Determine the type of allocation constraint.
*/
--
2.14.1
From: yuzhoujian <[email protected]>
The dump_header does not print the memcg's name when the system
oom happened, so users cannot locate the certain container which
contains the task that has been killed by the oom killer.
I follow the advices of David Rientjes and Michal Hocko, and refactor
part of the oom report in a backwards compatible way. After this patch,
users can get the memcg's path from the oom report and check the certain
container more quickly.
Below is the part of the oom report in the dmesg
...
[ 142.158316] panic cpuset=/ mems_allowed=0-1
[ 142.158983] CPU: 15 PID: 8682 Comm: panic Not tainted 4.17.0-rc6+ #13
[ 142.159659] Hardware name: Inspur SA5212M4/YZMB-00370-107, BIOS 4.1.10 11/14/2016
[ 142.160342] Call Trace:
[ 142.161037] dump_stack+0x78/0xb3
[ 142.161734] dump_header+0x7d/0x334
[ 142.162433] oom_kill_process+0x228/0x490
[ 142.163126] ? oom_badness+0x2a/0x130
[ 142.163821] out_of_memory+0xf0/0x280
[ 142.164532] __alloc_pages_slowpath+0x711/0xa07
[ 142.165241] __alloc_pages_nodemask+0x23f/0x260
[ 142.165947] alloc_pages_vma+0x73/0x180
[ 142.166665] do_anonymous_page+0xed/0x4e0
[ 142.167388] __handle_mm_fault+0xbd2/0xe00
[ 142.168114] handle_mm_fault+0x116/0x250
[ 142.168841] __do_page_fault+0x233/0x4d0
[ 142.169567] do_page_fault+0x32/0x130
[ 142.170303] ? page_fault+0x8/0x30
[ 142.171036] page_fault+0x1e/0x30
[ 142.171764] RIP: 0033:0x7f403000a860
[ 142.172517] RSP: 002b:00007ffc9f745c28 EFLAGS: 00010206
[ 142.173268] RAX: 00007f3f6fd7d000 RBX: 0000000000000000 RCX: 00007f3f7f5cd000
[ 142.174040] RDX: 00007f3fafd7d000 RSI: 0000000000000000 RDI: 00007f3f6fd7d000
[ 142.174806] RBP: 00007ffc9f745c50 R08: ffffffffffffffff R09: 0000000000000000
[ 142.175623] R10: 0000000000000022 R11: 0000000000000246 R12: 0000000000400490
[ 142.176542] R13: 00007ffc9f745d30 R14: 0000000000000000 R15: 0000000000000000
[ 142.177709] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),origin_memcg=(null),kill_memcg=/test/test1/test2,task=panic,pid= 8622,uid= 0
...
Changes since v6:
- divide the patch v5 into two parts. One part is to add an array of const char and
put enum oom_constraint into the memcontrol.h; the other is will refactor the output
in the dump_header.
- limit the memory usage for the static char array by using NAME_MAX in the mem_cgroup_print_oom_context.
- eliminate the spurious spaces in the oom's output and fix the spelling of "constrain".
Changes since v5:
- add an array of const char for each constraint.
- replace all of the pr_cont with a single line print of the pr_info.
- put enum oom_constraint into the memcontrol.c file for printing oom constraint.
Changes since v4:
- rename the helper's name to mem_cgroup_print_oom_context.
- rename the mem_cgroup_print_oom_info to mem_cgroup_print_oom_meminfo.
- add the constrain info in the dump_header.
Changes since v3:
- rename the helper's name to mem_cgroup_print_oom_memcg_name.
- add the rcu lock held to the helper.
- remove the print info of memcg's name in mem_cgroup_print_oom_info.
Changes since v2:
- add the mem_cgroup_print_memcg_name helper to print the memcg's
name which contains the task that will be killed by the oom-killer.
Changes since v1:
- replace adding mem_cgroup_print_oom_info with printing the memcg's
name only.
Signed-off-by: yuzhoujian <[email protected]>
---
include/linux/memcontrol.h | 15 ++++++++++---
mm/memcontrol.c | 55 ++++++++++++++++++++++++++++++++--------------
mm/oom_kill.c | 5 +++--
3 files changed, 53 insertions(+), 22 deletions(-)
diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index 57311b6c4d67..1c7d5da1c827 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -478,8 +478,11 @@ void mem_cgroup_handle_over_high(void);
unsigned long mem_cgroup_get_limit(struct mem_cgroup *memcg);
-void mem_cgroup_print_oom_info(struct mem_cgroup *memcg,
- struct task_struct *p);
+void mem_cgroup_print_oom_context(struct mem_cgroup *memcg,
+ struct task_struct *p, enum oom_constraint constraint,
+ nodemask_t *nodemask);
+
+void mem_cgroup_print_oom_meminfo(struct mem_cgroup *memcg);
static inline void mem_cgroup_oom_enable(void)
{
@@ -873,7 +876,13 @@ static inline unsigned long mem_cgroup_get_limit(struct mem_cgroup *memcg)
}
static inline void
-mem_cgroup_print_oom_info(struct mem_cgroup *memcg, struct task_struct *p)
+mem_cgroup_print_oom_context(struct mem_cgroup *memcg, struct task_struct *p,
+ enum oom_constraint constraint, nodemask_t *nodemask)
+{
+}
+
+static inline void
+mem_cgroup_print_oom_meminfo(struct mem_cgroup *memcg)
{
}
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 2bd3df3d101a..fd1172938c8e 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -1118,33 +1118,54 @@ static const char *const memcg1_stat_names[] = {
};
#define K(x) ((x) << (PAGE_SHIFT-10))
-/**
- * mem_cgroup_print_oom_info: Print OOM information relevant to memory controller.
- * @memcg: The memory cgroup that went over limit
+/*
+ * mem_cgroup_print_oom_context: Print OOM context information relevant to
+ * memory controller, which includes allocation constraint, nodemask, origin
+ * memcg that has reached its limit, kill memcg that contains the killed
+ * process, killed process's command, pid and uid.
+ * @memcg: The origin memory cgroup that went over limit
* @p: Task that is going to be killed
+ * @constraint: The allocation constraint
+ * @nodemask: The allocation nodemask
*
* NOTE: @memcg and @p's mem_cgroup can be different when hierarchy is
* enabled
*/
-void mem_cgroup_print_oom_info(struct mem_cgroup *memcg, struct task_struct *p)
+void mem_cgroup_print_oom_context(struct mem_cgroup *memcg, struct task_struct *p,
+ enum oom_constraint constraint, nodemask_t *nodemask)
{
- struct mem_cgroup *iter;
- unsigned int i;
+ static char origin_memcg_name[NAME_MAX], kill_memcg_name[NAME_MAX];
+ struct cgroup *origin_cgrp, *kill_cgrp;
rcu_read_lock();
-
- if (p) {
- pr_info("Task in ");
- pr_cont_cgroup_path(task_cgroup(p, memory_cgrp_id));
- pr_cont(" killed as a result of limit of ");
- } else {
- pr_info("Memory limit reached of cgroup ");
+ if (memcg) {
+ origin_cgrp = memcg->css.cgroup;
+ cgroup_path(origin_cgrp, origin_memcg_name, NAME_MAX);
}
-
- pr_cont_cgroup_path(memcg->css.cgroup);
- pr_cont("\n");
-
+ kill_cgrp = task_cgroup(p, memory_cgrp_id);
+ cgroup_path(kill_cgrp, kill_memcg_name, NAME_MAX);
+
+ if (p)
+ pr_info("oom-kill:constraint=%s,nodemask=%*pbl,origin_memcg=%s,kill_memcg=%s,task=%s,pid=%5d,uid=%5d\n",
+ oom_constraint_text[constraint], nodemask_pr_args(nodemask),
+ strlen(origin_memcg_name) ? origin_memcg_name : "(null)",
+ kill_memcg_name, p->comm, p->pid,
+ from_kuid(&init_user_ns, task_uid(p)));
+ else
+ pr_info("oom-kill:constraint=%s,nodemask=%*pbl,origin_memcg=%s,kill_memcg=%s\n",
+ oom_constraint_text[constraint], nodemask_pr_args(nodemask),
+ strlen(origin_memcg_name) ? origin_memcg_name : "(null)", kill_memcg_name);
rcu_read_unlock();
+}
+
+/**
+ * mem_cgroup_print_oom_info: Print OOM memory information relevant to memory controller.
+ * @memcg: The memory cgroup that went over limit
+ */
+void mem_cgroup_print_oom_meminfo(struct mem_cgroup *memcg)
+{
+ struct mem_cgroup *iter;
+ unsigned int i;
pr_info("memory: usage %llukB, limit %llukB, failcnt %lu\n",
K((u64)page_counter_read(&memcg->memory)),
diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index c806cd656af6..af0efab8a9e5 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -16,7 +16,6 @@
* for newbie kernel hackers. It features several pointers to major
* kernel subsystems and hints as to where to find out what things do.
*/
-
#include <linux/oom.h>
#include <linux/mm.h>
#include <linux/err.h>
@@ -414,6 +413,7 @@ static void dump_tasks(struct mem_cgroup *memcg, const nodemask_t *nodemask)
static void dump_header(struct oom_control *oc, struct task_struct *p)
{
+ enum oom_constraint constraint = constrained_alloc(oc);
pr_warn("%s invoked oom-killer: gfp_mask=%#x(%pGg), nodemask=%*pbl, order=%d, oom_score_adj=%hd\n",
current->comm, oc->gfp_mask, &oc->gfp_mask,
nodemask_pr_args(oc->nodemask), oc->order,
@@ -423,8 +423,9 @@ static void dump_header(struct oom_control *oc, struct task_struct *p)
cpuset_print_current_mems_allowed();
dump_stack();
+ mem_cgroup_print_oom_context(oc->memcg, p, constraint, oc->nodemask);
if (is_memcg_oom(oc))
- mem_cgroup_print_oom_info(oc->memcg, p);
+ mem_cgroup_print_oom_meminfo(oc->memcg);
else {
show_mem(SHOW_MEM_FILTER_NODES, oc->nodemask);
if (is_dump_unreclaim_slabs())
--
2.14.1
From: yuzhoujian <[email protected]>
The dump_header does not print the memcg's name when the system
oom happened, so users cannot locate the certain container which
contains the task that has been killed by the oom killer.
I follow the advices of David Rientjes and Michal Hocko, and refactor
part of the oom report in a backwards compatible way. After this patch,
users can get the memcg's path from the oom report and check the certain
container more quickly.
Below is the part of the oom report in the dmesg
...
[ 142.158316] panic cpuset=/ mems_allowed=0-1
[ 142.158983] CPU: 15 PID: 8682 Comm: panic Not tainted 4.17.0-rc6+ #13
[ 142.159659] Hardware name: Inspur SA5212M4/YZMB-00370-107, BIOS 4.1.10 11/14/2016
[ 142.160342] Call Trace:
[ 142.161037] dump_stack+0x78/0xb3
[ 142.161734] dump_header+0x7d/0x334
[ 142.162433] oom_kill_process+0x228/0x490
[ 142.163126] ? oom_badness+0x2a/0x130
[ 142.163821] out_of_memory+0xf0/0x280
[ 142.164532] __alloc_pages_slowpath+0x711/0xa07
[ 142.165241] __alloc_pages_nodemask+0x23f/0x260
[ 142.165947] alloc_pages_vma+0x73/0x180
[ 142.166665] do_anonymous_page+0xed/0x4e0
[ 142.167388] __handle_mm_fault+0xbd2/0xe00
[ 142.168114] handle_mm_fault+0x116/0x250
[ 142.168841] __do_page_fault+0x233/0x4d0
[ 142.169567] do_page_fault+0x32/0x130
[ 142.170303] ? page_fault+0x8/0x30
[ 142.171036] page_fault+0x1e/0x30
[ 142.171764] RIP: 0033:0x7f403000a860
[ 142.172517] RSP: 002b:00007ffc9f745c28 EFLAGS: 00010206
[ 142.173268] RAX: 00007f3f6fd7d000 RBX: 0000000000000000 RCX: 00007f3f7f5cd000
[ 142.174040] RDX: 00007f3fafd7d000 RSI: 0000000000000000 RDI: 00007f3f6fd7d000
[ 142.174806] RBP: 00007ffc9f745c50 R08: ffffffffffffffff R09: 0000000000000000
[ 142.175623] R10: 0000000000000022 R11: 0000000000000246 R12: 0000000000400490
[ 142.176542] R13: 00007ffc9f745d30 R14: 0000000000000000 R15: 0000000000000000
[ 142.177709] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),origin_memcg=(null),kill_memcg=/test/test1/test2,task=panic,pid= 8622,uid= 0
...
Changes since v6:
- divide the patch v5 into two parts. One part is to add an array of const char and
put enum oom_constraint into the memcontrol.h; the other is will refactor the output
in the dump_header.
- limit the memory usage for the static char array by using NAME_MAX in the mem_cgroup_print_oom_context.
- eliminate the spurious spaces in the oom's output and fix the spelling of "constrain".
Changes since v5:
- add an array of const char for each constraint.
- replace all of the pr_cont with a single line print of the pr_info.
- put enum oom_constraint into the memcontrol.c file for printing oom constraint.
Changes since v4:
- rename the helper's name to mem_cgroup_print_oom_context.
- rename the mem_cgroup_print_oom_info to mem_cgroup_print_oom_meminfo.
- add the constrain info in the dump_header.
Changes since v3:
- rename the helper's name to mem_cgroup_print_oom_memcg_name.
- add the rcu lock held to the helper.
- remove the print info of memcg's name in mem_cgroup_print_oom_info.
Changes since v2:
- add the mem_cgroup_print_memcg_name helper to print the memcg's
name which contains the task that will be killed by the oom-killer.
Changes since v1:
- replace adding mem_cgroup_print_oom_info with printing the memcg's
name only.
Signed-off-by: yuzhoujian <[email protected]>
---
include/linux/memcontrol.h | 15 ++++++++++---
mm/memcontrol.c | 55 ++++++++++++++++++++++++++++++++--------------
mm/oom_kill.c | 5 +++--
3 files changed, 53 insertions(+), 22 deletions(-)
diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index 57311b6c4d67..1c7d5da1c827 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -478,8 +478,11 @@ void mem_cgroup_handle_over_high(void);
unsigned long mem_cgroup_get_limit(struct mem_cgroup *memcg);
-void mem_cgroup_print_oom_info(struct mem_cgroup *memcg,
- struct task_struct *p);
+void mem_cgroup_print_oom_context(struct mem_cgroup *memcg,
+ struct task_struct *p, enum oom_constraint constraint,
+ nodemask_t *nodemask);
+
+void mem_cgroup_print_oom_meminfo(struct mem_cgroup *memcg);
static inline void mem_cgroup_oom_enable(void)
{
@@ -873,7 +876,13 @@ static inline unsigned long mem_cgroup_get_limit(struct mem_cgroup *memcg)
}
static inline void
-mem_cgroup_print_oom_info(struct mem_cgroup *memcg, struct task_struct *p)
+mem_cgroup_print_oom_context(struct mem_cgroup *memcg, struct task_struct *p,
+ enum oom_constraint constraint, nodemask_t *nodemask)
+{
+}
+
+static inline void
+mem_cgroup_print_oom_meminfo(struct mem_cgroup *memcg)
{
}
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 2bd3df3d101a..fd1172938c8e 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -1118,33 +1118,54 @@ static const char *const memcg1_stat_names[] = {
};
#define K(x) ((x) << (PAGE_SHIFT-10))
-/**
- * mem_cgroup_print_oom_info: Print OOM information relevant to memory controller.
- * @memcg: The memory cgroup that went over limit
+/*
+ * mem_cgroup_print_oom_context: Print OOM context information relevant to
+ * memory controller, which includes allocation constraint, nodemask, origin
+ * memcg that has reached its limit, kill memcg that contains the killed
+ * process, killed process's command, pid and uid.
+ * @memcg: The origin memory cgroup that went over limit
* @p: Task that is going to be killed
+ * @constraint: The allocation constraint
+ * @nodemask: The allocation nodemask
*
* NOTE: @memcg and @p's mem_cgroup can be different when hierarchy is
* enabled
*/
-void mem_cgroup_print_oom_info(struct mem_cgroup *memcg, struct task_struct *p)
+void mem_cgroup_print_oom_context(struct mem_cgroup *memcg, struct task_struct *p,
+ enum oom_constraint constraint, nodemask_t *nodemask)
{
- struct mem_cgroup *iter;
- unsigned int i;
+ static char origin_memcg_name[NAME_MAX], kill_memcg_name[NAME_MAX];
+ struct cgroup *origin_cgrp, *kill_cgrp;
rcu_read_lock();
-
- if (p) {
- pr_info("Task in ");
- pr_cont_cgroup_path(task_cgroup(p, memory_cgrp_id));
- pr_cont(" killed as a result of limit of ");
- } else {
- pr_info("Memory limit reached of cgroup ");
+ if (memcg) {
+ origin_cgrp = memcg->css.cgroup;
+ cgroup_path(origin_cgrp, origin_memcg_name, NAME_MAX);
}
-
- pr_cont_cgroup_path(memcg->css.cgroup);
- pr_cont("\n");
-
+ kill_cgrp = task_cgroup(p, memory_cgrp_id);
+ cgroup_path(kill_cgrp, kill_memcg_name, NAME_MAX);
+
+ if (p)
+ pr_info("oom-kill:constraint=%s,nodemask=%*pbl,origin_memcg=%s,kill_memcg=%s,task=%s,pid=%5d,uid=%5d\n",
+ oom_constraint_text[constraint], nodemask_pr_args(nodemask),
+ strlen(origin_memcg_name) ? origin_memcg_name : "(null)",
+ kill_memcg_name, p->comm, p->pid,
+ from_kuid(&init_user_ns, task_uid(p)));
+ else
+ pr_info("oom-kill:constraint=%s,nodemask=%*pbl,origin_memcg=%s,kill_memcg=%s\n",
+ oom_constraint_text[constraint], nodemask_pr_args(nodemask),
+ strlen(origin_memcg_name) ? origin_memcg_name : "(null)", kill_memcg_name);
rcu_read_unlock();
+}
+
+/**
+ * mem_cgroup_print_oom_info: Print OOM memory information relevant to memory controller.
+ * @memcg: The memory cgroup that went over limit
+ */
+void mem_cgroup_print_oom_meminfo(struct mem_cgroup *memcg)
+{
+ struct mem_cgroup *iter;
+ unsigned int i;
pr_info("memory: usage %llukB, limit %llukB, failcnt %lu\n",
K((u64)page_counter_read(&memcg->memory)),
diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index c806cd656af6..af0efab8a9e5 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -16,7 +16,6 @@
* for newbie kernel hackers. It features several pointers to major
* kernel subsystems and hints as to where to find out what things do.
*/
-
#include <linux/oom.h>
#include <linux/mm.h>
#include <linux/err.h>
@@ -414,6 +413,7 @@ static void dump_tasks(struct mem_cgroup *memcg, const nodemask_t *nodemask)
static void dump_header(struct oom_control *oc, struct task_struct *p)
{
+ enum oom_constraint constraint = constrained_alloc(oc);
pr_warn("%s invoked oom-killer: gfp_mask=%#x(%pGg), nodemask=%*pbl, order=%d, oom_score_adj=%hd\n",
current->comm, oc->gfp_mask, &oc->gfp_mask,
nodemask_pr_args(oc->nodemask), oc->order,
@@ -423,8 +423,9 @@ static void dump_header(struct oom_control *oc, struct task_struct *p)
cpuset_print_current_mems_allowed();
dump_stack();
+ mem_cgroup_print_oom_context(oc->memcg, p, constraint, oc->nodemask);
if (is_memcg_oom(oc))
- mem_cgroup_print_oom_info(oc->memcg, p);
+ mem_cgroup_print_oom_meminfo(oc->memcg);
else {
show_mem(SHOW_MEM_FILTER_NODES, oc->nodemask);
if (is_dump_unreclaim_slabs())
--
2.14.1
From: yuzhoujian <[email protected]>
This patch will make some preparation for the follow-up patch: Refactor
part of the oom report in dump_header. It puts enum oom_constraint in
memcontrol.h and adds an array of const char for each constraint.
Signed-off-by: yuzhoujian <[email protected]>
---
include/linux/memcontrol.h | 14 ++++++++++++++
mm/oom_kill.c | 7 -------
2 files changed, 14 insertions(+), 7 deletions(-)
diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index d99b71bc2c66..57311b6c4d67 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -62,6 +62,20 @@ struct mem_cgroup_reclaim_cookie {
unsigned int generation;
};
+enum oom_constraint {
+ CONSTRAINT_NONE,
+ CONSTRAINT_CPUSET,
+ CONSTRAINT_MEMORY_POLICY,
+ CONSTRAINT_MEMCG,
+};
+
+static const char * const oom_constraint_text[] = {
+ [CONSTRAINT_NONE] = "CONSTRAINT_NONE",
+ [CONSTRAINT_CPUSET] = "CONSTRAINT_CPUSET",
+ [CONSTRAINT_MEMORY_POLICY] = "CONSTRAINT_MEMORY_POLICY",
+ [CONSTRAINT_MEMCG] = "CONSTRAINT_MEMCG",
+};
+
#ifdef CONFIG_MEMCG
#define MEM_CGROUP_ID_SHIFT 16
diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index 8ba6cb88cf58..c806cd656af6 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -237,13 +237,6 @@ unsigned long oom_badness(struct task_struct *p, struct mem_cgroup *memcg,
return points > 0 ? points : 1;
}
-enum oom_constraint {
- CONSTRAINT_NONE,
- CONSTRAINT_CPUSET,
- CONSTRAINT_MEMORY_POLICY,
- CONSTRAINT_MEMCG,
-};
-
/*
* Determine the type of allocation constraint.
*/
--
2.14.1
On Sat, Jun 02, 2018 at 07:58:52PM +0800, [email protected] wrote:
> From: yuzhoujian <[email protected]>
>
> The dump_header does not print the memcg's name when the system
> oom happened, so users cannot locate the certain container which
> contains the task that has been killed by the oom killer.
>
> I follow the advices of David Rientjes and Michal Hocko, and refactor
> part of the oom report in a backwards compatible way. After this patch,
> users can get the memcg's path from the oom report and check the certain
> container more quickly.
>
> Below is the part of the oom report in the dmesg
> ...
> [ 142.158316] panic cpuset=/ mems_allowed=0-1
> [ 142.158983] CPU: 15 PID: 8682 Comm: panic Not tainted 4.17.0-rc6+ #13
> [ 142.159659] Hardware name: Inspur SA5212M4/YZMB-00370-107, BIOS 4.1.10 11/14/2016
> [ 142.160342] Call Trace:
> [ 142.161037] dump_stack+0x78/0xb3
> [ 142.161734] dump_header+0x7d/0x334
> [ 142.162433] oom_kill_process+0x228/0x490
> [ 142.163126] ? oom_badness+0x2a/0x130
> [ 142.163821] out_of_memory+0xf0/0x280
> [ 142.164532] __alloc_pages_slowpath+0x711/0xa07
> [ 142.165241] __alloc_pages_nodemask+0x23f/0x260
> [ 142.165947] alloc_pages_vma+0x73/0x180
> [ 142.166665] do_anonymous_page+0xed/0x4e0
> [ 142.167388] __handle_mm_fault+0xbd2/0xe00
> [ 142.168114] handle_mm_fault+0x116/0x250
> [ 142.168841] __do_page_fault+0x233/0x4d0
> [ 142.169567] do_page_fault+0x32/0x130
> [ 142.170303] ? page_fault+0x8/0x30
> [ 142.171036] page_fault+0x1e/0x30
> [ 142.171764] RIP: 0033:0x7f403000a860
> [ 142.172517] RSP: 002b:00007ffc9f745c28 EFLAGS: 00010206
> [ 142.173268] RAX: 00007f3f6fd7d000 RBX: 0000000000000000 RCX: 00007f3f7f5cd000
> [ 142.174040] RDX: 00007f3fafd7d000 RSI: 0000000000000000 RDI: 00007f3f6fd7d000
> [ 142.174806] RBP: 00007ffc9f745c50 R08: ffffffffffffffff R09: 0000000000000000
> [ 142.175623] R10: 0000000000000022 R11: 0000000000000246 R12: 0000000000400490
> [ 142.176542] R13: 00007ffc9f745d30 R14: 0000000000000000 R15: 0000000000000000
> [ 142.177709] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),origin_memcg=(null),kill_memcg=/test/test1/test2,task=panic,pid= 8622,uid= 0
> ...
>
> Changes since v6:
> - divide the patch v5 into two parts. One part is to add an array of const char and
> put enum oom_constraint into the memcontrol.h; the other is will refactor the output
> in the dump_header.
> - limit the memory usage for the static char array by using NAME_MAX in the mem_cgroup_print_oom_context.
> - eliminate the spurious spaces in the oom's output and fix the spelling of "constrain".
>
> Changes since v5:
> - add an array of const char for each constraint.
> - replace all of the pr_cont with a single line print of the pr_info.
> - put enum oom_constraint into the memcontrol.c file for printing oom constraint.
>
> Changes since v4:
> - rename the helper's name to mem_cgroup_print_oom_context.
> - rename the mem_cgroup_print_oom_info to mem_cgroup_print_oom_meminfo.
> - add the constrain info in the dump_header.
>
> Changes since v3:
> - rename the helper's name to mem_cgroup_print_oom_memcg_name.
> - add the rcu lock held to the helper.
> - remove the print info of memcg's name in mem_cgroup_print_oom_info.
>
> Changes since v2:
> - add the mem_cgroup_print_memcg_name helper to print the memcg's
> name which contains the task that will be killed by the oom-killer.
>
> Changes since v1:
> - replace adding mem_cgroup_print_oom_info with printing the memcg's
> name only.
>
> Signed-off-by: yuzhoujian <[email protected]>
> ---
> include/linux/memcontrol.h | 15 ++++++++++---
> mm/memcontrol.c | 55 ++++++++++++++++++++++++++++++++--------------
> mm/oom_kill.c | 5 +++--
> 3 files changed, 53 insertions(+), 22 deletions(-)
>
> diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
> index 57311b6c4d67..1c7d5da1c827 100644
> --- a/include/linux/memcontrol.h
> +++ b/include/linux/memcontrol.h
> @@ -478,8 +478,11 @@ void mem_cgroup_handle_over_high(void);
>
> unsigned long mem_cgroup_get_limit(struct mem_cgroup *memcg);
>
> -void mem_cgroup_print_oom_info(struct mem_cgroup *memcg,
> - struct task_struct *p);
> +void mem_cgroup_print_oom_context(struct mem_cgroup *memcg,
> + struct task_struct *p, enum oom_constraint constraint,
> + nodemask_t *nodemask);
> +
> +void mem_cgroup_print_oom_meminfo(struct mem_cgroup *memcg);
>
> static inline void mem_cgroup_oom_enable(void)
> {
> @@ -873,7 +876,13 @@ static inline unsigned long mem_cgroup_get_limit(struct mem_cgroup *memcg)
> }
>
> static inline void
> -mem_cgroup_print_oom_info(struct mem_cgroup *memcg, struct task_struct *p)
> +mem_cgroup_print_oom_context(struct mem_cgroup *memcg, struct task_struct *p,
> + enum oom_constraint constraint, nodemask_t *nodemask)
> +{
> +}
> +
> +static inline void
> +mem_cgroup_print_oom_meminfo(struct mem_cgroup *memcg)
> {
> }
>
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 2bd3df3d101a..fd1172938c8e 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -1118,33 +1118,54 @@ static const char *const memcg1_stat_names[] = {
> };
>
> #define K(x) ((x) << (PAGE_SHIFT-10))
> -/**
> - * mem_cgroup_print_oom_info: Print OOM information relevant to memory controller.
> - * @memcg: The memory cgroup that went over limit
> +/*
> + * mem_cgroup_print_oom_context: Print OOM context information relevant to
> + * memory controller, which includes allocation constraint, nodemask, origin
> + * memcg that has reached its limit, kill memcg that contains the killed
> + * process, killed process's command, pid and uid.
Please keep the brief description of the function actually brief and move
the detailed explanation after the parameters description.
> + * @memcg: The origin memory cgroup that went over limit
> * @p: Task that is going to be killed
> + * @constraint: The allocation constraint
> + * @nodemask: The allocation nodemask
> *
> * NOTE: @memcg and @p's mem_cgroup can be different when hierarchy is
> * enabled
> */
> -void mem_cgroup_print_oom_info(struct mem_cgroup *memcg, struct task_struct *p)
> +void mem_cgroup_print_oom_context(struct mem_cgroup *memcg, struct task_struct *p,
> + enum oom_constraint constraint, nodemask_t *nodemask)
> {
> - struct mem_cgroup *iter;
> - unsigned int i;
> + static char origin_memcg_name[NAME_MAX], kill_memcg_name[NAME_MAX];
> + struct cgroup *origin_cgrp, *kill_cgrp;
>
> rcu_read_lock();
> -
> - if (p) {
> - pr_info("Task in ");
> - pr_cont_cgroup_path(task_cgroup(p, memory_cgrp_id));
> - pr_cont(" killed as a result of limit of ");
> - } else {
> - pr_info("Memory limit reached of cgroup ");
> + if (memcg) {
> + origin_cgrp = memcg->css.cgroup;
> + cgroup_path(origin_cgrp, origin_memcg_name, NAME_MAX);
> }
> -
> - pr_cont_cgroup_path(memcg->css.cgroup);
> - pr_cont("\n");
> -
> + kill_cgrp = task_cgroup(p, memory_cgrp_id);
> + cgroup_path(kill_cgrp, kill_memcg_name, NAME_MAX);
> +
> + if (p)
> + pr_info("oom-kill:constraint=%s,nodemask=%*pbl,origin_memcg=%s,kill_memcg=%s,task=%s,pid=%5d,uid=%5d\n",
> + oom_constraint_text[constraint], nodemask_pr_args(nodemask),
> + strlen(origin_memcg_name) ? origin_memcg_name : "(null)",
> + kill_memcg_name, p->comm, p->pid,
> + from_kuid(&init_user_ns, task_uid(p)));
> + else
> + pr_info("oom-kill:constraint=%s,nodemask=%*pbl,origin_memcg=%s,kill_memcg=%s\n",
> + oom_constraint_text[constraint], nodemask_pr_args(nodemask),
> + strlen(origin_memcg_name) ? origin_memcg_name : "(null)", kill_memcg_name);
> rcu_read_unlock();
> +}
> +
> +/**
> + * mem_cgroup_print_oom_info: Print OOM memory information relevant to memory controller.
> + * @memcg: The memory cgroup that went over limit
> + */
> +void mem_cgroup_print_oom_meminfo(struct mem_cgroup *memcg)
> +{
> + struct mem_cgroup *iter;
> + unsigned int i;
>
> pr_info("memory: usage %llukB, limit %llukB, failcnt %lu\n",
> K((u64)page_counter_read(&memcg->memory)),
> diff --git a/mm/oom_kill.c b/mm/oom_kill.c
> index c806cd656af6..af0efab8a9e5 100644
> --- a/mm/oom_kill.c
> +++ b/mm/oom_kill.c
> @@ -16,7 +16,6 @@
> * for newbie kernel hackers. It features several pointers to major
> * kernel subsystems and hints as to where to find out what things do.
> */
> -
> #include <linux/oom.h>
> #include <linux/mm.h>
> #include <linux/err.h>
> @@ -414,6 +413,7 @@ static void dump_tasks(struct mem_cgroup *memcg, const nodemask_t *nodemask)
>
> static void dump_header(struct oom_control *oc, struct task_struct *p)
> {
> + enum oom_constraint constraint = constrained_alloc(oc);
The allocation constraint is detected by the dump_header() callers, why not
just use it here?
> pr_warn("%s invoked oom-killer: gfp_mask=%#x(%pGg), nodemask=%*pbl, order=%d, oom_score_adj=%hd\n",
> current->comm, oc->gfp_mask, &oc->gfp_mask,
> nodemask_pr_args(oc->nodemask), oc->order,
> @@ -423,8 +423,9 @@ static void dump_header(struct oom_control *oc, struct task_struct *p)
>
> cpuset_print_current_mems_allowed();
> dump_stack();
> + mem_cgroup_print_oom_context(oc->memcg, p, constraint, oc->nodemask);
> if (is_memcg_oom(oc))
> - mem_cgroup_print_oom_info(oc->memcg, p);
> + mem_cgroup_print_oom_meminfo(oc->memcg);
> else {
> show_mem(SHOW_MEM_FILTER_NODES, oc->nodemask);
> if (is_dump_unreclaim_slabs())
> --
> 2.14.1
>
--
Sincerely yours,
Mike.
On 2018/06/02 20:58, yuzhoujian wrote:
> -void mem_cgroup_print_oom_info(struct mem_cgroup *memcg, struct task_struct *p)
> +void mem_cgroup_print_oom_context(struct mem_cgroup *memcg, struct task_struct *p,
> + enum oom_constraint constraint, nodemask_t *nodemask)
> {
> - struct mem_cgroup *iter;
> - unsigned int i;
> + static char origin_memcg_name[NAME_MAX], kill_memcg_name[NAME_MAX];
> + struct cgroup *origin_cgrp, *kill_cgrp;
>
> rcu_read_lock();
> -
> - if (p) {
> - pr_info("Task in ");
> - pr_cont_cgroup_path(task_cgroup(p, memory_cgrp_id));
> - pr_cont(" killed as a result of limit of ");
> - } else {
> - pr_info("Memory limit reached of cgroup ");
> + if (memcg) {
> + origin_cgrp = memcg->css.cgroup;
> + cgroup_path(origin_cgrp, origin_memcg_name, NAME_MAX);
> }
> -
> - pr_cont_cgroup_path(memcg->css.cgroup);
> - pr_cont("\n");
> -
> + kill_cgrp = task_cgroup(p, memory_cgrp_id);
> + cgroup_path(kill_cgrp, kill_memcg_name, NAME_MAX);
> +
> + if (p)
> + pr_info("oom-kill:constraint=%s,nodemask=%*pbl,origin_memcg=%s,kill_memcg=%s,task=%s,pid=%5d,uid=%5d\n",
> + oom_constraint_text[constraint], nodemask_pr_args(nodemask),
> + strlen(origin_memcg_name) ? origin_memcg_name : "(null)",
Since origin_memcg_name is printed for both memcg OOM and !memcg OOM,
it is strange that origin_memcg_name is updated only when memcg != NULL.
Have you really tested !memcg OOM case?
Hi Mike
> Please keep the brief description of the function actually brief and move the detailed explanation after the parameters description.
Thanks for your advice.
> The allocation constraint is detected by the dump_header() callers, why not just use it here?
David suggest that constraint need to be printed in the oom report, so
I add the enum variable in this function.
Thanks
Wind
Hi Tetsuo
> Since origin_memcg_name is printed for both memcg OOM and !memcg OOM, it is strange that origin_memcg_name is updated only when memcg != NULL. Have you really tested !memcg OOM case?
if memcg == NULL , origin_memcg_name will also be NULL, so the length
of it is 0. origin_memcg_name will be "(null)". I've tested !memcg OOM
case with CONFIG_MEMCG and !CONFIG_MEMCG, and found nothing wrong.
Thanks
Wind
禹舟键 <[email protected]> 于2018年6月4日周一 上午9:58写道:
>
> Hi Mike
> > Please keep the brief description of the function actually brief and move the detailed explanation after the parameters description.
> Thanks for your advice.
>
> > The allocation constraint is detected by the dump_header() callers, why not just use it here?
> David suggest that constraint need to be printed in the oom report, so
> I add the enum variable in this function.
>
> Thanks
> Wind
On Mon, Jun 04, 2018 at 10:41:10AM +0800, 禹舟键 wrote:
> Hi Tetsuo
> > Since origin_memcg_name is printed for both memcg OOM and !memcg OOM, it is strange that origin_memcg_name is updated only when memcg != NULL. Have you really tested !memcg OOM case?
>
> if memcg == NULL , origin_memcg_name will also be NULL, so the length
> of it is 0. origin_memcg_name will be "(null)". I've tested !memcg OOM
> case with CONFIG_MEMCG and !CONFIG_MEMCG, and found nothing wrong.
>
> Thanks
> Wind
> 禹舟键 <[email protected]> 于2018年6月4日周一 上午9:58写道:
> >
> > Hi Mike
> > > Please keep the brief description of the function actually brief and move the detailed explanation after the parameters description.
> > Thanks for your advice.
> >
> > > The allocation constraint is detected by the dump_header() callers, why not just use it here?
> > David suggest that constraint need to be printed in the oom report, so
> > I add the enum variable in this function.
My question was why do you call to alloc_constrained in the dump_header()
function rather than pass the constraint that was detected a bit earlier to
that function?
Sorry if wasn't clear enough.
> > Thanks
> > Wind
>
--
Sincerely yours,
Mike.
Hi Mike
> My question was why do you call to alloc_constrained in the dump_header()
>function rather than pass the constraint that was detected a bit earlier to
>that function?
Ok, I will add a new parameter in the dump_header.
Thank you.
On Sat 02-06-18 19:58:51, [email protected] wrote:
> From: yuzhoujian <[email protected]>
>
> This patch will make some preparation for the follow-up patch: Refactor
> part of the oom report in dump_header. It puts enum oom_constraint in
> memcontrol.h and adds an array of const char for each constraint.
I do not get why you separate this specific part out.
oom_constraint_text is not used in the patch. It is almost always
preferable to have a user of newly added functionality.
>
> Signed-off-by: yuzhoujian <[email protected]>
> ---
> include/linux/memcontrol.h | 14 ++++++++++++++
> mm/oom_kill.c | 7 -------
> 2 files changed, 14 insertions(+), 7 deletions(-)
>
> diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
> index d99b71bc2c66..57311b6c4d67 100644
> --- a/include/linux/memcontrol.h
> +++ b/include/linux/memcontrol.h
> @@ -62,6 +62,20 @@ struct mem_cgroup_reclaim_cookie {
> unsigned int generation;
> };
>
> +enum oom_constraint {
> + CONSTRAINT_NONE,
> + CONSTRAINT_CPUSET,
> + CONSTRAINT_MEMORY_POLICY,
> + CONSTRAINT_MEMCG,
> +};
> +
> +static const char * const oom_constraint_text[] = {
> + [CONSTRAINT_NONE] = "CONSTRAINT_NONE",
> + [CONSTRAINT_CPUSET] = "CONSTRAINT_CPUSET",
> + [CONSTRAINT_MEMORY_POLICY] = "CONSTRAINT_MEMORY_POLICY",
> + [CONSTRAINT_MEMCG] = "CONSTRAINT_MEMCG",
> +};
> +
> #ifdef CONFIG_MEMCG
>
> #define MEM_CGROUP_ID_SHIFT 16
> diff --git a/mm/oom_kill.c b/mm/oom_kill.c
> index 8ba6cb88cf58..c806cd656af6 100644
> --- a/mm/oom_kill.c
> +++ b/mm/oom_kill.c
> @@ -237,13 +237,6 @@ unsigned long oom_badness(struct task_struct *p, struct mem_cgroup *memcg,
> return points > 0 ? points : 1;
> }
>
> -enum oom_constraint {
> - CONSTRAINT_NONE,
> - CONSTRAINT_CPUSET,
> - CONSTRAINT_MEMORY_POLICY,
> - CONSTRAINT_MEMCG,
> -};
> -
> /*
> * Determine the type of allocation constraint.
> */
> --
> 2.14.1
--
Michal Hocko
SUSE Labs
On Sat 02-06-18 19:58:52, [email protected] wrote:
> From: yuzhoujian <[email protected]>
>
> The dump_header does not print the memcg's name when the system
> oom happened, so users cannot locate the certain container which
> contains the task that has been killed by the oom killer.
>
> I follow the advices of David Rientjes and Michal Hocko, and refactor
> part of the oom report in a backwards compatible way. After this patch,
> users can get the memcg's path from the oom report and check the certain
> container more quickly.
I have earlier suggested that you split this into two parts. One to add
the missing information and the later to convert it to a single printk
output. Reducing the overhead from PATH_MAX to NAME_MAX is a good step
but it still really begs an example why we really insist on a single
printk and that should be in its own changelog.
Sorry if that was not clear previously.
--
Michal Hocko
SUSE Labs
Hi Tetsuo
I know what you mean. Actully I refer to the code for kernel version:
3.10.0-514. I think I can just use an array of type char, rather than
static char. Is it right?
Thanks
Hi Michal
> I have earlier suggested that you split this into two parts. One to add
> the missing information and the later to convert it to a single printk
> output.
I'm sorry I do not get your point. What do you mean the missing information?
> but it still really begs an example why we really insist on a single
> printk and that should be in its own changelog.
Actually , I just know that we should avoid the interleaving messages
in the dmesg.
But I don't know how to reproduce this issue. I think I can just
recount this issue in
the changelog.
Thanks
On Mon 04-06-18 16:57:17, 禹舟键 wrote:
> Hi Michal
>
> > I have earlier suggested that you split this into two parts. One to add
> > the missing information and the later to convert it to a single printk
> > output.
>
> I'm sorry I do not get your point. What do you mean the missing information?
memcg of the killed process
>
> > but it still really begs an example why we really insist on a single
> > printk and that should be in its own changelog.
>
> Actually , I just know that we should avoid the interleaving messages
> in the dmesg.
Yeah, that would be great. But you are increasing the static kernel size
and that is something to weigh in when considering the benefit. How
often those messages get interleaved? Is it worth another 512B of size?
Maybe yes, I am not sure. But this should be its own patch so that we
can revert it easily if the cost turns out to be bigger than the
benefit. You should realize that the OOM is a rare case and spending
resources on it is not really appreciated.
That being said, I am ready to ack a patch which adds the memcg of the
oom victim. I will not ack (nor nack) the patch which turns it into a
single print because I am not sure the benefit is really worth it. Maybe
others will though.
--
Michal Hocko
SUSE Labs
Hi Michal
I will add the missing information in the cover-letter.
> That being said, I am ready to ack a patch which adds the memcg of the
> oom victim. I will not ack (nor nack) the patch which turns it into a
> single print because I am not sure the benefit is really worth it. Maybe
> others will though.
OK, I will use the pr_cont_cgroup_name() to print origin and kill
memcg's name. I hope David will not have other opinions :)
Thanks
On Mon 04-06-18 20:13:44, 禹舟键 wrote:
> Hi Michal
> I will add the missing information in the cover-letter.
I do not really think the cover letter needs much improvements. It is
the patch description that should be as specific as possible. Cover
letter should contain a highlevel description usually.
> > That being said, I am ready to ack a patch which adds the memcg of the
> > oom victim. I will not ack (nor nack) the patch which turns it into a
> > single print because I am not sure the benefit is really worth it. Maybe
> > others will though.
>
> OK, I will use the pr_cont_cgroup_name() to print origin and kill
> memcg's name. I hope David will not have other opinions :)
As I've said this can be always added on top pressuming there is a good
justification.
--
Michal Hocko
SUSE Labs
Hi Mike
> My question was why do you call to alloc_constrained in the dump_header()
> function rather than pass the constraint that was detected a bit earlier to
> that function?
dump_header will be called by three functions: oom_kill_process,
check_panic_on_oom, out_of_memory.
We can get the constraint from the last two
functions(check_panic_on_oom, out_of_memory), but I need to
pass a new parameter(constraint) for oom_kill_process.
Thanks
On Fri, Jun 08, 2018 at 05:53:14PM +0800, 禹舟键 wrote:
> Hi Mike
> > My question was why do you call to alloc_constrained in the dump_header()
> > function rather than pass the constraint that was detected a bit earlier to
> > that function?
>
> dump_header will be called by three functions: oom_kill_process,
> check_panic_on_oom, out_of_memory.
> We can get the constraint from the last two
> functions(check_panic_on_oom, out_of_memory), but I need to
> pass a new parameter(constraint) for oom_kill_process.
Another option is to add the constraint to the oom_control structure.
> Thanks
>
--
Sincerely yours,
Mike.
On Sun 10-06-18 08:12:16, Mike Rapoport wrote:
> On Fri, Jun 08, 2018 at 05:53:14PM +0800, 禹舟键 wrote:
> > Hi Mike
> > > My question was why do you call to alloc_constrained in the dump_header()
> > > function rather than pass the constraint that was detected a bit earlier to
> > > that function?
> >
> > dump_header will be called by three functions: oom_kill_process,
> > check_panic_on_oom, out_of_memory.
> > We can get the constraint from the last two
> > functions(check_panic_on_oom, out_of_memory), but I need to
> > pass a new parameter(constraint) for oom_kill_process.
>
> Another option is to add the constraint to the oom_control structure.
Which would make more sense because oom_control should contain the full
OOM context.
--
Michal Hocko
SUSE Labs