The default page memory unit of OOM task dump events might not be
intuitive for the non-initiated when debugging OOM events. Add
a small printk prior to the task dump informing that the memory
units are actually memory _pages_.
Signed-off-by: Rodrigo Freire <[email protected]>
---
mm/oom_kill.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index 84081e7..b4d9557 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -392,6 +392,7 @@ static void dump_tasks(struct mem_cgroup *memcg, const nodemask_t *nodemask)
struct task_struct *p;
struct task_struct *task;
+ pr_info("Tasks state (memory values in pages):\n");
pr_info("[ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj name\n");
rcu_read_lock();
for_each_process(p) {
--
1.8.3.1
On Sun 01-07-18 13:09:40, Rodrigo Freire wrote:
> The default page memory unit of OOM task dump events might not be
> intuitive for the non-initiated when debugging OOM events. Add
> a small printk prior to the task dump informing that the memory
> units are actually memory _pages_.
Does this really help? I understand the the oom report might be not the
easiest thing to grasp but wouldn't it be much better to actually add
documentation with clarification of each part of it?
> Signed-off-by: Rodrigo Freire <[email protected]>
> ---
> mm/oom_kill.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/mm/oom_kill.c b/mm/oom_kill.c
> index 84081e7..b4d9557 100644
> --- a/mm/oom_kill.c
> +++ b/mm/oom_kill.c
> @@ -392,6 +392,7 @@ static void dump_tasks(struct mem_cgroup *memcg, const nodemask_t *nodemask)
> struct task_struct *p;
> struct task_struct *task;
>
> + pr_info("Tasks state (memory values in pages):\n");
> pr_info("[ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj name\n");
> rcu_read_lock();
> for_each_process(p) {
> --
> 1.8.3.1
--
Michal Hocko
SUSE Labs
Hello Michal,
----- Original Message -----
> From: "Michal Hocko" <[email protected]>
> To: "Rodrigo Freire" <[email protected]>
> Cc: [email protected], [email protected]
> Sent: Monday, July 2, 2018 6:30:43 AM
> Subject: Re: [PATCH] mm: be more informative in OOM task list
>
> On Sun 01-07-18 13:09:40, Rodrigo Freire wrote:
> > The default page memory unit of OOM task dump events might not be
> > intuitive for the non-initiated when debugging OOM events. Add
> > a small printk prior to the task dump informing that the memory
> > units are actually memory _pages_.
>
> Does this really help? I understand the the oom report might be not the
> easiest thing to grasp but wouldn't it be much better to actually add
> documentation with clarification of each part of it?
That would be great: After a quick grep -ri for oom in Documentation,
I found several other files containing its own OOM behaviour modifier
configurations. But it indeed lacks a central and canonical Doc file
which documents the OOM Killer behavior and workflows.
However, I still stand by my proposed patch: It is unobtrusive, infers
no performance issue and clarifying: I recently worked in a case (for
full disclosure: I am a far cry from a MM expert) where the sum of the
RSS pages made sense when interpreted as real kB pages. Reason: There
were processes sharing (a good amount of) memory regions, misleading
the interpretation and that misled not only me, but some other
colleagues a well: The pages was only sorted out after actually
inspecting the source code.
This patch is user-friendly and can be a great time saver to others in
the community.
I kindly request the ACKed-by ;-)
Have a great week,
- RF.
Hello Michal!
----- Original Message -----
> From: "Michal Hocko" <[email protected]>
> To: "Rodrigo Freire" <[email protected]>
> Cc: [email protected], [email protected]
> Sent: Monday, July 2, 2018 8:29:06 AM
> Subject: Re: [PATCH] mm: be more informative in OOM task list
>
> On Mon 02-07-18 07:22:13, Rodrigo Freire wrote:
> > Hello Michal,
> >
> > ----- Original Message -----
> > > From: "Michal Hocko" <[email protected]>
> > > To: "Rodrigo Freire" <[email protected]>
> > > Cc: [email protected], [email protected]
> > > Sent: Monday, July 2, 2018 6:30:43 AM
> > > Subject: Re: [PATCH] mm: be more informative in OOM task list
> > >
> > > On Sun 01-07-18 13:09:40, Rodrigo Freire wrote:
> > > > The default page memory unit of OOM task dump events might not be
> > > > intuitive for the non-initiated when debugging OOM events. Add
> > > > a small printk prior to the task dump informing that the memory
> > > > units are actually memory _pages_.
> > >
> > > Does this really help? I understand the the oom report might be not the
> > > easiest thing to grasp but wouldn't it be much better to actually add
> > > documentation with clarification of each part of it?
> >
> > That would be great: After a quick grep -ri for oom in Documentation,
> > I found several other files containing its own OOM behaviour modifier
> > configurations. But it indeed lacks a central and canonical Doc file
> > which documents the OOM Killer behavior and workflows.
> >
> > However, I still stand by my proposed patch: It is unobtrusive, infers
> > no performance issue and clarifying: I recently worked in a case (for
> > full disclosure: I am a far cry from a MM expert) where the sum of the
> > RSS pages made sense when interpreted as real kB pages. Reason: There
> > were processes sharing (a good amount of) memory regions, misleading
> > the interpretation and that misled not only me, but some other
> > colleagues a well: The pages was only sorted out after actually
> > inspecting the source code.
> >
> > This patch is user-friendly and can be a great time saver to others in
> > the community.
>
> Well, all other counters we print are in page units unless explicitly
> kB.
Your statement is correct. And I thought about that too. And then the doubt:
* Maybe someone forgot to state that these values are in kB?
> So I am not sure we really need to do anything but document the
> output better. Maybe others will find it more important though.
The thing is, it also led some other colleagues (a few!) to think the
very same as me: That raised the flag and made me write the patch:
That was indeed misleading.
And you may not have a MM and OOM-versed specialist available all the
time! ;-)
Still ask you to reconsider.
My best regards,
- RF.
On Mon 02-07-18 07:22:13, Rodrigo Freire wrote:
> Hello Michal,
>
> ----- Original Message -----
> > From: "Michal Hocko" <[email protected]>
> > To: "Rodrigo Freire" <[email protected]>
> > Cc: [email protected], [email protected]
> > Sent: Monday, July 2, 2018 6:30:43 AM
> > Subject: Re: [PATCH] mm: be more informative in OOM task list
> >
> > On Sun 01-07-18 13:09:40, Rodrigo Freire wrote:
> > > The default page memory unit of OOM task dump events might not be
> > > intuitive for the non-initiated when debugging OOM events. Add
> > > a small printk prior to the task dump informing that the memory
> > > units are actually memory _pages_.
> >
> > Does this really help? I understand the the oom report might be not the
> > easiest thing to grasp but wouldn't it be much better to actually add
> > documentation with clarification of each part of it?
>
> That would be great: After a quick grep -ri for oom in Documentation,
> I found several other files containing its own OOM behaviour modifier
> configurations. But it indeed lacks a central and canonical Doc file
> which documents the OOM Killer behavior and workflows.
>
> However, I still stand by my proposed patch: It is unobtrusive, infers
> no performance issue and clarifying: I recently worked in a case (for
> full disclosure: I am a far cry from a MM expert) where the sum of the
> RSS pages made sense when interpreted as real kB pages. Reason: There
> were processes sharing (a good amount of) memory regions, misleading
> the interpretation and that misled not only me, but some other
> colleagues a well: The pages was only sorted out after actually
> inspecting the source code.
>
> This patch is user-friendly and can be a great time saver to others in
> the community.
Well, all other counters we print are in page units unless explicitly
kB. So I am not sure we really need to do anything but document the
output better. Maybe others will find it more important though.
--
Michal Hocko
SUSE Labs
On Sun, 1 Jul 2018, Rodrigo Freire wrote:
> The default page memory unit of OOM task dump events might not be
> intuitive for the non-initiated when debugging OOM events. Add
> a small printk prior to the task dump informing that the memory
> units are actually memory _pages_.
>
> Signed-off-by: Rodrigo Freire <[email protected]>
> ---
> mm/oom_kill.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/mm/oom_kill.c b/mm/oom_kill.c
> index 84081e7..b4d9557 100644
> --- a/mm/oom_kill.c
> +++ b/mm/oom_kill.c
> @@ -392,6 +392,7 @@ static void dump_tasks(struct mem_cgroup *memcg, const nodemask_t *nodemask)
> struct task_struct *p;
> struct task_struct *task;
>
> + pr_info("Tasks state (memory values in pages):\n");
> pr_info("[ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj name\n");
> rcu_read_lock();
> for_each_process(p) {
As the author of dump_tasks(), and having seen these values misinterpreted
on more than one occassion, I think this is a valuable addition.
Could you also expand out the "pid" field to allow for seven digits
instead of five? I think everything else is aligned.
Feel free to add
Acked-by: David Rientjes <[email protected]>
to a v2.
On Tue, Jul 03, 2018 at 06:34:48PM -0700, David Rientjes wrote:
> On Sun, 1 Jul 2018, Rodrigo Freire wrote:
>
> > The default page memory unit of OOM task dump events might not be
> > intuitive for the non-initiated when debugging OOM events. Add
> > a small printk prior to the task dump informing that the memory
> > units are actually memory _pages_.
> >
> > Signed-off-by: Rodrigo Freire <[email protected]>
> > ---
> > mm/oom_kill.c | 1 +
> > 1 file changed, 1 insertion(+)
> >
> > diff --git a/mm/oom_kill.c b/mm/oom_kill.c
> > index 84081e7..b4d9557 100644
> > --- a/mm/oom_kill.c
> > +++ b/mm/oom_kill.c
> > @@ -392,6 +392,7 @@ static void dump_tasks(struct mem_cgroup *memcg, const nodemask_t *nodemask)
> > struct task_struct *p;
> > struct task_struct *task;
> >
> > + pr_info("Tasks state (memory values in pages):\n");
> > pr_info("[ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj name\n");
> > rcu_read_lock();
> > for_each_process(p) {
>
> As the author of dump_tasks(), and having seen these values misinterpreted
> on more than one occassion, I think this is a valuable addition.
>
> Could you also expand out the "pid" field to allow for seven digits
> instead of five? I think everything else is aligned.
>
> Feel free to add
>
> Acked-by: David Rientjes <[email protected]>
>
> to a v2.
>
Same here, for a v2:
Acked-by: Rafael Aquini <[email protected]>