2010-08-19 10:53:01

by KOSAKI Motohiro

[permalink] [raw]
Subject: [PATCH 0/2] oom: Fix two critical bug in 2.6.36-rc1

Currently, System is frequently crash when oom-killer was invoked because
recent changes makes two critical regression.

This patches fix them.


I hope this series will be merged rapidly because they are preventing
almost all oom testing.



2010-08-19 10:53:38

by KOSAKI Motohiro

[permalink] [raw]
Subject: [PATCH 1/2] oom: fix NULL pointer dereference

commit b940fd7035 (oom: remove unnecessary code and cleanup) added
unnecessary NULL pointer dereference. remove it.

Signed-off-by: KOSAKI Motohiro <[email protected]>
---
mm/oom_kill.c | 5 ++---
1 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index 5014e50..17d48a6 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -401,10 +401,9 @@ static void dump_header(struct task_struct *p, gfp_t gfp_mask, int order,
static int oom_kill_task(struct task_struct *p, struct mem_cgroup *mem)
{
p = find_lock_task_mm(p);
- if (!p) {
- task_unlock(p);
+ if (!p)
return 1;
- }
+
pr_err("Killed process %d (%s) total-vm:%lukB, anon-rss:%lukB, file-rss:%lukB\n",
task_pid_nr(p), p->comm, K(p->mm->total_vm),
K(get_mm_counter(p->mm, MM_ANONPAGES)),
--
1.6.5.2


2010-08-19 10:54:12

by KOSAKI Motohiro

[permalink] [raw]
Subject: [PATCH 2/2] oom: fix tasklist_lock leak

commit 0aad4b3124 (oom: fold __out_of_memory into out_of_memory)
introduced tasklist_lock leak. Then it caused following obvious
danger warings and panic.

================================================
[ BUG: lock held when returning to user space! ]
------------------------------------------------
rsyslogd/1422 is leaving the kernel with locks still held!
1 lock held by rsyslogd/1422:
#0: (tasklist_lock){.+.+.+}, at: [<ffffffff810faf64>] out_of_memory+0x164/0x3f0
BUG: scheduling while atomic: rsyslogd/1422/0x00000002
INFO: lockdep is turned off.

This patch fixes it.

Signed-off-by: KOSAKI Motohiro <[email protected]>
---
mm/oom_kill.c | 9 ++++++---
1 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index 17d48a6..c48c5ef 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -646,6 +646,7 @@ void out_of_memory(struct zonelist *zonelist, gfp_t gfp_mask,
unsigned long freed = 0;
unsigned int points;
enum oom_constraint constraint = CONSTRAINT_NONE;
+ int killed = 0;

blocking_notifier_call_chain(&oom_notify_list, 0, &freed);
if (freed > 0)
@@ -683,7 +684,7 @@ void out_of_memory(struct zonelist *zonelist, gfp_t gfp_mask,
if (!oom_kill_process(current, gfp_mask, order, 0, totalpages,
NULL, nodemask,
"Out of memory (oom_kill_allocating_task)"))
- return;
+ goto out;
}

retry:
@@ -691,7 +692,7 @@ retry:
constraint == CONSTRAINT_MEMORY_POLICY ? nodemask :
NULL);
if (PTR_ERR(p) == -1UL)
- return;
+ goto out;

/* Found nothing?!?! Either we hang forever, or we panic. */
if (!p) {
@@ -703,13 +704,15 @@ retry:
if (oom_kill_process(p, gfp_mask, order, points, totalpages, NULL,
nodemask, "Out of memory"))
goto retry;
+ killed = 1;
+out:
read_unlock(&tasklist_lock);

/*
* Give "p" a good chance of killing itself before we
* retry to allocate memory unless "p" is current
*/
- if (!test_thread_flag(TIF_MEMDIE))
+ if (killed && !test_thread_flag(TIF_MEMDIE))
schedule_timeout_uninterruptible(1);
}

--
1.6.5.2


2010-08-19 15:29:48

by Minchan Kim

[permalink] [raw]
Subject: Re: [PATCH 1/2] oom: fix NULL pointer dereference

On Thu, Aug 19, 2010 at 07:53:31PM +0900, KOSAKI Motohiro wrote:
> commit b940fd7035 (oom: remove unnecessary code and cleanup) added
> unnecessary NULL pointer dereference. remove it.
>
> Signed-off-by: KOSAKI Motohiro <[email protected]>
Reviewed-by: Minchan Kim <[email protected]>

--
Kind regards,
Minchan Kim

2010-08-19 16:06:17

by Minchan Kim

[permalink] [raw]
Subject: Re: [PATCH 2/2] oom: fix tasklist_lock leak

On Thu, Aug 19, 2010 at 07:54:06PM +0900, KOSAKI Motohiro wrote:
> commit 0aad4b3124 (oom: fold __out_of_memory into out_of_memory)
> introduced tasklist_lock leak. Then it caused following obvious
> danger warings and panic.
>
> ================================================
> [ BUG: lock held when returning to user space! ]
> ------------------------------------------------
> rsyslogd/1422 is leaving the kernel with locks still held!
> 1 lock held by rsyslogd/1422:
> #0: (tasklist_lock){.+.+.+}, at: [<ffffffff810faf64>] out_of_memory+0x164/0x3f0
> BUG: scheduling while atomic: rsyslogd/1422/0x00000002
> INFO: lockdep is turned off.
>
> This patch fixes it.
>
> Signed-off-by: KOSAKI Motohiro <[email protected]>
Reviewed-by: Minchan Kim <[email protected]>

--
Kind regards,
Minchan Kim

2010-08-19 20:36:06

by David Rientjes

[permalink] [raw]
Subject: Re: [PATCH 1/2] oom: fix NULL pointer dereference

On Thu, 19 Aug 2010, KOSAKI Motohiro wrote:

> commit b940fd7035 (oom: remove unnecessary code and cleanup) added
> unnecessary NULL pointer dereference. remove it.
>
> Signed-off-by: KOSAKI Motohiro <[email protected]>

Acked-by: David Rientjes <[email protected]>

2010-08-19 20:37:26

by David Rientjes

[permalink] [raw]
Subject: Re: [PATCH 2/2] oom: fix tasklist_lock leak

On Thu, 19 Aug 2010, KOSAKI Motohiro wrote:

> commit 0aad4b3124 (oom: fold __out_of_memory into out_of_memory)
> introduced tasklist_lock leak. Then it caused following obvious
> danger warings and panic.
>
> ================================================
> [ BUG: lock held when returning to user space! ]
> ------------------------------------------------
> rsyslogd/1422 is leaving the kernel with locks still held!
> 1 lock held by rsyslogd/1422:
> #0: (tasklist_lock){.+.+.+}, at: [<ffffffff810faf64>] out_of_memory+0x164/0x3f0
> BUG: scheduling while atomic: rsyslogd/1422/0x00000002
> INFO: lockdep is turned off.
>
> This patch fixes it.
>
> Signed-off-by: KOSAKI Motohiro <[email protected]>

Acked-by: David Rientjes <[email protected]>

2010-08-19 23:55:53

by Kamezawa Hiroyuki

[permalink] [raw]
Subject: Re: [PATCH 1/2] oom: fix NULL pointer dereference

On Thu, 19 Aug 2010 19:53:31 +0900 (JST)
KOSAKI Motohiro <[email protected]> wrote:

> commit b940fd7035 (oom: remove unnecessary code and cleanup) added
> unnecessary NULL pointer dereference. remove it.
>
> Signed-off-by: KOSAKI Motohiro <[email protected]>

Reviewed-by: KAMEZAWA Hiroyuki <[email protected]>

2010-08-19 23:57:20

by Kamezawa Hiroyuki

[permalink] [raw]
Subject: Re: [PATCH 2/2] oom: fix tasklist_lock leak

On Thu, 19 Aug 2010 19:54:06 +0900 (JST)
KOSAKI Motohiro <[email protected]> wrote:

> commit 0aad4b3124 (oom: fold __out_of_memory into out_of_memory)
> introduced tasklist_lock leak. Then it caused following obvious
> danger warings and panic.
>
> ================================================
> [ BUG: lock held when returning to user space! ]
> ------------------------------------------------
> rsyslogd/1422 is leaving the kernel with locks still held!
> 1 lock held by rsyslogd/1422:
> #0: (tasklist_lock){.+.+.+}, at: [<ffffffff810faf64>] out_of_memory+0x164/0x3f0
> BUG: scheduling while atomic: rsyslogd/1422/0x00000002
> INFO: lockdep is turned off.
>
> This patch fixes it.
>
> Signed-off-by: KOSAKI Motohiro <[email protected]>
Reviewed-by: KAMEZAWA Hiroyuki <[email protected]>