2012-02-02 13:01:12

by Oleg Nesterov

[permalink] [raw]
Subject: Re: [PATCH] staging: android/lowmemorykiller: Don't grab tasklist_lock

On 02/01, Anton Vorontsov wrote:
>
> @@ -132,7 +133,7 @@ static int lowmem_shrink(struct shrinker *s, struct shrink_control *sc)
> }
> selected_oom_adj = min_adj;
>
> - read_lock(&tasklist_lock);
> + rcu_read_lock();

This has the same problem, force_sig() becomes unsafe.

Why do you need force_? Do you really want to kill /sbin/init (or sub-namespace
init) ?

We could change force_sig_info() to use lock_task_sighand(), but I'd like to
avoid this. Imho, this interface should be cleanuped, and it should be used
for synchronous signals only.

With or without this patch, sig == NULL is not possible but !mm is not right,
there could be other other threads with mm != NULL.

Oleg.


2012-02-02 17:17:07

by Anton Vorontsov

[permalink] [raw]
Subject: Re: [PATCH] staging: android/lowmemorykiller: Don't grab tasklist_lock

On Thu, Feb 02, 2012 at 01:54:41PM +0100, Oleg Nesterov wrote:
> On 02/01, Anton Vorontsov wrote:
> >
> > @@ -132,7 +133,7 @@ static int lowmem_shrink(struct shrinker *s, struct shrink_control *sc)
> > }
> > selected_oom_adj = min_adj;
> >
> > - read_lock(&tasklist_lock);
> > + rcu_read_lock();
>
> This has the same problem, force_sig() becomes unsafe.

Ouch, I think I finally got it. So, lock_task_sighand() is trying to
gracefully grab the lock, checking if the sighand is not NULL (which means,
per __exit_signal(), that the task is halfway into the grave).

Well, it seems that such a behaviour of force_sig() is not quite obvious,
and there are other offenders out there. E.g. in sysrq code I don't see
anything that prevent the same race.

static void send_sig_all(int sig)
{
struct task_struct *p;

for_each_process(p) {
if (p->mm && !is_global_init(p))
/* Not swapper, init nor kernel thread */
force_sig(sig, p);
}
}

Would the following fix work for the sysrq?

- - - -
From: Anton Vorontsov <[email protected]>
Subject: [PATCH] sysrq: Fix unsafe operations on tasks

sysrq should grab the tasklist lock, otherwise calling force_sig() is
not safe, as it might race with exiting task, which ->sighand might be
set to NULL already.

Signed-off-by: Anton Vorontsov <[email protected]>
---
drivers/tty/sysrq.c | 2 ++
1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/drivers/tty/sysrq.c b/drivers/tty/sysrq.c
index 7867b7c..a1bcad7 100644
--- a/drivers/tty/sysrq.c
+++ b/drivers/tty/sysrq.c
@@ -322,11 +322,13 @@ static void send_sig_all(int sig)
{
struct task_struct *p;

+ read_lock(&tasklist_lock);
for_each_process(p) {
if (p->mm && !is_global_init(p))
/* Not swapper, init nor kernel thread */
force_sig(sig, p);
}
+ read_unlock(&tasklist_lock);
}

static void sysrq_handle_term(int key)
- - - -

But for LMK I will use send_signal(), as LMK is special.

Plus, while I'm at it, might want to review a bit closer other offenders,
and fix them as well.

> Why do you need force_? Do you really want to kill /sbin/init (or sub-namespace
> init) ?

Nope.

> We could change force_sig_info() to use lock_task_sighand(), but I'd like to
> avoid this. Imho, this interface should be cleanuped, and it should be used
> for synchronous signals only.

OK. Then we should fix the users?

> With or without this patch, sig == NULL is not possible but !mm is not right,
> there could be other other threads with mm != NULL.

I'm not sure I completely follow. In the current LMK code, we check for !mm
because otherwise the potential victim is useless for us (i.e. killing it
will not free much memory anyway).

Thanks!

--
Anton Vorontsov
Email: [email protected]

2012-02-03 00:03:51

by Anton Vorontsov

[permalink] [raw]
Subject: [PATCH v3] staging: android/lowmemorykiller: Don't grab tasklist_lock

Grabbing tasklist_lock has its disadvantages, i.e. it blocks
process creation and destruction. If there are lots of processes,
blocking doesn't sound as a great idea.

For LMK, it is sufficient to surround tasks list traverse with
rcu_read_{,un}lock().

>From now on using force_sig() is not safe, as it can race with an
already exiting task, so we use send_sig() now. As a downside, it
won't kill PID namespace init processes, but that's not what we
want anyway.

Suggested-by: Oleg Nesterov <[email protected]>
Signed-off-by: Anton Vorontsov <[email protected]>
---
drivers/staging/android/lowmemorykiller.c | 7 ++++---
1 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/staging/android/lowmemorykiller.c b/drivers/staging/android/lowmemorykiller.c
index 2d8d2b7..63da844 100644
--- a/drivers/staging/android/lowmemorykiller.c
+++ b/drivers/staging/android/lowmemorykiller.c
@@ -34,6 +34,7 @@
#include <linux/mm.h>
#include <linux/oom.h>
#include <linux/sched.h>
+#include <linux/rcupdate.h>
#include <linux/profile.h>
#include <linux/notifier.h>

@@ -132,7 +133,7 @@ static int lowmem_shrink(struct shrinker *s, struct shrink_control *sc)
}
selected_oom_adj = min_adj;

- read_lock(&tasklist_lock);
+ rcu_read_lock();
for_each_process(p) {
struct mm_struct *mm;
struct signal_struct *sig;
@@ -180,12 +181,12 @@ static int lowmem_shrink(struct shrinker *s, struct shrink_control *sc)
lowmem_deathpending = selected;
task_handoff_register(&task_nb);
#endif
- force_sig(SIGKILL, selected);
+ send_sig(SIGKILL, selected, 0);
rem -= selected_tasksize;
}
lowmem_print(4, "lowmem_shrink %lu, %x, return %d\n",
sc->nr_to_scan, sc->gfp_mask, rem);
- read_unlock(&tasklist_lock);
+ rcu_read_unlock();
return rem;
}

--
1.7.7.6

2012-02-03 16:37:48

by Oleg Nesterov

[permalink] [raw]
Subject: Re: [PATCH] staging: android/lowmemorykiller: Don't grab tasklist_lock

On 02/02, Anton Vorontsov wrote:
>
> On Thu, Feb 02, 2012 at 01:54:41PM +0100, Oleg Nesterov wrote:
> >
> > This has the same problem, force_sig() becomes unsafe.
>
> Ouch, I think I finally got it. So, lock_task_sighand() is trying to
> gracefully grab the lock, checking if the sighand is not NULL (which means,
> per __exit_signal(), that the task is halfway into the grave).

Yes.

> Would the following fix work for the sysrq?
>
> - - - -
> From: Anton Vorontsov <[email protected]>
> Subject: [PATCH] sysrq: Fix unsafe operations on tasks
>
> sysrq should grab the tasklist lock, otherwise calling force_sig() is
> not safe, as it might race with exiting task, which ->sighand might be
> set to NULL already.

And there are other reasons for tasklist. It is not clear why
for_each_process() itself is safe. OK, currently (afaics) the
caller disables irqs, but in theory this doesn't imply rcu_lock.

And it can race with fork() and miss the task, although mostly
in theory.

> --- a/drivers/tty/sysrq.c
> +++ b/drivers/tty/sysrq.c
> @@ -322,11 +322,13 @@ static void send_sig_all(int sig)
> {
> struct task_struct *p;
>
> + read_lock(&tasklist_lock);
> for_each_process(p) {
> if (p->mm && !is_global_init(p))
> /* Not swapper, init nor kernel thread */
> force_sig(sig, p);
> }
> + read_unlock(&tasklist_lock);

Agreed, but force_sig() should not be used anyway. It sends the
signal to the thread, we need to kill the process. The main thread
can exit.

> > With or without this patch, sig == NULL is not possible but !mm is not right,
> > there could be other other threads with mm != NULL.
>
> I'm not sure I completely follow. In the current LMK code, we check for !mm
> because otherwise the potential victim is useless for us (i.e. killing it
> will not free much memory anyway).

This is the common mistake. Consider this program:

void *thread_func(...)
{
use_a_lot_of_memory();
}

int main(void)
{
pthread_create(..., thread_func, ...);
pthread_exit();
}

lowmem_shrink() will skip this task because p->mm == NULL.

Oleg.

2012-02-03 16:43:42

by Oleg Nesterov

[permalink] [raw]
Subject: Re: [PATCH v3] staging: android/lowmemorykiller: Don't grab tasklist_lock

On 02/03, Anton Vorontsov wrote:
>
> @@ -132,7 +133,7 @@ static int lowmem_shrink(struct shrinker *s, struct shrink_control *sc)
> }
> selected_oom_adj = min_adj;
>
> - read_lock(&tasklist_lock);
> + rcu_read_lock();
> for_each_process(p) {
> struct mm_struct *mm;
> struct signal_struct *sig;
> @@ -180,12 +181,12 @@ static int lowmem_shrink(struct shrinker *s, struct shrink_control *sc)
> lowmem_deathpending = selected;
> task_handoff_register(&task_nb);
> #endif
> - force_sig(SIGKILL, selected);
> + send_sig(SIGKILL, selected, 0);
> rem -= selected_tasksize;
> }
> lowmem_print(4, "lowmem_shrink %lu, %x, return %d\n",
> sc->nr_to_scan, sc->gfp_mask, rem);
> - read_unlock(&tasklist_lock);
> + rcu_read_unlock();

I think this is correct. As for ->mm check please look at
find_lock_task_mm().

You can also remove the !sig check.

And, forgot to mention. There is another reason why mm != NULL
check is wrong (send_sig_all too). A kernel thread can do use_mm().
You should also check PF_KTHREAD.

Oleg.

2012-02-06 16:29:35

by Anton Vorontsov

[permalink] [raw]
Subject: [PATCH 1/6] oom: Make find_lock_task_mm() sparse-aware

This is needed so that callers would not get 'context imbalance'
warnings from the sparse tool.

As a side effect, this patch fixes the following sparse warnings:

CHECK mm/oom_kill.c
mm/oom_kill.c:201:28: warning: context imbalance in 'oom_badness' -
unexpected unlock
include/linux/rcupdate.h:249:30: warning: context imbalance in
'dump_tasks' - unexpected unlock
mm/oom_kill.c:453:9: warning: context imbalance in 'oom_kill_task' -
unexpected unlock
CHECK mm/memcontrol.c
...
mm/memcontrol.c:1130:17: warning: context imbalance in
'task_in_mem_cgroup' - unexpected unlock

Signed-off-by: Anton Vorontsov <[email protected]>
---
include/linux/oom.h | 11 ++++++++++-
mm/oom_kill.c | 2 +-
2 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/include/linux/oom.h b/include/linux/oom.h
index 552fba9..e64bfd2 100644
--- a/include/linux/oom.h
+++ b/include/linux/oom.h
@@ -21,6 +21,7 @@

#ifdef __KERNEL__

+#include <linux/compiler.h>
#include <linux/sched.h>
#include <linux/types.h>
#include <linux/nodemask.h>
@@ -65,7 +66,15 @@ static inline void oom_killer_enable(void)
oom_killer_disabled = false;
}

-extern struct task_struct *find_lock_task_mm(struct task_struct *p);
+extern struct task_struct *__find_lock_task_mm(struct task_struct *p);
+
+#define find_lock_task_mm(p) \
+({ \
+ struct task_struct *__ret; \
+ \
+ __cond_lock(&(p)->alloc_lock, __ret = __find_lock_task_mm(p)); \
+ __ret; \
+})

/* sysctls */
extern int sysctl_oom_dump_tasks;
diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index 2958fd8..0ebb383 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -136,7 +136,7 @@ static bool has_intersects_mems_allowed(struct task_struct *tsk,
* pointer. Return p, or any of its subthreads with a valid ->mm, with
* task_lock() held.
*/
-struct task_struct *find_lock_task_mm(struct task_struct *p)
+struct task_struct *__find_lock_task_mm(struct task_struct *p)
{
struct task_struct *t = p;

--
1.7.7.6

2012-02-06 16:29:41

by Anton Vorontsov

[permalink] [raw]
Subject: [PATCH 2/6] oom: Get rid of sparse warnings

Sparse flood makes it hard to catch newly-introduced warnings. So let's
fix the the sparse warnings in the oom killer:

CHECK mm/oom_kill.c
mm/oom_kill.c:139:20: warning: context imbalance in
'__find_lock_task_mm' - wrong count at exit
mm/oom_kill.c:771:9: warning: context imbalance in 'out_of_memory' -
different lock contexts for basic block

The first one is fixed by assuring sparse that we know that we exit
with the lock held.

The second one is caused by the fact that sparse isn't smart enough
to handle noreturn attribute.

Signed-off-by: Anton Vorontsov <[email protected]>
---
mm/oom_kill.c | 9 ++++++++-
1 files changed, 8 insertions(+), 1 deletions(-)

diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index 0ebb383..49569b6 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -142,8 +142,14 @@ struct task_struct *__find_lock_task_mm(struct task_struct *p)

do {
task_lock(t);
- if (likely(t->mm))
+ if (likely(t->mm)) {
+ /*
+ * Shut up sparse: we do know that we exit w/ the
+ * task locked.
+ */
+ __release(&t->alloc_loc);
return t;
+ }
task_unlock(t);
} while_each_thread(p, t);

@@ -766,6 +772,7 @@ retry:
dump_header(NULL, gfp_mask, order, NULL, mpol_mask);
read_unlock(&tasklist_lock);
panic("Out of memory and no killable processes...\n");
+ return;
}

if (oom_kill_process(p, gfp_mask, order, points, totalpages, NULL,
--
1.7.7.6

2012-02-06 16:29:48

by Anton Vorontsov

[permalink] [raw]
Subject: [PATCH 3/6] staging: android/lowmemorykiller: Don't grab tasklist_lock

Grabbing tasklist_lock has its disadvantages, i.e. it blocks
process creation and destruction. If there are lots of processes,
blocking doesn't sound as a great idea.

For LMK, it is sufficient to surround tasks list traverse with
rcu_read_{,un}lock().

>From now on using force_sig() is not safe, as it can race with an
already exiting task, so we use send_sig() now. As a downside, it
won't kill PID namespace init processes, but that's not what we
want anyway.

Suggested-by: Oleg Nesterov <[email protected]>
Signed-off-by: Anton Vorontsov <[email protected]>
---
drivers/staging/android/lowmemorykiller.c | 7 ++++---
1 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/staging/android/lowmemorykiller.c b/drivers/staging/android/lowmemorykiller.c
index 2d8d2b7..63da844 100644
--- a/drivers/staging/android/lowmemorykiller.c
+++ b/drivers/staging/android/lowmemorykiller.c
@@ -34,6 +34,7 @@
#include <linux/mm.h>
#include <linux/oom.h>
#include <linux/sched.h>
+#include <linux/rcupdate.h>
#include <linux/profile.h>
#include <linux/notifier.h>

@@ -132,7 +133,7 @@ static int lowmem_shrink(struct shrinker *s, struct shrink_control *sc)
}
selected_oom_adj = min_adj;

- read_lock(&tasklist_lock);
+ rcu_read_lock();
for_each_process(p) {
struct mm_struct *mm;
struct signal_struct *sig;
@@ -180,12 +181,12 @@ static int lowmem_shrink(struct shrinker *s, struct shrink_control *sc)
lowmem_deathpending = selected;
task_handoff_register(&task_nb);
#endif
- force_sig(SIGKILL, selected);
+ send_sig(SIGKILL, selected, 0);
rem -= selected_tasksize;
}
lowmem_print(4, "lowmem_shrink %lu, %x, return %d\n",
sc->nr_to_scan, sc->gfp_mask, rem);
- read_unlock(&tasklist_lock);
+ rcu_read_unlock();
return rem;
}

--
1.7.7.6

2012-02-06 16:29:52

by Anton Vorontsov

[permalink] [raw]
Subject: [PATCH 4/6] staging: android/lowmemorykiller: Better mm handling

LMK should not directly check for task->mm. The reason is that the
process' threads may exit or detach its mm via use_mm(), but other
threads may still have a valid mm. To catch this we use
find_lock_task_mm(), which walks up all threads and returns an
appropriate task (with lock held).

Suggested-by: Oleg Nesterov <[email protected]>
Signed-off-by: Anton Vorontsov <[email protected]>
---
drivers/staging/android/lowmemorykiller.c | 16 +++++++++-------
1 files changed, 9 insertions(+), 7 deletions(-)

diff --git a/drivers/staging/android/lowmemorykiller.c b/drivers/staging/android/lowmemorykiller.c
index 63da844..0755e2f 100644
--- a/drivers/staging/android/lowmemorykiller.c
+++ b/drivers/staging/android/lowmemorykiller.c
@@ -82,7 +82,7 @@ task_notify_func(struct notifier_block *self, unsigned long val, void *data)

static int lowmem_shrink(struct shrinker *s, struct shrink_control *sc)
{
- struct task_struct *p;
+ struct task_struct *tsk;
struct task_struct *selected = NULL;
int rem = 0;
int tasksize;
@@ -134,15 +134,17 @@ static int lowmem_shrink(struct shrinker *s, struct shrink_control *sc)
selected_oom_adj = min_adj;

rcu_read_lock();
- for_each_process(p) {
- struct mm_struct *mm;
+ for_each_process(tsk) {
+ struct task_struct *p;
struct signal_struct *sig;
int oom_adj;

- task_lock(p);
- mm = p->mm;
+ p = find_lock_task_mm(tsk);
+ if (!p)
+ continue;
+
sig = p->signal;
- if (!mm || !sig) {
+ if (!sig) {
task_unlock(p);
continue;
}
@@ -151,7 +153,7 @@ static int lowmem_shrink(struct shrinker *s, struct shrink_control *sc)
task_unlock(p);
continue;
}
- tasksize = get_mm_rss(mm);
+ tasksize = get_mm_rss(p->mm);
task_unlock(p);
if (tasksize <= 0)
continue;
--
1.7.7.6

2012-02-06 16:29:59

by Anton Vorontsov

[permalink] [raw]
Subject: [PATCH 5/6] staging: android/lowmemorykiller: No need for task->signal check

task->signal == NULL is not possible, so no need for these checks.

Suggested-by: Oleg Nesterov <[email protected]>
Signed-off-by: Anton Vorontsov <[email protected]>
---
drivers/staging/android/lowmemorykiller.c | 8 +-------
1 files changed, 1 insertions(+), 7 deletions(-)

diff --git a/drivers/staging/android/lowmemorykiller.c b/drivers/staging/android/lowmemorykiller.c
index 0755e2f..6e800d3 100644
--- a/drivers/staging/android/lowmemorykiller.c
+++ b/drivers/staging/android/lowmemorykiller.c
@@ -136,19 +136,13 @@ static int lowmem_shrink(struct shrinker *s, struct shrink_control *sc)
rcu_read_lock();
for_each_process(tsk) {
struct task_struct *p;
- struct signal_struct *sig;
int oom_adj;

p = find_lock_task_mm(tsk);
if (!p)
continue;

- sig = p->signal;
- if (!sig) {
- task_unlock(p);
- continue;
- }
- oom_adj = sig->oom_adj;
+ oom_adj = p->signal->oom_adj;
if (oom_adj < min_adj) {
task_unlock(p);
continue;
--
1.7.7.6

2012-02-06 16:30:08

by Anton Vorontsov

[permalink] [raw]
Subject: [PATCH 6/6] staging: android/lowmemorykiller: Do not kill kernel threads

LMK should not try killing kernel threads.

Suggested-by: Oleg Nesterov <[email protected]>
Signed-off-by: Anton Vorontsov <[email protected]>
---
drivers/staging/android/lowmemorykiller.c | 3 +++
1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/drivers/staging/android/lowmemorykiller.c b/drivers/staging/android/lowmemorykiller.c
index 6e800d3..ae38c39 100644
--- a/drivers/staging/android/lowmemorykiller.c
+++ b/drivers/staging/android/lowmemorykiller.c
@@ -138,6 +138,9 @@ static int lowmem_shrink(struct shrinker *s, struct shrink_control *sc)
struct task_struct *p;
int oom_adj;

+ if (tsk->flags & PF_KTHREAD)
+ continue;
+
p = find_lock_task_mm(tsk);
if (!p)
continue;
--
1.7.7.6

2012-02-06 16:37:11

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH 1/6] oom: Make find_lock_task_mm() sparse-aware

On Mon, Feb 06, 2012 at 08:29:30PM +0400, Anton Vorontsov wrote:
> This is needed so that callers would not get 'context imbalance'
> warnings from the sparse tool.
>
> As a side effect, this patch fixes the following sparse warnings:
>
> CHECK mm/oom_kill.c
> mm/oom_kill.c:201:28: warning: context imbalance in 'oom_badness' -
> unexpected unlock
> include/linux/rcupdate.h:249:30: warning: context imbalance in
> 'dump_tasks' - unexpected unlock
> mm/oom_kill.c:453:9: warning: context imbalance in 'oom_kill_task' -
> unexpected unlock
> CHECK mm/memcontrol.c
> ...
> mm/memcontrol.c:1130:17: warning: context imbalance in
> 'task_in_mem_cgroup' - unexpected unlock
>
> Signed-off-by: Anton Vorontsov <[email protected]>
> ---
> include/linux/oom.h | 11 ++++++++++-
> mm/oom_kill.c | 2 +-
> 2 files changed, 11 insertions(+), 2 deletions(-)
>
> diff --git a/include/linux/oom.h b/include/linux/oom.h
> index 552fba9..e64bfd2 100644
> --- a/include/linux/oom.h
> +++ b/include/linux/oom.h
> @@ -21,6 +21,7 @@
>
> #ifdef __KERNEL__
>
> +#include <linux/compiler.h>
> #include <linux/sched.h>
> #include <linux/types.h>
> #include <linux/nodemask.h>
> @@ -65,7 +66,15 @@ static inline void oom_killer_enable(void)
> oom_killer_disabled = false;
> }
>
> -extern struct task_struct *find_lock_task_mm(struct task_struct *p);
> +extern struct task_struct *__find_lock_task_mm(struct task_struct *p);
> +
> +#define find_lock_task_mm(p) \
> +({ \
> + struct task_struct *__ret; \
> + \
> + __cond_lock(&(p)->alloc_lock, __ret = __find_lock_task_mm(p)); \
> + __ret; \
> +})

Please use the proper "do...while" style thing here for multi-line,
complex #defines like the rest of the kernel does so that you don't end
up debugging horrible problems later.

And I would need acks from the -mm developers before I could apply any
of this.

thanks,

greg k-h

2012-02-06 16:37:29

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH 3/6] staging: android/lowmemorykiller: Don't grab tasklist_lock

On Mon, Feb 06, 2012 at 08:29:41PM +0400, Anton Vorontsov wrote:
> Grabbing tasklist_lock has its disadvantages, i.e. it blocks
> process creation and destruction. If there are lots of processes,
> blocking doesn't sound as a great idea.
>
> For LMK, it is sufficient to surround tasks list traverse with
> rcu_read_{,un}lock().
>
> >From now on using force_sig() is not safe, as it can race with an
> already exiting task, so we use send_sig() now. As a downside, it
> won't kill PID namespace init processes, but that's not what we
> want anyway.
>
> Suggested-by: Oleg Nesterov <[email protected]>
> Signed-off-by: Anton Vorontsov <[email protected]>

Are these last 4 patches independant of the first 2 and can be taken
through the staging tree now?

thanks,

greg k-h

2012-02-06 16:42:21

by Anton Vorontsov

[permalink] [raw]
Subject: Re: [PATCH 3/6] staging: android/lowmemorykiller: Don't grab tasklist_lock

On Mon, Feb 06, 2012 at 08:36:49AM -0800, Greg KH wrote:
> On Mon, Feb 06, 2012 at 08:29:41PM +0400, Anton Vorontsov wrote:
> > Grabbing tasklist_lock has its disadvantages, i.e. it blocks
> > process creation and destruction. If there are lots of processes,
> > blocking doesn't sound as a great idea.
> >
> > For LMK, it is sufficient to surround tasks list traverse with
> > rcu_read_{,un}lock().
> >
> > >From now on using force_sig() is not safe, as it can race with an
> > already exiting task, so we use send_sig() now. As a downside, it
> > won't kill PID namespace init processes, but that's not what we
> > want anyway.
> >
> > Suggested-by: Oleg Nesterov <[email protected]>
> > Signed-off-by: Anton Vorontsov <[email protected]>
>
> Are these last 4 patches independant of the first 2 and can be taken
> through the staging tree now?

Yep, without the first two there is just a bit of sparse warnings.
Not a big deal.

Thanks,

--
Anton Vorontsov
Email: [email protected]

2012-02-06 18:59:14

by Anton Vorontsov

[permalink] [raw]
Subject: Re: [PATCH 1/6] oom: Make find_lock_task_mm() sparse-aware

On Mon, Feb 06, 2012 at 08:35:42AM -0800, Greg KH wrote:
[...]
> > -extern struct task_struct *find_lock_task_mm(struct task_struct *p);
> > +extern struct task_struct *__find_lock_task_mm(struct task_struct *p);
> > +
> > +#define find_lock_task_mm(p) \
> > +({ \
> > + struct task_struct *__ret; \
> > + \
> > + __cond_lock(&(p)->alloc_lock, __ret = __find_lock_task_mm(p)); \
> > + __ret; \
> > +})
>
> Please use the proper "do...while" style thing here for multi-line,
> complex #defines like the rest of the kernel does so that you don't end
> up debugging horrible problems later.

Unfortunately this isn't possible in this case. Unlike '({})' GCC
extension, do-while statement does not evaluate to a value, i.e.
'x = do { 123; } while (0);' is illegal.

Thanks,

--
Anton Vorontsov
Email: [email protected]

2012-02-06 19:20:01

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH 1/6] oom: Make find_lock_task_mm() sparse-aware

On Mon, Feb 06, 2012 at 10:59:09PM +0400, Anton Vorontsov wrote:
> On Mon, Feb 06, 2012 at 08:35:42AM -0800, Greg KH wrote:
> [...]
> > > -extern struct task_struct *find_lock_task_mm(struct task_struct *p);
> > > +extern struct task_struct *__find_lock_task_mm(struct task_struct *p);
> > > +
> > > +#define find_lock_task_mm(p) \
> > > +({ \
> > > + struct task_struct *__ret; \
> > > + \
> > > + __cond_lock(&(p)->alloc_lock, __ret = __find_lock_task_mm(p)); \
> > > + __ret; \
> > > +})
> >
> > Please use the proper "do...while" style thing here for multi-line,
> > complex #defines like the rest of the kernel does so that you don't end
> > up debugging horrible problems later.
>
> Unfortunately this isn't possible in this case. Unlike '({})' GCC
> extension, do-while statement does not evaluate to a value, i.e.
> 'x = do { 123; } while (0);' is illegal.

Ah, you are right, my bad, sorry about that.

greg k-h

2012-02-06 21:27:54

by KOSAKI Motohiro

[permalink] [raw]
Subject: Re: [PATCH 1/6] oom: Make find_lock_task_mm() sparse-aware

2012/2/6 Greg KH <[email protected]>:
> On Mon, Feb 06, 2012 at 10:59:09PM +0400, Anton Vorontsov wrote:
>> On Mon, Feb 06, 2012 at 08:35:42AM -0800, Greg KH wrote:
>> [...]
>> > > -extern struct task_struct *find_lock_task_mm(struct task_struct *p);
>> > > +extern struct task_struct *__find_lock_task_mm(struct task_struct *p);
>> > > +
>> > > +#define find_lock_task_mm(p) ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? \
>> > > +({ ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? \
>> > > + struct task_struct *__ret; ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?\
>> > > + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? \
>> > > + __cond_lock(&(p)->alloc_lock, __ret = __find_lock_task_mm(p)); ?\
>> > > + __ret; ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?\
>> > > +})
>> >
>> > Please use the proper "do...while" style thing here for multi-line,
>> > complex #defines like the rest of the kernel does so that you don't end
>> > up debugging horrible problems later.
>>
>> Unfortunately this isn't possible in this case. Unlike '({})' GCC
>> extension, do-while statement does not evaluate to a value, i.e.
>> 'x = do { 123; } while (0);' is illegal.
>
> Ah, you are right, my bad, sorry about that.
>
> greg k-h

Some __cond_lock() caller are inline functions. Is this bad? inline function
is always recommended than macros.

2012-02-06 21:33:48

by KOSAKI Motohiro

[permalink] [raw]
Subject: Re: [PATCH 2/6] oom: Get rid of sparse warnings

2012/2/6 Anton Vorontsov <[email protected]>:
> Sparse flood makes it hard to catch newly-introduced warnings. So let's
> fix the the sparse warnings in the oom killer:
>
> ?CHECK ? mm/oom_kill.c
> ?mm/oom_kill.c:139:20: warning: context imbalance in
> ? ? ? ? ?'__find_lock_task_mm' - wrong count at exit
> ?mm/oom_kill.c:771:9: warning: context imbalance in 'out_of_memory' -
> ? ? ? ? ?different lock contexts for basic block
>
> The first one is fixed by assuring sparse that we know that we exit
> with the lock held.
>
> The second one is caused by the fact that sparse isn't smart enough
> to handle noreturn attribute.
>
> Signed-off-by: Anton Vorontsov <[email protected]>
> ---
> ?mm/oom_kill.c | ? ?9 ++++++++-
> ?1 files changed, 8 insertions(+), 1 deletions(-)
>
> diff --git a/mm/oom_kill.c b/mm/oom_kill.c
> index 0ebb383..49569b6 100644
> --- a/mm/oom_kill.c
> +++ b/mm/oom_kill.c
> @@ -142,8 +142,14 @@ struct task_struct *__find_lock_task_mm(struct task_struct *p)
>
> ? ? ? ?do {
> ? ? ? ? ? ? ? ?task_lock(t);
> - ? ? ? ? ? ? ? if (likely(t->mm))
> + ? ? ? ? ? ? ? if (likely(t->mm)) {
> + ? ? ? ? ? ? ? ? ? ? ? /*
> + ? ? ? ? ? ? ? ? ? ? ? ?* Shut up sparse: we do know that we exit w/ the
> + ? ? ? ? ? ? ? ? ? ? ? ?* task locked.
> + ? ? ? ? ? ? ? ? ? ? ? ?*/
> + ? ? ? ? ? ? ? ? ? ? ? __release(&t->alloc_loc);

task struct only have allock_lock, not alloc_loc. Moreover we don't release
the lock in this code path. Seems odd.



> ? ? ? ? ? ? ? ? ? ? ? ?return t;
> + ? ? ? ? ? ? ? }
> ? ? ? ? ? ? ? ?task_unlock(t);
> ? ? ? ?} while_each_thread(p, t);
>
> @@ -766,6 +772,7 @@ retry:
> ? ? ? ? ? ? ? ?dump_header(NULL, gfp_mask, order, NULL, mpol_mask);
> ? ? ? ? ? ? ? ?read_unlock(&tasklist_lock);
> ? ? ? ? ? ? ? ?panic("Out of memory and no killable processes...\n");
> + ? ? ? ? ? ? ? return;
> ? ? ? ?}
>
> ? ? ? ?if (oom_kill_process(p, gfp_mask, order, points, totalpages, NULL,
> --
> 1.7.7.6
>

2012-02-06 21:37:00

by KOSAKI Motohiro

[permalink] [raw]
Subject: Re: [PATCH 4/6] staging: android/lowmemorykiller: Better mm handling

> ?static int lowmem_shrink(struct shrinker *s, struct shrink_control *sc)
> ?{
> - ? ? ? struct task_struct *p;
> + ? ? ? struct task_struct *tsk;
> ? ? ? ?struct task_struct *selected = NULL;
> ? ? ? ?int rem = 0;
> ? ? ? ?int tasksize;
> @@ -134,15 +134,17 @@ static int lowmem_shrink(struct shrinker *s, struct shrink_control *sc)
> ? ? ? ?selected_oom_adj = min_adj;
>
> ? ? ? ?rcu_read_lock();
> - ? ? ? for_each_process(p) {
> - ? ? ? ? ? ? ? struct mm_struct *mm;
> + ? ? ? for_each_process(tsk) {
> + ? ? ? ? ? ? ? struct task_struct *p;
> ? ? ? ? ? ? ? ?struct signal_struct *sig;
> ? ? ? ? ? ? ? ?int oom_adj;
>
> - ? ? ? ? ? ? ? task_lock(p);
> - ? ? ? ? ? ? ? mm = p->mm;
> + ? ? ? ? ? ? ? p = find_lock_task_mm(tsk);
> + ? ? ? ? ? ? ? if (!p)
> + ? ? ? ? ? ? ? ? ? ? ? continue;
> +
> ? ? ? ? ? ? ? ?sig = p->signal;
> - ? ? ? ? ? ? ? if (!mm || !sig) {
> + ? ? ? ? ? ? ? if (!sig) {
> ? ? ? ? ? ? ? ? ? ? ? ?task_unlock(p);
> ? ? ? ? ? ? ? ? ? ? ? ?continue;
> ? ? ? ? ? ? ? ?}
> @@ -151,7 +153,7 @@ static int lowmem_shrink(struct shrinker *s, struct shrink_control *sc)
> ? ? ? ? ? ? ? ? ? ? ? ?task_unlock(p);
> ? ? ? ? ? ? ? ? ? ? ? ?continue;
> ? ? ? ? ? ? ? ?}
> - ? ? ? ? ? ? ? tasksize = get_mm_rss(mm);
> + ? ? ? ? ? ? ? tasksize = get_mm_rss(p->mm);
> ? ? ? ? ? ? ? ?task_unlock(p);
> ? ? ? ? ? ? ? ?if (tasksize <= 0)
> ? ? ? ? ? ? ? ? ? ? ? ?continue;

ack.

2012-02-06 21:38:22

by KOSAKI Motohiro

[permalink] [raw]
Subject: Re: [PATCH 5/6] staging: android/lowmemorykiller: No need for task->signal check

> ?drivers/staging/android/lowmemorykiller.c | ? ?8 +-------
> ?1 files changed, 1 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/staging/android/lowmemorykiller.c b/drivers/staging/android/lowmemorykiller.c
> index 0755e2f..6e800d3 100644
> --- a/drivers/staging/android/lowmemorykiller.c
> +++ b/drivers/staging/android/lowmemorykiller.c
> @@ -136,19 +136,13 @@ static int lowmem_shrink(struct shrinker *s, struct shrink_control *sc)
> ? ? ? ?rcu_read_lock();
> ? ? ? ?for_each_process(tsk) {
> ? ? ? ? ? ? ? ?struct task_struct *p;
> - ? ? ? ? ? ? ? struct signal_struct *sig;
> ? ? ? ? ? ? ? ?int oom_adj;
>
> ? ? ? ? ? ? ? ?p = find_lock_task_mm(tsk);
> ? ? ? ? ? ? ? ?if (!p)
> ? ? ? ? ? ? ? ? ? ? ? ?continue;
>
> - ? ? ? ? ? ? ? sig = p->signal;
> - ? ? ? ? ? ? ? if (!sig) {
> - ? ? ? ? ? ? ? ? ? ? ? task_unlock(p);
> - ? ? ? ? ? ? ? ? ? ? ? continue;
> - ? ? ? ? ? ? ? }
> - ? ? ? ? ? ? ? oom_adj = sig->oom_adj;
> + ? ? ? ? ? ? ? oom_adj = p->signal->oom_adj;
> ? ? ? ? ? ? ? ?if (oom_adj < min_adj) {
> ? ? ? ? ? ? ? ? ? ? ? ?task_unlock(p);
> ? ? ? ? ? ? ? ? ? ? ? ?continue;

ack.

2012-02-06 21:38:48

by KOSAKI Motohiro

[permalink] [raw]
Subject: Re: [PATCH 6/6] staging: android/lowmemorykiller: Do not kill kernel threads

2012/2/6 Anton Vorontsov <[email protected]>:
> LMK should not try killing kernel threads.
>
> Suggested-by: Oleg Nesterov <[email protected]>
> Signed-off-by: Anton Vorontsov <[email protected]>
> ---
> ?drivers/staging/android/lowmemorykiller.c | ? ?3 +++
> ?1 files changed, 3 insertions(+), 0 deletions(-)
>
> diff --git a/drivers/staging/android/lowmemorykiller.c b/drivers/staging/android/lowmemorykiller.c
> index 6e800d3..ae38c39 100644
> --- a/drivers/staging/android/lowmemorykiller.c
> +++ b/drivers/staging/android/lowmemorykiller.c
> @@ -138,6 +138,9 @@ static int lowmem_shrink(struct shrinker *s, struct shrink_control *sc)
> ? ? ? ? ? ? ? ?struct task_struct *p;
> ? ? ? ? ? ? ? ?int oom_adj;
>
> + ? ? ? ? ? ? ? if (tsk->flags & PF_KTHREAD)
> + ? ? ? ? ? ? ? ? ? ? ? continue;
> +

ack.

2012-02-07 04:20:53

by Anton Vorontsov

[permalink] [raw]
Subject: [PATCH v2 2/6] oom: Get rid of sparse warnings

Sparse flood makes it hard to catch newly-introduced warnings. So let's
fix the the sparse warnings in the oom killer:

CHECK mm/oom_kill.c
mm/oom_kill.c:139:20: warning: context imbalance in
'__find_lock_task_mm' - wrong count at exit
mm/oom_kill.c:771:9: warning: context imbalance in 'out_of_memory' -
different lock contexts for basic block

The first one is fixed by assuring sparse that we know that we exit
with the lock held.

The second one is caused by the fact that sparse isn't smart enough
to handle noreturn attribute.

Signed-off-by: Anton Vorontsov <[email protected]>
---

On Mon, Feb 06, 2012 at 04:33:26PM -0500, KOSAKI Motohiro wrote:
> 2012/2/6 Anton Vorontsov <[email protected]>:
> > Sparse flood makes it hard to catch newly-introduced warnings. So let's
> > fix the the sparse warnings in the oom killer:
> >
> >  CHECK   mm/oom_kill.c
> >  mm/oom_kill.c:139:20: warning: context imbalance in
> >          '__find_lock_task_mm' - wrong count at exit
> >  mm/oom_kill.c:771:9: warning: context imbalance in 'out_of_memory' -
> >          different lock contexts for basic block
> >
> > The first one is fixed by assuring sparse that we know that we exit
> > with the lock held.
> >
> > The second one is caused by the fact that sparse isn't smart enough
> > to handle noreturn attribute.
> >
> > Signed-off-by: Anton Vorontsov <[email protected]>
> > ---
> >  mm/oom_kill.c |    9 ++++++++-
> >  1 files changed, 8 insertions(+), 1 deletions(-)
> >
> > diff --git a/mm/oom_kill.c b/mm/oom_kill.c
> > index 0ebb383..49569b6 100644
> > --- a/mm/oom_kill.c
> > +++ b/mm/oom_kill.c
> > @@ -142,8 +142,14 @@ struct task_struct *__find_lock_task_mm(struct task_struct *p)
> >
> >        do {
> >                task_lock(t);
> > -               if (likely(t->mm))
> > +               if (likely(t->mm)) {
> > +                       /*
> > +                        * Shut up sparse: we do know that we exit w/ the
> > +                        * task locked.
> > +                        */
> > +                       __release(&t->alloc_loc);
>
> task struct only have allock_lock, not alloc_loc.

Funnily, but sparse does not care. :-) __release(foo) will work as
well. Seems like sparse counts locking balance globally.

This is now fixed in the patch down below, thanks for catching.

> Moreover we don't release
> the lock in this code path. Seems odd.

Indeed. That's exactly what sparse seeing is as well. We exit
without releasing the lock, which is bad (in sparse' eyes). So
we lie to sparse, telling it that we do release, so it shut ups.

Usually, we annotate functions with __releases() and __acquires()
when when functions releases/acquires one of its arguments, but in
this case the function returns a locked value, and it seems that
there's no way to properly annotate this case.

mm/oom_kill.c | 9 ++++++++-
1 files changed, 8 insertions(+), 1 deletions(-)

diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index 0ebb383..1914367 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -142,8 +142,14 @@ struct task_struct *__find_lock_task_mm(struct task_struct *p)

do {
task_lock(t);
- if (likely(t->mm))
+ if (likely(t->mm)) {
+ /*
+ * Shut up sparse: we do know that we exit w/ the
+ * task locked.
+ */
+ __release(&t->alloc_lock);
return t;
+ }
task_unlock(t);
} while_each_thread(p, t);

@@ -766,6 +772,7 @@ retry:
dump_header(NULL, gfp_mask, order, NULL, mpol_mask);
read_unlock(&tasklist_lock);
panic("Out of memory and no killable processes...\n");
+ return;
}

if (oom_kill_process(p, gfp_mask, order, points, totalpages, NULL,
--
1.7.7.6

2012-02-07 04:48:21

by Anton Vorontsov

[permalink] [raw]
Subject: [PATCH v2 1/6] oom: Make find_lock_task_mm() sparse-aware

This is needed so that callers would not get 'context imbalance'
warnings from the sparse tool.

As a side effect, this patch fixes the following sparse warnings:

CHECK mm/oom_kill.c
mm/oom_kill.c:201:28: warning: context imbalance in 'oom_badness' -
unexpected unlock
include/linux/rcupdate.h:249:30: warning: context imbalance in
'dump_tasks' - unexpected unlock
mm/oom_kill.c:453:9: warning: context imbalance in 'oom_kill_task' -
unexpected unlock
CHECK mm/memcontrol.c
...
mm/memcontrol.c:1130:17: warning: context imbalance in
'task_in_mem_cgroup' - unexpected unlock

Signed-off-by: Anton Vorontsov <[email protected]>
---

On Mon, Feb 06, 2012 at 04:27:32PM -0500, KOSAKI Motohiro wrote:
[...]
> >> Unfortunately this isn't possible in this case. Unlike '({})' GCC
> >> extension, do-while statement does not evaluate to a value, i.e.
> >> 'x = do { 123; } while (0);' is illegal.
> >
> > Ah, you are right, my bad, sorry about that.
> >
> > greg k-h
>
> Some __cond_lock() caller are inline functions. Is this bad?

No, that's great, actually. :-) Not obvious, but seems like
sparse understands __cond_lock in inline functions, so I'd
better use it.

Thanks,

include/linux/oom.h | 12 +++++++++++-
mm/oom_kill.c | 2 +-
2 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/include/linux/oom.h b/include/linux/oom.h
index 552fba9..7c8946a 100644
--- a/include/linux/oom.h
+++ b/include/linux/oom.h
@@ -21,6 +21,7 @@

#ifdef __KERNEL__

+#include <linux/compiler.h>
#include <linux/sched.h>
#include <linux/types.h>
#include <linux/nodemask.h>
@@ -65,7 +66,16 @@ static inline void oom_killer_enable(void)
oom_killer_disabled = false;
}

-extern struct task_struct *find_lock_task_mm(struct task_struct *p);
+extern struct task_struct *__find_lock_task_mm(struct task_struct *p);
+
+static inline struct task_struct *find_lock_task_mm(struct task_struct *p)
+{
+ struct task_struct *ret;
+
+ ret = __find_lock_task_mm(p);
+ (void)__cond_lock(&p->alloc_lock, ret);
+ return ret;
+}

/* sysctls */
extern int sysctl_oom_dump_tasks;
diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index 2958fd8..0ebb383 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -136,7 +136,7 @@ static bool has_intersects_mems_allowed(struct task_struct *tsk,
* pointer. Return p, or any of its subthreads with a valid ->mm, with
* task_lock() held.
*/
-struct task_struct *find_lock_task_mm(struct task_struct *p)
+struct task_struct *__find_lock_task_mm(struct task_struct *p)
{
struct task_struct *t = p;

--
1.7.7.6

2012-02-07 05:18:04

by KOSAKI Motohiro

[permalink] [raw]
Subject: Re: [PATCH v2 1/6] oom: Make find_lock_task_mm() sparse-aware

>> Some __cond_lock() caller are inline functions. Is this bad?
>
> No, that's great, actually. :-) Not obvious, but seems like
> sparse understands __cond_lock in inline functions, so I'd
> better use it.
>
> Thanks,

Great. So, to this version

Acked-by: KOSAKI Motohiro <[email protected]>

2012-02-07 05:39:20

by KOSAKI Motohiro

[permalink] [raw]
Subject: Re: [PATCH v2 2/6] oom: Get rid of sparse warnings

>> task struct only have allock_lock, not alloc_loc.
>
> Funnily, but sparse does not care. :-) __release(foo) will work as
> well. Seems like sparse counts locking balance globally.
>
> This is now fixed in the patch down below, thanks for catching.
>
>> Moreover we don't release
>> the lock in this code path. Seems odd.
>
> Indeed. That's exactly what sparse seeing is as well. We exit
> without releasing the lock, which is bad (in sparse' eyes). So
> we lie to sparse, telling it that we do release, so it shut ups.

Hmmm....

To be honest, I really dislike any lie annotation. Why? It is very
fragile and easily
become broken from unrelated but near line changes. Please consider to
enhance sparse at first.

2012-02-07 06:23:59

by Anton Vorontsov

[permalink] [raw]
Subject: Re: [PATCH v2 2/6] oom: Get rid of sparse warnings

On Tue, Feb 07, 2012 at 12:38:58AM -0500, KOSAKI Motohiro wrote:
> >> task struct only have allock_lock, not alloc_loc.
> >
> > Funnily, but sparse does not care. :-) __release(foo) will work as
> > well. Seems like sparse counts locking balance globally.
> >
> > This is now fixed in the patch down below, thanks for catching.
> >
> >> Moreover we don't release
> >> the lock in this code path. Seems odd.
> >
> > Indeed. That's exactly what sparse seeing is as well. We exit
> > without releasing the lock, which is bad (in sparse' eyes). So
> > we lie to sparse, telling it that we do release, so it shut ups.
>
> Hmmm....
>
> To be honest, I really dislike any lie annotation. Why? It is very
> fragile and easily
> become broken from unrelated but near line changes. Please consider to
> enhance sparse at first.

I somewhat doubt that it is possible to "enhance it". We keep the
lock held conditionaly, so we need too place the hint in the
code itself. I believe the best we can do would be something like
this:

diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index 4a24354..61d91f2 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -14,6 +14,7 @@
# define __releases(x) __attribute__((context(x,1,0)))
# define __acquire(x) __context__(x,1)
# define __release(x) __context__(x,-1)
+# define __ret_with_lock(x) __context__(x,-1)
# define __cond_lock(x,c) ((c) ? ({ __acquire(x); 1; }) : 0)
# define __percpu __attribute__((noderef, address_space(3)))
#ifdef CONFIG_SPARSE_RCU_POINTER
@@ -37,6 +38,7 @@ extern void __chk_io_ptr(const volatile void __iomem *);
# define __releases(x)
# define __acquire(x) (void)0
# define __release(x) (void)0
+# define __ret_with_lock(x)
# define __cond_lock(x,c) (c)
# define __percpu
# define __rcu

And then use it instead of __release().

--
Anton Vorontsov
Email: [email protected]

2012-02-08 15:34:29

by Oleg Nesterov

[permalink] [raw]
Subject: Re: [PATCH v2 1/6] oom: Make find_lock_task_mm() sparse-aware

On 02/07, Anton Vorontsov wrote:
>
> On Mon, Feb 06, 2012 at 04:27:32PM -0500, KOSAKI Motohiro wrote:
> >
> > Some __cond_lock() caller are inline functions. Is this bad?
>
> No, that's great, actually. :-) Not obvious, but seems like
> sparse understands __cond_lock in inline functions, so I'd
> better use it.

Hmm, great...

may be you can update lock_task_sighand() too ? (in a separate
patch of course).

Oleg.

2012-02-08 15:38:52

by Oleg Nesterov

[permalink] [raw]
Subject: Re: [PATCH 3/6] staging: android/lowmemorykiller: Don't grab tasklist_lock

On 02/06, Anton Vorontsov wrote:
>
> Grabbing tasklist_lock has its disadvantages, i.e. it blocks
> process creation and destruction. If there are lots of processes,
> blocking doesn't sound as a great idea.
>
> For LMK, it is sufficient to surround tasks list traverse with
> rcu_read_{,un}lock().
>
> From now on using force_sig() is not safe, as it can race with an
> already exiting task, so we use send_sig() now. As a downside, it
> won't kill PID namespace init processes, but that's not what we
> want anyway.

Reviewed-by: Oleg Nesterov <[email protected]>

2012-02-08 15:41:44

by Oleg Nesterov

[permalink] [raw]
Subject: Re: [PATCH 4/6] staging: android/lowmemorykiller: Better mm handling

On 02/06, Anton Vorontsov wrote:
>
> LMK should not directly check for task->mm. The reason is that the
> process' threads may exit or detach its mm via use_mm(), but other
> threads may still have a valid mm. To catch this we use
> find_lock_task_mm(), which walks up all threads and returns an
> appropriate task (with lock held).

Reviewed-by: Oleg Nesterov <[email protected]>

2012-02-08 15:42:08

by Oleg Nesterov

[permalink] [raw]
Subject: Re: [PATCH 5/6] staging: android/lowmemorykiller: No need for task->signal check

On 02/06, Anton Vorontsov wrote:
>
> task->signal == NULL is not possible, so no need for these checks.

Reviewed-by: Oleg Nesterov <[email protected]>

2012-02-08 15:43:08

by Oleg Nesterov

[permalink] [raw]
Subject: Re: [PATCH 6/6] staging: android/lowmemorykiller: Do not kill kernel threads

On 02/06, Anton Vorontsov wrote:
>
> LMK should not try killing kernel threads.

Reviewed-by: Oleg Nesterov <[email protected]>

2012-02-08 18:13:38

by KOSAKI Motohiro

[permalink] [raw]
Subject: Re: [PATCH v2 2/6] oom: Get rid of sparse warnings

(2/7/12 1:23 AM), Anton Vorontsov wrote:
> On Tue, Feb 07, 2012 at 12:38:58AM -0500, KOSAKI Motohiro wrote:
>>>> task struct only have allock_lock, not alloc_loc.
>>>
>>> Funnily, but sparse does not care. :-) __release(foo) will work as
>>> well. Seems like sparse counts locking balance globally.
>>>
>>> This is now fixed in the patch down below, thanks for catching.
>>>
>>>> Moreover we don't release
>>>> the lock in this code path. Seems odd.
>>>
>>> Indeed. That's exactly what sparse seeing is as well. We exit
>>> without releasing the lock, which is bad (in sparse' eyes). So
>>> we lie to sparse, telling it that we do release, so it shut ups.
>>
>> Hmmm....
>>
>> To be honest, I really dislike any lie annotation. Why? It is very
>> fragile and easily
>> become broken from unrelated but near line changes. Please consider to
>> enhance sparse at first.
>
> I somewhat doubt that it is possible to "enhance it". We keep the
> lock held conditionaly, so we need too place the hint in the
> code itself. I believe the best we can do would be something like
> this:
>
> diff --git a/include/linux/compiler.h b/include/linux/compiler.h
> index 4a24354..61d91f2 100644
> --- a/include/linux/compiler.h
> +++ b/include/linux/compiler.h
> @@ -14,6 +14,7 @@
> # define __releases(x) __attribute__((context(x,1,0)))
> # define __acquire(x) __context__(x,1)
> # define __release(x) __context__(x,-1)
> +# define __ret_with_lock(x) __context__(x,-1)
> # define __cond_lock(x,c) ((c) ? ({ __acquire(x); 1; }) : 0)
> # define __percpu __attribute__((noderef, address_space(3)))
> #ifdef CONFIG_SPARSE_RCU_POINTER
> @@ -37,6 +38,7 @@ extern void __chk_io_ptr(const volatile void __iomem *);
> # define __releases(x)
> # define __acquire(x) (void)0
> # define __release(x) (void)0
> +# define __ret_with_lock(x)
> # define __cond_lock(x,c) (c)
> # define __percpu
> # define __rcu
>
> And then use it instead of __release().

Hmmm...

I still dislike this lie annotation. But maybe it's better than nothing
(dunno, just guess). Go ahead.





2012-02-09 00:56:35

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH 3/6] staging: android/lowmemorykiller: Don't grab tasklist_lock

On Mon, Feb 06, 2012 at 08:42:15PM +0400, Anton Vorontsov wrote:
> On Mon, Feb 06, 2012 at 08:36:49AM -0800, Greg KH wrote:
> > On Mon, Feb 06, 2012 at 08:29:41PM +0400, Anton Vorontsov wrote:
> > > Grabbing tasklist_lock has its disadvantages, i.e. it blocks
> > > process creation and destruction. If there are lots of processes,
> > > blocking doesn't sound as a great idea.
> > >
> > > For LMK, it is sufficient to surround tasks list traverse with
> > > rcu_read_{,un}lock().
> > >
> > > >From now on using force_sig() is not safe, as it can race with an
> > > already exiting task, so we use send_sig() now. As a downside, it
> > > won't kill PID namespace init processes, but that's not what we
> > > want anyway.
> > >
> > > Suggested-by: Oleg Nesterov <[email protected]>
> > > Signed-off-by: Anton Vorontsov <[email protected]>
> >
> > Are these last 4 patches independant of the first 2 and can be taken
> > through the staging tree now?
>
> Yep, without the first two there is just a bit of sparse warnings.
> Not a big deal.

Ok, I'll take the last 4, the first 2 needs to go through some other
tree (-mm?)

thanks,

greg k-h

2012-02-09 16:45:25

by Anton Vorontsov

[permalink] [raw]
Subject: [PATCH] sched: Turn lock_task_sighand() into a static inline

It appears that sparse tool understands static inline functions for
context balance checking, so let's turn the macros into an inline func.

This makes the code a little bit more robust.

Suggested-by: Oleg Nesterov <[email protected]>
Signed-off-by: Anton Vorontsov <[email protected]>
---

On Wed, Feb 08, 2012 at 04:27:50PM +0100, Oleg Nesterov wrote:
> On 02/07, Anton Vorontsov wrote:
> >
> > On Mon, Feb 06, 2012 at 04:27:32PM -0500, KOSAKI Motohiro wrote:
> > >
> > > Some __cond_lock() caller are inline functions. Is this bad?
> >
> > No, that's great, actually. :-) Not obvious, but seems like
> > sparse understands __cond_lock in inline functions, so I'd
> > better use it.
>
> Hmm, great...
>
> may be you can update lock_task_sighand() too ? (in a separate
> patch of course).

Sure thing. Here it goes...

include/linux/sched.h | 15 +++++++++------
1 files changed, 9 insertions(+), 6 deletions(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index e82f721..22ae10e 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -2394,12 +2394,15 @@ static inline void task_unlock(struct task_struct *p)
extern struct sighand_struct *__lock_task_sighand(struct task_struct *tsk,
unsigned long *flags);

-#define lock_task_sighand(tsk, flags) \
-({ struct sighand_struct *__ss; \
- __cond_lock(&(tsk)->sighand->siglock, \
- (__ss = __lock_task_sighand(tsk, flags))); \
- __ss; \
-}) \
+static inline struct sighand_struct *lock_task_sighand(struct task_struct *tsk,
+ unsigned long *flags)
+{
+ struct sighand_struct *ret;
+
+ ret = __lock_task_sighand(tsk, flags);
+ (void)__cond_lock(&tsk->sighand->siglock, ret);
+ return ret;
+}

static inline void unlock_task_sighand(struct task_struct *tsk,
unsigned long *flags)
--
1.7.8.3

2012-02-11 23:11:15

by Anton Vorontsov

[permalink] [raw]
Subject: [tip:sched/core] sched: Turn lock_task_sighand() into a static inline

Commit-ID: 9388dc3047a88bedfd867e9ba3e1980c815ac524
Gitweb: http://git.kernel.org/tip/9388dc3047a88bedfd867e9ba3e1980c815ac524
Author: Anton Vorontsov <[email protected]>
AuthorDate: Thu, 9 Feb 2012 20:45:19 +0400
Committer: Ingo Molnar <[email protected]>
CommitDate: Sat, 11 Feb 2012 15:41:33 +0100

sched: Turn lock_task_sighand() into a static inline

It appears that sparse tool understands static inline functions
for context balance checking, so let's turn the macros into an
inline func.

This makes the code a little bit more robust.

Suggested-by: Oleg Nesterov <[email protected]>
Signed-off-by: Anton Vorontsov <[email protected]>
Cc: KOSAKI Motohiro <[email protected]>
Cc: Greg KH <[email protected]>
Cc: Arve <[email protected]>
Cc: San Mehat <[email protected]>
Cc: Colin Cross <[email protected]>
Cc: Eric W. Biederman <[email protected]>
Cc: Paul E. McKenney <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
---
include/linux/sched.h | 15 +++++++++------
1 files changed, 9 insertions(+), 6 deletions(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 92313a3f..5dba2ad 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -2393,12 +2393,15 @@ static inline void task_unlock(struct task_struct *p)
extern struct sighand_struct *__lock_task_sighand(struct task_struct *tsk,
unsigned long *flags);

-#define lock_task_sighand(tsk, flags) \
-({ struct sighand_struct *__ss; \
- __cond_lock(&(tsk)->sighand->siglock, \
- (__ss = __lock_task_sighand(tsk, flags))); \
- __ss; \
-}) \
+static inline struct sighand_struct *lock_task_sighand(struct task_struct *tsk,
+ unsigned long *flags)
+{
+ struct sighand_struct *ret;
+
+ ret = __lock_task_sighand(tsk, flags);
+ (void)__cond_lock(&tsk->sighand->siglock, ret);
+ return ret;
+}

static inline void unlock_task_sighand(struct task_struct *tsk,
unsigned long *flags)

2012-02-15 13:53:14

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [tip:sched/core] sched: Turn lock_task_sighand() into a static inline

On Sat, 2012-02-11 at 15:10 -0800, tip-bot for Anton Vorontsov wrote:
> +static inline struct sighand_struct *lock_task_sighand(struct task_struct *tsk,
> + unsigned long *flags)
> +{
> + struct sighand_struct *ret;
> +
> + ret = __lock_task_sighand(tsk, flags);
> + (void)__cond_lock(&tsk->sighand->siglock, ret);
> + return ret;
> +}

FWIW, I still really utterly detest __cond_lock(). Ideally someone would
teach sparse the conditional based on return thing and we could do the
function level annotation.