2007-06-05 17:48:56

by Will Schmidt

[permalink] [raw]
Subject: [PATCH 1/3] [PATCH i386] during VM oom condition, kill all threads in process group


When we get into a state where VM has ran out of memory, and it's time to
thwack a process, we should take out the entire process group, rather than
just one thread.

Tested on i386

Signed-off-by: Will Schmidt <[email protected]>
---

arch/i386/mm/fault.c | 4 +++-
1 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/arch/i386/mm/fault.c b/arch/i386/mm/fault.c
index b8c4e25..82aec0e 100644
--- a/arch/i386/mm/fault.c
+++ b/arch/i386/mm/fault.c
@@ -567,8 +567,10 @@ out_of_memory:
goto survive;
}
printk("VM: killing process %s\n", tsk->comm);
- if (error_code & 4)
+ if (error_code & 4) {
+ zap_other_threads(tsk);
do_exit(SIGKILL);
+ }
goto no_context;

do_sigbus:



2007-06-05 17:49:22

by Will Schmidt

[permalink] [raw]
Subject: [PATCH 2/3] [PATCH powerpc] during VM oom condition, kill all threads in process group


When we get into a state where VM has ran out of memory, and it's time to
thwack a process, we should take out the entire process group, rather than
just one thread.

Tested on POWER5.

Signed-off-by: Will Schmidt <[email protected]>
---

arch/powerpc/mm/fault.c | 4 +++-
1 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
index 03aeb3a..9afe871 100644
--- a/arch/powerpc/mm/fault.c
+++ b/arch/powerpc/mm/fault.c
@@ -392,8 +392,10 @@ out_of_memory:
goto survive;
}
printk("VM: killing process %s\n", current->comm);
- if (user_mode(regs))
+ if (user_mode(regs)) {
+ zap_other_threads(current);
do_exit(SIGKILL);
+ }
return SIGKILL;

do_sigbus:


2007-06-05 17:49:36

by Will Schmidt

[permalink] [raw]
Subject: [PATCH 3/3] [PATCH x86_64] during VM oom condition, kill all threads in process group


When we get into a state where VM has ran out of memory, and it's time to
thwack a process, we should take out the entire process group, rather than
just one thread.

Signed-off-by: Will Schmidt <[email protected]>
---

arch/x86_64/mm/fault.c | 4 +++-
1 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/arch/x86_64/mm/fault.c b/arch/x86_64/mm/fault.c
index 6ada723..2a3060e 100644
--- a/arch/x86_64/mm/fault.c
+++ b/arch/x86_64/mm/fault.c
@@ -562,8 +562,10 @@ out_of_memory:
goto again;
}
printk("VM: killing process %s\n", tsk->comm);
- if (error_code & 4)
+ if (error_code & 4) {
+ zap_other_threads(tsk);
do_exit(SIGKILL);
+ }
goto no_context;

do_sigbus:


2007-06-05 18:17:32

by Will Schmidt

[permalink] [raw]
Subject: Re: [PATCH 2/3] [PATCH powerpc] during VM oom condition, kill all threads in process group


Whoops.. sorry about any reply bounces, I flubbed the cc to
[email protected] .

-Will

On Tue, 2007-05-06 at 12:48 -0500, Will Schmidt wrote:
> When we get into a state where VM has ran out of memory, and it's time to
> thwack a process, we should take out the entire process group, rather than
> just one thread.
>
> Tested on POWER5.
>
> Signed-off-by: Will Schmidt <[email protected]>
> ---
>
> arch/powerpc/mm/fault.c | 4 +++-
> 1 files changed, 3 insertions(+), 1 deletions(-)
>
> diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
> index 03aeb3a..9afe871 100644
> --- a/arch/powerpc/mm/fault.c
> +++ b/arch/powerpc/mm/fault.c
> @@ -392,8 +392,10 @@ out_of_memory:
> goto survive;
> }
> printk("VM: killing process %s\n", current->comm);
> - if (user_mode(regs))
> + if (user_mode(regs)) {
> + zap_other_threads(current);
> do_exit(SIGKILL);
> + }
> return SIGKILL;
>
> do_sigbus:
>
>

2007-06-07 22:35:46

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH 1/3] [PATCH i386] during VM oom condition, kill all threads in process group

On Tue, 05 Jun 2007 12:48:32 -0500
Will Schmidt <[email protected]> wrote:

>
> When we get into a state where VM has ran out of memory, and it's time to
> thwack a process, we should take out the entire process group, rather than
> just one thread.
>
> Tested on i386
>
> Signed-off-by: Will Schmidt <[email protected]>
> ---
>
> arch/i386/mm/fault.c | 4 +++-
> 1 files changed, 3 insertions(+), 1 deletions(-)
>
> diff --git a/arch/i386/mm/fault.c b/arch/i386/mm/fault.c
> index b8c4e25..82aec0e 100644
> --- a/arch/i386/mm/fault.c
> +++ b/arch/i386/mm/fault.c
> @@ -567,8 +567,10 @@ out_of_memory:
> goto survive;
> }
> printk("VM: killing process %s\n", tsk->comm);
> - if (error_code & 4)
> + if (error_code & 4) {
> + zap_other_threads(tsk);
> do_exit(SIGKILL);
> + }
> goto no_context;
>

zap_other_threads() requires tasklist_lock.

If we're going to do this then we should probably create some new function
(with a better name) which takes tasklsit_lock and then calls
zap_other_threads().

Does this patch fix any observed-in-the-real-world problem? If so, please
describe it.

2007-06-07 23:21:12

by Anton Blanchard

[permalink] [raw]
Subject: Re: [PATCH 1/3] [PATCH i386] during VM oom condition, kill all threads in process group


Hi,

> zap_other_threads() requires tasklist_lock.
>
> If we're going to do this then we should probably create some new function
> (with a better name) which takes tasklsit_lock and then calls
> zap_other_threads().
>
> Does this patch fix any observed-in-the-real-world problem? If so, please
> describe it.

Yeah we have had complaints where threaded apps have only one thread
shot down instead of the entire process. This leaves the application in
a bad state, whereas if it had been killed cleanly the application could
have restarted.

My understanding is that fatal signals should kill all threads in the
group.

Anton

2007-06-08 00:10:41

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH 1/3] [PATCH i386] during VM oom condition, kill all threads in process group

On Thu, 7 Jun 2007 18:16:21 -0500
Anton Blanchard <[email protected]> wrote:

>
> Hi,
>
> > zap_other_threads() requires tasklist_lock.
> >
> > If we're going to do this then we should probably create some new function
> > (with a better name) which takes tasklsit_lock and then calls
> > zap_other_threads().
> >
> > Does this patch fix any observed-in-the-real-world problem? If so, please
> > describe it.
>
> Yeah we have had complaints where threaded apps have only one thread
> shot down instead of the entire process. This leaves the application in
> a bad state, whereas if it had been killed cleanly the application could
> have restarted.
>
> My understanding is that fatal signals should kill all threads in the
> group.
>

OK, well could we please get all that info appropriatelt captured in #2's
changelog?

Other architectures will probably need to implement this.

2007-06-08 19:19:31

by Will Schmidt

[permalink] [raw]
Subject: Re: [PATCH 1/3] [PATCH i386] during VM oom condition, kill all threads in process group

On Thu, 2007-06-07 at 17:10 -0700, Andrew Morton wrote:
> On Thu, 7 Jun 2007 18:16:21 -0500
> Anton Blanchard <[email protected]> wrote:
>
> >
> > Hi,
> >
> > > zap_other_threads() requires tasklist_lock.

Yup, I missed that. Thanks for pointing it out.

> > >
> > > If we're going to do this then we should probably create some new function
> > > (with a better name) which takes tasklsit_lock and then calls
> > > zap_other_threads().

I expect this will be a write_lock_irq() since zap_other_threads will be
doing a bit more than just reading the task info.

This will be down in a do-page-fault failure path (see
arch/*/mm/fault.c). I wonder if calling write_lock is going to be safe,
or if its possible to get into a deadlock? i.e. should I branch back up
to the survive: label if I can't take the lock? Would that even be
sufficient? or is it not an issue here?

> > >
> > > Does this patch fix any observed-in-the-real-world problem? If so, please
> > > describe it.
> >
> > Yeah we have had complaints where threaded apps have only one thread
> > shot down instead of the entire process. This leaves the application in
> > a bad state, whereas if it had been killed cleanly the application could
> > have restarted.
> >
> > My understanding is that fatal signals should kill all threads in the
> > group.
> >
>
> OK, well could we please get all that info appropriatelt captured in #2's
> changelog?
Yup, next spin I'll add more to the changelog.

>
> Other architectures will probably need to implement this.

-Will

2007-06-08 19:33:21

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH 1/3] [PATCH i386] during VM oom condition, kill all threads in process group

On Fri, 08 Jun 2007 14:19:18 -0500
Will Schmidt <[email protected]> wrote:

> > > > zap_other_threads() requires tasklist_lock.
>
> Yup, I missed that. Thanks for pointing it out.
>
> > > >
> > > > If we're going to do this then we should probably create some new function
> > > > (with a better name) which takes tasklsit_lock and then calls
> > > > zap_other_threads().
>
> I expect this will be a write_lock_irq() since zap_other_threads will be
> doing a bit more than just reading the task info.

No, I think read_lock() will be sufficient.

In fact, it's probably the case that rcu_read_lock() is now sufficient
locking coverage for zap_other_threads() (cc's people).

It had better be, because do_group_exit() forgot to take tasklist_lock. It
is perhaps relying upon spin_lock()'s hidden rcu_read_lock() properties
without so much as a code comment, which would be somewhat nasty of it.

You could perhaps just call do_group_exit() from within the fault handler,
btw.

> This will be down in a do-page-fault failure path (see
> arch/*/mm/fault.c). I wonder if calling write_lock is going to be safe,
> or if its possible to get into a deadlock? i.e. should I branch back up
> to the survive: label if I can't take the lock? Would that even be
> sufficient? or is it not an issue here?

You can take the lock in the fault handler. Nobody should be getting
pagefaults while holding tasklist_lock. (Well, a vmalloc fault might, but
that's a special-case which doesn't allocate memory or anything like that).

2007-06-08 21:12:49

by Will Schmidt

[permalink] [raw]
Subject: Re: [PATCH 1/3] [PATCH i386] during VM oom condition, kill all threads in process group

On Fri, 2007-06-08 at 12:32 -0700, Andrew Morton wrote:
> On Fri, 08 Jun 2007 14:19:18 -0500
> Will Schmidt <[email protected]> wrote:
>
> > > > > zap_other_threads() requires tasklist_lock.
> >

> In fact, it's probably the case that rcu_read_lock() is now sufficient
> locking coverage for zap_other_threads() (cc's people).
>
> It had better be, because do_group_exit() forgot to take tasklist_lock. It
> is perhaps relying upon spin_lock()'s hidden rcu_read_lock() properties
> without so much as a code comment, which would be somewhat nasty of it.

> You could perhaps just call do_group_exit() from within the fault
> handler,
> btw.

Yup, so looks like I can actually replace the existing do_exit() call
with do_group_exit(). I'll sit on this for a bit to give other folks a
chance to comment on which lock call is sufficient, read_lock() or
rcu_read_lock(), etc; and do_group_exit()'s issue with taking
tasklist_lock.

Thanks,

-Will

2007-06-08 22:50:47

by Eric W. Biederman

[permalink] [raw]
Subject: Re: [PATCH 1/3] [PATCH i386] during VM oom condition, kill all threads in process group

Will Schmidt <[email protected]> writes:

> On Fri, 2007-06-08 at 12:32 -0700, Andrew Morton wrote:
>> On Fri, 08 Jun 2007 14:19:18 -0500
>> Will Schmidt <[email protected]> wrote:
>>
>> > > > > zap_other_threads() requires tasklist_lock.
>> >
>
>> In fact, it's probably the case that rcu_read_lock() is now sufficient
>> locking coverage for zap_other_threads() (cc's people).
>>
>> It had better be, because do_group_exit() forgot to take tasklist_lock. It
>> is perhaps relying upon spin_lock()'s hidden rcu_read_lock() properties
>> without so much as a code comment, which would be somewhat nasty of it.
>
>> You could perhaps just call do_group_exit() from within the fault
>> handler,
>> btw.
>
> Yup, so looks like I can actually replace the existing do_exit() call
> with do_group_exit(). I'll sit on this for a bit to give other folks a
> chance to comment on which lock call is sufficient, read_lock() or
> rcu_read_lock(), etc; and do_group_exit()'s issue with taking
> tasklist_lock.

No. The rcu_read_lock is not sufficient.
Yes. sighand->siglock is enough, and we explicitly take it in
do_group_exit before calling zap_other_threads.

Unless I have completely miss-understood this thread.

Eric

2007-06-13 15:51:32

by Oleg Nesterov

[permalink] [raw]
Subject: Re: [PATCH 1/3] [PATCH i386] during VM oom condition, kill all threads in process group

On 06/08, Eric W. Biederman wrote:
>
> Will Schmidt <[email protected]> writes:
>
> > On Fri, 2007-06-08 at 12:32 -0700, Andrew Morton wrote:
> >> On Fri, 08 Jun 2007 14:19:18 -0500
> >> Will Schmidt <[email protected]> wrote:
> >>
> >> > > > > zap_other_threads() requires tasklist_lock.
> >> >
> >
> >> In fact, it's probably the case that rcu_read_lock() is now sufficient
> >> locking coverage for zap_other_threads() (cc's people).
> >>
> >> It had better be, because do_group_exit() forgot to take tasklist_lock. It
> >> is perhaps relying upon spin_lock()'s hidden rcu_read_lock() properties
> >> without so much as a code comment, which would be somewhat nasty of it.
> >
> >> You could perhaps just call do_group_exit() from within the fault
> >> handler,
> >> btw.
> >
> > Yup, so looks like I can actually replace the existing do_exit() call
> > with do_group_exit(). I'll sit on this for a bit to give other folks a
> > chance to comment on which lock call is sufficient, read_lock() or
> > rcu_read_lock(), etc; and do_group_exit()'s issue with taking
> > tasklist_lock.
>
> No. The rcu_read_lock is not sufficient.
> Yes. sighand->siglock is enough, and we explicitly take it in
> do_group_exit before calling zap_other_threads.

Yes, we don't need tasklist_lock (or rcu_read_lock).

de_thread() calls zap_other_threads() under tasklist_lock, but this
is because we can change child_reaper.

Oleg.