2019-12-19 06:32:15

by chenqiwu

[permalink] [raw]
Subject: [PATCH v3] kernel/exit: do panic earlier to get coredump if global init task exit

From: chenqiwu <[email protected]>

When global init task get a chance to be killed, panic will happen in
later calling steps by do_exit()->exit_notify()->forget_original_parent()
->find_child_reaper() if all init threads have exited.

However, it's hard to extract the coredump of init task from a kernel
crashdump, since exit_mm() has released its mm before panic. In order
to get the backtrace of init task in userspace, it's better to do panic
earlier at the beginning of exitting route.

It's worth noting that we must take case of a multi-threaded init exitting
issue. We need the test for is_global_init() && group_dead to ensure that
it is all threads exiting and not just the current thread.

Signed-off-by: chenqiwu <[email protected]>
---
changes in v3:
- move panic into group_dead condition.
- keep exitcode as the original code does.
- fix logic error for comment.
---
kernel/exit.c | 13 +++++++++----
1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/kernel/exit.c b/kernel/exit.c
index bcbd598..7271e13 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -517,10 +517,6 @@ static struct task_struct *find_child_reaper(struct task_struct *father,
}

write_unlock_irq(&tasklist_lock);
- if (unlikely(pid_ns == &init_pid_ns)) {
- panic("Attempted to kill init! exitcode=0x%08x\n",
- father->signal->group_exit_code ?: father->exit_code);
- }

list_for_each_entry_safe(p, n, dead, ptrace_entry) {
list_del_init(&p->ptrace_entry);
@@ -766,6 +762,15 @@ void __noreturn do_exit(long code)
acct_update_integrals(tsk);
group_dead = atomic_dec_and_test(&tsk->signal->live);
if (group_dead) {
+ /*
+ * If the last thread of global init exit, do panic
+ * immeddiately to get the coredump to find any clue
+ * for init task in userspace.
+ */
+ if (unlikely(is_global_init(tsk)))
+ panic("Attempted to kill init! exitcode=0x%08x\n",
+ tsk->signal->group_exit_code ?: (int)code);
+
#ifdef CONFIG_POSIX_TIMERS
hrtimer_cancel(&tsk->signal->real_timer);
exit_itimers(tsk->signal);
--
1.9.1


2019-12-19 10:43:35

by Christian Brauner

[permalink] [raw]
Subject: Re: [PATCH v3] kernel/exit: do panic earlier to get coredump if global init task exit

On Thu, Dec 19, 2019 at 02:29:53PM +0800, [email protected] wrote:
> From: chenqiwu <[email protected]>
>
> When global init task get a chance to be killed, panic will happen in
> later calling steps by do_exit()->exit_notify()->forget_original_parent()
> ->find_child_reaper() if all init threads have exited.
>
> However, it's hard to extract the coredump of init task from a kernel
> crashdump, since exit_mm() has released its mm before panic. In order
> to get the backtrace of init task in userspace, it's better to do panic
> earlier at the beginning of exitting route.
>
> It's worth noting that we must take case of a multi-threaded init exitting
> issue. We need the test for is_global_init() && group_dead to ensure that
> it is all threads exiting and not just the current thread.
>
> Signed-off-by: chenqiwu <[email protected]>

Acked-by: Christian Brauner <[email protected]>

2019-12-20 19:39:41

by Oleg Nesterov

[permalink] [raw]
Subject: Re: [PATCH v3] kernel/exit: do panic earlier to get coredump if global init task exit

On 12/19, [email protected] wrote:
>
> @@ -517,10 +517,6 @@ static struct task_struct *find_child_reaper(struct task_struct *father,
> }
>
> write_unlock_irq(&tasklist_lock);
> - if (unlikely(pid_ns == &init_pid_ns)) {
> - panic("Attempted to kill init! exitcode=0x%08x\n",
> - father->signal->group_exit_code ?: father->exit_code);
> - }
>
> list_for_each_entry_safe(p, n, dead, ptrace_entry) {
> list_del_init(&p->ptrace_entry);
> @@ -766,6 +762,15 @@ void __noreturn do_exit(long code)
> acct_update_integrals(tsk);
> group_dead = atomic_dec_and_test(&tsk->signal->live);
> if (group_dead) {
> + /*
> + * If the last thread of global init exit, do panic
> + * immeddiately to get the coredump to find any clue
> + * for init task in userspace.
> + */
> + if (unlikely(is_global_init(tsk)))
> + panic("Attempted to kill init! exitcode=0x%08x\n",
> + tsk->signal->group_exit_code ?: (int)code);
> +

Acked-by: Oleg Nesterov <[email protected]>

2019-12-20 19:54:13

by Christian Brauner

[permalink] [raw]
Subject: Re: [PATCH v3] kernel/exit: do panic earlier to get coredump if global init task exit

On December 20, 2019 8:38:00 PM GMT+01:00, Oleg Nesterov <[email protected]> wrote:
>On 12/19, [email protected] wrote:
>>
>> @@ -517,10 +517,6 @@ static struct task_struct
>*find_child_reaper(struct task_struct *father,
>> }
>>
>> write_unlock_irq(&tasklist_lock);
>> - if (unlikely(pid_ns == &init_pid_ns)) {
>> - panic("Attempted to kill init! exitcode=0x%08x\n",
>> - father->signal->group_exit_code ?: father->exit_code);
>> - }
>>
>> list_for_each_entry_safe(p, n, dead, ptrace_entry) {
>> list_del_init(&p->ptrace_entry);
>> @@ -766,6 +762,15 @@ void __noreturn do_exit(long code)
>> acct_update_integrals(tsk);
>> group_dead = atomic_dec_and_test(&tsk->signal->live);
>> if (group_dead) {
>> + /*
>> + * If the last thread of global init exit, do panic
>> + * immeddiately to get the coredump to find any clue
>> + * for init task in userspace.
>> + */
>> + if (unlikely(is_global_init(tsk)))
>> + panic("Attempted to kill init! exitcode=0x%08x\n",
>> + tsk->signal->group_exit_code ?: (int)code);
>> +
>
>Acked-by: Oleg Nesterov <[email protected]>

Thanks. I'll pick this up unless someone objects.

Christian

2019-12-22 16:57:42

by Christian Brauner

[permalink] [raw]
Subject: Applied patch "exit: panic before exit_mm() on global init exit"

I've added the patch:

exit: panic before exit_mm() on global init exit

to
https://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux.git/log/?h=fixes

I've rewritten the commit message and fixed a typo in the comment.

If noone objects this will be part of the threads-fixes pr early next week.

Thanks!
Christian

From 43cf75d96409a20ef06b756877a2e72b10a026fc Mon Sep 17 00:00:00 2001
From: chenqiwu <[email protected]>
Date: Thu, 19 Dec 2019 14:29:53 +0800
Subject: [PATCH] exit: panic before exit_mm() on global init exit

Currently, when global init and all threads in its thread-group have exited
we panic via:
do_exit()
-> exit_notify()
-> forget_original_parent()
-> find_child_reaper()
This makes it hard to extract a useable coredump for global init from a
kernel crashdump because by the time we panic exit_mm() will have already
released global init's mm.
This patch moves the panic futher up before exit_mm() is called. As was the
case previously, we only panic when global init and all its threads in the
thread-group have exited.

Signed-off-by: chenqiwu <[email protected]>
Acked-by: Christian Brauner <[email protected]>
Acked-by: Oleg Nesterov <[email protected]>
[[email protected]: fix typo, rewrite commit message]
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Christian Brauner <[email protected]>
---
kernel/exit.c | 12 ++++++++----
1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/kernel/exit.c b/kernel/exit.c
index a46a50d67002..fc364272759d 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -517,10 +517,6 @@ static struct task_struct *find_child_reaper(struct task_struct *father,
}

write_unlock_irq(&tasklist_lock);
- if (unlikely(pid_ns == &init_pid_ns)) {
- panic("Attempted to kill init! exitcode=0x%08x\n",
- father->signal->group_exit_code ?: father->exit_code);
- }

list_for_each_entry_safe(p, n, dead, ptrace_entry) {
list_del_init(&p->ptrace_entry);
@@ -786,6 +782,14 @@ void __noreturn do_exit(long code)
acct_update_integrals(tsk);
group_dead = atomic_dec_and_test(&tsk->signal->live);
if (group_dead) {
+ /*
+ * If the last thread of global init has exited, panic
+ * immediately to get a useable coredump.
+ */
+ if (unlikely(is_global_init(tsk)))
+ panic("Attempted to kill init! exitcode=0x%08x\n",
+ tsk->signal->group_exit_code ?: (int)code);
+
#ifdef CONFIG_POSIX_TIMERS
hrtimer_cancel(&tsk->signal->real_timer);
exit_itimers(tsk->signal);
--
2.24.0