2019-12-12 06:15:10

by chenqiwu

[permalink] [raw]
Subject: [PATCH] kernel/exit: do panic earlier to get coredump if global init task exit

From: chenqiwu <[email protected]>

When global init task get a chance to be killed, panic will happen in
later calling steps by do_exit()->exit_notify()->forget_original_parent()
->find_child_reaper() if all init threads have exited.

However, it's hard to extract the coredump of init task from a kernel
crashdump, since exit_mm() has released its mm before panic. In order
to get the backtrace of init task in userspace, it's better to do panic
earlier at the beginning of exitting route.

It's worth noting that we must take case of a multi-threaded init exitting
issue. We need the test for signal_group_exit() to ensure that it is all
threads exiting and not just the current thread.

Of course testing signal_group_exit() is not sufficient. It is still
possible that this is someone calling exit(2) in an old binary or someone
calling pthread_exit(3) from the last thread of init. To handle that case,
we need the test to be:

(signal_group_exit(tsk->signal) || thread_group_empty(tsk)).

Because exiting the final thread of init should also trigger the panic.

Signed-off-by: chenqiwu <[email protected]>
---
kernel/exit.c | 7 +++----
1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/kernel/exit.c b/kernel/exit.c
index e10de98..ba7d1aa 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -578,10 +578,6 @@ static struct task_struct *find_child_reaper(struct task_struct *father,
}

write_unlock_irq(&tasklist_lock);
- if (unlikely(pid_ns == &init_pid_ns)) {
- panic("Attempted to kill init! exitcode=0x%08x\n",
- father->signal->group_exit_code ?: father->exit_code);
- }

list_for_each_entry_safe(p, n, dead, ptrace_entry) {
list_del_init(&p->ptrace_entry);
@@ -785,6 +781,9 @@ void __noreturn do_exit(long code)
panic("Aiee, killing interrupt handler!");
if (unlikely(!tsk->pid))
panic("Attempted to kill the idle task!");
+ if (unlikely(is_global_init(tsk) &&
+ (signal_group_exit(tsk->signal) || thread_group_empty(tsk))))
+ panic("Attempted to kill init! exitcode=0x%08lx\n", code);

/*
* If do_exit is called because this processes oopsed, it's possible
--
1.9.1


2019-12-12 09:52:51

by Oleg Nesterov

[permalink] [raw]
Subject: Re: [PATCH] kernel/exit: do panic earlier to get coredump if global init task exit

On 12/12, [email protected] wrote:
>
> Of course testing signal_group_exit() is not sufficient. It is still
> possible that this is someone calling exit(2)

Or execve(), so

> @@ -785,6 +781,9 @@ void __noreturn do_exit(long code)
> panic("Aiee, killing interrupt handler!");
> if (unlikely(!tsk->pid))
> panic("Attempted to kill the idle task!");
> + if (unlikely(is_global_init(tsk) &&
> + (signal_group_exit(tsk->signal) || thread_group_empty(tsk))))
> + panic("Attempted to kill init! exitcode=0x%08lx\n", code);

so this can panic() if one of init's threads does does exec.

Oleg.

2019-12-12 10:09:42

by Oleg Nesterov

[permalink] [raw]
Subject: Re: [PATCH] kernel/exit: do panic earlier to get coredump if global init task exit

can't you use is_global_init() && group_dead ?

On 12/12, Oleg Nesterov wrote:
>
> On 12/12, [email protected] wrote:
> >
> > Of course testing signal_group_exit() is not sufficient. It is still
> > possible that this is someone calling exit(2)
>
> Or execve(), so
>
> > @@ -785,6 +781,9 @@ void __noreturn do_exit(long code)
> > panic("Aiee, killing interrupt handler!");
> > if (unlikely(!tsk->pid))
> > panic("Attempted to kill the idle task!");
> > + if (unlikely(is_global_init(tsk) &&
> > + (signal_group_exit(tsk->signal) || thread_group_empty(tsk))))
> > + panic("Attempted to kill init! exitcode=0x%08lx\n", code);
>
> so this can panic() if one of init's threads does does exec.
>
> Oleg.

2019-12-12 11:06:18

by Christian Brauner

[permalink] [raw]
Subject: Re: [PATCH] kernel/exit: do panic earlier to get coredump if global init task exit

On Thu, Dec 12, 2019 at 11:08:38AM +0100, Oleg Nesterov wrote:
> can't you use is_global_init() && group_dead ?

Seems reasonable.
Looks like we can move
group_dead = atomic_dec_and_test(&tsk->signal->live);
further up...

(Ideally I'd like to have a test for this to ensure that this lets you
capture a global init coredump but that might be tricky. But since you've
seem to have run into this case maybe you even have something that could
be turned into a test? (Similar to how we already have a purely opt-in
test for pstore.))

Christian

2019-12-14 06:31:27

by chenqiwu

[permalink] [raw]
Subject: Re: [PATCH] kernel/exit: do panic earlier to get coredump if global init task exit

On Thu, Dec 12, 2019 at 12:05:14PM +0100, Christian Brauner wrote:
> On Thu, Dec 12, 2019 at 11:08:38AM +0100, Oleg Nesterov wrote:
> > can't you use is_global_init() && group_dead ?
>
> Seems reasonable.
> Looks like we can move
> group_dead = atomic_dec_and_test(&tsk->signal->live);
> further up...
>
> (Ideally I'd like to have a test for this to ensure that this lets you
> capture a global init coredump but that might be tricky. But since you've
> seem to have run into this case maybe you even have something that could
> be turned into a test? (Similar to how we already have a purely opt-in
> test for pstore.))
>
> Christian

Hi all,
I agree that using is_global_init() && group_dead is more reasonable.

The crash isuee happened on a Android phone by reboot stress test.
panic log:
[ 84.048521] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
[ 84.048521]
[ 84.048540] CPU: 2 PID: 1 Comm: init Tainted: G S O 4.14.117-perf-g8035d1a #1
[ 84.048544] Hardware name: Qualcomm Technologies, Inc. SM8150 V2 PM8150 RAPHAEL (DT)
[ 84.048550] Call trace:
[ 84.048564] dump_backtrace+0x0/0x268
[ 84.048569] show_stack+0x14/0x20
[ 84.048577] dump_stack+0xc4/0x100
[ 84.048584] panic+0x1f0/0x410
[ 84.048591] complete_and_exit+0x0/0x20
[ 84.048596] do_group_exit+0x8c/0xa0
[ 84.048602] get_signal+0x1c0/0x790
[ 84.048608] do_notify_resume+0x184/0xc30
[ 84.048613] work_pending+0x8/0x10

From the kdump loaded by crash utility, all threads of global init have exited,
the group_dead value of global init has truned to 0 by atomic_dec_and_test().
crash> ps init
PID PPID CPU TASK ST %MEM VSZ RSS COMM
> 1 0 2 ffffffcd77526000 ?? 0.0 0 0 init
534 1 4 ffffffcd6b9a9000 ZO 0.0 0 0 init
535 1 4 ffffffcd6b9aa000 ZO 0.0 0 0 init
crash> ps -g 1
PID: 1 TASK: ffffffcd77526000 CPU: 2 COMMAND: "init"
(no threads)
crash> struct task_struct.signal ffffffcd77526000
signal = 0xffffffcd77530000
crash> struct signal_struct 0xffffffcd77530000
struct signal_struct {
sigcnt = {
counter = 1
},
live = {
counter = 0
},
nr_threads = 1,
thread_head = {
next = 0xffffffcd77526730,
prev = 0xffffffcd77526730
},
group_exit_code = 11,
notify_count = 0,
group_exit_task = 0x0,
group_stop_count = 0,
flags = 4,
...
}

However, as Christian said, the test for this is tricky since we must
make sure all of init threads exited. I make a test for is_global_init()
and send a SIGSEGV signal to global init task in userspace. The phone
crash imeddiately and reboot to collect kdump. Then I extract the coredump
of global init task from kdump successfully.
(gdb) core-file core.1.init
Core was generated by `/system/bin/init second_stage'.
#0 _exit () at bionic/libc/arch-arm64/syscalls/_exit.S:9
9 cmn x0, #(MAX_ERRNO + 1)
(gdb) bt
#0 _exit () at bionic/libc/arch-arm64/syscalls/_exit.S:9
#1 0x00000055606db11c in android::init::InstallRebootSignalHandlers()::$_14::operator()(int) const (this=<optimized out>, signal=11)
at system/core/init/reboot_utils.cpp:141
#2 0x00000055606db100 in android::init::InstallRebootSignalHandlers()::$_14::__invoke(int) (signal=11) at system/core/init/reboot_utils.cpp:138
#3 0x0000007f8de236a0 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

So from the following test, we have confidence that the following patch can help us
to get the extra coredump of init when global init task do real exit.

diff --git a/kernel/exit.c b/kernel/exit.c
index 8e288e8..9454106 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -476,10 +476,6 @@ static struct task_struct *find_child_reaper(struct task_struct *father,
}

write_unlock_irq(&tasklist_lock);
- if (unlikely(pid_ns == &init_pid_ns)) {
- panic("Attempted to kill init! exitcode=0x%08x\n",
- father->signal->group_exit_code ?: father->exit_code);
- }

list_for_each_entry_safe(p, n, dead, ptrace_entry) {
list_del_init(&p->ptrace_entry);
@@ -687,6 +683,10 @@ void do_exit(long code)
if (unlikely(!tsk->pid))
panic("Attempted to kill the idle task!");

+ group_dead = atomic_dec_and_test(&tsk->signal->live);
+ if (unlikely(is_global_init(tsk) && group_dead))
+ panic("Attempted to kill init! exitcode=0x%08lx\n", code);
+
/*
* If do_exit is called because this processes oopsed, it's possible
* that get_fs() was left as KERNEL_DS, so reset it to USER_DS before
@@ -743,7 +743,6 @@ void do_exit(long code)
if (tsk->mm)
sync_mm_rss(tsk->mm);
acct_update_integrals(tsk);
- group_dead = atomic_dec_and_test(&tsk->signal->live);
if (group_dead) {
hrtimer_cancel(&tsk->signal->real_timer);
exit_itimers(tsk->signal);


2019-12-14 11:52:49

by Christian Brauner

[permalink] [raw]
Subject: Re: [PATCH] kernel/exit: do panic earlier to get coredump if global init task exit

On Sat, Dec 14, 2019 at 02:27:04PM +0800, chenqiwu wrote:
> On Thu, Dec 12, 2019 at 12:05:14PM +0100, Christian Brauner wrote:
> > On Thu, Dec 12, 2019 at 11:08:38AM +0100, Oleg Nesterov wrote:
> > > can't you use is_global_init() && group_dead ?
> >
> > Seems reasonable.
> > Looks like we can move
> > group_dead = atomic_dec_and_test(&tsk->signal->live);
> > further up...
> >
> > (Ideally I'd like to have a test for this to ensure that this lets you
> > capture a global init coredump but that might be tricky. But since you've
> > seem to have run into this case maybe you even have something that could
> > be turned into a test? (Similar to how we already have a purely opt-in
> > test for pstore.))
> >
> > Christian
>
> Hi all,
> I agree that using is_global_init() && group_dead is more reasonable.
>
> The crash isuee happened on a Android phone by reboot stress test.
> panic log:
> [ 84.048521] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
> [ 84.048521]
> [ 84.048540] CPU: 2 PID: 1 Comm: init Tainted: G S O 4.14.117-perf-g8035d1a #1
> [ 84.048544] Hardware name: Qualcomm Technologies, Inc. SM8150 V2 PM8150 RAPHAEL (DT)
> [ 84.048550] Call trace:
> [ 84.048564] dump_backtrace+0x0/0x268
> [ 84.048569] show_stack+0x14/0x20
> [ 84.048577] dump_stack+0xc4/0x100
> [ 84.048584] panic+0x1f0/0x410
> [ 84.048591] complete_and_exit+0x0/0x20
> [ 84.048596] do_group_exit+0x8c/0xa0
> [ 84.048602] get_signal+0x1c0/0x790
> [ 84.048608] do_notify_resume+0x184/0xc30
> [ 84.048613] work_pending+0x8/0x10
>
> From the kdump loaded by crash utility, all threads of global init have exited,
> the group_dead value of global init has truned to 0 by atomic_dec_and_test().
> crash> ps init
> PID PPID CPU TASK ST %MEM VSZ RSS COMM
> > 1 0 2 ffffffcd77526000 ?? 0.0 0 0 init
> 534 1 4 ffffffcd6b9a9000 ZO 0.0 0 0 init
> 535 1 4 ffffffcd6b9aa000 ZO 0.0 0 0 init
> crash> ps -g 1
> PID: 1 TASK: ffffffcd77526000 CPU: 2 COMMAND: "init"
> (no threads)
> crash> struct task_struct.signal ffffffcd77526000
> signal = 0xffffffcd77530000
> crash> struct signal_struct 0xffffffcd77530000
> struct signal_struct {
> sigcnt = {
> counter = 1
> },
> live = {
> counter = 0
> },
> nr_threads = 1,
> thread_head = {
> next = 0xffffffcd77526730,
> prev = 0xffffffcd77526730
> },
> group_exit_code = 11,
> notify_count = 0,
> group_exit_task = 0x0,
> group_stop_count = 0,
> flags = 4,
> ...
> }
>
> However, as Christian said, the test for this is tricky since we must
> make sure all of init threads exited. I make a test for is_global_init()
> and send a SIGSEGV signal to global init task in userspace. The phone
> crash imeddiately and reboot to collect kdump. Then I extract the coredump
> of global init task from kdump successfully.
> (gdb) core-file core.1.init
> Core was generated by `/system/bin/init second_stage'.
> #0 _exit () at bionic/libc/arch-arm64/syscalls/_exit.S:9
> 9 cmn x0, #(MAX_ERRNO + 1)
> (gdb) bt
> #0 _exit () at bionic/libc/arch-arm64/syscalls/_exit.S:9
> #1 0x00000055606db11c in android::init::InstallRebootSignalHandlers()::$_14::operator()(int) const (this=<optimized out>, signal=11)
> at system/core/init/reboot_utils.cpp:141
> #2 0x00000055606db100 in android::init::InstallRebootSignalHandlers()::$_14::__invoke(int) (signal=11) at system/core/init/reboot_utils.cpp:138
> #3 0x0000007f8de236a0 in ?? ()
> Backtrace stopped: previous frame identical to this frame (corrupt stack?)
>
> So from the following test, we have confidence that the following patch can help us

Thanks. Can you please resend this as a proper patch for v2?

Christian