2021-03-10 16:47:37

by Eric W. Biederman

[permalink] [raw]
Subject: Re: [PATCH] exit: trigger panic when init process is set to SIGNAL_GROUP_EXIT

qianli zhao <[email protected]> writes:

> Hi,Oleg
>
> Thanks for your replay.
>
>> To be honest, I don't understand the changelog. It seems that you want
>> to uglify the kernel to simplify the debugging of buggy init? Or what?
>
> My patch is for the following purpose:
> 1. I hope to fix the occurrence of unexpected panic
> - [ 24.705390] kernel BUG at include/linux/pid_namespace.h:98!
> This problem occurs when both two init threads enter the do_exit,
> One of the init thread is syscall sys_exit_group,and set SIGNAL_GROUP_EXIT
> The other init thread perform ret_to_user()->get_signal() and found
> SIGNAL_GROUP_EXIT is set,then do_group_exit()->do_exit()
> Since there are no alive init threads it finally goes to
> zap_pid_ns_processes() and BUG().
> Timing details are in the changelog.

At the start of your changelog and your patch subject you describe what
you are doing but not why. For the next revision of the patch please
lead with the why it makes what you are trying to do much easier to
understand.

> 2. I hope to fix the problem that coredump cannot be parsed after init
> crash Before my patch,ever init sub-thread will finish do_exit(),the last
> thread will trigger panic().
> Except for the thread that triggered the panic,the state(such as
> PF_EXITING etc) of the other threads is already exiting,so we can't
> parse coredump from fulldump
> In fact, when any one init has been set to SIGNAL_GROUP_EXIT, then we
> can trigger panic,and prevent other init threads from continuing to
> exit
>
>> Nor can I understand the patch. I fail to understand the games with
>> SIGNAL_UNKILLABLE and ->siglock.
>
> When first init thread panic,i don't want other init threads to still
> exit,this will destroy the state of other init threads.
> so i use SIGNAL_UNKILLABLE to mark this stat,prevent other init
> threads from continuing to exit
> In addition i use siglock to protect tsk->signal->flags.

It does not work to use SIGNAL_UNKILLABLE for this. Normally init
has SIGNAL_UNKILLABLE set. The only case that clears SIGNAL_UNKILLABLE
is force_sig_info_to_task. If the init process exits with exit(2)
SIGNAL_UNKILLABLE will already be set. Which means testing
SIGNAL_UNKILLABLE as your patch does will prevent the panic.

Further simply calling panic is sufficient to guarantee that the other
threads don't exit, and that whichever thread calls panic first
will be the reporting thread. The rest of the threads will be caught
in panic_smp_self_stop(), if they happen to be running on other cpus.

So I would make the whole thing just be:

/* If global init has exited,
* panic immediately to get a useable coredump.
*/
if (unlikely(is_global_init(tsk) &&
(thread_group_empty(tsk) ||
(tsk->signal->flags & SIGNAL_GROUP_EXIT)))) {
panic("Attempted to kill init! exitcode=0x%08x\n",
tsk->signal->group_exit_code ?: (int)code);
}

The thread_group_empty test is needed to handle single threaded
inits.

Do you think you can respin your patch as something like that?

Eric

>
>> And iiuc with this patch the kernel will crash if init's sub-thread execs,
>> signal_group_exit() returns T in this case.
>
> Oleg Nesterov <[email protected]> 于2021年3月10日周三 上午2:27写道:
>>
>> On 03/09, Qianli Zhao wrote:
>> >
>> > From: Qianli Zhao <[email protected]>
>> >
>> > Once any init thread finds SIGNAL_GROUP_EXIT, trigger panic immediately
>> > instead of last thread of global init has exited, and do not allow other
>> > init threads to exit, protect task/memory state of all sub-threads for
>> > get reliable init coredump
>>
>> To be honest, I don't understand the changelog. It seems that you want
>> to uglify the kernel to simplify the debugging of buggy init? Or what?
>>
>> Nor can I understand the patch. I fail to understand the games with
>> SIGNAL_UNKILLABLE and ->siglock.
>>
>> And iiuc with this patch the kernel will crash if init's sub-thread execs,
>> signal_group_exit() returns T in this case.
>>
>> Oleg.
>>
>> > [ 24.705376] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00007f00
>> > [ 24.705382] CPU: 4 PID: 552 Comm: init Tainted: G S O 4.14.180-perf-g4483caa8ae80-dirty #1
>> > [ 24.705390] kernel BUG at include/linux/pid_namespace.h:98!
>> >
>> > PID: 552 CPU: 4 COMMAND: "init"
>> > PID: 1 CPU: 7 COMMAND: "init"
>> > core4 core7
>> > ... sys_exit_group()
>> > do_group_exit()
>> > - sig->flags = SIGNAL_GROUP_EXIT
>> > - zap_other_threads()
>> > do_exit()
>> > - PF_EXITING is set
>> > ret_to_user()
>> > do_notify_resume()
>> > get_signal()
>> > - signal_group_exit
>> > - goto fatal;
>> > do_group_exit()
>> > do_exit()
>> > - PF_EXITING is set
>> > - panic("Attempted to kill init! exitcode=0x%08x\n")
>> > exit_notify()
>> > find_alive_thread() //no alive sub-threads
>> > zap_pid_ns_processes()//CONFIG_PID_NS is not set
>> > BUG()
>> >
>> > Signed-off-by: Qianli Zhao <[email protected]>
>> > ---
>> > We got an init crash issue, but we can't get init coredump from fulldump, we also
>> > see BUG() triggered which calling in zap_pid_ns_processes().
>> >
>> > From crash dump we can get the following information:
>> > 1. "Attempted to kill init",init process is killed.
>> > - Kernel panic - not syncing: Attempted to kill init! exitcode=0x00007f00
>> > 2. At the same time as init crash, a BUG() triggered in other core.
>> > - [ 24.705390] kernel BUG at include/linux/pid_namespace.h:98!
>> > 3. When init thread calls exit_mm, the corresponding thread task->mm will be empty, which is not conducive to extracting coredump
>> >
>> > To fix the issue and save complete coredump, once find init thread is set to SIGNAL_GROUP_EXIT
>> > trigger panic immediately,and other child threads are not allowed to exit just wait for reboot
>> >
>> > PID: 1 TASK: ffffffc973126900 CPU: 7 COMMAND: "init"
>> > #0 [ffffff800805ba60] perf_trace_kernel_panic_late at ffffff99ac0bcbcc
>> > #1 [ffffff800805bac0] die at ffffff99ac08dc64
>> > #2 [ffffff800805bb10] bug_handler at ffffff99ac08e398
>> > #3 [ffffff800805bbc0] brk_handler at ffffff99ac08529c
>> > #4 [ffffff800805bc80] do_debug_exception at ffffff99ac0814e4
>> > #5 [ffffff800805bdf0] el1_dbg at ffffff99ac083298
>> > ->Exception
>> > /home/work/courbet-r-stable-build/kernel/msm-4.14/include/linux/pid_namespace.h: 98
>> > #6 [ffffff800805be20] do_exit at ffffff99ac0c22e8
>> > #7 [ffffff800805be80] do_group_exit at ffffff99ac0c2658
>> > #8 [ffffff800805beb0] sys_exit_group at ffffff99ac0c266c
>> > #9 [ffffff800805bff0] el0_svc_naked at ffffff99ac083cf
>> > ->SYSCALLNO: 5e (__NR_exit_group)
>> >
>> > PID: 552 TASK: ffffffc9613c8f00 CPU: 4 COMMAND: "init"
>> > #0 [ffffff801455b870] __delay at ffffff99ad32cc14
>> > #1 [ffffff801455b8b0] __const_udelay at ffffff99ad32cd10
>> > #2 [ffffff801455b8c0] msm_trigger_wdog_bite at ffffff99ac5d5be0
>> > #3 [ffffff801455b980] do_msm_restart at ffffff99acccc3f8
>> > #4 [ffffff801455b9b0] machine_restart at ffffff99ac085dd0
>> > #5 [ffffff801455b9d0] emergency_restart at ffffff99ac0eb6dc
>> > #6 [ffffff801455baf0] panic at ffffff99ac0bd008
>> > #7 [ffffff801455bb70] do_exit at ffffff99ac0c257c
>> > /home/work/courbet-r-stable-build/kernel/msm-4.14/kernel/exit.c: 842
>> > #8 [ffffff801455bbd0] do_group_exit at ffffff99ac0c2644
>> > #9 [ffffff801455bcc0] get_signal at ffffff99ac0d1384
>> > #10 [ffffff801455be60] do_notify_resume at ffffff99ac08b2a8
>> > #11 [ffffff801455bff0] work_pending at ffffff99ac083b8c
>> >
>> > ---
>> > kernel/exit.c | 29 +++++++++++++++++++++--------
>> > 1 file changed, 21 insertions(+), 8 deletions(-)
>> >
>> > diff --git a/kernel/exit.c b/kernel/exit.c
>> > index ef2fb929..6b2da22 100644
>> > --- a/kernel/exit.c
>> > +++ b/kernel/exit.c
>> > @@ -758,6 +758,27 @@ void __noreturn do_exit(long code)
>> > validate_creds_for_do_exit(tsk);
>> >
>> > /*
>> > + * Once init group is marked for death,
>> > + * panic immediately to get a useable coredump
>> > + */
>> > + if (unlikely(is_global_init(tsk) &&
>> > + signal_group_exit(tsk->signal))) {
>> > + spin_lock_irq(&tsk->sighand->siglock);
>> > + if (!(tsk->signal->flags & SIGNAL_UNKILLABLE)) {
>> > + tsk->signal->flags |= SIGNAL_UNKILLABLE;
>> > + spin_unlock_irq(&tsk->sighand->siglock);
>> > + panic("Attempted to kill init! exitcode=0x%08x\n",
>> > + tsk->signal->group_exit_code ?: (int)code);
>> > + } else {
>> > + /* init sub-thread is dying, just wait for reboot */
>> > + spin_unlock_irq(&tsk->sighand->siglock);
>> > + futex_exit_recursive(tsk);
>> > + set_current_state(TASK_UNINTERRUPTIBLE);
>> > + schedule();
>> > + }
>> > + }
>> > +
>> > + /*
>> > * We're taking recursive faults here in do_exit. Safest is to just
>> > * leave this task alone and wait for reboot.
>> > */
>> > @@ -776,14 +797,6 @@ void __noreturn do_exit(long code)
>> > acct_update_integrals(tsk);
>> > group_dead = atomic_dec_and_test(&tsk->signal->live);
>> > if (group_dead) {
>> > - /*
>> > - * If the last thread of global init has exited, panic
>> > - * immediately to get a useable coredump.
>> > - */
>> > - if (unlikely(is_global_init(tsk)))
>> > - panic("Attempted to kill init! exitcode=0x%08x\n",
>> > - tsk->signal->group_exit_code ?: (int)code);
>> > -
>> > #ifdef CONFIG_POSIX_TIMERS
>> > hrtimer_cancel(&tsk->signal->real_timer);
>> > exit_itimers(tsk->signal);
>> > --
>> > 1.9.1
>> >
>>


2021-03-10 17:35:15

by Oleg Nesterov

[permalink] [raw]
Subject: Re: [PATCH] exit: trigger panic when init process is set to SIGNAL_GROUP_EXIT

On 03/10, Eric W. Biederman wrote:
>
> /* If global init has exited,
> * panic immediately to get a useable coredump.
> */
> if (unlikely(is_global_init(tsk) &&
> (thread_group_empty(tsk) ||
> (tsk->signal->flags & SIGNAL_GROUP_EXIT)))) {
> panic("Attempted to kill init! exitcode=0x%08x\n",
> tsk->signal->group_exit_code ?: (int)code);
> }
>
> The thread_group_empty test is needed to handle single threaded
> inits.

But we can't rely on thread_group_empty(). Just suppose that the main
thread exit first, then the 2nd (last) thread exits too.

Oleg.

2021-03-10 19:11:28

by Eric W. Biederman

[permalink] [raw]
Subject: Re: [PATCH] exit: trigger panic when init process is set to SIGNAL_GROUP_EXIT

Oleg Nesterov <[email protected]> writes:

> On 03/10, Eric W. Biederman wrote:
>>
>> /* If global init has exited,
>> * panic immediately to get a useable coredump.
>> */
>> if (unlikely(is_global_init(tsk) &&
>> (thread_group_empty(tsk) ||
>> (tsk->signal->flags & SIGNAL_GROUP_EXIT)))) {
>> panic("Attempted to kill init! exitcode=0x%08x\n",
>> tsk->signal->group_exit_code ?: (int)code);
>> }
>>
>> The thread_group_empty test is needed to handle single threaded
>> inits.
>
> But we can't rely on thread_group_empty(). Just suppose that the main
> thread exit first, then the 2nd (last) thread exits too.

My code above is designed so that every thread calls panic.
Only the first thread into panic actually writes the panic (That is in
panic itself).

By testing for thread_group_empty() || SIGNAL_GROUP_EXIT
I am just trying to allow threads of init to exit.

Maybe thread_group_empty isn't the exact test we need to allow those.


Eric

2021-03-10 22:17:40

by Eric W. Biederman

[permalink] [raw]
Subject: Re: [PATCH] exit: trigger panic when init process is set to SIGNAL_GROUP_EXIT

Oleg Nesterov <[email protected]> writes:

> On 03/10, Eric W. Biederman wrote:
>>
>> /* If global init has exited,
>> * panic immediately to get a useable coredump.
>> */
>> if (unlikely(is_global_init(tsk) &&
>> (thread_group_empty(tsk) ||
>> (tsk->signal->flags & SIGNAL_GROUP_EXIT)))) {
>> panic("Attempted to kill init! exitcode=0x%08x\n",
>> tsk->signal->group_exit_code ?: (int)code);
>> }
>>
>> The thread_group_empty test is needed to handle single threaded
>> inits.
>
> But we can't rely on thread_group_empty(). Just suppose that the main
> thread exit first, then the 2nd (last) thread exits too.

It took me a minute. I think you are pointing out that there is a case
where we do not set SIGNAL_GROUP_EXIT and that init actually exits.

The case where all of the threads do pthread_exit() aka do_exit().

I think that implies that to have a comprehensive test would
need to do:

group_dead = atomic_dec_and_test(&tsk->signal->live);

/* If global init has exited,
* panic immediately to get a useable coredump.
*/
if (unlikely(is_global_init(tsk) &&
(group_dead || thread_group_empty(tsk) ||
(tsk->signal->flags & SIGNAL_GROUP_EXIT)))) {
panic("Attempted to kill init! exitcode=0x%08x\n",
tsk->signal->group_exit_code ?: (int)code);
}

Leaving the test where it is. Yes. I think that should work.


Eric

2021-03-11 04:43:53

by qianli zhao

[permalink] [raw]
Subject: Re: [PATCH] exit: trigger panic when init process is set to SIGNAL_GROUP_EXIT

Hi, Eric
Thank you for your suggestion

> At the start of your changelog and your patch subject you describe what
> you are doing but not why. For the next revision of the patch please
> lead with the why it makes what you are trying to do much easier to
> understand.

got it.

>
> It does not work to use SIGNAL_UNKILLABLE for this. Normally init
> has SIGNAL_UNKILLABLE set. The only case that clears SIGNAL_UNKILLABLE
> is force_sig_info_to_task. If the init process exits with exit(2)
> SIGNAL_UNKILLABLE will already be set. Which means testing
> SIGNAL_UNKILLABLE as your patch does will prevent the panic.
>

Ok,using SIGNAL_UNKILLABLE is incorrect.

> Further simply calling panic is sufficient to guarantee that the other
> threads don't exit, and that whichever thread calls panic first
> will be the reporting thread. The rest of the threads will be caught
> in panic_smp_self_stop(), if they happen to be running on other cpus.
>
> So I would make the whole thing just be:
>
> /* If global init has exited,
> * panic immediately to get a useable coredump.
> */
> if (unlikely(is_global_init(tsk) &&
> (thread_group_empty(tsk) ||
> (tsk->signal->flags & SIGNAL_GROUP_EXIT)))) {
> panic("Attempted to kill init! exitcode=0x%08x\n",
> tsk->signal->group_exit_code ?: (int)code);
> }
>
> The thread_group_empty test is needed to handle single threaded
> inits.
>
> Do you think you can respin your patch as something like that?
>

Ok.it's a very good change,other CPUs calls to panic() will be caught
and execute panic_smp_self_stop(),
there is no need to deal with this situation separately when other CPUs exit().