2023-01-20 01:42:07

by Nicholas Piggin

[permalink] [raw]
Subject: [PATCH] exit: Detect and fix irq disabled state in oops

If a task oopses with irqs disabled, this can cause various cascading
problems in the oops path such as sleep-from-invalid warnings, and
potentially worse.

Since commit 0258b5fd7c712 ("coredump: Limit coredumps to a single
thread group"), the unconditional irq enable in coredump_task_exit()
will "fix" the irq state to be enabled early in do_exit(), so currently
this may not be triggerable, but that is coincidental and fragile.

Detect and fix the irqs_disabled() condition in the oops path before
calling do_exit(), similarly to the way in_atomic() is handled.

Link: https://lore.kernel.org/lkml/[email protected]/
Reported-by: Michael Ellerman <[email protected]>
Acked-by: "Eric W. Biederman" <[email protected]>
Signed-off-by: Nicholas Piggin <[email protected]>
---
Hi Peter,

Would you consider taking this through the sched tree?

Thanks,
Nick

kernel/exit.c | 7 +++++++
1 file changed, 7 insertions(+)

diff --git a/kernel/exit.c b/kernel/exit.c
index 15dc2ec80c46..bccfa4218356 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -807,6 +807,8 @@ void __noreturn do_exit(long code)
struct task_struct *tsk = current;
int group_dead;

+ WARN_ON(irqs_disabled());
+
synchronize_group_exit(tsk, code);

WARN_ON(tsk->plug);
@@ -938,6 +940,11 @@ void __noreturn make_task_dead(int signr)
if (unlikely(!tsk->pid))
panic("Attempted to kill the idle task!");

+ if (unlikely(irqs_disabled())) {
+ pr_info("note: %s[%d] exited with irqs disabled\n",
+ current->comm, task_pid_nr(current));
+ local_irq_enable();
+ }
if (unlikely(in_atomic())) {
pr_info("note: %s[%d] exited with preempt_count %d\n",
current->comm, task_pid_nr(current),
--
2.37.2


2023-01-20 16:11:12

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH] exit: Detect and fix irq disabled state in oops

On Fri, Jan 20, 2023 at 11:18:20AM +1000, Nicholas Piggin wrote:
> If a task oopses with irqs disabled, this can cause various cascading
> problems in the oops path such as sleep-from-invalid warnings, and
> potentially worse.
>
> Since commit 0258b5fd7c712 ("coredump: Limit coredumps to a single
> thread group"), the unconditional irq enable in coredump_task_exit()
> will "fix" the irq state to be enabled early in do_exit(), so currently
> this may not be triggerable, but that is coincidental and fragile.
>
> Detect and fix the irqs_disabled() condition in the oops path before
> calling do_exit(), similarly to the way in_atomic() is handled.
>
> Link: https://lore.kernel.org/lkml/[email protected]/
> Reported-by: Michael Ellerman <[email protected]>
> Acked-by: "Eric W. Biederman" <[email protected]>
> Signed-off-by: Nicholas Piggin <[email protected]>
> ---
> Hi Peter,
>
> Would you consider taking this through the sched tree?

Yep, can do, let me go queue it.