2022-10-04 09:54:54

by Nicholas Piggin

[permalink] [raw]
Subject: [PATCH] exit: Detect and fix irq disabled state in oops

If a task oopses with irqs disabled, this can cause various cascading
problems in the oops path such as sleep-from-invalid warnings, and
potentially worse.

Since commit 0258b5fd7c712 ("coredump: Limit coredumps to a single
thread group"), the unconditional irq enable in coredump_task_exit()
will "fix" the irq state to be enabled early in do_exit(), so currently
this may not be triggerable, but that is coincidental and fragile.

Detect and fix the irqs_disabled() condition in the oops path before
calling do_exit(), similarly to the way in_atomic() is handled.

Reported-by: Michael Ellerman <[email protected]>
Signed-off-by: Nicholas Piggin <[email protected]>
---
kernel/exit.c | 6 ++++++
1 file changed, 6 insertions(+)

diff --git a/kernel/exit.c b/kernel/exit.c
index 84021b24f79e..fa696765f694 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -738,6 +738,7 @@ void __noreturn do_exit(long code)
struct task_struct *tsk = current;
int group_dead;

+ WARN_ON(irqs_disabled());
WARN_ON(tsk->plug);

kcov_task_exit(tsk);
@@ -865,6 +866,11 @@ void __noreturn make_task_dead(int signr)
if (unlikely(!tsk->pid))
panic("Attempted to kill the idle task!");

+ if (unlikely(irqs_disabled())) {
+ pr_info("note: %s[%d] exited with irqs disabled\n",
+ current->comm, task_pid_nr(current));
+ local_irq_enable();
+ }
if (unlikely(in_atomic())) {
pr_info("note: %s[%d] exited with preempt_count %d\n",
current->comm, task_pid_nr(current),
--
2.37.2


2022-12-20 08:18:26

by Nicholas Piggin

[permalink] [raw]
Subject: Re: [PATCH] exit: Detect and fix irq disabled state in oops

On Tue Oct 4, 2022 at 7:44 PM AEST, Nicholas Piggin wrote:
> If a task oopses with irqs disabled, this can cause various cascading
> problems in the oops path such as sleep-from-invalid warnings, and
> potentially worse.
>
> Since commit 0258b5fd7c712 ("coredump: Limit coredumps to a single
> thread group"), the unconditional irq enable in coredump_task_exit()
> will "fix" the irq state to be enabled early in do_exit(), so currently
> this may not be triggerable, but that is coincidental and fragile.
>
> Detect and fix the irqs_disabled() condition in the oops path before
> calling do_exit(), similarly to the way in_atomic() is handled.
>
> Reported-by: Michael Ellerman <[email protected]>
> Signed-off-by: Nicholas Piggin <[email protected]>

Hey Eric, did you have any thoughts on this?

Thanks,
Nick

> ---
> kernel/exit.c | 6 ++++++
> 1 file changed, 6 insertions(+)
>
> diff --git a/kernel/exit.c b/kernel/exit.c
> index 84021b24f79e..fa696765f694 100644
> --- a/kernel/exit.c
> +++ b/kernel/exit.c
> @@ -738,6 +738,7 @@ void __noreturn do_exit(long code)
> struct task_struct *tsk = current;
> int group_dead;
>
> + WARN_ON(irqs_disabled());
> WARN_ON(tsk->plug);
>
> kcov_task_exit(tsk);
> @@ -865,6 +866,11 @@ void __noreturn make_task_dead(int signr)
> if (unlikely(!tsk->pid))
> panic("Attempted to kill the idle task!");
>
> + if (unlikely(irqs_disabled())) {
> + pr_info("note: %s[%d] exited with irqs disabled\n",
> + current->comm, task_pid_nr(current));
> + local_irq_enable();
> + }
> if (unlikely(in_atomic())) {
> pr_info("note: %s[%d] exited with preempt_count %d\n",
> current->comm, task_pid_nr(current),
> --
> 2.37.2

2022-12-24 05:23:56

by Eric W. Biederman

[permalink] [raw]
Subject: Re: [PATCH] exit: Detect and fix irq disabled state in oops

"Nicholas Piggin" <[email protected]> writes:

> On Tue Oct 4, 2022 at 7:44 PM AEST, Nicholas Piggin wrote:
>> If a task oopses with irqs disabled, this can cause various cascading
>> problems in the oops path such as sleep-from-invalid warnings, and
>> potentially worse.
>>
>> Since commit 0258b5fd7c712 ("coredump: Limit coredumps to a single
>> thread group"), the unconditional irq enable in coredump_task_exit()
>> will "fix" the irq state to be enabled early in do_exit(), so currently
>> this may not be triggerable, but that is coincidental and fragile.
>>
>> Detect and fix the irqs_disabled() condition in the oops path before
>> calling do_exit(), similarly to the way in_atomic() is handled.
>>
>> Reported-by: Michael Ellerman <[email protected]>
>> Signed-off-by: Nicholas Piggin <[email protected]>
>
> Hey Eric, did you have any thoughts on this?

No strong thoughts.

I agree that the unconditionally disabling then enabling irqs in
coredump_task_exit will mean there is likely to be little change in real
behavior.

I also agree that is something fragile to depend upon so we making
our assumptions explicit seems good.

Acked-by: "Eric W. Biederman" <[email protected]>

>
> Thanks,
> Nick
>
>> ---
>> kernel/exit.c | 6 ++++++
>> 1 file changed, 6 insertions(+)
>>
>> diff --git a/kernel/exit.c b/kernel/exit.c
>> index 84021b24f79e..fa696765f694 100644
>> --- a/kernel/exit.c
>> +++ b/kernel/exit.c
>> @@ -738,6 +738,7 @@ void __noreturn do_exit(long code)
>> struct task_struct *tsk = current;
>> int group_dead;
>>
>> + WARN_ON(irqs_disabled());
>> WARN_ON(tsk->plug);
>>
>> kcov_task_exit(tsk);
>> @@ -865,6 +866,11 @@ void __noreturn make_task_dead(int signr)
>> if (unlikely(!tsk->pid))
>> panic("Attempted to kill the idle task!");
>>
>> + if (unlikely(irqs_disabled())) {
>> + pr_info("note: %s[%d] exited with irqs disabled\n",
>> + current->comm, task_pid_nr(current));
>> + local_irq_enable();
>> + }
>> if (unlikely(in_atomic())) {
>> pr_info("note: %s[%d] exited with preempt_count %d\n",
>> current->comm, task_pid_nr(current),
>> --
>> 2.37.2