A deadlock caused by logbuf_lock occurs when panic:
a) Panic CPU is running in non-NMI context
b) Panic CPU sends out shutdown IPI via NMI vector
c) One of the CPUs that we bring down via NMI vector holded logbuf_lock
d) Panic CPU try to hold logbuf_lock, then deadlock occurs.
we try to re-init the logbuf_lock in printk_safe_flush_on_panic()
to avoid deadlock, but it does not work here, because :
Firstly, it is inappropriate to check num_online_cpus() here.
When the CPU bring down via NMI vector, the panic CPU willn't
wait too long for other cores to stop, so when this problem
occurs, num_online_cpus() may be greater than 1.
Secondly, printk_safe_flush_on_panic() is called after panic
notifier callback, so if printk() is called in panic notifier
callback, deadlock will still occurs. Eg, if ftrace_dump_on_oops
is set, we print some debug information, it will try to hold the
logbuf_lock.
To avoid this deadlock, drop the num_online_cpus() check and call
the printk_safe_flush_on_panic() before panic_notifier_list callback,
attempt to re-init logbuf_lock from panic CPU.
Signed-off-by: Cheng Jian <[email protected]>
---
kernel/panic.c | 3 +++
kernel/printk/printk_safe.c | 3 ---
2 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/kernel/panic.c b/kernel/panic.c
index b69ee9e76cb2..8dbcb2227b60 100644
--- a/kernel/panic.c
+++ b/kernel/panic.c
@@ -255,6 +255,9 @@ void panic(const char *fmt, ...)
crash_smp_send_stop();
}
+ /* Call flush even twice. It tries harder with a single online CPU */
+ printk_safe_flush_on_panic();
+
/*
* Run any panic handlers, including those that might need to
* add information to the kmsg dump output.
diff --git a/kernel/printk/printk_safe.c b/kernel/printk/printk_safe.c
index d9a659a686f3..9ebc1723e1a4 100644
--- a/kernel/printk/printk_safe.c
+++ b/kernel/printk/printk_safe.c
@@ -269,9 +269,6 @@ void printk_safe_flush_on_panic(void)
* Do not risk a double release when more CPUs are up.
*/
if (raw_spin_is_locked(&logbuf_lock)) {
- if (num_online_cpus() > 1)
- return;
-
debug_locks_off();
raw_spin_lock_init(&logbuf_lock);
}
--
2.17.1
On (20/06/03 14:19), Cheng Jian wrote:
> A deadlock caused by logbuf_lock occurs when panic:
>
> a) Panic CPU is running in non-NMI context
> b) Panic CPU sends out shutdown IPI via NMI vector
> c) One of the CPUs that we bring down via NMI vector holded logbuf_lock
> d) Panic CPU try to hold logbuf_lock, then deadlock occurs.
>
> we try to re-init the logbuf_lock in printk_safe_flush_on_panic()
> to avoid deadlock, but it does not work here, because :
>
> Firstly, it is inappropriate to check num_online_cpus() here.
> When the CPU bring down via NMI vector, the panic CPU willn't
> wait too long for other cores to stop, so when this problem
> occurs, num_online_cpus() may be greater than 1.
>
> Secondly, printk_safe_flush_on_panic() is called after panic
> notifier callback, so if printk() is called in panic notifier
> callback, deadlock will still occurs. Eg, if ftrace_dump_on_oops
> is set, we print some debug information, it will try to hold the
> logbuf_lock.
>
> To avoid this deadlock, drop the num_online_cpus() check and call
> the printk_safe_flush_on_panic() before panic_notifier_list callback,
> attempt to re-init logbuf_lock from panic CPU.
We hopefully will get rid of some of these locks (around 5.9 kernel
maybe), so the deadlocks (at least in the printk-code) should become
less common.
-ss
On Wed 2020-06-03 14:19:15, Cheng Jian wrote:
> A deadlock caused by logbuf_lock occurs when panic:
>
> a) Panic CPU is running in non-NMI context
> b) Panic CPU sends out shutdown IPI via NMI vector
> c) One of the CPUs that we bring down via NMI vector holded logbuf_lock
> d) Panic CPU try to hold logbuf_lock, then deadlock occurs.
>
> we try to re-init the logbuf_lock in printk_safe_flush_on_panic()
> to avoid deadlock, but it does not work here, because :
>
> Firstly, it is inappropriate to check num_online_cpus() here.
> When the CPU bring down via NMI vector, the panic CPU willn't
> wait too long for other cores to stop, so when this problem
> occurs, num_online_cpus() may be greater than 1.
>
> Secondly, printk_safe_flush_on_panic() is called after panic
> notifier callback, so if printk() is called in panic notifier
> callback, deadlock will still occurs. Eg, if ftrace_dump_on_oops
> is set, we print some debug information, it will try to hold the
> logbuf_lock.
>
> To avoid this deadlock, drop the num_online_cpus() check and call
> the printk_safe_flush_on_panic() before panic_notifier_list callback,
> attempt to re-init logbuf_lock from panic CPU.
It might cause double unlock (deadlock) on architectures that did not
use NMI to stop the CPUs.
I have created a conservative fix for this problem for SLES, see
https://github.com/openSUSE/kernel-source/blob/SLE15-SP2-UPDATE/patches.suse/printk-panic-Avoid-deadlock-in-printk-after-stopping-CPUs-by-NMI.patch
It solves the problem only on x86 architecture.
There are many hacks that try to solve various scenarios but it
is getting too complicated and does not solve all problems.
The only real solution is lockless printk(). First piece is a lockless
ringbuffer. See the last version at
https://lore.kernel.org/r/[email protected]
We prefer to work on the lockless solution instead of adding more
complicated workarounds. This is why I even did not try to upstream
the patch for SLES.
In the meantime, you might also consider removing the offending
message from the panic notifier if it is not really important.
Best Regards,
Petr
Hi, Petr
On 2020/6/4 16:29, Petr Mladek wrote:
> It might cause double unlock (deadlock) on architectures that did not
> use NMI to stop the CPUs.
>
> I have created a conservative fix for this problem for SLES, see
> https://github.com/openSUSE/kernel-source/blob/SLE15-SP2-UPDATE/patches.suse/printk-panic-Avoid-deadlock-in-printk-after-stopping-CPUs-by-NMI.patch
> It solves the problem only on x86 architecture.
>
> There are many hacks that try to solve various scenarios but it
> is getting too complicated and does not solve all problems.
I have read your conservative fix and I have some question,
1-- does the console_sem need to be reinitialized ?
2-- Other architectures without NMI, is there no such problem ?
> The only real solution is lockless printk(). First piece is a lockless
> ringbuffer. See the last version at
> https://lore.kernel.org/r/[email protected]
>
> We prefer to work on the lockless solution instead of adding more
> complicated workarounds. This is why I even did not try to upstream
> the patch for SLES.
>
> In the meantime, you might also consider removing the offending
> message from the panic notifier if it is not really important.
>
> Best Regards,
> Petr
>
> .
Thank you.
Cheng Jian
On Fri 2020-06-05 18:42:57, chengjian (D) wrote:
> Hi, Petr
>
> On 2020/6/4 16:29, Petr Mladek wrote:
>
> > It might cause double unlock (deadlock) on architectures that did not
> > use NMI to stop the CPUs.
> >
> > I have created a conservative fix for this problem for SLES, see
> > https://github.com/openSUSE/kernel-source/blob/SLE15-SP2-UPDATE/patches.suse/printk-panic-Avoid-deadlock-in-printk-after-stopping-CPUs-by-NMI.patch
> > It solves the problem only on x86 architecture.
> >
> > There are many hacks that try to solve various scenarios but it
> > is getting too complicated and does not solve all problems.
>
> I have read your conservative fix and I have some question,
>
> 1-- does the console_sem need to be reinitialized ?
No, it is not needed:
+ printk() itself does try_lock() and skips console handling when the
semaphore is not available.
+ panic() tries to push the messages later in console_flush_on_panic().
It ignores the semaphore. Also most console drivers ignore their
internal locks because oops_in_progress is set by bust_spinlocks().
> 2-- Other architectures without NMI, is there no such problem ?
The situation is more complicated when NMI is not used. Non-stopped
CPUs are in unknown state, most likely in a busy loop. Nobody knows
whether printk() is repeatedly called in the loop. When it was called,
re-initializing any lock would cause double unlock and deadlock.
It would be possible to add some more hacks. One problem is that
there are two groups of users. One prefer to risk a deadlock and
have a chance to see the messages. Others prefer to always
reach emergency_restart() and reboot the machine.
This is one big motivation for working on lockless printk().
Best Regards,
Petr