When a caller writes heavily to the kernel log (e.g. writing to
/dev/kmsg in a loop) while another panics, there's currently a high
likelihood of a deadlock (see patch 2 for the full description of this
deadlock).
The principle fix is to disable the optimistic spin once panic_cpu is
set, so the panic CPU doesn't spin waiting for a halted CPU to hand over
the console_sem.
However, this exposed us to a livelock situation, where the panic CPU
holds the console_sem, and another CPU could fill up the log buffer
faster than the consoles could drain it, preventing the panic from
progressing and halting the other CPUs. To avoid this, patch 3 adds a
mechanism to suppress printk (from non-panic-CPU) during panic, if we
reach a threshold of dropped messages.
A major goal with all of these patches is to try to decrease the
likelihood that another CPU is holding the console_sem when we halt it
in panic(). This reduces the odds of needing to break locks and
potentially encountering further deadlocks with the console drivers.
To test, I use the following script, kmsg_panic.sh:
#!/bin/bash
date
# 991 chars (based on log buffer size):
chars="$(printf 'a%.0s' {1..991})"
while :; do
echo $chars > /dev/kmsg
done &
echo c > /proc/sysrq-trigger &
date
exit
I defined a hang as any time the system did not reboot to a login prompt
on the serial console within 60 seconds. Here are the statistics on
hangs using this script, before and after the patch.
before: 776 hangs / 1484 trials - 52.3%
after : 0 hangs / 1238 trials - 0.0%
Stephen Brennan (4):
panic: Add panic_in_progress helper
printk: disable optimistic spin during panic
printk: Avoid livelock with heavy printk during panic
printk: Drop console_sem during panic
include/linux/panic.h | 3 +++
kernel/printk/printk.c | 31 ++++++++++++++++++++++++++++++-
2 files changed, 33 insertions(+), 1 deletion(-)
--
2.30.2
A CPU executing with console lock spinning enabled might be halted
during a panic. Before the panicking CPU calls console_flush_on_panic(),
it may call console_trylock(), which attempts to optimistically spin,
deadlocking the panic CPU:
CPU 0 (panic CPU) CPU 1
----------------- ------
printk() {
vprintk_func() {
vprintk_default() {
vprintk_emit() {
console_unlock() {
console_lock_spinning_enable();
... printing to console ...
panic() {
crash_smp_send_stop() {
NMI -------------------> HALT
}
atomic_notifier_call_chain() {
printk() {
...
console_trylock_spinnning() {
// optimistic spin infinitely
This hang during panic can be induced when a kdump kernel is loaded, and
crash_kexec_post_notifiers=1 is present on the kernel command line. The
following script which concurrently writes to /dev/kmsg, and triggers a
panic, can result in this hang:
#!/bin/bash
date
# 991 chars (based on log buffer size):
chars="$(printf 'a%.0s' {1..991})"
while :; do
echo $chars > /dev/kmsg
done &
echo c > /proc/sysrq-trigger &
date
exit
To avoid this deadlock, ensure that console_trylock_spinning() does not
allow spinning once a panic has begun.
Fixes: dbdda842fe96 ("printk: Add console owner and waiter logic to load balance console writes")
Suggested-by: Petr Mladek <[email protected]>
Signed-off-by: Stephen Brennan <[email protected]>
---
kernel/printk/printk.c | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 57b132b658e1..20b4b71a1a07 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -1843,6 +1843,16 @@ static int console_trylock_spinning(void)
if (console_trylock())
return 1;
+ /*
+ * It's unsafe to spin once a panic has begun. If we are the
+ * panic CPU, we may have already halted the owner of the
+ * console_sem. If we are not the panic CPU, then we should
+ * avoid taking console_sem, so the panic CPU has a better
+ * chance of cleanly acquiring it later.
+ */
+ if (panic_in_progress())
+ return 0;
+
printk_safe_enter_irqsave(flags);
raw_spin_lock(&console_owner_lock);
--
2.30.2
During a panic(), if another CPU is writing heavily the kernel log (e.g.
via /dev/kmsg), then the panic CPU may livelock writing out its messages
to the console. Note when too many messages are dropped during panic and
suppress further printk, except from the panic CPU.
Suggested-by: Petr Mladek <[email protected]>
Signed-off-by: Stephen Brennan <[email protected]>
---
kernel/printk/printk.c | 13 +++++++++++++
1 file changed, 13 insertions(+)
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 20b4b71a1a07..ca253ac07615 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -93,6 +93,12 @@ EXPORT_SYMBOL_GPL(console_drivers);
*/
int __read_mostly suppress_printk;
+/*
+ * During panic, heavy printk by other CPUs can delay the
+ * panic and risk deadlock on console resources.
+ */
+int __read_mostly suppress_panic_printk;
+
#ifdef CONFIG_LOCKDEP
static struct lockdep_map console_lock_dep_map = {
.name = "console_lock"
@@ -2228,6 +2234,10 @@ asmlinkage int vprintk_emit(int facility, int level,
if (unlikely(suppress_printk))
return 0;
+ if (unlikely(suppress_panic_printk) &&
+ atomic_read(&panic_cpu) != raw_smp_processor_id())
+ return 0;
+
if (level == LOGLEVEL_SCHED) {
level = LOGLEVEL_DEFAULT;
in_sched = true;
@@ -2613,6 +2623,7 @@ void console_unlock(void)
{
static char ext_text[CONSOLE_EXT_LOG_MAX];
static char text[CONSOLE_LOG_MAX];
+ static int panic_console_dropped;
unsigned long flags;
bool do_cond_resched, retry;
struct printk_info info;
@@ -2667,6 +2678,8 @@ void console_unlock(void)
if (console_seq != r.info->seq) {
console_dropped += r.info->seq - console_seq;
console_seq = r.info->seq;
+ if (panic_in_progress() && panic_console_dropped++ > 10)
+ suppress_panic_printk = 1;
}
if (suppress_message_printing(r.info->level)) {
--
2.30.2
On Fri 2022-01-21 11:02:20, Stephen Brennan wrote:
> A CPU executing with console lock spinning enabled might be halted
> during a panic. Before the panicking CPU calls console_flush_on_panic(),
> it may call console_trylock(), which attempts to optimistically spin,
> deadlocking the panic CPU:
>
> CPU 0 (panic CPU) CPU 1
> ----------------- ------
> printk() {
> vprintk_func() {
> vprintk_default() {
> vprintk_emit() {
> console_unlock() {
> console_lock_spinning_enable();
> ... printing to console ...
> panic() {
> crash_smp_send_stop() {
> NMI -------------------> HALT
> }
> atomic_notifier_call_chain() {
> printk() {
> ...
> console_trylock_spinnning() {
> // optimistic spin infinitely
>
> This hang during panic can be induced when a kdump kernel is loaded, and
> crash_kexec_post_notifiers=1 is present on the kernel command line. The
> following script which concurrently writes to /dev/kmsg, and triggers a
> panic, can result in this hang:
>
> #!/bin/bash
> date
> # 991 chars (based on log buffer size):
> chars="$(printf 'a%.0s' {1..991})"
> while :; do
> echo $chars > /dev/kmsg
> done &
> echo c > /proc/sysrq-trigger &
> date
> exit
>
> To avoid this deadlock, ensure that console_trylock_spinning() does not
> allow spinning once a panic has begun.
>
> Fixes: dbdda842fe96 ("printk: Add console owner and waiter logic to load balance console writes")
>
> Suggested-by: Petr Mladek <[email protected]>
> Signed-off-by: Stephen Brennan <[email protected]>
Looks good to me:
Reviewed-by: Petr Mladek <[email protected]>
Best Regards,
Petr
On Fri 2022-01-21 11:02:21, Stephen Brennan wrote:
> During a panic(), if another CPU is writing heavily the kernel log (e.g.
> via /dev/kmsg), then the panic CPU may livelock writing out its messages
> to the console. Note when too many messages are dropped during panic and
> suppress further printk, except from the panic CPU.
I would add something like:
"It might cause that some useful messages are lost. But messages are
being lost anyway and this at least avoids the possible livelock."
> ---
> kernel/printk/printk.c | 13 +++++++++++++
> 1 file changed, 13 insertions(+)
>
> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> index 20b4b71a1a07..ca253ac07615 100644
> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -2667,6 +2678,8 @@ void console_unlock(void)
> if (console_seq != r.info->seq) {
> console_dropped += r.info->seq - console_seq;
> console_seq = r.info->seq;
> + if (panic_in_progress() && panic_console_dropped++ > 10)
> + suppress_panic_printk = 1;
It would be great to let the user know, something like:
pr_warn("Too many dropped messages. Supress messages on non-panic CPUs to prevent livelock.\n");
> }
>
> if (suppress_message_printing(r.info->level)) {
Otherwise, it looks good to me.
Best Regards,
Petr
On (22/01/21 11:02), Stephen Brennan wrote:
> A CPU executing with console lock spinning enabled might be halted
> during a panic. Before the panicking CPU calls console_flush_on_panic(),
> it may call console_trylock(), which attempts to optimistically spin,
> deadlocking the panic CPU:
>
> CPU 0 (panic CPU) CPU 1
> ----------------- ------
> printk() {
> vprintk_func() {
> vprintk_default() {
> vprintk_emit() {
> console_unlock() {
> console_lock_spinning_enable();
> ... printing to console ...
> panic() {
> crash_smp_send_stop() {
> NMI -------------------> HALT
> }
> atomic_notifier_call_chain() {
> printk() {
> ...
> console_trylock_spinnning() {
> // optimistic spin infinitely
[..]
> +++ b/kernel/printk/printk.c
> @@ -1843,6 +1843,16 @@ static int console_trylock_spinning(void)
> if (console_trylock())
> return 1;
>
> + /*
> + * It's unsafe to spin once a panic has begun. If we are the
> + * panic CPU, we may have already halted the owner of the
> + * console_sem. If we are not the panic CPU, then we should
> + * avoid taking console_sem, so the panic CPU has a better
> + * chance of cleanly acquiring it later.
> + */
> + if (panic_in_progress())
> + return 0;
Is there something that prevents panic CPU from NMI hlt CPU which is
in console_trylock() under raw_spin_lock_irqsave()?
CPU0 CPU1
console_trylock_spinnning()
console_trylock()
down_trylock()
raw_spin_lock_irqsave(&sem->lock)
panic()
crash_smp_send_stop()
NMI -> HALT
> Is there something that prevents panic CPU from NMI hlt CPU which is
> in console_trylock() under raw_spin_lock_irqsave()?
>
> CPU0 CPU1
> console_trylock_spinnning()
> console_trylock()
> down_trylock()
> raw_spin_lock_irqsave(&sem->lock)
>
> panic()
> crash_smp_send_stop()
> NMI -> HALT
This is a good point. I wonder if console_flush_on_panic() should
perform a sema_init() before it does console_trylock().
John
On (22/01/26 10:51), John Ogness wrote:
> > Is there something that prevents panic CPU from NMI hlt CPU which is
> > in console_trylock() under raw_spin_lock_irqsave()?
> >
> > CPU0 CPU1
> > console_trylock_spinnning()
> > console_trylock()
> > down_trylock()
> > raw_spin_lock_irqsave(&sem->lock)
> >
> > panic()
> > crash_smp_send_stop()
> > NMI -> HALT
>
> This is a good point. I wonder if console_flush_on_panic() should
> perform a sema_init() before it does console_trylock().
A long time ago there was zap_locks() function in printk, that used
to re-init console semaphore and logbuf spin_lock, but _only_ in case
of printk recursion (which was never reliable)
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/kernel/printk/printk.c?h=v4.9.297#n1557
This has been superseded by printk_safe per-CPU buffers so we removed
that function.
So it could be that may be we want to introduce something similar to
zap_locks() again.
All reasonable serial consoles drivers should take oops_in_progress into
consideration in ->write(), so we probably don't care for console_drivers
spinlocks, etc. but potentially can do a bit better on the printk side.
Sergey Senozhatsky <[email protected]> writes:
> On (22/01/26 10:51), John Ogness wrote:
>> > Is there something that prevents panic CPU from NMI hlt CPU which is
>> > in console_trylock() under raw_spin_lock_irqsave()?
>> >
>> > CPU0 CPU1
>> > console_trylock_spinnning()
>> > console_trylock()
>> > down_trylock()
>> > raw_spin_lock_irqsave(&sem->lock)
>> >
>> > panic()
>> > crash_smp_send_stop()
>> > NMI -> HALT
>>
>> This is a good point. I wonder if console_flush_on_panic() should
>> perform a sema_init() before it does console_trylock().
>
> A long time ago there was zap_locks() function in printk, that used
> to re-init console semaphore and logbuf spin_lock, but _only_ in case
> of printk recursion (which was never reliable)
>
> https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/kernel/printk/printk.c?h=v4.9.297#n1557
>
> This has been superseded by printk_safe per-CPU buffers so we removed
> that function.
>
> So it could be that may be we want to introduce something similar to
> zap_locks() again.
>
> All reasonable serial consoles drivers should take oops_in_progress into
> consideration in ->write(), so we probably don't care for console_drivers
> spinlocks, etc. but potentially can do a bit better on the printk side.
I see the concern here. If a CPU is halted while holding
console_sem.lock spinlock, then the very next printk would hang, since
each vprintk_emit() does a trylock.
Now in my thousands of iterations of tests, I haven't been lucky enough
to interrupt a CPU in the middle of this critical section. The critical
section itself is incredibly short and so it's hard to do it. Not
impossible, I'd imagine.
We can't fix it in console_flush_on_panic(), because that is called much
later, after we've called the panic notifiers, which definitely
printk(). If we wanted to re-initialize the console_sem, we'd want it
done earlier in panic(), directly after the NMI was sent.
My understanding was that we can't be too cautious regarding the console
drivers. Sure, they _shouldn't_ have any race conditions, but once we're
in panic we're better off avoiding the console drivers unless it's our
last choice. So, is it worth re-initializing the console_sem early in
panic, which forces all the subsequent printk to go out to the consoles?
I don't know.
One alternative is to do __printk_safe_enter() at the beginning of
panic. This effectively guarantees that no printk will hit the console
drivers or even attempt to grab the console_sem. Then, we can do the
kmsg_dump, do a crash_kexec if configured, and only when all options
have been exhausted would we reinitialize the console_sem and flush to
the console. Maybe this is too cautious, but it is an alternative.
Stephen
On (22/01/26 10:15), Stephen Brennan wrote:
[..]
> > On (22/01/26 10:51), John Ogness wrote:
> >> > Is there something that prevents panic CPU from NMI hlt CPU which is
> >> > in console_trylock() under raw_spin_lock_irqsave()?
> >> >
> >> > CPU0 CPU1
> >> > console_trylock_spinnning()
> >> > console_trylock()
> >> > down_trylock()
> >> > raw_spin_lock_irqsave(&sem->lock)
> >> >
> >> > panic()
> >> > crash_smp_send_stop()
> >> > NMI -> HALT
> >>
> >> This is a good point. I wonder if console_flush_on_panic() should
> >> perform a sema_init() before it does console_trylock().
> >
> > A long time ago there was zap_locks() function in printk, that used
> > to re-init console semaphore and logbuf spin_lock, but _only_ in case
> > of printk recursion (which was never reliable)
> >
> > https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/kernel/printk/printk.c?h=v4.9.297#n1557
> >
> > This has been superseded by printk_safe per-CPU buffers so we removed
> > that function.
> >
> > So it could be that may be we want to introduce something similar to
> > zap_locks() again.
> >
> > All reasonable serial consoles drivers should take oops_in_progress into
> > consideration in ->write(), so we probably don't care for console_drivers
> > spinlocks, etc. but potentially can do a bit better on the printk side.
>
> I see the concern here. If a CPU is halted while holding
> console_sem.lock spinlock, then the very next printk would hang, since
> each vprintk_emit() does a trylock.
Right. So I also thought about placing panic_in_progress() somewhere in
console_trylock() and make it fail for anything that is not a panic CPU.
> Now in my thousands of iterations of tests, I haven't been lucky enough
> to interrupt a CPU in the middle of this critical section. The critical
> section itself is incredibly short and so it's hard to do it. Not
> impossible, I'd imagine.
I can imagine that the race window is really small, and I'm not insisting
on fixing it right now (or ever for that matter).
Basically, we now have two different "something bad is in progress"
that affect two different ends of the calls stack. bust_spinlocks()
sets oops_in_progress and affects console drivers' spinlocks, but has
no meaning to any other printk locks. And then we have panic_in_progress()
which is meaningful to some printk locks, but not to all of them, and is
meaningless to console drivers, because those look at oops_in_progress.
If printk folks are fine with that then I'm also fine.
> We can't fix it in console_flush_on_panic(), because that is called much
> later, after we've called the panic notifiers, which definitely
> printk(). If we wanted to re-initialize the console_sem, we'd want it
> done earlier in panic(), directly after the NMI was sent.
Right.
> My understanding was that we can't be too cautious regarding the console
> drivers. Sure, they _shouldn't_ have any race conditions, but once we're
> in panic we're better off avoiding the console drivers unless it's our
> last choice. So, is it worth re-initializing the console_sem early in
> panic, which forces all the subsequent printk to go out to the consoles?
> I don't know.
>
> One alternative is to do __printk_safe_enter() at the beginning of
> panic. This effectively guarantees that no printk will hit the console
> drivers or even attempt to grab the console_sem. Then, we can do the
> kmsg_dump, do a crash_kexec if configured, and only when all options
> have been exhausted would we reinitialize the console_sem and flush to
> the console. Maybe this is too cautious, but it is an alternative.
Back in the days we also had this idea of "detaching" non-panic CPUs from
printk() by overwriting their printk function pointers.
On 2022-01-27, Sergey Senozhatsky <[email protected]> wrote:
> Right. So I also thought about placing panic_in_progress() somewhere in
> console_trylock() and make it fail for anything that is not a panic
> CPU.
I think this is a good idea and console_trylock() is the correct place
for that.
> Back in the days we also had this idea of "detaching" non-panic CPUs from
> printk() by overwriting their printk function pointers.
We need to keep in mind that printk() is no longer the problem. The
records are stored locklessly. The problem is the
console_trylock()/console_unlock() within vprintk_emit(). IMHO adding a
panic check in console_trylock() should solve that race sufficiently.
John
On Thu 2022-01-27 16:11:08, Sergey Senozhatsky wrote:
> On (22/01/26 10:15), Stephen Brennan wrote:
> [..]
> > > On (22/01/26 10:51), John Ogness wrote:
> > >> > Is there something that prevents panic CPU from NMI hlt CPU which is
> > >> > in console_trylock() under raw_spin_lock_irqsave()?
> > >> >
> > >> > CPU0 CPU1
> > >> > console_trylock_spinnning()
> > >> > console_trylock()
> > >> > down_trylock()
> > >> > raw_spin_lock_irqsave(&sem->lock)
> > >> >
> > >> > panic()
> > >> > crash_smp_send_stop()
> > >> > NMI -> HALT
> > >>
> > >> This is a good point. I wonder if console_flush_on_panic() should
> > >> perform a sema_init() before it does console_trylock().
> > >
> > > A long time ago there was zap_locks() function in printk, that used
> > > to re-init console semaphore and logbuf spin_lock, but _only_ in case
> > > of printk recursion (which was never reliable)
> > >
> > > https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/kernel/printk/printk.c?h=v4.9.297#n1557
> > >
> > > This has been superseded by printk_safe per-CPU buffers so we removed
> > > that function.
> > >
> > > So it could be that may be we want to introduce something similar to
> > > zap_locks() again.
> > >
> > > All reasonable serial consoles drivers should take oops_in_progress into
> > > consideration in ->write(), so we probably don't care for console_drivers
> > > spinlocks, etc. but potentially can do a bit better on the printk side.
> >
> > I see the concern here. If a CPU is halted while holding
> > console_sem.lock spinlock, then the very next printk would hang, since
> > each vprintk_emit() does a trylock.
>
> Right. So I also thought about placing panic_in_progress() somewhere in
> console_trylock() and make it fail for anything that is not a panic CPU.
>
> > Now in my thousands of iterations of tests, I haven't been lucky enough
> > to interrupt a CPU in the middle of this critical section. The critical
> > section itself is incredibly short and so it's hard to do it. Not
> > impossible, I'd imagine.
>
> I can imagine that the race window is really small, and I'm not insisting
> on fixing it right now (or ever for that matter).
>
> Basically, we now have two different "something bad is in progress"
> that affect two different ends of the calls stack. bust_spinlocks()
> sets oops_in_progress and affects console drivers' spinlocks, but has
> no meaning to any other printk locks. And then we have panic_in_progress()
> which is meaningful to some printk locks, but not to all of them, and is
> meaningless to console drivers, because those look at oops_in_progress.
Good point! It looks a bit non-consistent and I am sure that we could
do better.
Well, my view is that there are contexts:
1. oops_in_progress is used when printing OOps report and panic().
The important thing is that the system might continue working
after OOps when "panic_on_oops" is not defined.
Many console drivers allow to enter locked section even when the lock
is already teaken. But they prevent double unlock.
The aim is to show OOps messages on the console because they system
might silently die otherwise.
2. The new panic_in_progress() context says that the system is really
dying.
The system should try hard to show the messages on console but
still a safe way. It should not risk breaking other ways to store
debugging information: kdump, kmsg_dump, and panic notifiers.
I mention notifiers because some inform the hypervisor about the
panic. The hypervisor could get some debugging information as well.
3. console_flush_on_panic() is the last attempt to show to messages
on the console.
It is done after kdump, kmsg_dump, and notifiers. consoles are
the only concern here. So it could be more agressive and risk more.
Honestly, I never thought much about oops_in_panic context. All
the past discussion were focused on panic.
Also we focused on lowering the risk by introducing lockless algorithms
or fixing bugs. I can't remember moving some risky operation earlier.
Also we were reluctant to change the risk level (algorithm) when it
was not obvious or at least promising that it makes things better.
> If printk folks are fine with that then I'm also fine.
>
> > We can't fix it in console_flush_on_panic(), because that is called much
> > later, after we've called the panic notifiers, which definitely
> > printk(). If we wanted to re-initialize the console_sem, we'd want it
> > done earlier in panic(), directly after the NMI was sent.
I am not sure if it is worth the risk. You want to reinitialize the
semaphore because a small race window in the internal spin lock.
But it will allow to enter a lot of code that is guarded by console_sem.
I mean that chance of dealock caused by the internal semaohore spin
lock is super small. In compare, a lot of tricky code is guarded
by console_sem. It looks like a big risk to ignore the semaphore
early in panic().
A better solution would be to use raw_spin_trylock_irqsave() in
down_trylock().
Best Regards,
Petr
On 2022-01-27, Petr Mladek <[email protected]> wrote:
> I mean that chance of dealock caused by the internal semaohore spin
> lock is super small. In compare, a lot of tricky code is guarded
> by console_sem. It looks like a big risk to ignore the semaphore
> early in panic().
Agreed.
> A better solution would be to use raw_spin_trylock_irqsave() in
> down_trylock().
down_trylock() is attempting to decrement a semaphore. It should not
fail just because another CPU is also in the process of
decrementing/incrementing the semaphore.
Maybe a down_trylock_cond() could be introduced where the trylock could
fail if a given condition is not met. The function would need to
implement its own internal trylock spin loop to check the condition. But
then we could pass in a condition for it to abort. For example, when in
panic and we are not the panic CPU.
John
On Thu 2022-01-27 13:49:44, John Ogness wrote:
> On 2022-01-27, Petr Mladek <[email protected]> wrote:
> > I mean that chance of dealock caused by the internal semaohore spin
> > lock is super small. In compare, a lot of tricky code is guarded
> > by console_sem. It looks like a big risk to ignore the semaphore
> > early in panic().
>
> Agreed.
>
> > A better solution would be to use raw_spin_trylock_irqsave() in
> > down_trylock().
>
> down_trylock() is attempting to decrement a semaphore. It should not
> fail just because another CPU is also in the process of
> decrementing/incrementing the semaphore.
IMHO, it does not matter. As you say, raw_spin_trylock_irqsave() fails
only when another process is about to release or take the semaphore.
The semaphore is usually taken for a long time. The tiny window when
the counter is manipulated is negligible.
I mean, if down_trylock() fails because of raw_spin_trylock_irqsave()
failure then it is few instructions from failing even with the lock.
> Maybe a down_trylock_cond() could be introduced where the trylock could
> fail if a given condition is not met. The function would need to
> implement its own internal trylock spin loop to check the condition. But
> then we could pass in a condition for it to abort. For example, when in
> panic and we are not the panic CPU.
This looks too complicated.
Another solution would be to introduce panic_down_trylock() variant
of down_trylock() that will use raw_spin_trylock_irqsave(). The normal
down_trylock() might still use the raw_spin_lock_irqsave().
Well, this should get discussed with the locking people.
Best Regards,
Petr