Hi,
This is v2 of a series to introduce the new non-BKL (nbcon)
consoles. This series is only a subset of the original
v1 [0]. In particular, this series represents patches 5-10 of
the v1 series. For information about the motivation of the
atomic consoles, please read the cover letter of v1.
This series focuses on wiring up the printk subsystem to
be able to use the nbcon consoles and implement their ownership
interfaces and rules. This series does _not_ include threaded
printing, atomic printing regions, or nbcon drivers. Those
features will be added in separate follow-up series.
There is not much that has _not_ changed since v1. Here is an
attempt to list the changes:
- new naming:
OLD NEW
bkl legacy
nobkl nbcon
CON_NO_BKL CON_NBCON
cons_() nbcon_()
- rather than allocating context objects per-cpu, per-prio, and
per-console, require the context object to sit on the stack
- serialize nbcon consoles with the console_lock until there
are no more boot consoles registered
- update @have_boot_console and @have_legacy_console on
unregister_console()
- only use @nbcon_seq for the nbcon sequence counter
- avoid console lock in __pr_flush() if there are only nbcon
consoles
- use only 1 state variable instead of CUR and REQ states
- replace saved states in the context with boolean flags
- use atomic long for nbcon_seq, expanded as needed on 32bit
systems
- instead of the owner performing the handover, now the owner
gives up ownership and the waiter takes ownership
- remove unnecessary state and context fields
- simplify sequence tracking by only allowing incrementing
(positive) updates
- simplify buffer handling by only allowing hostile takeovers
in the single panic context
- remove early buffer handling because there is no early window
- carefully consider individual state bits rather than
performing general set compares
- split the code for various locking strategies based on
complete methods rather than functional pieces
John Ogness
[0] https://lore.kernel.org/lkml/[email protected]
John Ogness (1):
printk: Provide debug_store() for nbcon debugging
Thomas Gleixner (7):
printk: Add non-BKL (nbcon) console basic infrastructure
printk: nbcon: Add acquire/release logic
printk: nbcon: Add buffer management
printk: nbcon: Add sequence handling
printk: nbcon: Add ownership state functions
printk: nbcon: Add emit function and callback function for atomic
printing
printk: nbcon: Add functions for drivers to mark unsafe regions
include/linux/console.h | 132 +++++
kernel/printk/Makefile | 2 +-
kernel/printk/internal.h | 29 ++
kernel/printk/printk.c | 156 ++++--
kernel/printk/printk_nbcon.c | 955 +++++++++++++++++++++++++++++++++++
5 files changed, 1243 insertions(+), 31 deletions(-)
create mode 100644 kernel/printk/printk_nbcon.c
base-commit: 132a90d1527fedba2d95085c951ccf00dbbebe41
--
2.39.2
From: Thomas Gleixner <[email protected]>
Add per console acquire/release functionality. The console 'locked'
state is a combination of multiple state fields:
- The 'prio' field contains the severity of the context that owns
the console. This field is used for decisions whether to attempt
friendly handovers and also prevents takeovers from a less
severe context, e.g. to protect the panic CPU. A value of 0
(NBCON_PRIO_NONE) means the console is not locked.
- The 'cpu' field denotes on which CPU the console is locked.
The acquire mechanism comes with several flavours:
- Straight forward acquire when the console is not contended.
- Friendly handover mechanism based on a request/grant handshake.
The requesting context:
1) Sets its priority into the 'req_prio' field.
2) Waits (with a timeout) for the owning context to unlock the
console.
3) Sets the 'prio' field and clears the 'req_prio' field.
The owning context:
1) Observes the 'req_prio' field set.
2) Gives up console ownership by clearing the 'prio' field.
- Hostile takeover
The new owner takes the console over without 'req_prio'
handshake.
This is required when friendly handovers are not possible,
i.e. the higher priority context interrupted the owning
context on the same CPU or the owning context is not able
to make progress on a remote CPU.
The release is the counterpart which either releases the console
directly or yields it gracefully over to a requester.
All operations on console::nbcon_state are atomic cmpxchg based to
handle concurrency.
The acquire/release functions implement only minimal policies:
- Preference for higher priority contexts.
- Protection of the panic CPU.
All other policy decisions have to be made at the call sites:
- What is marked as an unsafe section.
- Whether to spinwait if there is already an owner.
- Whether to attempt a hostile takeover when safe.
- Whether to attempt a hostile takeover when unsafe.
The design allows to implement the well known:
acquire()
output_one_line()
release()
algorithm, but also allows to avoid the per line acquire/release for
e.g. panic situations by doing the acquire once and then relying on
the panic CPU protection for the rest.
Co-developed-by: John Ogness <[email protected]>
Signed-off-by: John Ogness <[email protected]>
Signed-off-by: Thomas Gleixner (Intel) <[email protected]>
---
include/linux/console.h | 59 +++++
kernel/printk/printk_nbcon.c | 447 +++++++++++++++++++++++++++++++++++
2 files changed, 506 insertions(+)
diff --git a/include/linux/console.h b/include/linux/console.h
index c99265d82b98..e06cd1ce3e82 100644
--- a/include/linux/console.h
+++ b/include/linux/console.h
@@ -175,13 +175,28 @@ enum cons_flags {
* struct nbcon_state - console state for nbcon consoles
* @atom: Compound of the state fields for atomic operations
*
+ * @req_prio: The priority of a handover request
+ * @prio: The priority of the current usage
+ * @unsafe: Console is busy in a non takeover region
+ * @hostile_unsafe: The @unsafe value before a hostile takeover
+ * @cpu: The CPU on which the owner runs
+ *
* To be used for reading and preparing of the value stored in the nbcon
* state variable @console.nbcon_state.
+ *
+ * The @prio and @req_prio fields are particularly important to allow
+ * spin-waiting to timeout and give up without the risk of a waiter being
+ * assigned the lock after giving up.
*/
struct nbcon_state {
union {
unsigned int atom;
struct {
+ unsigned int prio : 2;
+ unsigned int req_prio : 2;
+ unsigned int unsafe : 1;
+ unsigned int hostile_unsafe : 1;
+ unsigned int cpu : 24;
};
};
};
@@ -194,6 +209,50 @@ struct nbcon_state {
*/
static_assert(sizeof(struct nbcon_state) <= sizeof(int));
+/**
+ * nbcon_prio - console owner priority for nbcon consoles
+ * @NBCON_PRIO_NONE: Unused
+ * @NBCON_PRIO_NORMAL: Normal (non-emergency) usage
+ * @NBCON_PRIO_EMERGENCY: Emergency output (WARN/OOPS...)
+ * @NBCON_PRIO_PANIC: Panic output
+ * @NBCON_PRIO_MAX: The number of priority levels
+ *
+ * A context wanting to produce emergency output can carefully takeover the
+ * console, even without consent of the owner. Ideally such a takeover is only
+ * when @nbcon_state::unsafe is not set. However, a context wanting to produce
+ * panic output can ignore the unsafe flag as a last resort. If panic output
+ * is active, no takeover is possible until the panic output releases the
+ * console.
+ */
+enum nbcon_prio {
+ NBCON_PRIO_NONE = 0,
+ NBCON_PRIO_NORMAL,
+ NBCON_PRIO_EMERGENCY,
+ NBCON_PRIO_PANIC,
+ NBCON_PRIO_MAX,
+};
+
+struct console;
+
+/**
+ * struct nbcon_context - Context for console acquire/release
+ * @console: The associated console
+ * @spinwait_max_us: Limit for spinwait acquire
+ * @prio: Priority of the context
+ * @unsafe: This context is in an unsafe section
+ * @hostile: Acquire console by hostile takeover
+ * @takeover_unsafe: Acquire console by hostile takeover even if unsafe
+ */
+struct nbcon_context {
+ /* members set by caller */
+ struct console *console;
+ unsigned int spinwait_max_us;
+ enum nbcon_prio prio;
+ unsigned int unsafe : 1;
+ unsigned int hostile : 1;
+ unsigned int takeover_unsafe : 1;
+};
+
/**
* struct console - The console descriptor structure
* @name: The name of the console driver
diff --git a/kernel/printk/printk_nbcon.c b/kernel/printk/printk_nbcon.c
index f9462b088439..b0acde0cb949 100644
--- a/kernel/printk/printk_nbcon.c
+++ b/kernel/printk/printk_nbcon.c
@@ -4,10 +4,38 @@
#include <linux/kernel.h>
#include <linux/console.h>
+#include <linux/delay.h>
#include "internal.h"
/*
* Printk console printing implementation for consoles that do not depend on
* the legacy style console_lock mechanism.
+ *
+ * Console is locked on a CPU when @nbcon_state::prio is set and
+ * @nbcon_state:cpu == current CPU. This is valid for the current execution
+ * context.
+ *
+ * Nesting execution contexts on the same CPU can carefully take over if
+ * the driver allows reentrancy via nbcon_state::unsafe = false. When the
+ * interrupted context resumes it checks the state before entering an unsafe
+ * region and aborts the operation if it detects a takeover.
+ *
+ * In case of panic the nesting context can take over the console forcefully.
+ *
+ * A concurrent writer on a different CPU with a higher priority can request
+ * to take over the console by:
+ *
+ * 1) Carefully writing the desired priority into @nbcon_state::req_prio
+ * if there is no higher priority request pending.
+ *
+ * 2) Carefully spin on @nbcon_state::prio until it is no longer locked.
+ *
+ * 3) Attempt to lock the console by carefully writing the desired
+ * priority to @nbcon_state::prio and carefully removing the desired
+ * priority from @nbcon_state::req_prio.
+ *
+ * In case that the owner does not react on the request and does not make
+ * observable progress, the waiter will timeout and can then decide to do
+ * a hostile takeover.
*/
/*
@@ -76,6 +104,425 @@ static inline bool nbcon_state_try_cmpxchg(struct console *con, struct nbcon_sta
return atomic_try_cmpxchg(&ACCESS_PRIVATE(con, nbcon_state), &cur->atom, new->atom);
}
+/**
+ * nbcon_context_try_acquire_direct - Try to acquire directly
+ * @ctxt: The context of the caller
+ * @cur: The current console state
+ *
+ * Return: 0 on success and @cur is updated to the new console state.
+ * Otherwise an error code on failure.
+ *
+ * Errors:
+ *
+ * -EPERM: A panic is in progress an this is not the panic CPU.
+ * Retrying the acquire will not be immediately useful.
+ *
+ * -EBUSY: The console already has an owner. Retrying the
+ * acquire will not be immediately useful.
+ *
+ * -EAGAIN: An unexpected state change occurred. @cur has been
+ * updated. The caller should retry the acquire.
+ *
+ * The general procedure is to change @prio from unowned to owned.
+ */
+static int nbcon_context_try_acquire_direct(struct nbcon_context *ctxt,
+ struct nbcon_state *cur)
+{
+ unsigned int cpu = smp_processor_id();
+ struct console *con = ctxt->console;
+ struct nbcon_state new = {
+ .cpu = cpu,
+ .prio = ctxt->prio,
+ .unsafe = cur->hostile_unsafe | ctxt->unsafe,
+ .hostile_unsafe = cur->hostile_unsafe,
+ };
+
+ if (other_cpu_in_panic())
+ return -EPERM;
+
+ /*
+ * Direct not possible if owner in unsafe region. If there
+ * was previously a hostile takeover, the console may be left
+ * in an unsafe state, even if it is now unowned.
+ */
+ if (cur->unsafe)
+ return -EBUSY;
+
+ /* Direct is only possible if it is unowned. */
+ if (cur->prio != NBCON_PRIO_NONE)
+ return -EBUSY;
+
+ /* Higher priority waiters are allowed to keep waiting. */
+ if (cur->req_prio > ctxt->prio)
+ new.req_prio = cur->req_prio;
+ else
+ new.req_prio = 0;
+
+ /* All looks good. Try to take ownership. */
+
+ if (!nbcon_state_try_cmpxchg(con, cur, &new))
+ return -EAGAIN;
+
+ cur->atom = new.atom;
+
+ return 0;
+}
+
+/**
+ * nbcon_context_try_acquire_handover - Try to acquire via handover
+ * @ctxt: The context of the caller
+ * @cur: The current console state
+ *
+ * Return: 0 on success and @cur is updated to the new console state.
+ * Otherwise an error code on failure.
+ *
+ * Errors:
+ *
+ * -EPERM: A panic is in progress an this is not the panic CPU.
+ * Retrying the acquire will not be immediately useful.
+ *
+ * -EBUSY: The current owner or waiter is such that this context
+ * is not able to execute a handover. Retrying the
+ * acquire will not be immediately useful.
+ *
+ * -EAGAIN: An unexpected state change occurred. @cur has been
+ * updated. The caller should retry the acquire.
+ *
+ * The general procedure is to set @req_prio and wait until unowned. Then
+ * set @prio (claiming ownership) and clearing @req_prio.
+ */
+static int nbcon_context_try_acquire_handover(struct nbcon_context *ctxt,
+ struct nbcon_state *cur)
+{
+ unsigned int cpu = smp_processor_id();
+ struct console *con = ctxt->console;
+ struct nbcon_state new;
+ int timeout;
+ int err = 0;
+
+ /* Cannot request handover if owner has same or higher priority. */
+ if (cur->prio >= ctxt->prio)
+ return -EBUSY;
+
+ /* Cannot request handover if a waiter has same or higher priority. */
+ if (cur->req_prio >= ctxt->prio)
+ return -EBUSY;
+
+ /*
+ * If there is no owner, a handover may be in progress and this
+ * context (having a higher priority) could jump in directly.
+ */
+ if (cur->prio == NBCON_PRIO_NONE)
+ return nbcon_context_try_acquire_direct(ctxt, cur);
+
+ /* Cannot handover on same CPU. */
+ if (cur->cpu == cpu)
+ return -EBUSY;
+
+ /* Cannot request handover if caller unwilling to wait. */
+ if (ctxt->spinwait_max_us == 0)
+ return -EBUSY;
+
+ /* Set request for handover. */
+ new.atom = cur->atom;
+ new.req_prio = ctxt->prio;
+ if (!nbcon_state_try_cmpxchg(con, cur, &new))
+ return -EAGAIN;
+
+ cur->atom = new.atom;
+
+ debug_store(1, "handover: cpu%d SPINNING to acquire for prio%d\n", cpu, ctxt->prio);
+
+ /* Wait until there is no owner and then acquire directly. */
+ for (timeout = ctxt->spinwait_max_us; timeout >= 0; timeout--) {
+ /* On successful acquire, this request is cleared. */
+ err = nbcon_context_try_acquire_direct(ctxt, cur);
+ if (!err)
+ return 0;
+
+ /*
+ * If another CPU is in panic, the request must be removed
+ * before returning to the caller.
+ */
+ if (err == -EPERM)
+ break;
+
+ /* Continue spinwaiting on -EAGAIN and -EBUSY. */
+
+ udelay(1);
+
+ /* Re-read the state because some time has passed. */
+ nbcon_state_read(con, cur);
+
+ /*
+ * If the waiter unexpectedly changed, this request is
+ * no longer set. Have the caller try again rather than
+ * guessing what has happened.
+ */
+ if (cur->req_prio != ctxt->prio)
+ return -EAGAIN;
+ }
+
+ /* Timeout. Safely remove handover request and report failure. */
+ do {
+ /* No need to remove request if there is another waiter. */
+ if (cur->req_prio != ctxt->prio) {
+ if (err == -EPERM)
+ return err;
+ return -EBUSY;
+ }
+
+ /* Unset request for handover. */
+ new.atom = cur->atom;
+ new.req_prio = NBCON_PRIO_NONE;
+ if (nbcon_state_try_cmpxchg(con, cur, &new)) {
+ debug_store(1, "handover: cpu%d TIMEOUT to acquire for prio%d\n",
+ cpu, ctxt->prio);
+ /*
+ * Request successfully unset. Report failure of
+ * acquiring via handover.
+ */
+ cur->atom = new.atom;
+ return -EBUSY;
+ }
+
+ /*
+ * Unable to remove request. Try direct acquire. If
+ * successful, this request is also cleared.
+ */
+ err = nbcon_context_try_acquire_direct(ctxt, cur);
+ } while (err);
+
+ /* Direct acquire at timeout succeeded! */
+ return 0;
+}
+
+/**
+ * nbcon_context_try_acquire_hostile - Try to acquire via hostile takeover
+ * @ctxt: The context of the caller
+ * @cur: The current console state
+ *
+ * Return: 0 on success and @cur is updated to the new console state.
+ * Otherwise an error code on failure.
+ *
+ * Errors:
+ *
+ * -EPERM: A panic is in progress an this is not the panic CPU
+ * or this context is not the single panic context.
+ * Retrying the acquire will not be immediately useful.
+ *
+ * -EBUSY: The current owner is in an unsafe region and
+ * @takeover_unsafe was not set. Retrying the acquire
+ * is only immediately useful if @takeover_unsafe is
+ * set.
+ *
+ * -EAGAIN: An unexpected state change occurred. @cur has been
+ * updated. The caller should retry the acquire.
+ *
+ * The general procedure is to set @prio (forcing ownership). This method
+ * must only be used as a final attempt during panic.
+ */
+static int nbcon_context_try_acquire_hostile(struct nbcon_context *ctxt,
+ struct nbcon_state *cur)
+{
+ unsigned int cpu = smp_processor_id();
+ struct console *con = ctxt->console;
+ struct nbcon_state new = {
+ .cpu = cpu,
+ .prio = ctxt->prio,
+ .unsafe = cur->hostile_unsafe | ctxt->unsafe,
+ .hostile_unsafe = cur->hostile_unsafe | cur->unsafe,
+ };
+
+ if (other_cpu_in_panic())
+ return -EPERM;
+
+ /* Hostile takeovers must only be in the single panic context! */
+ if (WARN_ON_ONCE(ctxt->prio != NBCON_PRIO_PANIC || cur->prio == NBCON_PRIO_PANIC))
+ return -EPERM;
+
+ /*
+ * If a hostile takeover in unsafe regions is wanted,
+ * this is additionally requested.
+ */
+ if (cur->unsafe && !ctxt->takeover_unsafe)
+ return -EBUSY;
+
+ if (!nbcon_state_try_cmpxchg(con, cur, &new))
+ return -EAGAIN;
+
+ cur->atom = new.atom;
+
+ return 0;
+}
+
+/**
+ * nbcon_context_try_acquire - Try to acquire nbcon console
+ * @ctxt: The context of the caller
+ *
+ * Return: True if the console was acquired. False otherwise.
+ *
+ * The attempts to acquire always begin with the safest methods and only
+ * upon failure move to more aggressive methods.
+ *
+ * If the caller requested a hostile takeover, on success @ctxt is
+ * updated so that the caller can see if indeed a hostile (and
+ * possibly unsafe) takeover occurred.
+ */
+__maybe_unused
+static bool nbcon_context_try_acquire(struct nbcon_context *ctxt)
+{
+ __maybe_unused
+ unsigned int cpu = smp_processor_id();
+ struct console *con = ctxt->console;
+ struct nbcon_state cur;
+ bool hostile;
+ int err;
+
+ /* @takeover_unsafe is only valid when hostile takeover requested. */
+ WARN_ON_ONCE(ctxt->takeover_unsafe && !ctxt->hostile);
+
+ nbcon_state_read(con, &cur);
+
+try_again:
+ hostile = false;
+
+ /* ACQUIRE DIRECT */
+
+ debug_store(ctxt->prio > NBCON_PRIO_NORMAL,
+ "direct: cpu%d trying DIRECT acquire prio%d\n", cpu, ctxt->prio);
+ err = nbcon_context_try_acquire_direct(ctxt, &cur);
+ if (!err) {
+ debug_store(ctxt->prio > NBCON_PRIO_NORMAL,
+ "direct: cpu%d SUCCESS prio%d\n", cpu, ctxt->prio);
+ goto success;
+ } else if (err == -EPERM) {
+ debug_store(1, "direct: cpu%d FAILED prio%d (non-panic cpu)\n",
+ cpu, ctxt->prio);
+ return false;
+ } else if (err == -EAGAIN) {
+ debug_store(1, "direct: cpu%d FAILED prio%d (state race)\n",
+ cpu, ctxt->prio);
+ goto try_again;
+ } else {
+ debug_store(1, "direct: cpu%d FAILED prio%d (already owned by cpu%d)\n",
+ cpu, ctxt->prio, cur.cpu);
+ /* Continue to next method. */
+ }
+
+ /* ACQUIRE VIA HANDOVER */
+
+ debug_store(1, "handover: cpu%d REQUESTING prio%d from cpu%d\n",
+ cpu, ctxt->prio, cur.cpu);
+ err = nbcon_context_try_acquire_handover(ctxt, &cur);
+ if (!err) {
+ debug_store(1, "handover: cpu%d SUCCESS prio%d\n",
+ cpu, ctxt->prio);
+ goto success;
+ } else if (err == -EPERM) {
+ debug_store(1, "handover: cpu%d FAILED requesting prio%d (non-panic cpu)\n",
+ cpu, ctxt->prio);
+ return false;
+ } else if (err == -EAGAIN) {
+ debug_store(1, "handover: cpu%d FAILED requesting prio%d (state race)\n",
+ cpu, ctxt->prio);
+ goto try_again;
+ } else {
+ debug_store(1, "handover: cpu%d FAILED requesting prio%d (cpu%d/prio%d/timeout)\n",
+ cpu, ctxt->prio, cur.cpu, cur.prio);
+ /* Continue to next method. */
+ }
+
+ /* ACQUIRE VIA HOSTILE TAKEOVER */
+
+ /* Only attempt hostile takeover if explicitly requested. */
+ if (!ctxt->hostile)
+ return false;
+
+ debug_store(1,
+ "hostile: cpu%d trying HOSTILE acquire for prio%d from cpu%d (takeover_unsafe=%d)\n",
+ cpu, ctxt->prio, cur.cpu, ctxt->takeover_unsafe);
+ err = nbcon_context_try_acquire_hostile(ctxt, &cur);
+ if (!err) {
+ debug_store(1, "hostile: cpu%d SUCCESS prio%d\n", cpu, ctxt->prio);
+ /* Let caller know if the takeover was unsafe. */
+ ctxt->takeover_unsafe = cur.hostile_unsafe;
+ hostile = true;
+ goto success;
+ } else if (err == -EPERM) {
+ debug_store(1, "hostile: cpu%d FAILED acquire prio%d (non-panic cpu)\n",
+ cpu, ctxt->prio);
+ return false;
+ } else if (err == -EAGAIN) {
+ debug_store(1, "hostile: cpu%d FAILED acquire prio%d (state race)\n",
+ cpu, ctxt->prio);
+ goto try_again;
+ } else {
+ debug_store(1, "hostile: cpu%d FAILED acquire prio%d (unsafe)\n",
+ cpu, ctxt->prio);
+ /* Continue to next method. */
+ }
+
+ /* No methods left to try. */
+ return false;
+success:
+ if (!hostile) {
+ /* Let caller know it was not a hostile takeover. */
+ ctxt->hostile = 0;
+ ctxt->takeover_unsafe = 0;
+ }
+ return true;
+}
+
+static bool nbcon_owner_matches(struct nbcon_state *cur, int expected_cpu,
+ int expected_prio)
+{
+ /*
+ * Since consoles can only be acquired by higher priorities,
+ * owning contexts are uniquely identified by @prio. However,
+ * since contexts can unexpectedly lose ownership, it is
+ * possible that later another owner appears with the same
+ * priority. For this reason @cpu is also needed.
+ */
+
+ if (cur->prio != expected_prio)
+ return false;
+
+ if (cur->cpu != expected_cpu)
+ return false;
+
+ return true;
+}
+
+/**
+ * nbcon_context_release - Release the console
+ * @ctxt: The nbcon context from nbcon_context_try_acquire()
+ */
+__maybe_unused
+static void nbcon_context_release(struct nbcon_context *ctxt)
+{
+ unsigned int cpu = smp_processor_id();
+ struct console *con = ctxt->console;
+ struct nbcon_state cur;
+ struct nbcon_state new;
+
+ nbcon_state_read(con, &cur);
+ do {
+ if (!nbcon_owner_matches(&cur, cpu, ctxt->prio))
+ return;
+
+ new.atom = cur.atom;
+ new.prio = NBCON_PRIO_NONE;
+ new.unsafe = cur.hostile_unsafe;
+
+ /*
+ * If @hostile_unsafe is set, it is kept set so that
+ * the state remains permanently unsafe.
+ */
+
+ } while (!nbcon_state_try_cmpxchg(con, &cur, &new));
+}
+
/**
* nbcon_init - Initialize the nbcon console specific data
* @con: Console to initialize
--
2.39.2
To debug and validate nbcon ownership transitions, provide a new
macro debug_store() (enabled with DEBUG_NBCON) to log transition
information to the ringbuffer. If enabled, this macro only logs
to the ringbuffer and does not trigger any printing. This is to
avoid triggering recursive printing in the nbcon consoles.
Signed-off-by: John Ogness <[email protected]>
---
kernel/printk/printk_nbcon.c | 29 +++++++++++++++++++++++++++++
1 file changed, 29 insertions(+)
diff --git a/kernel/printk/printk_nbcon.c b/kernel/printk/printk_nbcon.c
index bb379a4f6263..f9462b088439 100644
--- a/kernel/printk/printk_nbcon.c
+++ b/kernel/printk/printk_nbcon.c
@@ -10,6 +10,35 @@
* the legacy style console_lock mechanism.
*/
+/*
+ * Define DEBUG_NBCON to allow for nbcon ownership transitions to be logged
+ * to the ringbuffer. The debug_store() macro only logs to the lockless
+ * ringbuffer and does not trigger any printing.
+ */
+#undef DEBUG_NBCON
+
+#ifdef DEBUG_NBCON
+/* Only write to ringbuffer. */
+int __debug_store(const char *fmt, ...)
+{
+ va_list args;
+ int r;
+
+ va_start(args, fmt);
+ r = vprintk_store(2, 7, NULL, fmt, args);
+ va_end(args);
+
+ return r;
+}
+#define debug_store(cond, fmt, ...) \
+ do { \
+ if (cond) \
+ __debug_store(pr_fmt("DEBUG: " fmt), ##__VA_ARGS__) \
+ } while (0)
+#else
+#define debug_store(cond, fmt, ...)
+#endif
+
/**
* nbcon_state_set - Helper function to set the console state
* @con: Console to update
--
2.39.2
From: Thomas Gleixner <[email protected]>
Provide functions that are related to the safe handover mechanism
and allow console drivers to dynamically specify unsafe regions:
- nbcon_context_can_proceed()
Invoked by a console owner to check whether a handover request
is pending or whether the console was taken over in a hostile
fashion. If a handover request is pending, this function will
also perform the handover, thus cancelling its own ownership.
- nbcon_context_update_unsafe()
Invoked by a console owner to denote that the driver is about
to enter or leave a critical region where a hostile take over
is unsafe. This function is also a cancellation point where
loss of ownership can occur.
The unsafe state is stored in the console state and allows a
new context to make informed decisions whether to attempt a
takeover of such a console. The unsafe state is also available
to the driver so that it can make informed decisions about the
required actions or take a special emergency path.
Co-developed-by: John Ogness <[email protected]>
Signed-off-by: John Ogness <[email protected]>
Signed-off-by: Thomas Gleixner (Intel) <[email protected]>
---
kernel/printk/printk_nbcon.c | 113 ++++++++++++++++++++++++++++++++++-
1 file changed, 112 insertions(+), 1 deletion(-)
diff --git a/kernel/printk/printk_nbcon.c b/kernel/printk/printk_nbcon.c
index 8229a0a00d5b..e41f2eff5ef6 100644
--- a/kernel/printk/printk_nbcon.c
+++ b/kernel/printk/printk_nbcon.c
@@ -623,7 +623,6 @@ static bool nbcon_owner_matches(struct nbcon_state *cur, int expected_cpu,
* nbcon_context_release - Release the console
* @ctxt: The nbcon context from nbcon_context_try_acquire()
*/
-__maybe_unused
static void nbcon_context_release(struct nbcon_context *ctxt)
{
unsigned int cpu = smp_processor_id();
@@ -650,6 +649,118 @@ static void nbcon_context_release(struct nbcon_context *ctxt)
ctxt->pbufs = NULL;
}
+/**
+ * nbcon_context_can_proceed - Check whether ownership can proceed
+ * @ctxt: The nbcon context from nbcon_context_try_acquire()
+ * @cur: The current console state
+ *
+ * Return: True if the state is correct. False if ownership was
+ * handed over or taken.
+ *
+ * Must be invoked after the record was dumped into the assigned buffer
+ * and at appropriate safe places in the driver.
+ *
+ * When this function returns false then the calling context is no longer
+ * the owner and is no longer allowed to go forward. In this case it must
+ * back out immediately and carefully. The buffer content is also no longer
+ * trusted since it no longer belongs to the calling context.
+ */
+static bool nbcon_context_can_proceed(struct nbcon_context *ctxt, struct nbcon_state *cur)
+{
+ unsigned int cpu = smp_processor_id();
+
+ /* Make sure this context is still the owner. */
+ if (!nbcon_owner_matches(cur, cpu, ctxt->prio)) {
+ debug_store(1, "handover: cpu%d DETECTED HOSTILE takeover\n", cpu);
+ return false;
+ }
+
+ /* The console owner can proceed if there is no waiter. */
+ if (cur->req_prio == NBCON_PRIO_NONE)
+ return true;
+
+ /*
+ * A console owner within an unsafe region is always allowed to
+ * proceed, even if there are waiters. It can perform a handover
+ * when exiting the unsafe region. Otherwise the waiter will
+ * need to perform an unsafe hostile takeover.
+ */
+ if (cur->unsafe) {
+ debug_store(cur->req_prio > cur->prio,
+ "handover: cpu%d IGNORING HANDOVER prio%d -> prio%d (unsafe)\n",
+ cpu, cur->prio, cur->req_prio);
+ return true;
+ }
+
+ /* Waiters always have higher priorities than owners. */
+ WARN_ON_ONCE(cur->req_prio <= cur->prio);
+
+ debug_store(1, "handover: cpu%d HANDING OVER prio%d -> prio%d\n",
+ cpu, cur->prio, cur->req_prio);
+
+
+ /*
+ * Having a safe point for take over and eventually a few
+ * duplicated characters or a full line is way better than a
+ * hostile takeover. Post processing can take care of the garbage.
+ * Release and hand over.
+ */
+ nbcon_context_release(ctxt);
+
+ /*
+ * It is not known whether the handover succeeded. The outermost
+ * callsite has to make the final decision whether printing
+ * should proceed or not (via reacquire, possibly hostile). The
+ * console is now unlocked so go back all the way instead of
+ * trying to implement heuristics in tons of places.
+ */
+ return false;
+}
+
+/**
+ * nbcon_context_update_unsafe - Update the unsafe bit in @con->nbcon_state
+ * @ctxt: The nbcon context from nbcon_context_try_acquire()
+ * @unsafe: The new value for the unsafe bit
+ *
+ * Return: True if the state is correct. False if ownership was
+ * handed over or taken.
+ *
+ * Typically the unsafe bit is set during acquire. This function allows
+ * modifying the unsafe status without releasing ownership.
+ *
+ * When this function returns false then the calling context is no longer
+ * the owner and is no longer allowed to go forward. In this case it must
+ * back out immediately and carefully. The buffer content is also no longer
+ * trusted since it no longer belongs to the calling context.
+ *
+ * Internal helper to avoid duplicated code
+ */
+__maybe_unused
+static bool nbcon_context_update_unsafe(struct nbcon_context *ctxt, bool unsafe)
+{
+ struct console *con = ctxt->console;
+ struct nbcon_state cur;
+ struct nbcon_state new;
+
+ nbcon_state_read(con, &cur);
+
+ /* The unsafe bit must not be cleared if @hostile_unsafe is set. */
+ if (!unsafe && cur.hostile_unsafe)
+ return nbcon_context_can_proceed(ctxt, &cur);
+
+ do {
+ if (!nbcon_context_can_proceed(ctxt, &cur))
+ return false;
+
+ new.atom = cur.atom;
+ new.unsafe = unsafe;
+ } while (!nbcon_state_try_cmpxchg(con, &cur, &new));
+
+ ctxt->unsafe = unsafe;
+
+ return true;
+}
+
/**
* nbcon_init - Initialize the nbcon console specific data
* @con: Console to initialize
--
2.39.2
From: Thomas Gleixner <[email protected]>
In case of hostile takeovers it must be ensured that the previous
owner cannot scribble over the output buffer of the emergency/panic
context. This is achieved by:
- Adding a global output buffer instance for the panic context.
This is the only situation where hostile takeovers can occur and
there is always at most 1 panic context.
- Allocating an output buffer per console upon console
registration. This buffer is used by the console owner when not
in panic context.
- Choosing the appropriate buffer is handled in the acquire/release
functions.
Co-developed-by: John Ogness <[email protected]>
Signed-off-by: John Ogness <[email protected]>
Signed-off-by: Thomas Gleixner (Intel) <[email protected]>
---
include/linux/console.h | 7 +++++++
kernel/printk/internal.h | 6 ++++++
kernel/printk/printk.c | 6 ------
kernel/printk/printk_nbcon.c | 26 ++++++++++++++++++++++++--
4 files changed, 37 insertions(+), 8 deletions(-)
diff --git a/include/linux/console.h b/include/linux/console.h
index e06cd1ce3e82..d2bcd2c190a7 100644
--- a/include/linux/console.h
+++ b/include/linux/console.h
@@ -233,6 +233,7 @@ enum nbcon_prio {
};
struct console;
+struct printk_buffers;
/**
* struct nbcon_context - Context for console acquire/release
@@ -242,6 +243,7 @@ struct console;
* @unsafe: This context is in an unsafe section
* @hostile: Acquire console by hostile takeover
* @takeover_unsafe: Acquire console by hostile takeover even if unsafe
+ * @pbufs: Pointer to the text buffer for this context
*/
struct nbcon_context {
/* members set by caller */
@@ -251,6 +253,9 @@ struct nbcon_context {
unsigned int unsafe : 1;
unsigned int hostile : 1;
unsigned int takeover_unsafe : 1;
+
+ /* members set by acquire */
+ struct printk_buffers *pbufs;
};
/**
@@ -274,6 +279,7 @@ struct nbcon_context {
* @node: hlist node for the console list
*
* @nbcon_state: State for nbcon consoles
+ * @pbufs: Pointer to nbcon private buffer
*/
struct console {
char name[16];
@@ -296,6 +302,7 @@ struct console {
/* nbcon console specific members */
atomic_t __private nbcon_state;
+ struct printk_buffers *pbufs;
};
#ifdef CONFIG_LOCKDEP
diff --git a/kernel/printk/internal.h b/kernel/printk/internal.h
index 655810f2976e..0f2be350600e 100644
--- a/kernel/printk/internal.h
+++ b/kernel/printk/internal.h
@@ -13,6 +13,12 @@ int devkmsg_sysctl_set_loglvl(struct ctl_table *table, int write,
#define printk_sysctl_init() do { } while (0)
#endif
+#define con_printk(lvl, con, fmt, ...) \
+ printk(lvl pr_fmt("%s%sconsole [%s%d] " fmt), \
+ (con->flags & CON_NBCON) ? "" : "legacy ", \
+ (con->flags & CON_BOOT) ? "boot" : "", \
+ con->name, con->index, ##__VA_ARGS__)
+
#ifdef CONFIG_PRINTK
#ifdef CONFIG_PRINTK_CALLER
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 98b4854c81ea..582552a96c57 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -3359,12 +3359,6 @@ static void try_enable_default_console(struct console *newcon)
newcon->flags |= CON_CONSDEV;
}
-#define con_printk(lvl, con, fmt, ...) \
- printk(lvl pr_fmt("%s%sconsole [%s%d] " fmt), \
- (con->flags & CON_NBCON) ? "" : "legacy ", \
- (con->flags & CON_BOOT) ? "boot" : "", \
- con->name, con->index, ##__VA_ARGS__)
-
static void console_init_seq(struct console *newcon, bool bootcon_registered)
{
struct console *con;
diff --git a/kernel/printk/printk_nbcon.c b/kernel/printk/printk_nbcon.c
index b0acde0cb949..39fa64891ec6 100644
--- a/kernel/printk/printk_nbcon.c
+++ b/kernel/printk/printk_nbcon.c
@@ -5,6 +5,7 @@
#include <linux/kernel.h>
#include <linux/console.h>
#include <linux/delay.h>
+#include <linux/slab.h>
#include "internal.h"
/*
* Printk console printing implementation for consoles that do not depend on
@@ -20,6 +21,9 @@
* region and aborts the operation if it detects a takeover.
*
* In case of panic the nesting context can take over the console forcefully.
+ * If the interrupted context touches the assigned record buffer after
+ * takeover, it does not cause harm because the interrupting single panic
+ * context is assigned its own panic record buffer.
*
* A concurrent writer on a different CPU with a higher priority can request
* to take over the console by:
@@ -356,6 +360,8 @@ static int nbcon_context_try_acquire_hostile(struct nbcon_context *ctxt,
return 0;
}
+static struct printk_buffers panic_nbcon_pbufs;
+
/**
* nbcon_context_try_acquire - Try to acquire nbcon console
* @ctxt: The context of the caller
@@ -372,7 +378,6 @@ static int nbcon_context_try_acquire_hostile(struct nbcon_context *ctxt,
__maybe_unused
static bool nbcon_context_try_acquire(struct nbcon_context *ctxt)
{
- __maybe_unused
unsigned int cpu = smp_processor_id();
struct console *con = ctxt->console;
struct nbcon_state cur;
@@ -471,6 +476,13 @@ static bool nbcon_context_try_acquire(struct nbcon_context *ctxt)
ctxt->hostile = 0;
ctxt->takeover_unsafe = 0;
}
+
+ /* Assign the appropriate buffer for this context. */
+ if (atomic_read(&panic_cpu) == cpu)
+ ctxt->pbufs = &panic_nbcon_pbufs;
+ else
+ ctxt->pbufs = con->pbufs;
+
return true;
}
@@ -509,7 +521,7 @@ static void nbcon_context_release(struct nbcon_context *ctxt)
nbcon_state_read(con, &cur);
do {
if (!nbcon_owner_matches(&cur, cpu, ctxt->prio))
- return;
+ break;
new.atom = cur.atom;
new.prio = NBCON_PRIO_NONE;
@@ -521,6 +533,8 @@ static void nbcon_context_release(struct nbcon_context *ctxt)
*/
} while (!nbcon_state_try_cmpxchg(con, &cur, &new));
+
+ ctxt->pbufs = NULL;
}
/**
@@ -534,6 +548,12 @@ bool nbcon_init(struct console *con)
{
struct nbcon_state state = { };
+ con->pbufs = kmalloc(sizeof(*con->pbufs), GFP_KERNEL);
+ if (!con->pbufs) {
+ con_printk(KERN_ERR, con, "failed to allocate printing buffer\n");
+ return false;
+ }
+
nbcon_state_set(con, &state);
return true;
}
@@ -547,4 +567,6 @@ void nbcon_cleanup(struct console *con)
struct nbcon_state state = { };
nbcon_state_set(con, &state);
+ kfree(con->pbufs);
+ con->pbufs = NULL;
}
--
2.39.2
On 2023-07-28, John Ogness <[email protected]> wrote:
> +/*
> + * Define DEBUG_NBCON to allow for nbcon ownership transitions to be logged
> + * to the ringbuffer. The debug_store() macro only logs to the lockless
> + * ringbuffer and does not trigger any printing.
> + */
> +#undef DEBUG_NBCON
> +
> +#ifdef DEBUG_NBCON
> +/* Only write to ringbuffer. */
> +int __debug_store(const char *fmt, ...)
> +{
> + va_list args;
> + int r;
> +
> + va_start(args, fmt);
> + r = vprintk_store(2, 7, NULL, fmt, args);
> + va_end(args);
> +
> + return r;
> +}
> +#define debug_store(cond, fmt, ...) \
> + do { \
> + if (cond) \
> + __debug_store(pr_fmt("DEBUG: " fmt), ##__VA_ARGS__) \
Missing a semi-colon here. Wrapping this with a do-while was a
last-minute change requested by checkpatch.pl. Probably nobody would
notice because you must manually define DEBUG_NBCON by changing the
source code. Fixed for v3 (assuming Petr allows me to keep this
debugging code in place).
> + } while (0)
> +#else
> +#define debug_store(cond, fmt, ...)
> +#endif
John
On Fri 2023-07-28 02:08:27, John Ogness wrote:
> To debug and validate nbcon ownership transitions, provide a new
> macro debug_store() (enabled with DEBUG_NBCON) to log transition
> information to the ringbuffer. If enabled, this macro only logs
> to the ringbuffer and does not trigger any printing. This is to
> avoid triggering recursive printing in the nbcon consoles.
>
> --- a/kernel/printk/printk_nbcon.c
> +++ b/kernel/printk/printk_nbcon.c
> @@ -10,6 +10,35 @@
> * the legacy style console_lock mechanism.
> */
>
> +/*
> + * Define DEBUG_NBCON to allow for nbcon ownership transitions to be logged
> + * to the ringbuffer. The debug_store() macro only logs to the lockless
> + * ringbuffer and does not trigger any printing.
> + */
> +#undef DEBUG_NBCON
> +
> +#ifdef DEBUG_NBCON
> +/* Only write to ringbuffer. */
> +int __debug_store(const char *fmt, ...)
> +{
> + va_list args;
> + int r;
> +
> + va_start(args, fmt);
> + r = vprintk_store(2, 7, NULL, fmt, args);
I have never used the facility number before. It seems that they are
defined in /usr/include/sys/syslog.h, for example, see
https://sites.uclouvain.be/SystInfo/usr/include/sys/syslog.h.html
They are even somehow documented in "man 3 syslog", for example, see
https://linux.die.net/man/3/syslog
We should probably use one of the LOG_LOCALX numbers, e.g.
#define LOG_LOCAL0 (16<<3)
Also, please use LOGLEVEL_DEBUG instead of the hard coded "7".
> + va_end(args);
> +
> + return r;
> +}
> +#define debug_store(cond, fmt, ...) \
> + do { \
> + if (cond) \
> + __debug_store(pr_fmt("DEBUG: " fmt), ##__VA_ARGS__) \
> + } while (0)
> +#else
> +#define debug_store(cond, fmt, ...)
> +#endif
> +
> /**
> * nbcon_state_set - Helper function to set the console state
> * @con: Console to update
Best Regards,
Petr
On 2023-07-28, Petr Mladek <[email protected]> wrote:
>> + r = vprintk_store(2, 7, NULL, fmt, args);
>
> We should probably use one of the LOG_LOCALX numbers, e.g.
>
> #define LOG_LOCAL0 (16<<3)
>
> Also, please use LOGLEVEL_DEBUG instead of the hard coded "7".
OK.
I am also open to dropping the function and its use. When developing the
ringbuffer I also had a similar function but never attempted to publish
it. It is useful for developing/testing/debugging, but once everything
works as it should it has no real purpose. I have no problems keeping
the debug stuff hidden in my own private toolbox.
John
On Fri 2023-07-28 23:07:08, John Ogness wrote:
> On 2023-07-28, Petr Mladek <[email protected]> wrote:
> >> + r = vprintk_store(2, 7, NULL, fmt, args);
> >
> > We should probably use one of the LOG_LOCALX numbers, e.g.
> >
> > #define LOG_LOCAL0 (16<<3)
> >
> > Also, please use LOGLEVEL_DEBUG instead of the hard coded "7".
>
> OK.
>
> I am also open to dropping the function and its use. When developing the
> ringbuffer I also had a similar function but never attempted to publish
> it. It is useful for developing/testing/debugging, but once everything
> works as it should it has no real purpose. I have no problems keeping
> the debug stuff hidden in my own private toolbox.
I do not mind to add this patch. It is actually pretty interesting
function. I wonder if it might inspire anyone for using this approach
for some other reasons.
Best Regards,
Petr
On Fri 2023-07-28 02:08:28, John Ogness wrote:
> From: Thomas Gleixner <[email protected]>
>
> Add per console acquire/release functionality. The console 'locked'
> state is a combination of multiple state fields:
I scratched my head on this patch many days and I am still not sure
how to deal with it. I propose some changes but ...
My proposal is a kind of inverted logic. It allows only limited number
of scenarios. It reduces backward loops. And it is more clear what
is possible in which part of the code.
I kind of like it this way.
The original approach allows what makes sense. Restart the entire
try_acquire loop when something has changed.
It might reduce code duplication. But it allows too many combinations
to my taste so that I am not able keep them in head. From my POV,
it might require more test+debug+fix loops. And the debug part
might be painful.
But it is possible that I just look at it from a wrong angle.
About this mail:
It is long. I updated it several days. I did not know how to
split it because most comments are somehow related.
I do not expect that you would answer every detail. Top level view
is more important. And also situations where I missed some context.
> --- a/kernel/printk/printk_nbcon.c
> +++ b/kernel/printk/printk_nbcon.c
> @@ -76,6 +104,425 @@ static inline bool nbcon_state_try_cmpxchg(struct console *con, struct nbcon_sta
> return atomic_try_cmpxchg(&ACCESS_PRIVATE(con, nbcon_state), &cur->atom, new->atom);
> }
>
> +/**
> + * nbcon_context_try_acquire_direct - Try to acquire directly
> + * @ctxt: The context of the caller
> + * @cur: The current console state
> + *
> + * Return: 0 on success and @cur is updated to the new console state.
> + * Otherwise an error code on failure.
> + *
> + * Errors:
It might be because I am not a native speaker. But I was a bit
confused by the below description. I think that I know what
it means but only because I "know" the context.
Especially using the same "Retrying the acquire will not be
immediately useful." does not help to understand the difference
of what the caller should do.
> + *
> + * -EPERM: A panic is in progress an this is not the panic CPU.
> + * Retrying the acquire will not be immediately useful.
I would write something like:
* -EPERM: A panic is in progress an this is not the panic CPU.
* This context is not allowed to take the lock.
But I would use is also when the priority is not sufficient. In this
case, any waiting does not make sense either. So, I would write:
* -EPERM: The console is already owned by a context with
* the same or higher priority. Or a panic is in
* progress an this is not the panic CPU. Any waiting
* does not make sense in this case.
> + *
> + * -EBUSY: The console already has an owner. Retrying the
> + * acquire will not be immediately useful.
I would write something like this:
* -EBUSY: The console already has an owner with a lower
* The caller might try handover.
> + *
> + * -EAGAIN: An unexpected state change occurred. @cur has been
> + * updated. The caller should retry the acquire.
The word "unexpected" sounds like it should never happen. It might
rare but it is a perfectly fine race. I would write:
"@cur has changed in the meaning. The caller should retry."
> + *
> + * The general procedure is to change @prio from unowned to owned.
> + */
> +static int nbcon_context_try_acquire_direct(struct nbcon_context *ctxt,
> + struct nbcon_state *cur)
> +{
> + unsigned int cpu = smp_processor_id();
> + struct console *con = ctxt->console;
> + struct nbcon_state new = {
> + .cpu = cpu,
> + .prio = ctxt->prio,
> + .unsafe = cur->hostile_unsafe | ctxt->unsafe,
> + .hostile_unsafe = cur->hostile_unsafe,
> + };
> +
> + if (other_cpu_in_panic())
> + return -EPERM;
> +
> + /*
> + * Direct not possible if owner in unsafe region. If there
> + * was previously a hostile takeover, the console may be left
> + * in an unsafe state, even if it is now unowned.
> + */
> + if (cur->unsafe)
> + return -EBUSY;
I would expect that this function is able to take the lock when there
is no owner and no waiter.
The harsh takeover will be used as the last attempt in panic().
But what about messages printed after this flush? They should
be able to reach the console as well.
It might be enough when the information that there was a harsh
takeover in the past will stay stored in nbcon_state and
copied to nbcon_ctxt. If the console driver used a special
code after a harsh takeover, it would still know that it should
use it.
> + /* Direct is only possible if it is unowned. */
> + if (cur->prio != NBCON_PRIO_NONE)
> + return -EBUSY;
> +
> + /* Higher priority waiters are allowed to keep waiting. */
> + if (cur->req_prio > ctxt->prio)
> + new.req_prio = cur->req_prio;
> + else
> + new.req_prio = 0;
I am confused here.
1. I would expect that we bail out when there already is a request
with a higher priority.
It seems like this code tries to steal the lock that was
released for the higher priority waiter. Honestly, I do
not see any good reason for this. Higher priority
request should always get the lock ASAP!
If we do not want an extra delay then I would replace
udelay(1) with cpu_relax(). The lock is supposed to
wait the same way as a spinlock.
Or is this somehow important for RT?
I guess that the kthread might afford rescheduling here. But
will it reschedule also in EMERGENCY context? It will not
reschedule in PANIC for sure.
Well, if we allow rescheduling then we need to guarantee
that the process stays on the same CPU. I am not sure what
is the plan here.
2. I would expect that we bail out even when there already is a request
with the same priority.
OK, the spinning context will detect this because nbcon_state_try_cmpxchg()
will fail and it will restart everything with -EAGAIN. And it will
later bail out in nbcon_context_try_acquire_handover().
So, this likely works but it is pretty hairy to me. It is far from
obvious and not documented. It took me long time to put
all these pieces together.
> + /* All looks good. Try to take ownership. */
> +
> + if (!nbcon_state_try_cmpxchg(con, cur, &new))
> + return -EAGAIN;
> +
> + cur->atom = new.atom;
> +
> + return 0;
> +}
> +
> +/**
> + * nbcon_context_try_acquire_handover - Try to acquire via handover
> + * @ctxt: The context of the caller
> + * @cur: The current console state
> + *
> + * Return: 0 on success and @cur is updated to the new console state.
> + * Otherwise an error code on failure.
> + *
> + * Errors:
> + *
> + * -EPERM: A panic is in progress an this is not the panic CPU.
> + * Retrying the acquire will not be immediately useful.
> + *
> + * -EBUSY: The current owner or waiter is such that this context
> + * is not able to execute a handover. Retrying the
> + * acquire will not be immediately useful.
> + *
> + * -EAGAIN: An unexpected state change occurred. @cur has been
> + * updated. The caller should retry the acquire.
> + *
I have the same problems to understand the meaning of the error codes
as in nbcon_context_try_acquire_direct(). See an alternative text below.
> + * The general procedure is to set @req_prio and wait until unowned. Then
> + * set @prio (claiming ownership) and clearing @req_prio.
> + */
> +static int nbcon_context_try_acquire_handover(struct nbcon_context *ctxt,
> + struct nbcon_state *cur)
> +{
> + unsigned int cpu = smp_processor_id();
> + struct console *con = ctxt->console;
> + struct nbcon_state new;
> + int timeout;
> + int err = 0;
> +
> + /* Cannot request handover if owner has same or higher priority. */
> + if (cur->prio >= ctxt->prio)
> + return -EBUSY;
> +
I was surprised by the -EBUSY. Insufficient priority sounds more like
a permission problem.
I guess that you use -EBUSY so that it would try the harsh takeover.
It is not a big deal. But it should be explained in a comment.
But I would personally use -EPERM.
> + /* Cannot request handover if a waiter has same or higher priority. */
> + if (cur->req_prio >= ctxt->prio)
> + return -EBUSY;
Same here.
> +
> + /*
> + * If there is no owner, a handover may be in progress and this
> + * context (having a higher priority) could jump in directly.
> + */
> + if (cur->prio == NBCON_PRIO_NONE)
> + return nbcon_context_try_acquire_direct(ctxt, cur);
This looks superfluous. We are here because
nbcon_context_try_acquire_direct() failed.
> + /* Cannot handover on same CPU. */
> + if (cur->cpu == cpu)
> + return -EBUSY;
-EBUSY looks natural in this case.
> + /* Cannot request handover if caller unwilling to wait. */
> + if (ctxt->spinwait_max_us == 0)
> + return -EBUSY;
> +
> + /* Set request for handover. */
> + new.atom = cur->atom;
> + new.req_prio = ctxt->prio;
> + if (!nbcon_state_try_cmpxchg(con, cur, &new))
> + return -EAGAIN;
> +
> + cur->atom = new.atom;
> +
> + debug_store(1, "handover: cpu%d SPINNING to acquire for prio%d\n", cpu, ctxt->prio);
> +
> + /* Wait until there is no owner and then acquire directly. */
> + for (timeout = ctxt->spinwait_max_us; timeout >= 0; timeout--) {
> + /* On successful acquire, this request is cleared. */
> + err = nbcon_context_try_acquire_direct(ctxt, cur);
I guess that this is the reason why nbcon_context_try_acquire_direct()
tries to get the lock even when there is a request with the same
priority. But it means that more CPUs with the same priority
might get the lock via nbcon_context_try_acquire_direct().
Only one might succeed.
This made my head spin a lot. And it took me long time to understand
if the race is ok, if there will be more waiters with the same
priority spinning, ...
OK, callers with the same priority should not spin because
they will bail out at the beginning of this function. So, there should
be only one waiter, except, see below.
> + if (!err)
> + return 0;
> +
> + /*
> + * If another CPU is in panic, the request must be removed
> + * before returning to the caller.
> + */
> + if (err == -EPERM)
> + break;
> +
> + /* Continue spinwaiting on -EAGAIN and -EBUSY. */
> +
> + udelay(1);
> +
> + /* Re-read the state because some time has passed. */
> + nbcon_state_read(con, cur);
> +
> + /*
> + * If the waiter unexpectedly changed, this request is
> + * no longer set. Have the caller try again rather than
> + * guessing what has happened.
> + */
> + if (cur->req_prio != ctxt->prio)
IMHO, the following might happen:
CPU0 CPU1 CPU2
# EMERGENCY PRIO)
nbcon_context_try_acquire_handover()
nbcon_context_try_acquire_handover for (timeout...
udelay()
# current owner releases the lock
# another try_acquire in EMERGENCY_PRIO
nbcon_context_try_acquire_direct()
# SUCCESS, the lock was free
# req_prio is clear
# another try_acquire in EMERGENCY_PRIO
nbcon_context_try_acquire_direct()
# -EBUSY, lock owned by CPU1
# nbcon_context_try_acquire_handover()
# see clear req_prio
# set req_prio to and is spinning
nbcon_state_read(con, cur);
if (cur->req_prio != ctxt->prio)
# fails because req_prio is the same even
# when the owner is different
RESULT: Two CPUs are happily spinning.
Maybe, this particular race is acceptable. But is it the only race?
My feeling:
It took me quite some time to make mental mode of the code.
Then I had to do many spins with various variants to find
the above race.
And I am not sure if this is the only one. My brain is hurting.
Update after several days:
I thought about this many days. (Partly because there was a weekend
and I was busy with other things). I was not sure if I wanted to avoid
these races because of my intuition (might be a good reason) or because
I wanted to do it my way (would be a bad reason).
And I do not feel comfortable with this patch even after many days.
The stealing and the many continue/restart paths create to many
combinations. It might be perfectly fine if everything works.
But it will create many complications when there is a problem.
And the question is not only whether the lock work as expected.
The problem is that the stealing also makes the result less
predictable. It would add more possible "races" also to
the try_acquire() callers.
IMHO, the semantic of this lock would be much easier when it
guarantees that higher priority request will block any lower
priority one. And that only one EMERGENCY context will wait
and get the lock.
So, I would prefer to make this more straightforward:
1. nbcon_context_try_acquire_direct() should take the lock only
when the lock is free and there is not pending request
with the same or higher priority.
if (cur->req_prio >= ctxt->prio)
return -EPERM;
2. nbcon_context_try_acquire_handover() might duplicate some code
with _direct() variant. But it should guarantee that there will
be only one waiter a straightforward way.
I would suggest something like this for the waiting loop:
err = -EBUSY;
for (timeout = ctxt->spinwait_max_us; timeout >= 0; timeout--) {
/*
* We are here because the lock is not free.
*
* In theory, the following race might happen during udelay().
* The lock was taken and released by a higher priority
* context. Then the lock was taken with a lower priority
* context and another CPU with our priority started handover.
* As a result, two CPUs might be cycling now.
*
* It would not be a big deal. Only one CPU
* would succeed with taking over the lock. The other
* would bail out.
*
* But in practice, this could not happen because
* messages with NORMAL prio are serialized either by
* console_lock() or by the kthread. All messages with
* PANIC priority are handled on panic CPU. And panic
* context overrides any other context and the
* priority never goes down.
*
* An ultimate solution would be adding req_cpu to
* check that someone else starter spinning on another
* CPU. But it is not worth it as described above.
*/
udelay(1); or cpu_relax() ?
/* Re-read the state because some time has passed. */
nbcon_state_read(con, cur);
/* See below how this is implemented. */
err = nbcon_context_try_acquire_request(ctxt, cur);
while (err = -EBUSY);
/* Was it timeout? */
if (err != -EBUSY)
return err;
/* Timed out. Carefully remove the request. */
do {
new.atom = cur->atom;
new.req_prio = NBCON_PRIO_NONE;
if (nbcon_state_try_cmpxchg(con, cur, &new)) {
debug_store(1, "handover: cpu%d TIMEOUT to acquire for prio%d\n",
cpu, ctxt->prio);
/*
* Request successfully unset. Report failure of
* acquiring via handover.
*/
cur->atom = new.atom;
return -EBUSY;
}
/*
* Something has changed. Maybe the lock is free or
* taken over by a higher priority task. Or maybe
* just some other meta data has changed.
*
* We would get the lock when it is free and the
* request is still ours. And we will get -EBUSY
* when the request is still ours.
*/
err = nbcon_context_try_acquire_request(ctxt, cur);
while (err = -EBUSY);
return err;
}
, where
/*
* Try to take the lock when it is reserved for us. It is called for
* a re-read @cur state.
*
* Returns:
*
* 0: successfully acquired the lock.
*
* -EPERM: The console has been taken by a context with a higher
* priority. Or a panic is in progress an this is not
* the panic CPU. Waiting does not make sense any longer.
*
* -EBUSY: The lock is still blocked and reserved for us. It makes
* sense to continue waiting until the timeout.
*/
static int nbcon_context_try_acquire_request(struct nbcon_context *ctxt,
struct nbcon_state *cur)
{
struct nbcon_state new;
if (other_cpu_in_panic())
return -EPERM;
/* A higher priority context took over the request/lock */
if (cur->req_prio != ctxt->prio)
return -EPERM;
/* Request is ours. But is the lock still owned? */
if (cur->prio != NBCON_PRIO_NONE)
return -EBUSY;
/* The lock is available and is reserved for us. Try to get it. */
new.atom = cur->atom;
new.cpu = smp_processor_id();
new.prio = ctxt->prio;
new.req_prio = 0;
new.unsafe = cur->hostile_unsafe | ctxt->unsafe;
if (!nbcon_state_try_cmpxchg(con, cur, &new))
return 0;
/*
* Someone else manipulated the state. But the lock was free
* and blocked for us. Only a higher priority context was
* allowed to take over it.
*/
return -EPERM;
}
Just for comparison. The _direct() variant might look like:
/*
* Try to get the lock when it is free not requested by another
* context with the same or higher priority.
*
* Returns:
*
* 0: successfully acquired the lock.
*
* -EPERM: The console is already owned by a context with
* the same or higher priority. Or a panic is in
* progress an this is not the panic CPU. Any waiting
* does not make sense in this case.
*
* -EBUSY: The console already has an owner with a lower
* The caller might try handover.
*
*/
static int nbcon_context_try_acquire_directly(struct nbcon_context *ctxt,
struct nbcon_state *cur)
{
struct nbcon_state new;
try_again:
if (other_cpu_in_panic())
return -EPERM;
/* Could override only request from a lower priority context. */
if (ctxt->prio <= cur->req_prio)
return -EPERM;
if (cur->prio != NBCON_PRIO_NONE)
return -EBUSY;
/* The lock is available and we are free to take it. */
new.atom = cur->atom;
new.cpu = smp_processor_id();
new.prio = ctxt->prio;
/* Clear request from a lower priority context. */
new.req_prio = 0;
new.unsafe = cur->hostile_unsafe | ctxt->unsafe;
/*
* The state might have changed when anyone else took the lock.
* Check again if we could busy wait or give up.
*/
if (nbcon_state_try_cmpxchg(con, cur, &new))
goto try_again;
return 0;
}
So, there is some code duplication but it is clear what is going on.
And there are only two error codes.
> + }
> +
> + /* Timeout. Safely remove handover request and report failure. */
> + do {
> + /* No need to remove request if there is another waiter. */
> + if (cur->req_prio != ctxt->prio) {
> + if (err == -EPERM)
> + return err;
> + return -EBUSY;
> + }
> +
> + /* Unset request for handover. */
> + new.atom = cur->atom;
> + new.req_prio = NBCON_PRIO_NONE;
> + if (nbcon_state_try_cmpxchg(con, cur, &new)) {
> + debug_store(1, "handover: cpu%d TIMEOUT to acquire for prio%d\n",
> + cpu, ctxt->prio);
> + /*
> + * Request successfully unset. Report failure of
> + * acquiring via handover.
> + */
> + cur->atom = new.atom;
> + return -EBUSY;
> + }
> +
> + /*
> + * Unable to remove request. Try direct acquire. If
> + * successful, this request is also cleared.
> + */
> + err = nbcon_context_try_acquire_direct(ctxt, cur);
> + } while (err);
> +
> + /* Direct acquire at timeout succeeded! */
> + return 0;
> +}
> +
> +/**
> + * nbcon_context_try_acquire_hostile - Try to acquire via hostile takeover
> + * @ctxt: The context of the caller
> + * @cur: The current console state
> + *
> + * Return: 0 on success and @cur is updated to the new console state.
> + * Otherwise an error code on failure.
> + *
> + * Errors:
> + *
> + * -EPERM: A panic is in progress an this is not the panic CPU
> + * or this context is not the single panic context.
> + * Retrying the acquire will not be immediately useful.
> + *
> + * -EBUSY: The current owner is in an unsafe region and
> + * @takeover_unsafe was not set. Retrying the acquire
> + * is only immediately useful if @takeover_unsafe is
> + * set.
> + *
> + * -EAGAIN: An unexpected state change occurred. @cur has been
> + * updated. The caller should retry the acquire.
> + *
> + * The general procedure is to set @prio (forcing ownership). This method
> + * must only be used as a final attempt during panic.
> + */
> +static int nbcon_context_try_acquire_hostile(struct nbcon_context *ctxt,
> + struct nbcon_state *cur)
> +{
> + unsigned int cpu = smp_processor_id();
> + struct console *con = ctxt->console;
> + struct nbcon_state new = {
> + .cpu = cpu,
> + .prio = ctxt->prio,
> + .unsafe = cur->hostile_unsafe | ctxt->unsafe,
> + .hostile_unsafe = cur->hostile_unsafe | cur->unsafe,
We do hostile only when it is unsafe. So, I would expect:
.unsafe = true;
.hostile_unsafe = true;
> + };
> +
> + if (other_cpu_in_panic())
> + return -EPERM;
> +
> + /* Hostile takeovers must only be in the single panic context! */
> + if (WARN_ON_ONCE(ctxt->prio != NBCON_PRIO_PANIC || cur->prio == NBCON_PRIO_PANIC))
> + return -EPERM;
I wondered if we actually might want to take over the lock from
a nested PANIC context on the panic() CPU.
For example, when panic() was in normal process context
and NMI came when flushing a message to the console.
Should the NMI context take over the lock a harsh way?
But it actually does not make sense. The harsh takeover
is allowed only in the final flush_in_panic() which is
called from the same context as the panic() function.
> +
> + /*
> + * If a hostile takeover in unsafe regions is wanted,
> + * this is additionally requested.
> + */
> + if (cur->unsafe && !ctxt->takeover_unsafe)
> + return -EBUSY;
I would return -EPERM because we simply can't take the lock
in this context now.
> + if (!nbcon_state_try_cmpxchg(con, cur, &new))
> + return -EAGAIN;
Do we really need to start from the very beginning?
If try_cmpxchg fails it means that a non-panic CPU
still owns the lock and is manipulating the state.
I would just repeat the cmpxchg inside this function.
I would just revisit the values of the .unsafe and
.takeover_unsafe flags. The forced takeover might
be safe now.
> +
> + cur->atom = new.atom;
> +
> + return 0;
> +}
> +
> +/**
> + * nbcon_context_try_acquire - Try to acquire nbcon console
> + * @ctxt: The context of the caller
> + *
> + * Return: True if the console was acquired. False otherwise.
> + *
> + * The attempts to acquire always begin with the safest methods and only
> + * upon failure move to more aggressive methods.
> + *
> + * If the caller requested a hostile takeover, on success @ctxt is
> + * updated so that the caller can see if indeed a hostile (and
> + * possibly unsafe) takeover occurred.
> + */
> +__maybe_unused
> +static bool nbcon_context_try_acquire(struct nbcon_context *ctxt)
> +{
> + __maybe_unused
> + unsigned int cpu = smp_processor_id();
> + struct console *con = ctxt->console;
> + struct nbcon_state cur;
> + bool hostile;
> + int err;
> +
> + /* @takeover_unsafe is only valid when hostile takeover requested. */
> + WARN_ON_ONCE(ctxt->takeover_unsafe && !ctxt->hostile);
> +
> + nbcon_state_read(con, &cur);
> +
> +try_again:
> + hostile = false;
> +
> + /* ACQUIRE DIRECT */
> +
> + debug_store(ctxt->prio > NBCON_PRIO_NORMAL,
> + "direct: cpu%d trying DIRECT acquire prio%d\n", cpu, ctxt->prio);
> + err = nbcon_context_try_acquire_direct(ctxt, &cur);
> + if (!err) {
> + debug_store(ctxt->prio > NBCON_PRIO_NORMAL,
> + "direct: cpu%d SUCCESS prio%d\n", cpu, ctxt->prio);
> + goto success;
> + } else if (err == -EPERM) {
> + debug_store(1, "direct: cpu%d FAILED prio%d (non-panic cpu)\n",
> + cpu, ctxt->prio);
> + return false;
> + } else if (err == -EAGAIN) {
> + debug_store(1, "direct: cpu%d FAILED prio%d (state race)\n",
> + cpu, ctxt->prio);
> + goto try_again;
> + } else {
> + debug_store(1, "direct: cpu%d FAILED prio%d (already owned by cpu%d)\n",
> + cpu, ctxt->prio, cur.cpu);
> + /* Continue to next method. */
> + }
Hmm, the many debug_store() calls are a bit disturbing here.
Is the custom message really importnat? Would be enough to do?:
debug_store(ctxt->prio > NBCON_PRIO_NORMAL,
"direct: cpu%d trying DIRECT acquire prio%d\n", cpu, ctxt->prio);
err = nbcon_context_try_acquire_direct(ctxt, &cur);
debug_store(ctxt->prio > NBCON_PRIO_NORMAL,
"direct: cpu%d DIRECT acquire prio%d returned %pe\n", cpu, ctxt->prio, ERR_PTR(err));
if (!err)
goto success;
if (err == -EPERM)
return -false;
WARN_ON_ONCE(err != -EBUSY);
/* ACQUIRE VIA HANDOVER */
...
> + debug_store(1, "handover: cpu%d REQUESTING prio%d from cpu%d\n",
> + cpu, ctxt->prio, cur.cpu);
> + err = nbcon_context_try_acquire_handover(ctxt, &cur);
> + if (!err) {
> + debug_store(1, "handover: cpu%d SUCCESS prio%d\n",
> + cpu, ctxt->prio);
> + goto success;
> + } else if (err == -EPERM) {
> + debug_store(1, "handover: cpu%d FAILED requesting prio%d (non-panic cpu)\n",
> + cpu, ctxt->prio);
> + return false;
> + } else if (err == -EAGAIN) {
> + debug_store(1, "handover: cpu%d FAILED requesting prio%d (state race)\n",
> + cpu, ctxt->prio);
> + goto try_again;
> + } else {
> + debug_store(1, "handover: cpu%d FAILED requesting prio%d (cpu%d/prio%d/timeout)\n",
> + cpu, ctxt->prio, cur.cpu, cur.prio);
> + /* Continue to next method. */
> + }
> +
> + /* ACQUIRE VIA HOSTILE TAKEOVER */
> +
> + /* Only attempt hostile takeover if explicitly requested. */
> + if (!ctxt->hostile)
> + return false;
> +
> + debug_store(1,
> + "hostile: cpu%d trying HOSTILE acquire for prio%d from cpu%d (takeover_unsafe=%d)\n",
> + cpu, ctxt->prio, cur.cpu, ctxt->takeover_unsafe);
> + err = nbcon_context_try_acquire_hostile(ctxt, &cur);
> + if (!err) {
> + debug_store(1, "hostile: cpu%d SUCCESS prio%d\n", cpu, ctxt->prio);
> + /* Let caller know if the takeover was unsafe. */
> + ctxt->takeover_unsafe = cur.hostile_unsafe;
> + hostile = true;
> + goto success;
> + } else if (err == -EPERM) {
> + debug_store(1, "hostile: cpu%d FAILED acquire prio%d (non-panic cpu)\n",
> + cpu, ctxt->prio);
> + return false;
> + } else if (err == -EAGAIN) {
> + debug_store(1, "hostile: cpu%d FAILED acquire prio%d (state race)\n",
> + cpu, ctxt->prio);
> + goto try_again;
> + } else {
> + debug_store(1, "hostile: cpu%d FAILED acquire prio%d (unsafe)\n",
> + cpu, ctxt->prio);
> + /* Continue to next method. */
> + }
> +
> + /* No methods left to try. */
> + return false;
> +success:
> + if (!hostile) {
> + /* Let caller know it was not a hostile takeover. */
> + ctxt->hostile = 0;
> + ctxt->takeover_unsafe = 0;
nbcon_context_try_acquire_direct() sets new.unsafe when
cur->hostile_unsafe is set.
My understanding is that any acquire is unsafe when there
was an unsafe hostile takeover in the past. IMHO, ctxt should
be aware of this and we should not blindly clear these variables.
> + }
> + return true;
> +}
> +/**
> + * nbcon_context_release - Release the console
> + * @ctxt: The nbcon context from nbcon_context_try_acquire()
> + */
> +__maybe_unused
> +static void nbcon_context_release(struct nbcon_context *ctxt)
> +{
> + unsigned int cpu = smp_processor_id();
> + struct console *con = ctxt->console;
> + struct nbcon_state cur;
> + struct nbcon_state new;
> +
> + nbcon_state_read(con, &cur);
> + do {
> + if (!nbcon_owner_matches(&cur, cpu, ctxt->prio))
> + return;
> +
> + new.atom = cur.atom;
> + new.prio = NBCON_PRIO_NONE;
> + new.unsafe = cur.hostile_unsafe;
> +
> + /*
> + * If @hostile_unsafe is set, it is kept set so that
> + * the state remains permanently unsafe.
> + */
This comment should be above the new.unsafe assignment.
> +
> + } while (!nbcon_state_try_cmpxchg(con, &cur, &new));
> +}
> +
> /**
> * nbcon_init - Initialize the nbcon console specific data
> * @con: Console to initialize
I have actually forgot one thing.
On Fri 2023-07-28 02:08:28, John Ogness wrote:
> From: Thomas Gleixner <[email protected]>
>
> Add per console acquire/release functionality. The console 'locked'
> state is a combination of multiple state fields:
>
> - Hostile takeover
>
> The new owner takes the console over without 'req_prio'
> handshake.
>
> This is required when friendly handovers are not possible,
> i.e. the higher priority context interrupted the owning
> context on the same CPU or the owning context is not able
> to make progress on a remote CPU.
I always expected that there would be only one hostile takeover.
It would allow to take the lock a harsh way when the friendly
way fails.
> All other policy decisions have to be made at the call sites:
>
> - What is marked as an unsafe section.
> - Whether to spinwait if there is already an owner.
> - Whether to attempt a hostile takeover when safe.
> - Whether to attempt a hostile takeover when unsafe.
But there seems to be actually two variants. How they are
supposed to be used, please?
I would expect that a higher priority context would always
be able to takeover the lock when it is in a safe context.
By other words, "hostile takeover when safe" would be
the standard behavior for context with a higher priority.
And if I get it correctly then nbcon_enter_unsafe() returns
"false" when there is a pending request with a higher priority.
And only requests with a higher priority are allowed.
By other words, the difference between normal takeover and
"hostile takeover when safe" is that the 1st one has to
wait until the current owner calls nbcon_enter_unsafe().
But the result is the same. The current owner might
prematurely end after calling nbcon_enter_unsafe().
Maybe, this another relic from the initial more generic approach?
> --- a/include/linux/console.h
> +++ b/include/linux/console.h
> @@ -175,13 +175,28 @@ enum cons_flags {
> * struct nbcon_state - console state for nbcon consoles
> * @atom: Compound of the state fields for atomic operations
> *
> + * @req_prio: The priority of a handover request
> + * @prio: The priority of the current usage
> + * @unsafe: Console is busy in a non takeover region
> + * @hostile_unsafe: The @unsafe value before a hostile takeover
> + * @cpu: The CPU on which the owner runs
> + *
> * To be used for reading and preparing of the value stored in the nbcon
> * state variable @console.nbcon_state.
> + *
> + * The @prio and @req_prio fields are particularly important to allow
> + * spin-waiting to timeout and give up without the risk of a waiter being
> + * assigned the lock after giving up.
> */
> struct nbcon_state {
> union {
> unsigned int atom;
> struct {
> + unsigned int prio : 2;
> + unsigned int req_prio : 2;
> + unsigned int unsafe : 1;
> + unsigned int hostile_unsafe : 1;
> + unsigned int cpu : 24;
> };
> };
> };
> @@ -194,6 +209,50 @@ struct nbcon_state {
> */
> static_assert(sizeof(struct nbcon_state) <= sizeof(int));
>
> +/**
> + * nbcon_prio - console owner priority for nbcon consoles
> + * @NBCON_PRIO_NONE: Unused
> + * @NBCON_PRIO_NORMAL: Normal (non-emergency) usage
> + * @NBCON_PRIO_EMERGENCY: Emergency output (WARN/OOPS...)
> + * @NBCON_PRIO_PANIC: Panic output
> + * @NBCON_PRIO_MAX: The number of priority levels
> + *
> + * A context wanting to produce emergency output can carefully takeover the
> + * console, even without consent of the owner. Ideally such a takeover is only
> + * when @nbcon_state::unsafe is not set. However, a context wanting to produce
> + * panic output can ignore the unsafe flag as a last resort. If panic output
> + * is active, no takeover is possible until the panic output releases the
> + * console.
> + */
> +enum nbcon_prio {
> + NBCON_PRIO_NONE = 0,
> + NBCON_PRIO_NORMAL,
> + NBCON_PRIO_EMERGENCY,
> + NBCON_PRIO_PANIC,
> + NBCON_PRIO_MAX,
> +};
> +
> +struct console;
> +
> +/**
> + * struct nbcon_context - Context for console acquire/release
> + * @console: The associated console
> + * @spinwait_max_us: Limit for spinwait acquire
> + * @prio: Priority of the context
> + * @unsafe: This context is in an unsafe section
This seems to be an input value for try_acquire(). It is
controversial.
I guess that it removes the need to call nbcon_enter_unsafe()
right after try_acquire_console().
Hmm, this semantic is problematic:
1. The result would be non-paired enter_unsafe()/exit_unsafe()
calls.
2. I would personally expect that this is an output value
set by try_acquire() so that the context might know
how it got the lock.
3. Strictly speaking, as an input value, it would mean that
try_acquire() is called when the console is in "unsafe" state.
But the caller does not know in which state the console
is until it acquires the lock.
> + * @hostile: Acquire console by hostile takeover
> + * @takeover_unsafe: Acquire console by hostile takeover even if unsafe
> + */
> +struct nbcon_context {
> + /* members set by caller */
> + struct console *console;
> + unsigned int spinwait_max_us;
> + enum nbcon_prio prio;
> + unsigned int unsafe : 1;
> + unsigned int hostile : 1;
> + unsigned int takeover_unsafe : 1;
> +};
The names make sense. But there are 4 names used struct nbcon_state
and struct nbcon_context. One is used twice:
state.unsafe
state.hostile_unsafe
ctxt.unsafe
ctxt.hostile
ctxt.takeover_unsafe
For me it is not easy to remember which permutation is used where ;-)
It would be easier if we remove the "hostile when safe" variant.
Then 3 variables might be enough. I suggest something like:
state.unsafe Console is busy and takeover is not safe.
state.unsafe_takeover A hostile takeover in an unsafe state happened
in the past. The console can't be safe until
re-initialized.
ctxt.allow_unsafe_takeover Allow hostile takeover even if unsafe.
Can be used only with PANIC prio. Might cause
a system freeze when the console is used later.
Best Regards,
Petr
On Fri 2023-07-28 02:08:29, John Ogness wrote:
> From: Thomas Gleixner <[email protected]>
>
> In case of hostile takeovers it must be ensured that the previous
> owner cannot scribble over the output buffer of the emergency/panic
> context. This is achieved by:
>
> - Adding a global output buffer instance for the panic context.
> This is the only situation where hostile takeovers can occur and
> there is always at most 1 panic context.
>
> - Allocating an output buffer per console upon console
> registration. This buffer is used by the console owner when not
> in panic context.
>
> - Choosing the appropriate buffer is handled in the acquire/release
> functions.
>
> Co-developed-by: John Ogness <[email protected]>
> Signed-off-by: John Ogness <[email protected]>
> Signed-off-by: Thomas Gleixner (Intel) <[email protected]>
Looks good to me:
Reviewed-by: Petr Mladek <[email protected]>
Best Regards,
Petr
On Fri 2023-07-28 02:08:31, John Ogness wrote:
> From: Thomas Gleixner <[email protected]>
>
> Provide functions that are related to the safe handover mechanism
> and allow console drivers to dynamically specify unsafe regions:
> --- a/kernel/printk/printk_nbcon.c
> +++ b/kernel/printk/printk_nbcon.c
> @@ -650,6 +649,118 @@ static void nbcon_context_release(struct nbcon_context *ctxt)
> ctxt->pbufs = NULL;
> }
>
> +/**
> + * nbcon_context_can_proceed - Check whether ownership can proceed
> + * @ctxt: The nbcon context from nbcon_context_try_acquire()
> + * @cur: The current console state
> + *
> + * Return: True if the state is correct. False if ownership was
> + * handed over or taken.
> + *
> + * Must be invoked after the record was dumped into the assigned buffer
> + * and at appropriate safe places in the driver.
> + *
> + * When this function returns false then the calling context is no longer
> + * the owner and is no longer allowed to go forward. In this case it must
> + * back out immediately and carefully. The buffer content is also no longer
> + * trusted since it no longer belongs to the calling context.
> + */
> +static bool nbcon_context_can_proceed(struct nbcon_context *ctxt, struct nbcon_state *cur)
> +{
[...]
> + /*
> + * A console owner within an unsafe region is always allowed to
> + * proceed, even if there are waiters. It can perform a handover
> + * when exiting the unsafe region. Otherwise the waiter will
> + * need to perform an unsafe hostile takeover.
> + */
> + if (cur->unsafe) {
> + debug_store(cur->req_prio > cur->prio,
> + "handover: cpu%d IGNORING HANDOVER prio%d -> prio%d (unsafe)\n",
> + cpu, cur->prio, cur->req_prio);
> + return true;
> + }
[...]
> +}
> +
> +/**
> + * nbcon_context_update_unsafe - Update the unsafe bit in @con->nbcon_state
> + * @ctxt: The nbcon context from nbcon_context_try_acquire()
> + * @unsafe: The new value for the unsafe bit
> + *
> + * Return: True if the state is correct. False if ownership was
> + * handed over or taken.
> + *
> + * Typically the unsafe bit is set during acquire. This function allows
> + * modifying the unsafe status without releasing ownership.
> + *
> + * When this function returns false then the calling context is no longer
> + * the owner and is no longer allowed to go forward. In this case it must
> + * back out immediately and carefully. The buffer content is also no longer
> + * trusted since it no longer belongs to the calling context.
> + *
> + * Internal helper to avoid duplicated code
> + */
> +__maybe_unused
> +static bool nbcon_context_update_unsafe(struct nbcon_context *ctxt, bool unsafe)
> +{
> + struct console *con = ctxt->console;
> + struct nbcon_state cur;
> + struct nbcon_state new;
> +
> + nbcon_state_read(con, &cur);
> +
> + /* The unsafe bit must not be cleared if @hostile_unsafe is set. */
> + if (!unsafe && cur.hostile_unsafe)
> + return nbcon_context_can_proceed(ctxt, &cur);
> +
> + do {
> + if (!nbcon_context_can_proceed(ctxt, &cur))
> + return false;
nbcon_context_can_proceed() returns "true" even when there
is a pending request. It happens when the current state is "unsafe".
But see below.
> +
> + new.atom = cur.atom;
> + new.unsafe = unsafe;
> + } while (!nbcon_state_try_cmpxchg(con, &cur, &new));
If the new state is "safe" and there is a pending request
then we should release the lock and return false here.
It does not make sense to block the waiter just to realize
that we can't enter "unsafe" state again.
> + ctxt->unsafe = unsafe;
> +
> + return true;
An easy solution would be to do here:
ctxt->unsafe = unsafe;
return nbcon_context_can_proceed(ctxt, &cur);
> +}
But maybe, we can change the logic a bit. Something like:
/**
* nbcon_context_update_unsafe - Update the unsafe bit in @con->nbcon_state
* @ctxt: The nbcon context from nbcon_context_try_acquire()
* @unsafe: The new value for the unsafe bit
*
* Return: True if the state is correct. False if ownership was
* handed over or taken.
*
* When this function returns false then the calling context is no longer
* the owner and is no longer allowed to go forward. In this case it must
* back out immediately and carefully. The buffer content is also no longer
* trusted since it no longer belongs to the calling context.
*
* Internal helper to avoid duplicated code
*/
static bool nbcon_context_update_unsafe(struct nbcon_context *ctxt, bool unsafe)
{
struct console *con = ctxt->console;
struct nbcon_state cur;
struct nbcon_state new;
bool updated, can_proceed;
if (!nbcon_context_can_proceed(ctxt, &cur))
return false;
/* The unsafe bit must not be cleared if @hostile_unsafe is set. */
if (cur.hostile_unsafe)
unsafe = true;
if (cur.unsafe == unsafe)
return true;
do {
new.atom = cur.atom;
new.unsafe = unsafe;
updated = nbcon_state_try_cmpxchg(con, &cur, &new));
/*
* The state has changed. Either there is a new
* request lor there was a hostile takeover.
*/
can_proceed = nbcon_context_can_proceed(ctxt, &cur);
} while (!updated && can_proceed);
if (updated)
ctxt->unsafe = unsafe;
return can_proceed;
}
Best Regards,
Petr
On Fri 2023-07-28 02:08:25, John Ogness wrote:
> Hi,
>
> This is v2 of a series to introduce the new non-BKL (nbcon)
> consoles. This series is only a subset of the original
> v1 [0]. In particular, this series represents patches 5-10 of
> the v1 series. For information about the motivation of the
> atomic consoles, please read the cover letter of v1.
>
> This series focuses on wiring up the printk subsystem to
> be able to use the nbcon consoles and implement their ownership
> interfaces and rules. This series does _not_ include threaded
> printing, atomic printing regions, or nbcon drivers. Those
> features will be added in separate follow-up series.
>
>
> include/linux/console.h | 132 +++++
> kernel/printk/Makefile | 2 +-
> kernel/printk/internal.h | 29 ++
> kernel/printk/printk.c | 156 ++++--
> kernel/printk/printk_nbcon.c | 955 +++++++++++++++++++++++++++++++++++
Nit: Is there still any chance to rename this to kernel/printk/nbcon.c ?
I am sorry that I did not suggested this earlier. I think that
we should have omitted the "printk_" prefix even for
the "ringbuffer.*" files.
I think that it came from "printk_safe.c". But it made some sense.
"printk_safe_*" was also an API.
But in general, "printk_" prefix is superfluous in "printk" directory.
Best Regards,
Petr