This is v2 of a series to prepare for threaded/atomic
printing. It is a rework of patches 6-12 of the v1 [0]. From
the v1, patches 1-5 are already mainline and a rework of
patches >12 will be posted in a later series.
This series focuses on reducing the scope of the BKL
console_lock. It achieves this by switching to SRCU and a
dedicated mutex for console list iteration and modification,
respectively. The console_lock will no longer offer this
protection and is completely removed from
(un)register_console() and console_stop/start() code.
All users of the console_lock for list iteration have been
modified. For the call sites where the console_lock is still
needed (because of other reasons), I added comments to explain
exactly why the console_lock was needed.
The base commit for this series is from Paul McKenney's RCU tree
and provides an NMI-safe SRCU implementation [1]. Without the
NMI-safe SRCU implementation, this series is not less safe than
mainline. But we will need the NMI-safe SRCU implementation for
atomic consoles anyway, so we might as well get it in
now. Especially since it _does_ increase the reliability for
mainline in the panic path.
Changes since v2:
general:
- introduce console_is_enabled() to document safe data race on
console->flags
- switch all "console->flags & CON_ENABLED" code sites to
console_is_enabled()
- add "for_each_console_srcu" to .clang-format
- cleanup/clarify comments relating to console_lock
coverage/usage
um:
- kmsg_dumper: use srcu instead of console_lock for list
iteration
kgdb/kdb:
- configure_kgdboc: keep console_lock for console->device()
synchronization, use srcu for list iteration
- kgdboc_earlycon_pre_exp_handler: use srcu instead of
documenting unsafety for list iteration
- kgdboc_earlycon_init: use console_list_lock instead of
console_lock to lock list
- kdb_msg_write: use srcu instead of documenting unsafety for
list iteration
tty:
- show_cons_active: keep console_lock for console->device()
synchronization
fbdev:
- xen-fbfront: xenfb_probe: use srcu instead of console_lock
for list iteration, introduce console_force_preferred() to
safely implement hack
proc/consoles:
- show_console_dev: keep console_lock for console->device()
synchronization
- c_next: use hlist_entry_safe() instead of
hlist_for_each_entry_continue()
printk:
- remove console_lock from console_stop/start() and
(un)register_console()
- introduce console_srcu_read_(un)lock() to wrap scru read
(un)lock
- rename cons_first() macro to console_first()
- for_each_console: add lockdep check instead of introducing
new for_each_registered_console()
- console_list_lock: add warning if in read-side critical
section
- release srcu read lock on handover
- console_flush_all: use srcu instead of relying on console
lock for list iteration
- console_unblank: use srcu instead of relying on console_lock
for list iteration
- console_flush_on_panic: use srcu for list iteration and
document console->seq race
- device: keep console_lock for console->device()
synchronization, usr srcu for list iteration
- register_console: split list adding logic into the 3 distinct
scenarios
- register_console: set initial sequence number before adding
to list
- unregister_console: fix ENODEV return value if the console is
not registered
- console_stop: synchronize srcu
- printk_late_init: use _safe variant of iteration
- __pr_flush: use srcu instead of relying on console_lock for
list iteration
John Ogness
[0] https://lore.kernel.org/r/[email protected]
[1] https://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git/log/?h=srcunmisafe.2022.10.18b
John Ogness (37):
printk: Convert console_drivers list to hlist
printk: Prepare for SRCU console list protection
printk: introduce console_is_enabled() wrapper
printk: use console_is_enabled()
tty: nfcon: use console_is_enabled()
um: kmsg_dump: use console_is_enabled()
efi: earlycon: use console_is_enabled()
netconsole: use console_is_enabled()
tty: hvc: use console_is_enabled()
tty: serial: earlycon: use console_is_enabled()
tty: serial: kgdboc: use console_is_enabled()
tty: serial: pic32_uart: use console_is_enabled()
tty: serial: samsung_tty: use console_is_enabled()
tty: serial: serial_core: use console_is_enabled()
tty: serial: xilinx_uartps: use console_is_enabled()
tty: tty_io: use console_is_enabled()
usb: early: xhci-dbc: use console_is_enabled()
kdb: kdb_io: use console_is_enabled()
um: kmsg_dumper: use srcu console list iterator
serial: kgdboc: use srcu console list iterator
serial: kgdboc: document console_lock usage
tty: tty_io: document console_lock usage
xen: fbfront: use srcu console list iterator
proc: consoles: document console_lock usage
kdb: use srcu console list iterator
printk: console_flush_all: use srcu console list iterator
printk: console_unblank: use srcu console list iterator
printk: console_flush_on_panic: use srcu console list iterator
printk: console_device: use srcu console list iterator
printk: register_console: use srcu console list iterator
printk: __pr_flush: use srcu console list iterator
printk: introduce console_list_lock
serial: kgdboc: use console_list_lock instead of console_lock
tty: tty_io: use console_list_lock for list synchronization
proc: consoles: use console_list_lock for list iteration
printk: relieve console_lock of list synchronization duties
printk, xen: fbfront: create/use safe function for forcing preferred
Thomas Gleixner (1):
serial: kgdboc: Lock console list in probe function
.clang-format | 1 +
arch/m68k/emu/nfcon.c | 4 +-
arch/um/kernel/kmsg_dump.c | 15 +-
drivers/firmware/efi/earlycon.c | 4 +-
drivers/net/netconsole.c | 4 +-
drivers/tty/hvc/hvc_console.c | 2 +-
drivers/tty/serial/earlycon.c | 4 +-
drivers/tty/serial/kgdboc.c | 37 ++-
drivers/tty/serial/pic32_uart.c | 2 +-
drivers/tty/serial/samsung_tty.c | 2 +-
drivers/tty/serial/serial_core.c | 2 +-
drivers/tty/serial/xilinx_uartps.c | 2 +-
drivers/tty/tty_io.c | 18 +-
drivers/usb/early/xhci-dbc.c | 2 +-
drivers/video/fbdev/xen-fbfront.c | 16 +-
fs/proc/consoles.c | 20 +-
include/linux/console.h | 75 +++++-
include/linux/serial_core.h | 2 +-
kernel/debug/kdb/kdb_io.c | 7 +-
kernel/printk/printk.c | 373 +++++++++++++++++++++--------
20 files changed, 438 insertions(+), 154 deletions(-)
base-commit: c2d158a284abd63d727dad7402a2eed650dd4233
--
2.30.2
Use srcu console list iteration for console list traversal.
Document why the console_lock is still necessary.
Signed-off-by: John Ogness <[email protected]>
---
kernel/printk/printk.c | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index e478cb92e7ba..410ad9d5649c 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -3025,15 +3025,24 @@ struct tty_driver *console_device(int *index)
{
struct console *c;
struct tty_driver *driver = NULL;
+ int cookie;
+ /*
+ * Stop console printing because the device() callback may
+ * assume the console is not within its write() callback.
+ */
console_lock();
- for_each_console(c) {
+
+ cookie = console_srcu_read_lock();
+ for_each_console_srcu(c) {
if (!c->device)
continue;
driver = c->device(c, index);
if (driver)
break;
}
+ console_srcu_read_unlock(cookie);
+
console_unlock();
return driver;
}
--
2.30.2
Replace the open coded single linked list with a hlist so a conversion
to SRCU protected list walks can reuse the existing primitives.
Signed-off-by: John Ogness <[email protected]>
---
fs/proc/consoles.c | 3 +-
include/linux/console.h | 8 ++--
kernel/printk/printk.c | 99 +++++++++++++++++++++++------------------
3 files changed, 63 insertions(+), 47 deletions(-)
diff --git a/fs/proc/consoles.c b/fs/proc/consoles.c
index dfe6ce3505ce..cf2e0788f9c7 100644
--- a/fs/proc/consoles.c
+++ b/fs/proc/consoles.c
@@ -74,8 +74,9 @@ static void *c_start(struct seq_file *m, loff_t *pos)
static void *c_next(struct seq_file *m, void *v, loff_t *pos)
{
struct console *con = v;
+
++*pos;
- return con->next;
+ return hlist_entry_safe(con->node.next, struct console, node);
}
static void c_stop(struct seq_file *m, void *v)
diff --git a/include/linux/console.h b/include/linux/console.h
index 8c1686e2c233..7b5f21f9e469 100644
--- a/include/linux/console.h
+++ b/include/linux/console.h
@@ -15,6 +15,7 @@
#define _LINUX_CONSOLE_H_ 1
#include <linux/atomic.h>
+#include <linux/list.h>
#include <linux/types.h>
struct vc_data;
@@ -154,14 +155,16 @@ struct console {
u64 seq;
unsigned long dropped;
void *data;
- struct console *next;
+ struct hlist_node node;
};
+extern struct hlist_head console_list;
+
/*
* for_each_console() allows you to iterate on each console
*/
#define for_each_console(con) \
- for (con = console_drivers; con != NULL; con = con->next)
+ hlist_for_each_entry(con, &console_list, node)
extern int console_set_on_cmdline;
extern struct console *early_console;
@@ -174,7 +177,6 @@ enum con_flush_mode {
extern int add_preferred_console(char *name, int idx, char *options);
extern void register_console(struct console *);
extern int unregister_console(struct console *);
-extern struct console *console_drivers;
extern void console_lock(void);
extern int console_trylock(void);
extern void console_unlock(void);
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index e4f1e7478b52..867becc40021 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -79,13 +79,12 @@ int oops_in_progress;
EXPORT_SYMBOL(oops_in_progress);
/*
- * console_sem protects the console_drivers list, and also
- * provides serialisation for access to the entire console
- * driver system.
+ * console_sem protects console_list and console->flags updates, and also
+ * provides serialization for access to the entire console driver system.
*/
static DEFINE_SEMAPHORE(console_sem);
-struct console *console_drivers;
-EXPORT_SYMBOL_GPL(console_drivers);
+HLIST_HEAD(console_list);
+EXPORT_SYMBOL_GPL(console_list);
/*
* System may need to suppress printk message under certain
@@ -2556,7 +2555,7 @@ static int console_cpu_notify(unsigned int cpu)
* console_lock - lock the console system for exclusive use.
*
* Acquires a lock which guarantees that the caller has
- * exclusive access to the console system and the console_drivers list.
+ * exclusive access to the console system and console_list.
*
* Can sleep, returns nothing.
*/
@@ -2576,7 +2575,7 @@ EXPORT_SYMBOL(console_lock);
* console_trylock - try to lock the console system for exclusive use.
*
* Try to acquire a lock which guarantees that the caller has exclusive
- * access to the console system and the console_drivers list.
+ * access to the console system and console_list.
*
* returns 1 on success, and 0 on failure to acquire the lock.
*/
@@ -2940,11 +2939,20 @@ void console_flush_on_panic(enum con_flush_mode mode)
console_may_schedule = 0;
if (mode == CONSOLE_REPLAY_ALL) {
+ struct hlist_node *tmp;
struct console *c;
u64 seq;
seq = prb_first_valid_seq(prb);
- for_each_console(c)
+ /*
+ * This cannot use for_each_console() because it's not established
+ * that the current context has console locked and neither there is
+ * a guarantee that there is no concurrency in that case.
+ *
+ * Open code it for documentation purposes and pretend that
+ * it works.
+ */
+ hlist_for_each_entry_safe(c, tmp, &console_list, node)
c->seq = seq;
}
console_unlock();
@@ -3081,6 +3089,9 @@ static void try_enable_default_console(struct console *newcon)
(con->flags & CON_BOOT) ? "boot" : "", \
con->name, con->index, ##__VA_ARGS__)
+#define console_first() \
+ hlist_entry(console_list.first, struct console, node)
+
/*
* The console driver calls this routine during kernel initialization
* to register the console printing procedure with printk() and to
@@ -3140,8 +3151,8 @@ void register_console(struct console *newcon)
* flag set and will be first in the list.
*/
if (preferred_console < 0) {
- if (!console_drivers || !console_drivers->device ||
- console_drivers->flags & CON_BOOT) {
+ if (hlist_empty(&console_list) || !console_first()->device ||
+ console_first()->flags & CON_BOOT) {
try_enable_default_console(newcon);
}
}
@@ -3169,20 +3180,22 @@ void register_console(struct console *newcon)
}
/*
- * Put this console in the list - keep the
- * preferred driver at the head of the list.
+ * Put this console in the list - keep the
+ * preferred driver at the head of the list.
*/
console_lock();
- if ((newcon->flags & CON_CONSDEV) || console_drivers == NULL) {
- newcon->next = console_drivers;
- console_drivers = newcon;
- if (newcon->next)
- newcon->next->flags &= ~CON_CONSDEV;
- /* Ensure this flag is always set for the head of the list */
+ if (hlist_empty(&console_list)) {
+ /* Ensure CON_CONSDEV is always set for the head. */
newcon->flags |= CON_CONSDEV;
+ hlist_add_head(&newcon->node, &console_list);
+
+ } else if (newcon->flags & CON_CONSDEV) {
+ /* Only the new head can have CON_CONSDEV set. */
+ console_first()->flags &= ~CON_CONSDEV;
+ hlist_add_head(&newcon->node, &console_list);
+
} else {
- newcon->next = console_drivers->next;
- console_drivers->next = newcon;
+ hlist_add_behind(&newcon->node, console_list.first);
}
newcon->dropped = 0;
@@ -3209,16 +3222,18 @@ void register_console(struct console *newcon)
if (bootcon_enabled &&
((newcon->flags & (CON_CONSDEV | CON_BOOT)) == CON_CONSDEV) &&
!keep_bootcon) {
- for_each_console(con)
+ struct hlist_node *tmp;
+
+ hlist_for_each_entry_safe(con, tmp, &console_list, node) {
if (con->flags & CON_BOOT)
unregister_console(con);
+ }
}
}
EXPORT_SYMBOL(register_console);
int unregister_console(struct console *console)
{
- struct console *con;
int res;
con_printk(KERN_INFO, console, "disabled\n");
@@ -3229,32 +3244,30 @@ int unregister_console(struct console *console)
if (res > 0)
return 0;
- res = -ENODEV;
console_lock();
- if (console_drivers == console) {
- console_drivers=console->next;
- res = 0;
- } else {
- for_each_console(con) {
- if (con->next == console) {
- con->next = console->next;
- res = 0;
- break;
- }
- }
+
+ /* Disable it unconditionally */
+ console->flags &= ~CON_ENABLED;
+
+ if (hlist_unhashed(&console->node)) {
+ res = -ENODEV;
+ goto out_unlock;
}
- if (res)
- goto out_disable_unlock;
+ hlist_del_init(&console->node);
/*
+ * <HISTORICAL>
* If this isn't the last console and it has CON_CONSDEV set, we
* need to set it on the next preferred console.
+ * </HISTORICAL>
+ *
+ * The above makes no sense as there is no guarantee that the next
+ * console has any device attached. Oh well....
*/
- if (console_drivers != NULL && console->flags & CON_CONSDEV)
- console_drivers->flags |= CON_CONSDEV;
+ if (!hlist_empty(&console_list) && console->flags & CON_CONSDEV)
+ console_first()->flags |= CON_CONSDEV;
- console->flags &= ~CON_ENABLED;
console_unlock();
console_sysfs_notify();
@@ -3263,10 +3276,8 @@ int unregister_console(struct console *console)
return res;
-out_disable_unlock:
- console->flags &= ~CON_ENABLED;
+out_unlock:
console_unlock();
-
return res;
}
EXPORT_SYMBOL(unregister_console);
@@ -3317,10 +3328,11 @@ void __init console_init(void)
*/
static int __init printk_late_init(void)
{
+ struct hlist_node *tmp;
struct console *con;
int ret;
- for_each_console(con) {
+ hlist_for_each_entry_safe(con, tmp, &console_list, node) {
if (!(con->flags & CON_BOOT))
continue;
@@ -3340,6 +3352,7 @@ static int __init printk_late_init(void)
unregister_console(con);
}
}
+
ret = cpuhp_setup_state_nocalls(CPUHP_PRINTK_DEAD, "printk:dead", NULL,
console_cpu_notify);
WARN_ON(ret < 0);
--
2.30.2
show_cons_active() uses the console_lock to gather information
on registered consoles. Since the console_lock is being used for
multiple reasons, explicitly document these reasons. This will
be useful when the console_lock is split into fine-grained
locking.
Signed-off-by: John Ogness <[email protected]>
---
drivers/tty/tty_io.c | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/drivers/tty/tty_io.c b/drivers/tty/tty_io.c
index 2050e63963bb..333579bfa335 100644
--- a/drivers/tty/tty_io.c
+++ b/drivers/tty/tty_io.c
@@ -3526,6 +3526,14 @@ static ssize_t show_cons_active(struct device *dev,
struct console *c;
ssize_t count = 0;
+ /*
+ * Hold the console_lock to guarantee that no consoles are
+ * unregistered until all console processing is complete.
+ * This also allows safe traversal of the console list.
+ *
+ * Stop console printing because the device() callback may
+ * assume the console is not within its write() callback.
+ */
console_lock();
for_each_console(c) {
if (!c->device)
--
2.30.2
Currently there exist races in console_register(), where the types
of registered consoles are checked (without holding the console_lock)
and then after acquiring the console_lock, it is assumed that the list
has not changed. Also, some code that performs console_unregister()
make similar assumptions.
Introduce a console_list_lock to provide full synchronization for any
console list changes. The console_list_lock also provides
synchronization for updates to console->flags.
Note that one of the various responsibilities of the console_lock is
also intended to provide this synchronization. Later changes will
update call sites relying on the console_lock for this purpose. Once
all call sites have been updated, the console_lock will be relieved
of synchronizing console_list and console->flags updates.
Signed-off-by: John Ogness <[email protected]>
---
include/linux/console.h | 20 ++++++++--
kernel/printk/printk.c | 82 +++++++++++++++++++++++++++++++++++------
2 files changed, 88 insertions(+), 14 deletions(-)
diff --git a/include/linux/console.h b/include/linux/console.h
index 60195cd086dc..bf1e8136424a 100644
--- a/include/linux/console.h
+++ b/include/linux/console.h
@@ -159,8 +159,12 @@ struct console {
};
#ifdef CONFIG_LOCKDEP
+extern void lockdep_assert_console_list_lock_held(void);
extern bool console_srcu_read_lock_is_held(void);
#else
+static inline void lockdep_assert_console_list_lock_held(void)
+{
+}
static inline bool console_srcu_read_lock_is_held(void)
{
return 1;
@@ -170,6 +174,9 @@ static inline bool console_srcu_read_lock_is_held(void)
extern int console_srcu_read_lock(void);
extern void console_srcu_read_unlock(int cookie);
+extern void console_list_lock(void) __acquires(console_mutex);
+extern void console_list_unlock(void) __releases(console_mutex);
+
extern struct hlist_head console_list;
/**
@@ -206,10 +213,17 @@ static inline bool console_is_enabled(const struct console *con)
hlist_for_each_entry_srcu(con, &console_list, node, \
console_srcu_read_lock_is_held())
-/*
- * for_each_console() allows you to iterate on each console
+/**
+ * for_each_console() - Iterator over registered consoles
+ * @con: struct console pointer used as loop cursor
+ *
+ * The console list and all struct console fields are immutable while
+ * iterating.
+ *
+ * Requires console_list_lock to be held.
*/
-#define for_each_console(con) \
+#define for_each_console(con) \
+ lockdep_assert_console_list_lock_held(); \
hlist_for_each_entry(con, &console_list, node)
extern int console_set_on_cmdline;
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 2faa6e3e2fb8..3615ee6d68fd 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -78,6 +78,9 @@ EXPORT_SYMBOL(ignore_console_lock_warning);
int oops_in_progress;
EXPORT_SYMBOL(oops_in_progress);
+/* console_mutex protects console_list and console->flags updates. */
+static DEFINE_MUTEX(console_mutex);
+
/*
* console_sem protects console_list and console->flags updates, and also
* provides serialization for access to the entire console driver system.
@@ -104,6 +107,11 @@ static struct lockdep_map console_lock_dep_map = {
.name = "console_lock"
};
+void lockdep_assert_console_list_lock_held(void)
+{
+ lockdep_assert_held(&console_mutex);
+}
+
bool console_srcu_read_lock_is_held(void)
{
return srcu_read_lock_held(&console_srcu);
@@ -225,6 +233,40 @@ int devkmsg_sysctl_set_loglvl(struct ctl_table *table, int write,
}
#endif /* CONFIG_PRINTK && CONFIG_SYSCTL */
+/**
+ * console_list_lock - Lock the console list
+ *
+ * For console list or console->flags updates
+ */
+void console_list_lock(void)
+{
+#ifdef CONFIG_DEBUG_LOCK_ALLOC
+ /*
+ * In unregister_console(), synchronize_srcu() is called with the
+ * console_list_lock held. Therefore it is not allowed that the
+ * console_list_lock is taken with the srcu_lock held.
+ *
+ * Whether or not this context is in the read-side critical section
+ * can only be detected if the appropriate debug options are enabled.
+ */
+ WARN_ON_ONCE(debug_lockdep_rcu_enabled() &&
+ srcu_read_lock_held(&console_srcu));
+#endif
+ mutex_lock(&console_mutex);
+}
+EXPORT_SYMBOL(console_list_lock);
+
+/**
+ * console_list_unlock - Unlock the console list
+ *
+ * Counterpart to console_list_lock()
+ */
+void console_list_unlock(void)
+{
+ mutex_unlock(&console_mutex);
+}
+EXPORT_SYMBOL(console_list_unlock);
+
/**
* console_srcu_read_lock - Register a new reader for the
* SRCU-protected console list
@@ -3055,9 +3097,11 @@ struct tty_driver *console_device(int *index)
void console_stop(struct console *console)
{
__pr_flush(console, 1000, true);
+ console_list_lock();
console_lock();
console->flags &= ~CON_ENABLED;
console_unlock();
+ console_list_unlock();
/* Ensure that all SRCU list walks have completed */
synchronize_srcu(&console_srcu);
@@ -3066,9 +3110,11 @@ EXPORT_SYMBOL(console_stop);
void console_start(struct console *console)
{
+ console_list_lock();
console_lock();
console->flags |= CON_ENABLED;
console_unlock();
+ console_list_unlock();
__pr_flush(console, 1000, true);
}
EXPORT_SYMBOL(console_start);
@@ -3164,6 +3210,8 @@ static void try_enable_default_console(struct console *newcon)
#define console_first() \
hlist_entry(console_list.first, struct console, node)
+static int unregister_console_locked(struct console *console);
+
/*
* The console driver calls this routine during kernel initialization
* to register the console printing procedure with printk() and to
@@ -3188,15 +3236,14 @@ void register_console(struct console *newcon)
struct console *con;
bool bootcon_enabled = false;
bool realcon_enabled = false;
- int cookie;
int err;
- cookie = console_srcu_read_lock();
- for_each_console_srcu(con) {
+ console_list_lock();
+
+ for_each_console(con) {
if (WARN(con == newcon, "console '%s%d' already registered\n",
con->name, con->index)) {
- console_srcu_read_unlock(cookie);
- return;
+ goto unlock;
}
if (con->flags & CON_BOOT)
@@ -3204,13 +3251,12 @@ void register_console(struct console *newcon)
else
realcon_enabled = true;
}
- console_srcu_read_unlock(cookie);
/* Do not register boot consoles when there already is a real one. */
if (newcon->flags & CON_BOOT && realcon_enabled) {
pr_info("Too late to register bootconsole %s%d\n",
newcon->name, newcon->index);
- return;
+ goto unlock;
}
/*
@@ -3241,7 +3287,7 @@ void register_console(struct console *newcon)
/* printk() messages are not printed to the Braille console. */
if (err || newcon->flags & CON_BRL)
- return;
+ goto unlock;
/*
* If we have a bootconsole, and are switching to a real console,
@@ -3304,13 +3350,15 @@ void register_console(struct console *newcon)
hlist_for_each_entry_safe(con, tmp, &console_list, node) {
if (con->flags & CON_BOOT)
- unregister_console(con);
+ unregister_console_locked(con);
}
}
+unlock:
+ console_list_unlock();
}
EXPORT_SYMBOL(register_console);
-int unregister_console(struct console *console)
+static int unregister_console_locked(struct console *console)
{
int res;
@@ -3362,6 +3410,16 @@ int unregister_console(struct console *console)
console_unlock();
return res;
}
+
+int unregister_console(struct console *console)
+{
+ int res;
+
+ console_list_lock();
+ res = unregister_console_locked(console);
+ console_list_unlock();
+ return res;
+}
EXPORT_SYMBOL(unregister_console);
/*
@@ -3414,6 +3472,7 @@ static int __init printk_late_init(void)
struct console *con;
int ret;
+ console_list_lock();
hlist_for_each_entry_safe(con, tmp, &console_list, node) {
if (!(con->flags & CON_BOOT))
continue;
@@ -3431,9 +3490,10 @@ static int __init printk_late_init(void)
*/
pr_warn("bootconsole [%s%d] uses init memory and must be disabled even before the real one is ready\n",
con->name, con->index);
- unregister_console(con);
+ unregister_console_locked(con);
}
}
+ console_list_unlock();
ret = cpuhp_setup_state_nocalls(CPUHP_PRINTK_DEAD, "printk:dead", NULL,
console_cpu_notify);
--
2.30.2
Rather than using the console_lock to guarantee safe console list
traversal, use srcu console list iteration.
Signed-off-by: John Ogness <[email protected]>
---
arch/um/kernel/kmsg_dump.c | 11 ++++-------
1 file changed, 4 insertions(+), 7 deletions(-)
diff --git a/arch/um/kernel/kmsg_dump.c b/arch/um/kernel/kmsg_dump.c
index 3a3bbbb22090..44a418dec919 100644
--- a/arch/um/kernel/kmsg_dump.c
+++ b/arch/um/kernel/kmsg_dump.c
@@ -16,20 +16,17 @@ static void kmsg_dumper_stdout(struct kmsg_dumper *dumper,
struct console *con;
unsigned long flags;
size_t len = 0;
+ int cookie;
/* only dump kmsg when no console is available */
- if (!console_trylock())
- return;
-
- for_each_console(con) {
+ cookie = console_srcu_read_lock();
+ for_each_console_srcu(con) {
if (strcmp(con->name, "tty") == 0 &&
(console_is_enabled(con) || (con->flags & CON_CONSDEV))) {
break;
}
}
-
- console_unlock();
-
+ console_srcu_read_unlock(cookie);
if (con)
return;
--
2.30.2
With SRCU it is now safe to traverse the console list, even if
the console_trylock() failed. However, overwriting console->seq
when console_trylock() failed is still an issue.
Switch to SRCU iteration and document remaining issue with
console->seq.
Signed-off-by: John Ogness <[email protected]>
---
kernel/printk/printk.c | 21 +++++++++++----------
1 file changed, 11 insertions(+), 10 deletions(-)
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 0c37ab208395..e478cb92e7ba 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -2998,21 +2998,22 @@ void console_flush_on_panic(enum con_flush_mode mode)
console_may_schedule = 0;
if (mode == CONSOLE_REPLAY_ALL) {
- struct hlist_node *tmp;
struct console *c;
+ int cookie;
u64 seq;
seq = prb_first_valid_seq(prb);
- /*
- * This cannot use for_each_console() because it's not established
- * that the current context has console locked and neither there is
- * a guarantee that there is no concurrency in that case.
- *
- * Open code it for documentation purposes and pretend that
- * it works.
- */
- hlist_for_each_entry_safe(c, tmp, &console_list, node)
+
+ cookie = console_srcu_read_lock();
+ for_each_console_srcu(c) {
+ /*
+ * If the above console_trylock() failed, this is an
+ * unsynchronized assignment. But in that case, the
+ * kernel is in "hope and pray" mode anyway.
+ */
c->seq = seq;
+ }
+ console_srcu_read_unlock(cookie);
}
console_unlock();
}
--
2.30.2
With commit 9e124fe16ff2("xen: Enable console tty by default in domU
if it's not a dummy") a hack was implemented to make sure that the
tty console remains the console behind the /dev/console device. The
main problem with the hack is that, after getting the console pointer
to the tty console, it is assumed the pointer is still valid after
releasing the console_sem. This assumption is incorrect and unsafe.
Make the hack safe by introducing a new function
console_force_preferred() to perform the full operation under
the console_list_lock.
Signed-off-by: John Ogness <[email protected]>
---
drivers/video/fbdev/xen-fbfront.c | 8 +---
include/linux/console.h | 1 +
kernel/printk/printk.c | 69 +++++++++++++++++++------------
3 files changed, 46 insertions(+), 32 deletions(-)
diff --git a/drivers/video/fbdev/xen-fbfront.c b/drivers/video/fbdev/xen-fbfront.c
index 2552c853c6c2..aa362b25a60f 100644
--- a/drivers/video/fbdev/xen-fbfront.c
+++ b/drivers/video/fbdev/xen-fbfront.c
@@ -512,12 +512,8 @@ static void xenfb_make_preferred_console(void)
}
console_srcu_read_unlock(cookie);
- if (c) {
- unregister_console(c);
- c->flags |= CON_CONSDEV;
- c->flags &= ~CON_PRINTBUFFER; /* don't print again */
- register_console(c);
- }
+ if (c)
+ console_force_preferred(c);
}
static int xenfb_resume(struct xenbus_device *dev)
diff --git a/include/linux/console.h b/include/linux/console.h
index bf1e8136424a..41378b00bbdd 100644
--- a/include/linux/console.h
+++ b/include/linux/console.h
@@ -235,6 +235,7 @@ enum con_flush_mode {
};
extern int add_preferred_console(char *name, int idx, char *options);
+extern void console_force_preferred(struct console *c);
extern void register_console(struct console *);
extern int unregister_console(struct console *);
extern void console_lock(void);
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 840d581c4b23..9a056a42b8d8 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -3207,38 +3207,17 @@ static void try_enable_default_console(struct console *newcon)
static int unregister_console_locked(struct console *console);
-/*
- * The console driver calls this routine during kernel initialization
- * to register the console printing procedure with printk() and to
- * print any messages that were printed by the kernel before the
- * console driver was initialized.
- *
- * This can happen pretty early during the boot process (because of
- * early_printk) - sometimes before setup_arch() completes - be careful
- * of what kernel features are used - they may not be initialised yet.
- *
- * There are two types of consoles - bootconsoles (early_printk) and
- * "real" consoles (everything which is not a bootconsole) which are
- * handled differently.
- * - Any number of bootconsoles can be registered at any time.
- * - As soon as a "real" console is registered, all bootconsoles
- * will be unregistered automatically.
- * - Once a "real" console is registered, any attempt to register a
- * bootconsoles will be rejected
- */
-void register_console(struct console *newcon)
+static void register_console_locked(struct console *newcon)
{
struct console *con;
bool bootcon_enabled = false;
bool realcon_enabled = false;
int err;
- console_list_lock();
-
for_each_console(con) {
if (WARN(con == newcon, "console '%s%d' already registered\n",
con->name, con->index)) {
- goto unlock;
+ return;
}
if (con->flags & CON_BOOT)
@@ -3251,7 +3230,7 @@ void register_console(struct console *newcon)
if (newcon->flags & CON_BOOT && realcon_enabled) {
pr_info("Too late to register bootconsole %s%d\n",
newcon->name, newcon->index);
- goto unlock;
+ return;
}
/*
@@ -3282,7 +3261,7 @@ void register_console(struct console *newcon)
/* printk() messages are not printed to the Braille console. */
if (err || newcon->flags & CON_BRL)
- goto unlock;
+ return;
/*
* If we have a bootconsole, and are switching to a real console,
@@ -3346,7 +3325,31 @@ void register_console(struct console *newcon)
unregister_console_locked(con);
}
}
-unlock:
+}
+
+/*
+ * The console driver calls this routine during kernel initialization
+ * to register the console printing procedure with printk() and to
+ * print any messages that were printed by the kernel before the
+ * console driver was initialized.
+ *
+ * This can happen pretty early during the boot process (because of
+ * early_printk) - sometimes before setup_arch() completes - be careful
+ * of what kernel features are used - they may not be initialised yet.
+ *
+ * There are two types of consoles - bootconsoles (early_printk) and
+ * "real" consoles (everything which is not a bootconsole) which are
+ * handled differently.
+ * - Any number of bootconsoles can be registered at any time.
+ * - As soon as a "real" console is registered, all bootconsoles
+ * will be unregistered automatically.
+ * - Once a "real" console is registered, any attempt to register a
+ * bootconsoles will be rejected
+ */
+void register_console(struct console *newcon)
+{
+ console_list_lock();
+ register_console_locked(newcon);
console_list_unlock();
}
EXPORT_SYMBOL(register_console);
@@ -3411,6 +3414,20 @@ int unregister_console(struct console *console)
}
EXPORT_SYMBOL(unregister_console);
+void console_force_preferred(struct console *c)
+{
+ console_list_lock();
+
+ if (unregister_console_locked(c) == 0) {
+ c->flags |= CON_CONSDEV;
+ c->flags &= ~CON_PRINTBUFFER; /* don't print again */
+ register_console_locked(c);
+ }
+
+ console_list_unlock();
+}
+EXPORT_SYMBOL(console_force_preferred);
+
/*
* Initialize the console device. This is called *early*, so
* we can't necessarily depend on lots of kernel help here.
--
2.30.2
On Wed, Oct 19, 2022 at 05:01:24PM +0206, John Ogness wrote:
> Replace the open coded single linked list with a hlist so a conversion
> to SRCU protected list walks can reuse the existing primitives.
>
> Signed-off-by: John Ogness <[email protected]>
> ---
> fs/proc/consoles.c | 3 +-
> include/linux/console.h | 8 ++--
> kernel/printk/printk.c | 99 +++++++++++++++++++++++------------------
> 3 files changed, 63 insertions(+), 47 deletions(-)
>
> diff --git a/fs/proc/consoles.c b/fs/proc/consoles.c
> index dfe6ce3505ce..cf2e0788f9c7 100644
> --- a/fs/proc/consoles.c
> +++ b/fs/proc/consoles.c
> @@ -74,8 +74,9 @@ static void *c_start(struct seq_file *m, loff_t *pos)
> static void *c_next(struct seq_file *m, void *v, loff_t *pos)
> {
> struct console *con = v;
> +
> ++*pos;
> - return con->next;
> + return hlist_entry_safe(con->node.next, struct console, node);
> }
>
> static void c_stop(struct seq_file *m, void *v)
> diff --git a/include/linux/console.h b/include/linux/console.h
> index 8c1686e2c233..7b5f21f9e469 100644
> --- a/include/linux/console.h
> +++ b/include/linux/console.h
> @@ -15,6 +15,7 @@
> #define _LINUX_CONSOLE_H_ 1
>
> #include <linux/atomic.h>
> +#include <linux/list.h>
> #include <linux/types.h>
>
> struct vc_data;
> @@ -154,14 +155,16 @@ struct console {
> u64 seq;
> unsigned long dropped;
> void *data;
> - struct console *next;
> + struct hlist_node node;
> };
>
> +extern struct hlist_head console_list;
> +
> /*
> * for_each_console() allows you to iterate on each console
> */
> #define for_each_console(con) \
> - for (con = console_drivers; con != NULL; con = con->next)
> + hlist_for_each_entry(con, &console_list, node)
>
> extern int console_set_on_cmdline;
> extern struct console *early_console;
> @@ -174,7 +177,6 @@ enum con_flush_mode {
> extern int add_preferred_console(char *name, int idx, char *options);
> extern void register_console(struct console *);
> extern int unregister_console(struct console *);
> -extern struct console *console_drivers;
> extern void console_lock(void);
> extern int console_trylock(void);
> extern void console_unlock(void);
> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> index e4f1e7478b52..867becc40021 100644
> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -79,13 +79,12 @@ int oops_in_progress;
> EXPORT_SYMBOL(oops_in_progress);
>
> /*
> - * console_sem protects the console_drivers list, and also
> - * provides serialisation for access to the entire console
> - * driver system.
> + * console_sem protects console_list and console->flags updates, and also
> + * provides serialization for access to the entire console driver system.
> */
> static DEFINE_SEMAPHORE(console_sem);
> -struct console *console_drivers;
> -EXPORT_SYMBOL_GPL(console_drivers);
> +HLIST_HEAD(console_list);
> +EXPORT_SYMBOL_GPL(console_list);
>
> /*
> * System may need to suppress printk message under certain
> @@ -2556,7 +2555,7 @@ static int console_cpu_notify(unsigned int cpu)
> * console_lock - lock the console system for exclusive use.
> *
> * Acquires a lock which guarantees that the caller has
> - * exclusive access to the console system and the console_drivers list.
> + * exclusive access to the console system and console_list.
> *
> * Can sleep, returns nothing.
> */
> @@ -2576,7 +2575,7 @@ EXPORT_SYMBOL(console_lock);
> * console_trylock - try to lock the console system for exclusive use.
> *
> * Try to acquire a lock which guarantees that the caller has exclusive
> - * access to the console system and the console_drivers list.
> + * access to the console system and console_list.
> *
> * returns 1 on success, and 0 on failure to acquire the lock.
> */
> @@ -2940,11 +2939,20 @@ void console_flush_on_panic(enum con_flush_mode mode)
> console_may_schedule = 0;
>
> if (mode == CONSOLE_REPLAY_ALL) {
> + struct hlist_node *tmp;
> struct console *c;
> u64 seq;
>
> seq = prb_first_valid_seq(prb);
> - for_each_console(c)
> + /*
> + * This cannot use for_each_console() because it's not established
> + * that the current context has console locked and neither there is
> + * a guarantee that there is no concurrency in that case.
> + *
> + * Open code it for documentation purposes and pretend that
> + * it works.
> + */
> + hlist_for_each_entry_safe(c, tmp, &console_list, node)
> c->seq = seq;
> }
> console_unlock();
> @@ -3081,6 +3089,9 @@ static void try_enable_default_console(struct console *newcon)
> (con->flags & CON_BOOT) ? "boot" : "", \
> con->name, con->index, ##__VA_ARGS__)
>
> +#define console_first() \
> + hlist_entry(console_list.first, struct console, node)
> +
> /*
> * The console driver calls this routine during kernel initialization
> * to register the console printing procedure with printk() and to
> @@ -3140,8 +3151,8 @@ void register_console(struct console *newcon)
> * flag set and will be first in the list.
> */
> if (preferred_console < 0) {
> - if (!console_drivers || !console_drivers->device ||
> - console_drivers->flags & CON_BOOT) {
> + if (hlist_empty(&console_list) || !console_first()->device ||
> + console_first()->flags & CON_BOOT) {
> try_enable_default_console(newcon);
> }
> }
> @@ -3169,20 +3180,22 @@ void register_console(struct console *newcon)
> }
>
> /*
> - * Put this console in the list - keep the
> - * preferred driver at the head of the list.
> + * Put this console in the list - keep the
> + * preferred driver at the head of the list.
> */
> console_lock();
> - if ((newcon->flags & CON_CONSDEV) || console_drivers == NULL) {
> - newcon->next = console_drivers;
> - console_drivers = newcon;
> - if (newcon->next)
> - newcon->next->flags &= ~CON_CONSDEV;
> - /* Ensure this flag is always set for the head of the list */
> + if (hlist_empty(&console_list)) {
> + /* Ensure CON_CONSDEV is always set for the head. */
> newcon->flags |= CON_CONSDEV;
> + hlist_add_head(&newcon->node, &console_list);
> +
> + } else if (newcon->flags & CON_CONSDEV) {
> + /* Only the new head can have CON_CONSDEV set. */
> + console_first()->flags &= ~CON_CONSDEV;
> + hlist_add_head(&newcon->node, &console_list);
> +
> } else {
> - newcon->next = console_drivers->next;
> - console_drivers->next = newcon;
> + hlist_add_behind(&newcon->node, console_list.first);
> }
>
> newcon->dropped = 0;
> @@ -3209,16 +3222,18 @@ void register_console(struct console *newcon)
> if (bootcon_enabled &&
> ((newcon->flags & (CON_CONSDEV | CON_BOOT)) == CON_CONSDEV) &&
> !keep_bootcon) {
> - for_each_console(con)
> + struct hlist_node *tmp;
> +
> + hlist_for_each_entry_safe(con, tmp, &console_list, node) {
> if (con->flags & CON_BOOT)
> unregister_console(con);
> + }
> }
> }
> EXPORT_SYMBOL(register_console);
>
> int unregister_console(struct console *console)
> {
> - struct console *con;
> int res;
>
> con_printk(KERN_INFO, console, "disabled\n");
> @@ -3229,32 +3244,30 @@ int unregister_console(struct console *console)
> if (res > 0)
> return 0;
>
> - res = -ENODEV;
> console_lock();
> - if (console_drivers == console) {
> - console_drivers=console->next;
> - res = 0;
> - } else {
> - for_each_console(con) {
> - if (con->next == console) {
> - con->next = console->next;
> - res = 0;
> - break;
> - }
> - }
> +
> + /* Disable it unconditionally */
> + console->flags &= ~CON_ENABLED;
> +
> + if (hlist_unhashed(&console->node)) {
How can this ever be hit? The console lock is held, so it shouldn't
have gone away already. Or am I missing something else here?
Other than that minor question, looks good to me!
Reviewed-by: Greg Kroah-Hartman <[email protected]>
On 2022-10-19, Greg Kroah-Hartman <[email protected]> wrote:
>> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
>> index e4f1e7478b52..867becc40021 100644
>> --- a/kernel/printk/printk.c
>> +++ b/kernel/printk/printk.c
>> @@ -3229,32 +3244,30 @@ int unregister_console(struct console *console)
>> if (res > 0)
>> return 0;
>>
>> - res = -ENODEV;
>> console_lock();
>> - if (console_drivers == console) {
>> - console_drivers=console->next;
>> - res = 0;
>> - } else {
>> - for_each_console(con) {
>> - if (con->next == console) {
>> - con->next = console->next;
>> - res = 0;
>> - break;
>> - }
>> - }
>> +
>> + /* Disable it unconditionally */
>> + console->flags &= ~CON_ENABLED;
>> +
>> + if (hlist_unhashed(&console->node)) {
>
> How can this ever be hit? The console lock is held, so it shouldn't
> have gone away already. Or am I missing something else here?
Mainline also has this check. I expect it is for the case that some code
tries to call unregister_console() for a console that is not
registered.
Since register_console() does not return if it succeeded, I suppose some
code somewhere my try to unregister without knowing that it never
registered in the first place.
> Other than that minor question, looks good to me!
>
> Reviewed-by: Greg Kroah-Hartman <[email protected]>
Thanks!
John
On Wed, Oct 19, 2022 at 11:52:53PM +0206, John Ogness wrote:
> On 2022-10-19, Greg Kroah-Hartman <[email protected]> wrote:
> >> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> >> index e4f1e7478b52..867becc40021 100644
> >> --- a/kernel/printk/printk.c
> >> +++ b/kernel/printk/printk.c
> >> @@ -3229,32 +3244,30 @@ int unregister_console(struct console *console)
> >> if (res > 0)
> >> return 0;
> >>
> >> - res = -ENODEV;
> >> console_lock();
> >> - if (console_drivers == console) {
> >> - console_drivers=console->next;
> >> - res = 0;
> >> - } else {
> >> - for_each_console(con) {
> >> - if (con->next == console) {
> >> - con->next = console->next;
> >> - res = 0;
> >> - break;
> >> - }
> >> - }
> >> +
> >> + /* Disable it unconditionally */
> >> + console->flags &= ~CON_ENABLED;
> >> +
> >> + if (hlist_unhashed(&console->node)) {
> >
> > How can this ever be hit? The console lock is held, so it shouldn't
> > have gone away already. Or am I missing something else here?
>
> Mainline also has this check. I expect it is for the case that some code
> tries to call unregister_console() for a console that is not
> registered.
>
> Since register_console() does not return if it succeeded, I suppose some
> code somewhere my try to unregister without knowing that it never
> registered in the first place.
Ick, ok, that's fine for now.
What a mess, thanks for working to unwind it!
greg k-h
On Wed, Oct 19, 2022 at 05:01:45PM +0206, John Ogness wrote:
> show_cons_active() uses the console_lock to gather information
> on registered consoles. Since the console_lock is being used for
> multiple reasons, explicitly document these reasons. This will
> be useful when the console_lock is split into fine-grained
> locking.
>
> Signed-off-by: John Ogness <[email protected]>
> ---
> drivers/tty/tty_io.c | 8 ++++++++
> 1 file changed, 8 insertions(+)
>
> diff --git a/drivers/tty/tty_io.c b/drivers/tty/tty_io.c
> index 2050e63963bb..333579bfa335 100644
> --- a/drivers/tty/tty_io.c
> +++ b/drivers/tty/tty_io.c
> @@ -3526,6 +3526,14 @@ static ssize_t show_cons_active(struct device *dev,
> struct console *c;
> ssize_t count = 0;
>
> + /*
> + * Hold the console_lock to guarantee that no consoles are
> + * unregistered until all console processing is complete.
> + * This also allows safe traversal of the console list.
> + *
> + * Stop console printing because the device() callback may
> + * assume the console is not within its write() callback.
> + */
> console_lock();
> for_each_console(c) {
> if (!c->device)
> --
> 2.30.2
>
Reviewed-by: Greg Kroah-Hartman <[email protected]>
On Wed, Oct 19, 2022 at 05:01:55PM +0206, John Ogness wrote:
> Currently there exist races in console_register(), where the types
> of registered consoles are checked (without holding the console_lock)
> and then after acquiring the console_lock, it is assumed that the list
> has not changed. Also, some code that performs console_unregister()
> make similar assumptions.
>
> Introduce a console_list_lock to provide full synchronization for any
> console list changes. The console_list_lock also provides
> synchronization for updates to console->flags.
>
> Note that one of the various responsibilities of the console_lock is
> also intended to provide this synchronization. Later changes will
> update call sites relying on the console_lock for this purpose. Once
> all call sites have been updated, the console_lock will be relieved
> of synchronizing console_list and console->flags updates.
>
> Signed-off-by: John Ogness <[email protected]>
> ---
> include/linux/console.h | 20 ++++++++--
> kernel/printk/printk.c | 82 +++++++++++++++++++++++++++++++++++------
> 2 files changed, 88 insertions(+), 14 deletions(-)
Reviewed-by: Greg Kroah-Hartman <[email protected]>
On Wed 2022-10-19 17:01:24, John Ogness wrote:
> Replace the open coded single linked list with a hlist so a conversion
> to SRCU protected list walks can reuse the existing primitives.
>
> Signed-off-by: John Ogness <[email protected]>
Reviewed-by: Petr Mladek <[email protected]>
Just one nit below.
> @@ -3229,32 +3244,30 @@ int unregister_console(struct console *console)
> if (res > 0)
> return 0;
>
> - res = -ENODEV;
> console_lock();
> - if (console_drivers == console) {
> - console_drivers=console->next;
> - res = 0;
> - } else {
> - for_each_console(con) {
> - if (con->next == console) {
> - con->next = console->next;
> - res = 0;
> - break;
> - }
> - }
> +
> + /* Disable it unconditionally */
> + console->flags &= ~CON_ENABLED;
> +
> + if (hlist_unhashed(&console->node)) {
> + res = -ENODEV;
> + goto out_unlock;
Nit: It might make sense to replace this with:
console_unlock();
return -ENODEV;
This is the only code path using the extra goto target.
It is just an idea. I do not resist on this change.
> }
>
> - if (res)
> - goto out_disable_unlock;
> + hlist_del_init(&console->node);
>
> /*
> + * <HISTORICAL>
> * If this isn't the last console and it has CON_CONSDEV set, we
> * need to set it on the next preferred console.
> + * </HISTORICAL>
> + *
> + * The above makes no sense as there is no guarantee that the next
> + * console has any device attached. Oh well....
> */
> - if (console_drivers != NULL && console->flags & CON_CONSDEV)
> - console_drivers->flags |= CON_CONSDEV;
> + if (!hlist_empty(&console_list) && console->flags & CON_CONSDEV)
> + console_first()->flags |= CON_CONSDEV;
>
> - console->flags &= ~CON_ENABLED;
> console_unlock();
> console_sysfs_notify();
>
> @@ -3263,10 +3276,8 @@ int unregister_console(struct console *console)
>
> return res;
>
> -out_disable_unlock:
> - console->flags &= ~CON_ENABLED;
> +out_unlock:
> console_unlock();
> -
> return res;
> }
> EXPORT_SYMBOL(unregister_console);
Best Regards,
Petr
On Wed 2022-10-19 17:01:42, John Ogness wrote:
> Rather than using the console_lock to guarantee safe console list
> traversal, use srcu console list iteration.
>
> Signed-off-by: John Ogness <[email protected]>
> ---
> arch/um/kernel/kmsg_dump.c | 11 ++++-------
> 1 file changed, 4 insertions(+), 7 deletions(-)
>
> diff --git a/arch/um/kernel/kmsg_dump.c b/arch/um/kernel/kmsg_dump.c
> index 3a3bbbb22090..44a418dec919 100644
> --- a/arch/um/kernel/kmsg_dump.c
> +++ b/arch/um/kernel/kmsg_dump.c
> @@ -16,20 +16,17 @@ static void kmsg_dumper_stdout(struct kmsg_dumper *dumper,
> struct console *con;
> unsigned long flags;
> size_t len = 0;
> + int cookie;
>
> /* only dump kmsg when no console is available */
I agree that it is perfectly fine to use RCU here. The previous
code was just a best effort. The kdump was done without console_lock()
so that the console might get available in the meantime.
I guess that it is not a big deal. The dumper is called typically only
during panic() where new consoles are not added.
> - if (!console_trylock())
> - return;
> -
> - for_each_console(con) {
> + cookie = console_srcu_read_lock();
> + for_each_console_srcu(con) {
> if (strcmp(con->name, "tty") == 0 &&
> (console_is_enabled(con) || (con->flags & CON_CONSDEV))) {
This is slightly more racy than the previous code. Only one console
could have CON_CONSDEV. It might be moved to another console when the
list is manipulated. As a result, rcu walk might see zero, one, or two
consoles with this flag unless the flag is moved carefully.
Anyway, this check does not match the comment and does not make much
sense to me. It is true that CON_CONSDEV flag is used only for
registered consoles. But messages are printed on the console only
when CON_ENABLED flag is set.
IMHO, we should check only console_is_enabled() here.
Adding Thomas Meyer and Richard Weinberger <[email protected]> into Cc.
Thomas added this check in the commit e23fe90dec286cd77e90
("um: kmsg_dumper: always dump when not tty console") and
Richard pushed it.
> break;
> }
> }
> -
> - console_unlock();
> -
> + console_srcu_read_unlock(cookie);
>
> if (con)
> return;
Best Regards,
Petr
On (22/10/19 17:01), John Ogness wrote:
> Replace the open coded single linked list with a hlist so a conversion
> to SRCU protected list walks can reuse the existing primitives.
>
> Signed-off-by: John Ogness <[email protected]>
Reviewed-by: Sergey Senozhatsky <[email protected]>
On Wed 2022-10-19 17:01:45, John Ogness wrote:
> show_cons_active() uses the console_lock to gather information
> on registered consoles. Since the console_lock is being used for
> multiple reasons, explicitly document these reasons. This will
> be useful when the console_lock is split into fine-grained
> locking.
>
> Signed-off-by: John Ogness <[email protected]>
> ---
> drivers/tty/tty_io.c | 8 ++++++++
> 1 file changed, 8 insertions(+)
>
> diff --git a/drivers/tty/tty_io.c b/drivers/tty/tty_io.c
> index 2050e63963bb..333579bfa335 100644
> --- a/drivers/tty/tty_io.c
> +++ b/drivers/tty/tty_io.c
> @@ -3526,6 +3526,14 @@ static ssize_t show_cons_active(struct device *dev,
> struct console *c;
> ssize_t count = 0;
>
> + /*
> + * Hold the console_lock to guarantee that no consoles are
> + * unregistered until all console processing is complete.
> + * This also allows safe traversal of the console list.
This is more or less clear. show_cons_active() reads a lot of
information from the registered consoles.
> + *
> + * Stop console printing because the device() callback may
> + * assume the console is not within its write() callback.
I wonder if this is based on some real example or if you just want
to stay on the safe side.
It is perfectly fine to stay on the safe side. But we should make
it clear if the dependency really exists or if it has to be
investigated later during the clean up.
> + */
> console_lock();
> for_each_console(c) {
> if (!c->device)
Anyway, thanks for adding the comment.
Best Regards,
Petr
On Wed 2022-10-19 17:01:52, John Ogness wrote:
> Use srcu console list iteration for console list traversal.
>
> Document why the console_lock is still necessary.
>
> Signed-off-by: John Ogness <[email protected]>
> ---
> kernel/printk/printk.c | 11 ++++++++++-
> 1 file changed, 10 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> index e478cb92e7ba..410ad9d5649c 100644
> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -3025,15 +3025,24 @@ struct tty_driver *console_device(int *index)
> {
> struct console *c;
> struct tty_driver *driver = NULL;
> + int cookie;
>
> + /*
> + * Stop console printing because the device() callback may
> + * assume the console is not within its write() callback.
Again, I would like to know more details about the possible races
with the write() callback. It is not that obvious.
> + */
> console_lock();
> - for_each_console(c) {
> +
> + cookie = console_srcu_read_lock();
> + for_each_console_srcu(c) {
> if (!c->device)
> continue;
> driver = c->device(c, index);
> if (driver)
> break;
> }
> + console_srcu_read_unlock(cookie);
> +
> console_unlock();
> return driver;
Otherwise, the change looks good.
Best Regards,
Petr
On Wed 2022-10-19 17:01:51, John Ogness wrote:
> With SRCU it is now safe to traverse the console list, even if
> the console_trylock() failed. However, overwriting console->seq
> when console_trylock() failed is still an issue.
>
> Switch to SRCU iteration and document remaining issue with
> console->seq.
>
> Signed-off-by: John Ogness <[email protected]>
Reviewed-by: Petr Mladek <[email protected]>
Best Regards,
Petr
Adding Paul into Cc so that he is aware of using a custom SRCU lockdep
check in console_list_lock().
On Wed 2022-10-19 17:01:55, John Ogness wrote:
> Currently there exist races in console_register(), where the types
> of registered consoles are checked (without holding the console_lock)
> and then after acquiring the console_lock, it is assumed that the list
> has not changed. Also, some code that performs console_unregister()
> make similar assumptions.
This sounds like that the list lock is added just to fix the races.
People might wonder why it is not done using console_lock().
My understanding is that removing this responsibility from console_lock() is
the main motivation. It would deserve a comment, e.g.
<proposal>
A solution would be to synchronize this using console_lock(). But it
would require a complex analyze of all console drivers to make sure
that console_lock() is not taken in match() and setup() callbacks.
And splitting the responsibilities of console_lock() is actually
one big motivation here.
Instead, introduce a console_list_lock...
</proposal>
> Introduce a console_list_lock to provide full synchronization for any
> console list changes.
> The console_list_lock also provides synchronization for updates
> to console->flags.
This would deserve some explanation. The synchronization of the list
manipulation is obvious, especially when the lock is called
console_list_lock. But the motivation to use this lock
for console->flags is much more complicated.
I had some problems to create a reasonable mental model about it.
I did split the flags into groups:
1. Flags that are driver-specific and static. They do not need
any synchronization:
+ CON_BOOT
+ CON_ANYTIME
2. Flags that are modified only during console registration [*]:
+ CON_PRINTBUFFER
+ CON_CONSDEV
+ CON_BRL
+ CON_EXTENDED
3. Flags that might be modified by more operations, namely: console
registration, start, and stop [*].
+ CON_ENABLED
[*] It is actually more complicated. Some flags are modified also
outside console registration code. But we are not going to
synchronize these changes because they are not visible
to others via console_drivers list.
This explained why it made sense to synchronize console->flags using
console_list_lock. Also this explained why
console_start()/console_stop() were updated in this patch.
The following description would have helped me:
<proposal>
In addition, use console_list_lock also for synchronization of
console->flags updates. All flags are either static or modified
only during the console registration. There are only few
exceptions.
The only exception is CON_ENABLED that is modified also by
console_start()/console_stop(). We need to take console_list_lock()
here as well.
Another exception is when the flags are modified by the console
driver init code before the console gets registered. These will
be ignored because they are not visible to the rest of the system
via console_drivers list.
</proposal>
> Note that one of the various responsibilities of the console_lock is
> also intended to provide this synchronization. Later changes will
> update call sites relying on the console_lock for this purpose. Once
> all call sites have been updated, the console_lock will be relieved
> of synchronizing console_list and console->flags updates.
It seems that this patch actually updates all writers. It would be
nice to mention it to define the scope of this patch.
<proposal>
Note that one of the various responsibilities of the console_lock is
also intended to provide this synchronization. Only the callers that
modify the list or flags are updated here. Later changes will
update the readers Once all call sites have been updated,
the console_lock will be relieved of synchronizing console_list
and console->flags updates.
</proposal>
> diff --git a/include/linux/console.h b/include/linux/console.h
> index 60195cd086dc..bf1e8136424a 100644
> --- a/include/linux/console.h
> +++ b/include/linux/console.h
> @@ -159,8 +159,12 @@ struct console {
> };
>
> #ifdef CONFIG_LOCKDEP
> +extern void lockdep_assert_console_list_lock_held(void);
> extern bool console_srcu_read_lock_is_held(void);
> #else
> +static inline void lockdep_assert_console_list_lock_held(void)
> +{
> +}
> static inline bool console_srcu_read_lock_is_held(void)
> {
> return 1;
> @@ -170,6 +174,9 @@ static inline bool console_srcu_read_lock_is_held(void)
> extern int console_srcu_read_lock(void);
> extern void console_srcu_read_unlock(int cookie);
>
> +extern void console_list_lock(void) __acquires(console_mutex);
> +extern void console_list_unlock(void) __releases(console_mutex);
> +
> extern struct hlist_head console_list;
>
> /**
> @@ -206,10 +213,17 @@ static inline bool console_is_enabled(const struct console *con)
> hlist_for_each_entry_srcu(con, &console_list, node, \
> console_srcu_read_lock_is_held())
>
> -/*
> - * for_each_console() allows you to iterate on each console
> +/**
> + * for_each_console() - Iterator over registered consoles
> + * @con: struct console pointer used as loop cursor
> + *
> + * The console list and all struct console fields are immutable while
> + * iterating.
s/all struct console fields/console->flags/ ?
> + *
> + * Requires console_list_lock to be held.
> */
> -#define for_each_console(con) \
> +#define for_each_console(con) \
> + lockdep_assert_console_list_lock_held(); \
> hlist_for_each_entry(con, &console_list, node)
>
> extern int console_set_on_cmdline;
> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> index 2faa6e3e2fb8..3615ee6d68fd 100644
> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -78,6 +78,9 @@ EXPORT_SYMBOL(ignore_console_lock_warning);
> int oops_in_progress;
> EXPORT_SYMBOL(oops_in_progress);
>
> +/* console_mutex protects console_list and console->flags updates. */
/*
* console_mutex protects console_list updates and console->flags updates.
* The flags are synchronized only for consoles that are registered,
* accessible via the list.
*/
> +static DEFINE_MUTEX(console_mutex);
> +
> /*
> * console_sem protects console_list and console->flags updates, and also
> * provides serialization for access to the entire console driver system.
> @@ -104,6 +107,11 @@ static struct lockdep_map console_lock_dep_map = {
> .name = "console_lock"
> };
>
> +void lockdep_assert_console_list_lock_held(void)
> +{
> + lockdep_assert_held(&console_mutex);
> +}
> +
> bool console_srcu_read_lock_is_held(void)
> {
> return srcu_read_lock_held(&console_srcu);
> @@ -225,6 +233,40 @@ int devkmsg_sysctl_set_loglvl(struct ctl_table *table, int write,
> }
> #endif /* CONFIG_PRINTK && CONFIG_SYSCTL */
>
> +/**
> + * console_list_lock - Lock the console list
> + *
> + * For console list or console->flags updates
> + */
> +void console_list_lock(void)
> +{
> +#ifdef CONFIG_DEBUG_LOCK_ALLOC
> + /*
> + * In unregister_console(), synchronize_srcu() is called with the
> + * console_list_lock held. Therefore it is not allowed that the
> + * console_list_lock is taken with the srcu_lock held.
> + *
> + * Whether or not this context is in the read-side critical section
> + * can only be detected if the appropriate debug options are enabled.
> + */
> + WARN_ON_ONCE(debug_lockdep_rcu_enabled() &&
> + srcu_read_lock_held(&console_srcu));
> +#endif
> + mutex_lock(&console_mutex);
> +}
> +EXPORT_SYMBOL(console_list_lock);
> +
> +/**
> + * console_list_unlock - Unlock the console list
> + *
> + * Counterpart to console_list_lock()
> + */
> +void console_list_unlock(void)
> +{
> + mutex_unlock(&console_mutex);
> +}
> +EXPORT_SYMBOL(console_list_unlock);
> +
> /**
> * console_srcu_read_lock - Register a new reader for the
> * SRCU-protected console list
> @@ -3304,13 +3350,15 @@ void register_console(struct console *newcon)
>
> hlist_for_each_entry_safe(con, tmp, &console_list, node) {
> if (con->flags & CON_BOOT)
> - unregister_console(con);
> + unregister_console_locked(con);
> }
> }
> +unlock:
> + console_list_unlock();
> }
> EXPORT_SYMBOL(register_console);
>
> -int unregister_console(struct console *console)
We should make it clear that it must be locked by console_list_lock().
Many people would expect console_lock() ;-) I would add a comment
and assert.
/* Must be called under console_list_lock(). */
> +static int unregister_console_locked(struct console *console)
> {
assert_console_list_lock_held();
> int res;
>
With updated comments:
Reviewed-by: Petr Mladek <[email protected]>
Best Regards,
Petr
On 2022-10-27, Petr Mladek <[email protected]> wrote:
>> - if (c) {
>> - unregister_console(c);
>> - c->flags |= CON_CONSDEV;
>> - c->flags &= ~CON_PRINTBUFFER; /* don't print again */
>> - register_console(c);
>> - }
>> + if (c)
>> + console_force_preferred(c);
>
> I would prefer to fix this a clean way.
>
> [...]
>
> I would suggest to implement:
>
> [...]
>
> It is a more code. But it is race-free. Also it is much more clear
> what is going on.
>
> How does this sound, please?
I wasn't sure if any of the other preferred-console magic in
register_console() was needed, which is why I kept a full
register_console() call. But if it really is just about forcing it the
head and setting a new CON_CONSDEV, then your suggestion is much
simpler. Thanks.
John
On Wed 2022-10-19 17:02:00, John Ogness wrote:
> With commit 9e124fe16ff2("xen: Enable console tty by default in domU
> if it's not a dummy") a hack was implemented to make sure that the
> tty console remains the console behind the /dev/console device. The
> main problem with the hack is that, after getting the console pointer
> to the tty console, it is assumed the pointer is still valid after
> releasing the console_sem. This assumption is incorrect and unsafe.
>
> Make the hack safe by introducing a new function
> console_force_preferred() to perform the full operation under
> the console_list_lock.
>
> Signed-off-by: John Ogness <[email protected]>
> ---
> drivers/video/fbdev/xen-fbfront.c | 8 +---
> include/linux/console.h | 1 +
> kernel/printk/printk.c | 69 +++++++++++++++++++------------
> 3 files changed, 46 insertions(+), 32 deletions(-)
>
> diff --git a/drivers/video/fbdev/xen-fbfront.c b/drivers/video/fbdev/xen-fbfront.c
> index 2552c853c6c2..aa362b25a60f 100644
> --- a/drivers/video/fbdev/xen-fbfront.c
> +++ b/drivers/video/fbdev/xen-fbfront.c
> @@ -512,12 +512,8 @@ static void xenfb_make_preferred_console(void)
> }
> console_srcu_read_unlock(cookie);
>
> - if (c) {
> - unregister_console(c);
> - c->flags |= CON_CONSDEV;
> - c->flags &= ~CON_PRINTBUFFER; /* don't print again */
> - register_console(c);
> - }
> + if (c)
> + console_force_preferred(c);
I would prefer to fix this a clean way. The current code is a real hack.
It tries to find a console under console_srcu. Then the console
is unregistered, flags are modified, and gets registered again.
The locks are released between all these operations.
I would suggest to implement:
void console_force_preferred_locked(struct console *new_pref_con)
{
struct console *cur_pref_con;
assert_console_list_lock_held();
if (hlist_unhashed(&new_pref_con->node))
return;
for_each_console(cur_pref_con) {
if (cur_pref_con->flags & CON_CONSDEV)
break;
}
/* Already preferred? */
if (cur_pref_con == new_pref_con)
return;
hlist_del_init_rcu(&new_pref_con->node);
/*
* Ensure that all SRCU list walks have completed before @con
* is added back as the first console
*/
synchronize_srcu()
hlist_add_behind_rcu(&new_pref_con->node, console_list.first);
cur_pref_con->flags &= ~CON_CONSDEV;
new_pref_con->flags |= CON_CONSDEV;
}
And do:
static void xenfb_make_preferred_console(void)
{
struct console *c;
if (console_set_on_cmdline)
return;
console_list_lock();
for_each_console(c) {
if (!strcmp(c->name, "tty") && c->index == 0)
break;
}
if (c)
console_force_preferred_locked(c);
console_list_unlock();
}
It is a more code. But it is race-free. Also it is much more clear
what is going on.
How does this sound, please?
Best Regards,
Petr
On Thu 2022-10-27 15:41:23, John Ogness wrote:
> On 2022-10-27, Petr Mladek <[email protected]> wrote:
> >> - if (c) {
> >> - unregister_console(c);
> >> - c->flags |= CON_CONSDEV;
> >> - c->flags &= ~CON_PRINTBUFFER; /* don't print again */
> >> - register_console(c);
> >> - }
> >> + if (c)
> >> + console_force_preferred(c);
> >
> > I would prefer to fix this a clean way.
> >
> > [...]
> >
> > I would suggest to implement:
> >
> > [...]
> >
> > It is a more code. But it is race-free. Also it is much more clear
> > what is going on.
> >
> > How does this sound, please?
>
> I wasn't sure if any of the other preferred-console magic in
> register_console() was needed, which is why I kept a full
> register_console() call. But if it really is just about forcing it the
> head and setting a new CON_CONSDEV, then your suggestion is much
> simpler. Thanks.
I believe that it is just this. I have spent a lot of time
investigating these hacks when thinking about refactoring
of register_console().
Best Regards,
Petr
On Thu, Oct 27, 2022 at 12:09:32PM +0200, Petr Mladek wrote:
> Adding Paul into Cc so that he is aware of using a custom SRCU lockdep
> check in console_list_lock().
[ . . . ]
> > +/**
> > + * console_list_lock - Lock the console list
> > + *
> > + * For console list or console->flags updates
> > + */
> > +void console_list_lock(void)
> > +{
> > +#ifdef CONFIG_DEBUG_LOCK_ALLOC
> > + /*
> > + * In unregister_console(), synchronize_srcu() is called with the
> > + * console_list_lock held. Therefore it is not allowed that the
> > + * console_list_lock is taken with the srcu_lock held.
> > + *
> > + * Whether or not this context is in the read-side critical section
> > + * can only be detected if the appropriate debug options are enabled.
> > + */
> > + WARN_ON_ONCE(debug_lockdep_rcu_enabled() &&
> > + srcu_read_lock_held(&console_srcu));
Yes, this is an interesting case.
The problem that John is facing is that srcu_read_lock_held() believes
that it is safer to err on the side of there being an SRCU reader.
This is because the standard use is to complain if there is -not-
an SRCU reader. So as soon as there is a lockdep issue anywhere,
srcu_read_lock_held() switches to unconditionally returning true.
Which is exactly what John does not want in this case.
So he excludes the CONFIG_DEBUG_LOCK_ALLOC=n case and the
!debug_lockdep_rcu_enabled() case, both of which cause
srcu_read_lock_held() to unconditionally return true.
This can result in false-positive splats if some other CPU issues a
lockdep warning after debug_lockdep_rcu_enabled() is invoked but before
srcu_read_lock_held() is invoked. But similar vulnerabilities are
present in RCU_LOCKDEP_WARN(), so unless and until there is a problem,
this code should suffice.
One way to save a line is as follows:
WARN_ON_ONCE(IS_ENABLED(CONFIG_DEBUG_LOCK_ALLOC) &&
debug_lockdep_rcu_enabled() &&
srcu_read_lock_held(&console_srcu));
Longer term, it might be possible to teach lockdep about this sort of
SRCU deadlock. This is not an issue for vanilla RCU because the RCU
reader context prohibits such deadlocks. This isn't exactly the same
as reader-writer locking because this is perfectly OK with SRCU:
CPU 0:
spin_lock(&mylock);
idx = srcu_read_lock(&mysrcu);
do_something();
srcu_read_unlock(&mysrcu, idx);
spin_unlock(&mylock);
CPU 1:
idx = srcu_read_lock(&mysrcu);
spin_lock(&mylock);
do_something();
spin_unlock(&mylock);
srcu_read_unlock(&mysrcu, idx);
Adding Boqun on CC in case it is easier than I think. ;-)
Thanx, Paul
On Thu, Oct 27, 2022 at 11:50:07AM -0700, Paul E. McKenney wrote:
> On Thu, Oct 27, 2022 at 12:09:32PM +0200, Petr Mladek wrote:
> > Adding Paul into Cc so that he is aware of using a custom SRCU lockdep
> > check in console_list_lock().
>
> [ . . . ]
>
> > > +/**
> > > + * console_list_lock - Lock the console list
> > > + *
> > > + * For console list or console->flags updates
> > > + */
> > > +void console_list_lock(void)
> > > +{
> > > +#ifdef CONFIG_DEBUG_LOCK_ALLOC
> > > + /*
> > > + * In unregister_console(), synchronize_srcu() is called with the
> > > + * console_list_lock held. Therefore it is not allowed that the
> > > + * console_list_lock is taken with the srcu_lock held.
> > > + *
> > > + * Whether or not this context is in the read-side critical section
> > > + * can only be detected if the appropriate debug options are enabled.
> > > + */
> > > + WARN_ON_ONCE(debug_lockdep_rcu_enabled() &&
> > > + srcu_read_lock_held(&console_srcu));
>
> Yes, this is an interesting case.
>
> The problem that John is facing is that srcu_read_lock_held() believes
> that it is safer to err on the side of there being an SRCU reader.
> This is because the standard use is to complain if there is -not-
> an SRCU reader. So as soon as there is a lockdep issue anywhere,
> srcu_read_lock_held() switches to unconditionally returning true.
>
> Which is exactly what John does not want in this case.
>
> So he excludes the CONFIG_DEBUG_LOCK_ALLOC=n case and the
> !debug_lockdep_rcu_enabled() case, both of which cause
> srcu_read_lock_held() to unconditionally return true.
>
> This can result in false-positive splats if some other CPU issues a
> lockdep warning after debug_lockdep_rcu_enabled() is invoked but before
> srcu_read_lock_held() is invoked. But similar vulnerabilities are
> present in RCU_LOCKDEP_WARN(), so unless and until there is a problem,
> this code should suffice.
>
> One way to save a line is as follows:
>
> WARN_ON_ONCE(IS_ENABLED(CONFIG_DEBUG_LOCK_ALLOC) &&
> debug_lockdep_rcu_enabled() &&
> srcu_read_lock_held(&console_srcu));
>
> Longer term, it might be possible to teach lockdep about this sort of
> SRCU deadlock. This is not an issue for vanilla RCU because the RCU
> reader context prohibits such deadlocks. This isn't exactly the same
> as reader-writer locking because this is perfectly OK with SRCU:
>
> CPU 0:
> spin_lock(&mylock);
> idx = srcu_read_lock(&mysrcu);
> do_something();
> srcu_read_unlock(&mysrcu, idx);
> spin_unlock(&mylock);
>
> CPU 1:
> idx = srcu_read_lock(&mysrcu);
> spin_lock(&mylock);
> do_something();
> spin_unlock(&mylock);
> srcu_read_unlock(&mysrcu, idx);
>
> Adding Boqun on CC in case it is easier than I think. ;-)
>
First I think reader-writer deadlock detection won't treat this as a
deadlock, because srcu_read_lock() is a recursive read lock ;-) in other
words, lockdep knows they don't block each other.
I was actually considering to equip SRCU with reader-writer deadlock
detection when:
https://lore.kernel.org/lkml/20180412021233.ewncg5jjuzjw3x62@tardis/
The problem (for SRCU to use reader-writer deadlock detection) is
letting lockdep know synchronize_srcu() doesn't block srcu_read_lock(),
so looks like I owe you a new lockdep annotation primitive ;-)
Regards,
Boqun
> Thanx, Paul
On Fri, Oct 28, 2022 at 11:09:35AM -0700, Boqun Feng wrote:
> On Thu, Oct 27, 2022 at 11:50:07AM -0700, Paul E. McKenney wrote:
> > On Thu, Oct 27, 2022 at 12:09:32PM +0200, Petr Mladek wrote:
> > > Adding Paul into Cc so that he is aware of using a custom SRCU lockdep
> > > check in console_list_lock().
> >
> > [ . . . ]
> >
> > > > +/**
> > > > + * console_list_lock - Lock the console list
> > > > + *
> > > > + * For console list or console->flags updates
> > > > + */
> > > > +void console_list_lock(void)
> > > > +{
> > > > +#ifdef CONFIG_DEBUG_LOCK_ALLOC
> > > > + /*
> > > > + * In unregister_console(), synchronize_srcu() is called with the
> > > > + * console_list_lock held. Therefore it is not allowed that the
> > > > + * console_list_lock is taken with the srcu_lock held.
> > > > + *
> > > > + * Whether or not this context is in the read-side critical section
> > > > + * can only be detected if the appropriate debug options are enabled.
> > > > + */
> > > > + WARN_ON_ONCE(debug_lockdep_rcu_enabled() &&
> > > > + srcu_read_lock_held(&console_srcu));
> >
> > Yes, this is an interesting case.
> >
> > The problem that John is facing is that srcu_read_lock_held() believes
> > that it is safer to err on the side of there being an SRCU reader.
> > This is because the standard use is to complain if there is -not-
> > an SRCU reader. So as soon as there is a lockdep issue anywhere,
> > srcu_read_lock_held() switches to unconditionally returning true.
> >
> > Which is exactly what John does not want in this case.
> >
> > So he excludes the CONFIG_DEBUG_LOCK_ALLOC=n case and the
> > !debug_lockdep_rcu_enabled() case, both of which cause
> > srcu_read_lock_held() to unconditionally return true.
> >
> > This can result in false-positive splats if some other CPU issues a
> > lockdep warning after debug_lockdep_rcu_enabled() is invoked but before
> > srcu_read_lock_held() is invoked. But similar vulnerabilities are
> > present in RCU_LOCKDEP_WARN(), so unless and until there is a problem,
> > this code should suffice.
> >
> > One way to save a line is as follows:
> >
> > WARN_ON_ONCE(IS_ENABLED(CONFIG_DEBUG_LOCK_ALLOC) &&
> > debug_lockdep_rcu_enabled() &&
> > srcu_read_lock_held(&console_srcu));
> >
> > Longer term, it might be possible to teach lockdep about this sort of
> > SRCU deadlock. This is not an issue for vanilla RCU because the RCU
> > reader context prohibits such deadlocks. This isn't exactly the same
> > as reader-writer locking because this is perfectly OK with SRCU:
> >
> > CPU 0:
> > spin_lock(&mylock);
> > idx = srcu_read_lock(&mysrcu);
> > do_something();
> > srcu_read_unlock(&mysrcu, idx);
> > spin_unlock(&mylock);
> >
> > CPU 1:
> > idx = srcu_read_lock(&mysrcu);
> > spin_lock(&mylock);
> > do_something();
> > spin_unlock(&mylock);
> > srcu_read_unlock(&mysrcu, idx);
> >
> > Adding Boqun on CC in case it is easier than I think. ;-)
>
> First I think reader-writer deadlock detection won't treat this as a
> deadlock, because srcu_read_lock() is a recursive read lock ;-) in other
> words, lockdep knows they don't block each other.
Nice!
> I was actually considering to equip SRCU with reader-writer deadlock
> detection when:
>
> https://lore.kernel.org/lkml/20180412021233.ewncg5jjuzjw3x62@tardis/
>
> The problem (for SRCU to use reader-writer deadlock detection) is
> letting lockdep know synchronize_srcu() doesn't block srcu_read_lock(),
> so looks like I owe you a new lockdep annotation primitive ;-)
Even better! ;-)
Thanx, Paul
On 2022-10-27, "Paul E. McKenney" <[email protected]> wrote:
> One way to save a line is as follows:
>
> WARN_ON_ONCE(IS_ENABLED(CONFIG_DEBUG_LOCK_ALLOC) &&
> debug_lockdep_rcu_enabled() &&
> srcu_read_lock_held(&console_srcu));
Unfortunately this suggestion does not work because
debug_lockdep_rcu_enabled() only exists if CONFIG_DEBUG_LOCK_ALLOC is
enabled. Would you be interested in having an empty implementation?
Then my check would not need to be concerned about
CONFIG_DEBUG_LOCK_ALLOC at all. It would become:
WARN_ON_ONCE(debug_lockdep_rcu_enabled() &&
srcu_read_lock_held(&console_srcu));
The patch below could be used to achieve that.
John
--------8<-------------
From 71d9e484d5128cd1e57e5bff5391d91789f444fa Mon Sep 17 00:00:00 2001
From: John Ogness <[email protected]>
Date: Mon, 7 Nov 2022 11:06:40 +0106
Subject: [PATCH] rcu: implement lockdep_rcu_enabled for
!CONFIG_DEBUG_LOCK_ALLOC
Provide an implementation for debug_lockdep_rcu_enabled() when
CONFIG_DEBUG_LOCK_ALLOC is not enabled. This allows code to check
if rcu lockdep debugging is available without needing an extra
check if CONFIG_DEBUG_LOCK_ALLOC is enabled.
Signed-off-by: John Ogness <[email protected]>
---
include/linux/rcupdate.h | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index 08605ce7379d..65178c40ab6f 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -340,6 +340,11 @@ static inline int rcu_read_lock_any_held(void)
return !preemptible();
}
+static inline int debug_lockdep_rcu_enabled(void)
+{
+ return 0;
+}
+
#endif /* #else #ifdef CONFIG_DEBUG_LOCK_ALLOC */
#ifdef CONFIG_PROVE_RCU
--
2.30.2
On Mon, Nov 07, 2022 at 11:16:56AM +0106, John Ogness wrote:
> On 2022-10-27, "Paul E. McKenney" <[email protected]> wrote:
> > One way to save a line is as follows:
> >
> > WARN_ON_ONCE(IS_ENABLED(CONFIG_DEBUG_LOCK_ALLOC) &&
> > debug_lockdep_rcu_enabled() &&
> > srcu_read_lock_held(&console_srcu));
>
> Unfortunately this suggestion does not work because
> debug_lockdep_rcu_enabled() only exists if CONFIG_DEBUG_LOCK_ALLOC is
> enabled. Would you be interested in having an empty implementation?
> Then my check would not need to be concerned about
> CONFIG_DEBUG_LOCK_ALLOC at all. It would become:
>
> WARN_ON_ONCE(debug_lockdep_rcu_enabled() &&
> srcu_read_lock_held(&console_srcu));
>
> The patch below could be used to achieve that.
Looks quite sensible to me! There are probably a few other places
where it might be used.
Thanx, Paul
> John
>
> --------8<-------------
> >From 71d9e484d5128cd1e57e5bff5391d91789f444fa Mon Sep 17 00:00:00 2001
> From: John Ogness <[email protected]>
> Date: Mon, 7 Nov 2022 11:06:40 +0106
> Subject: [PATCH] rcu: implement lockdep_rcu_enabled for
> !CONFIG_DEBUG_LOCK_ALLOC
>
> Provide an implementation for debug_lockdep_rcu_enabled() when
> CONFIG_DEBUG_LOCK_ALLOC is not enabled. This allows code to check
> if rcu lockdep debugging is available without needing an extra
> check if CONFIG_DEBUG_LOCK_ALLOC is enabled.
>
> Signed-off-by: John Ogness <[email protected]>
> ---
> include/linux/rcupdate.h | 5 +++++
> 1 file changed, 5 insertions(+)
>
> diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
> index 08605ce7379d..65178c40ab6f 100644
> --- a/include/linux/rcupdate.h
> +++ b/include/linux/rcupdate.h
> @@ -340,6 +340,11 @@ static inline int rcu_read_lock_any_held(void)
> return !preemptible();
> }
>
> +static inline int debug_lockdep_rcu_enabled(void)
> +{
> + return 0;
> +}
> +
> #endif /* #else #ifdef CONFIG_DEBUG_LOCK_ALLOC */
>
> #ifdef CONFIG_PROVE_RCU
> --
> 2.30.2