2018-10-16 05:05:26

by Sergey Senozhatsky

[permalink] [raw]
Subject: [RFC][PATCHv2 0/4] less deadlock prone serial consoles

Hello,


RFC


The patch set reduces the number of ways serial consoles
can deadlock the system, including the recently reported deadlock
in panic().

The test scenario I ran on my x86_64 box:
- panic() the system while one of the CPUs holds 8250's
uart_port->lock.

Test results:

base | patched
-----------------------------------------
deadlock | flush logbuf
| reboot the system

To keep the cover letter short, please find full description in
commit messages.

P.S. since this is RFC I modify only 8250 serial driver.

Sergey Senozhatsky (4):
panic: avoid deadlocks in re-entrant console drivers
printk: move printk_safe macros to printk header
seial: introduce uart_port locking helpers
tty: 8250: switch to uart_port locking helpers

drivers/tty/serial/8250/8250_core.c | 8 +--
drivers/tty/serial/8250/8250_dma.c | 4 +-
drivers/tty/serial/8250/8250_port.c | 81 +++++++++++++----------------
include/linux/printk.h | 40 ++++++++++++++
include/linux/serial_core.h | 48 +++++++++++++++++
kernel/panic.c | 6 +++
kernel/printk/internal.h | 37 -------------
kernel/printk/printk_safe.c | 6 ++-
8 files changed, 141 insertions(+), 89 deletions(-)

--
2.19.1



2018-10-16 05:05:36

by Sergey Senozhatsky

[permalink] [raw]
Subject: [RFC][PATCHv2 1/4] panic: avoid deadlocks in re-entrant console drivers

From printk()/serial console point of view panic() is special, because
it may force CPU to re-enter printk() or/and serial console driver.
Therefore, some of serial consoles drivers are re-entrant. E.g. 8250:

serial8250_console_write()
{
if (port->sysrq)
locked = 0;
else if (oops_in_progress)
locked = spin_trylock_irqsave(&port->lock, flags);
else
spin_lock_irqsave(&port->lock, flags);
...
}

panic() does set oops_in_progress via bust_spinlocks(1), so in theory
we should be able to re-enter serial console driver from panic():

CPU0
<NMI>
uart_console_write()
serial8250_console_write() // if (oops_in_progress)
// spin_trylock_irqsave()
call_console_drivers()
console_unlock()
console_flush_on_panic()
bust_spinlocks(1) // oops_in_progress++
panic()
<NMI/>
spin_lock_irqsave(&port->lock, flags) // spin_lock_irqsave()
serial8250_console_write()
call_console_drivers()
console_unlock()
printk()
...

However, this does not happen and we deadlock in serial console on
port->lock spinlock. And the problem is that console_flush_on_panic()
called after bust_spinlocks(0):

void panic(const char *fmt, ...)
{
bust_spinlocks(1);
...
bust_spinlocks(0);
console_flush_on_panic();
...
}

bust_spinlocks(0) decrements oops_in_progress, so oops_in_progress
can go back to zero. Thus even re-entrant console drivers will simply
spin on port->lock spinlock. Given that port->lock may already be
locked either by a stopped CPU, or by the very same CPU we execute
panic() on (for instance, NMI panic() on printing CPU) the system
deadlocks and does not reboot.

Fix this by setting oops_in_progress before console_flush_on_panic(),
so re-entrant console drivers will trylock the port->lock instead of
spinning on it forever.

Signed-off-by: Sergey Senozhatsky <[email protected]>
---
kernel/panic.c | 6 ++++++
1 file changed, 6 insertions(+)

diff --git a/kernel/panic.c b/kernel/panic.c
index f6d549a29a5c..a0e60ccf3031 100644
--- a/kernel/panic.c
+++ b/kernel/panic.c
@@ -237,7 +237,13 @@ void panic(const char *fmt, ...)
if (_crash_kexec_post_notifiers)
__crash_kexec(NULL);

+ /*
+ * Decrement oops_in_progress and let bust_spinlocks() to
+ * unblank_screen(), console_unblank() and wake_up_klogd()
+ */
bust_spinlocks(0);
+ /* Set oops_in_progress, so we can reenter serial console driver */
+ bust_spinlocks(1);

/*
* We may have ended up stopping the CPU holding the lock (in
--
2.19.1


2018-10-16 05:05:45

by Sergey Senozhatsky

[permalink] [raw]
Subject: [RFC][PATCHv2 2/4] printk: move printk_safe macros to printk header

Make printk_safe_enter_irqsave()/etc macros available to the
rest of the kernel.

Signed-off-by: Sergey Senozhatsky <[email protected]>
---
include/linux/printk.h | 40 +++++++++++++++++++++++++++++++++++++
kernel/printk/internal.h | 37 ----------------------------------
kernel/printk/printk_safe.c | 6 ++++--
3 files changed, 44 insertions(+), 39 deletions(-)

diff --git a/include/linux/printk.h b/include/linux/printk.h
index cf3eccfe1543..75f99441fd54 100644
--- a/include/linux/printk.h
+++ b/include/linux/printk.h
@@ -157,6 +157,46 @@ static inline void printk_nmi_direct_enter(void) { }
static inline void printk_nmi_direct_exit(void) { }
#endif /* PRINTK_NMI */

+#ifdef CONFIG_PRINTK
+extern void printk_safe_enter(void);
+extern void printk_safe_exit(void);
+
+#define printk_safe_enter_irqsave(flags) \
+ do { \
+ local_irq_save(flags); \
+ printk_safe_enter(); \
+ } while (0)
+
+#define printk_safe_exit_irqrestore(flags) \
+ do { \
+ printk_safe_exit(); \
+ local_irq_restore(flags); \
+ } while (0)
+
+#define printk_safe_enter_irq() \
+ do { \
+ local_irq_disable(); \
+ printk_safe_enter(); \
+ } while (0)
+
+#define printk_safe_exit_irq() \
+ do { \
+ printk_safe_exit(); \
+ local_irq_enable(); \
+ } while (0)
+#else
+/*
+ * On !PRINTK builds we still export console output related locks
+ * and some functions (console_unlock()/tty/etc.), so printk-safe
+ * must preserve the existing local IRQ guarantees.
+ */
+#define printk_safe_enter_irqsave(flags) local_irq_save(flags)
+#define printk_safe_exit_irqrestore(flags) local_irq_restore(flags)
+
+#define printk_safe_enter_irq() local_irq_disable()
+#define printk_safe_exit_irq() local_irq_enable()
+#endif
+
#ifdef CONFIG_PRINTK
asmlinkage __printf(5, 0)
int vprintk_emit(int facility, int level,
diff --git a/kernel/printk/internal.h b/kernel/printk/internal.h
index 0f1898820cba..03c10f361a15 100644
--- a/kernel/printk/internal.h
+++ b/kernel/printk/internal.h
@@ -32,32 +32,6 @@ int vprintk_store(int facility, int level,
__printf(1, 0) int vprintk_default(const char *fmt, va_list args);
__printf(1, 0) int vprintk_deferred(const char *fmt, va_list args);
__printf(1, 0) int vprintk_func(const char *fmt, va_list args);
-void __printk_safe_enter(void);
-void __printk_safe_exit(void);
-
-#define printk_safe_enter_irqsave(flags) \
- do { \
- local_irq_save(flags); \
- __printk_safe_enter(); \
- } while (0)
-
-#define printk_safe_exit_irqrestore(flags) \
- do { \
- __printk_safe_exit(); \
- local_irq_restore(flags); \
- } while (0)
-
-#define printk_safe_enter_irq() \
- do { \
- local_irq_disable(); \
- __printk_safe_enter(); \
- } while (0)
-
-#define printk_safe_exit_irq() \
- do { \
- __printk_safe_exit(); \
- local_irq_enable(); \
- } while (0)

void defer_console_output(void);

@@ -65,15 +39,4 @@ void defer_console_output(void);

__printf(1, 0) int vprintk_func(const char *fmt, va_list args) { return 0; }

-/*
- * In !PRINTK builds we still export logbuf_lock spin_lock, console_sem
- * semaphore and some of console functions (console_unlock()/etc.), so
- * printk-safe must preserve the existing local IRQ guarantees.
- */
-#define printk_safe_enter_irqsave(flags) local_irq_save(flags)
-#define printk_safe_exit_irqrestore(flags) local_irq_restore(flags)
-
-#define printk_safe_enter_irq() local_irq_disable()
-#define printk_safe_exit_irq() local_irq_enable()
-
#endif /* CONFIG_PRINTK */
diff --git a/kernel/printk/printk_safe.c b/kernel/printk/printk_safe.c
index 0913b4d385de..367fcb2c39f0 100644
--- a/kernel/printk/printk_safe.c
+++ b/kernel/printk/printk_safe.c
@@ -359,16 +359,18 @@ static __printf(1, 0) int vprintk_safe(const char *fmt, va_list args)
}

/* Can be preempted by NMI. */
-void __printk_safe_enter(void)
+void printk_safe_enter(void)
{
this_cpu_inc(printk_context);
}
+EXPORT_SYMBOL_GPL(printk_safe_enter);

/* Can be preempted by NMI. */
-void __printk_safe_exit(void)
+void printk_safe_exit(void)
{
this_cpu_dec(printk_context);
}
+EXPORT_SYMBOL_GPL(printk_safe_exit);

__printf(1, 0) int vprintk_func(const char *fmt, va_list args)
{
--
2.19.1


2018-10-16 05:06:24

by Sergey Senozhatsky

[permalink] [raw]
Subject: [RFC][PATCHv2 3/4] serial: introduce uart_port locking helpers

The patch introduces several macros which should make serial console
drivers less printk() deadlock prone.

There are several console driver entry points so we are looking at
several slightly different deadlock scenarios.

- The first entry point is console ->write() callback, which we call
from printk(). A possible deadlock scenario there is:

CPU0
<NMI>
spin_lock_irqsave(&port->lock, flags) << deadlock
serial_foo_write()
call_console_drivers()
console_unlock()
console_flush_on_panic()
panic()
<NMI/>
spin_lock_irqsave(&port->lock, flags)
serial_foo_write()
call_console_drivers()
console_unlock()
printk()
...

->write() callback, generally speaking, must be re-entrant in two cases:
1) sysrq
2) panic (oops_in_progress)

And this is what uart_port_lock_for_printk()/uart_port_unlock_after_printk()
macros are for.

Usage example:

static serial_foo_write(...)
{
struct uart_port *port;
unsigned long flags;
bool locked;

uart_port_lock_for_printk(port, flags, locked);
uart_console_write(port, s, count, serial_foo_putchar);
uart_port_unlock_after_printk(port, flags, locked);
}

Some of the serial drivers already use _some sort_ of
uart_port_lock_for_printk(), some are not panic() re-entrant. This
should address the issue.

- The rest (of entry points) requires a bit different handling.
Let's take a look at the following backtrace:

CPU0
<IRQ>
spin_lock_irqsave(&port->lock, flags) << deadlock
serial_foo_write()
call_console_drivers()
console_unlock()
printk()
__queue_work()
tty_flip_buffer_push()
spin_lock_irqsave(&port->lock, flags)
serial_foo_handle_IRQ()
<IRQ/>

Serial drivers invoke tons of core kernel functions - WQ, MM, etc. All
of which may printk() in various cases. So we can't really just
"remove those printk-s". The simples way to address this seems to be
PRINTK_SAFE_CONTEXT_MASK.

printk_safe() is a special printk() mode, which redirects recursive
printk()-s to a secondary (per-CPU) buffer; so we don't re-enter
printk() and serial console driver. The secondary (per-CPU) buffer is
flushed and printed to the consoles later, from a safe context, when
we know that we are not in printk()-recursion anymore.

And this is what uart_port_lock_*()/uart_port_unlock_*() macros are
for. With uart_port_lock_irqsave() the previous example would turn
into:

CPU0
<IRQ>
irq_work_queue()
printk_safe_log_store()
printk()
__queue_work()
tty_flip_buffer_push()
uart_port_lock_irqsave(port, flags)
serial_foo_handle_IRQ()
<IRQ/>

As of now, no consoles are re-entrant when printk() initiated by
console's driver IRQ handler. This macro should address the issue.

Signed-off-by: Sergey Senozhatsky <[email protected]>
---
include/linux/serial_core.h | 48 +++++++++++++++++++++++++++++++++++++
1 file changed, 48 insertions(+)

diff --git a/include/linux/serial_core.h b/include/linux/serial_core.h
index 047fa67d039b..acc6966fea8e 100644
--- a/include/linux/serial_core.h
+++ b/include/linux/serial_core.h
@@ -266,6 +266,54 @@ struct uart_port {
void *private_data; /* generic platform data pointer */
};

+#define uart_port_lock_irq(p) \
+ do { \
+ printk_safe_enter_irq(); \
+ spin_lock(&(p)->lock); \
+ } while (0)
+
+#define uart_port_unlock_irq(p) \
+ do { \
+ spin_unlock(&(p)->lock); \
+ printk_safe_exit_irq(); \
+ } while (0)
+
+#define uart_port_lock_irqsave(p, flags) \
+ do { \
+ printk_safe_enter_irqsave(flags); \
+ spin_lock(&(p)->lock); \
+ } while (0)
+
+#define uart_port_unlock_irqrestore(p, flags) \
+ do { \
+ spin_unlock(&(p)->lock); \
+ printk_safe_exit_irqrestore(flags); \
+ } while (0)
+
+#if defined(CONFIG_SERIAL_CORE_CONSOLE) || defined(SUPPORT_SYSRQ)
+#define uart_port_in_sysrq(p) p->sysrq
+#else
+#define uart_port_in_sysrq(p) 0
+#endif
+
+#define uart_port_lock_for_printk(p, flags, locked) \
+ do { \
+ locked = true; \
+ if (uart_port_in_sysrq(p)) \
+ locked = false; \
+ else if (oops_in_progress) \
+ locked = spin_trylock_irqsave(&(p)->lock, \
+ flags); \
+ else \
+ spin_lock_irqsave(&(p)->lock, flags); \
+ } while (0)
+
+#define uart_port_unlock_after_printk(p, flags, locked) \
+ do { \
+ if (locked) \
+ spin_unlock_irqrestore(&(p)->lock, flags); \
+ } while (0)
+
static inline int serial_port_in(struct uart_port *up, int offset)
{
return up->serial_in(up, offset);
--
2.19.1


2018-10-16 05:06:44

by Sergey Senozhatsky

[permalink] [raw]
Subject: [RFC][PATCHv2 4/4] tty: 8250: switch to uart_port locking helpers

Switch to uart_port_lock macros:

- use uart_port_lock_for_printk()/uart_port_unlock_after_printk() in
serial8250_console_write().

- use uart_port_lock_*()/uart_port_unlock_*() elsewhere.

Signed-off-by: Sergey Senozhatsky <[email protected]>
---
drivers/tty/serial/8250/8250_core.c | 8 +--
drivers/tty/serial/8250/8250_dma.c | 4 +-
drivers/tty/serial/8250/8250_port.c | 81 +++++++++++++----------------
3 files changed, 43 insertions(+), 50 deletions(-)

diff --git a/drivers/tty/serial/8250/8250_core.c b/drivers/tty/serial/8250/8250_core.c
index 94f3e1c64490..34fe546260f6 100644
--- a/drivers/tty/serial/8250/8250_core.c
+++ b/drivers/tty/serial/8250/8250_core.c
@@ -268,7 +268,7 @@ static void serial8250_backup_timeout(struct timer_list *t)
unsigned int iir, ier = 0, lsr;
unsigned long flags;

- spin_lock_irqsave(&up->port.lock, flags);
+ uart_port_lock_irqsave(&up->port, flags);

/*
* Must disable interrupts or else we risk racing with the interrupt
@@ -302,7 +302,7 @@ static void serial8250_backup_timeout(struct timer_list *t)
if (up->port.irq)
serial_out(up, UART_IER, ier);

- spin_unlock_irqrestore(&up->port.lock, flags);
+ uart_port_unlock_irqrestore(&up->port, flags);

/* Standard timer interval plus 0.2s to keep the port running */
mod_timer(&up->timer,
@@ -1078,9 +1078,9 @@ void serial8250_unregister_port(int line)
if (uart->em485) {
unsigned long flags;

- spin_lock_irqsave(&uart->port.lock, flags);
+ uart_port_lock_irqsave(&uart->port, flags);
serial8250_em485_destroy(uart);
- spin_unlock_irqrestore(&uart->port.lock, flags);
+ uart_port_unlock_irqrestore(&uart->port, flags);
}

uart_remove_one_port(&serial8250_reg, &uart->port);
diff --git a/drivers/tty/serial/8250/8250_dma.c b/drivers/tty/serial/8250/8250_dma.c
index bfa1a857f3ff..a9255b4e9353 100644
--- a/drivers/tty/serial/8250/8250_dma.c
+++ b/drivers/tty/serial/8250/8250_dma.c
@@ -22,7 +22,7 @@ static void __dma_tx_complete(void *param)
dma_sync_single_for_cpu(dma->txchan->device->dev, dma->tx_addr,
UART_XMIT_SIZE, DMA_TO_DEVICE);

- spin_lock_irqsave(&p->port.lock, flags);
+ uart_port_lock_irqsave(&p->port, flags);

dma->tx_running = 0;

@@ -39,7 +39,7 @@ static void __dma_tx_complete(void *param)
serial_port_out(&p->port, UART_IER, p->ier);
}

- spin_unlock_irqrestore(&p->port.lock, flags);
+ uart_port_unlock_irqrestore(&p->port, flags);
}

static void __dma_rx_complete(void *param)
diff --git a/drivers/tty/serial/8250/8250_port.c b/drivers/tty/serial/8250/8250_port.c
index f776b3eafb96..340aecf23dc5 100644
--- a/drivers/tty/serial/8250/8250_port.c
+++ b/drivers/tty/serial/8250/8250_port.c
@@ -790,9 +790,9 @@ static void enable_rsa(struct uart_8250_port *up)
{
if (up->port.type == PORT_RSA) {
if (up->port.uartclk != SERIAL_RSA_BAUD_BASE * 16) {
- spin_lock_irq(&up->port.lock);
+ uart_port_lock_irq(&up->port);
__enable_rsa(up);
- spin_unlock_irq(&up->port.lock);
+ uart_port_unlock_irq(&up->port);
}
if (up->port.uartclk == SERIAL_RSA_BAUD_BASE * 16)
serial_out(up, UART_RSA_FRR, 0);
@@ -812,7 +812,7 @@ static void disable_rsa(struct uart_8250_port *up)

if (up->port.type == PORT_RSA &&
up->port.uartclk == SERIAL_RSA_BAUD_BASE * 16) {
- spin_lock_irq(&up->port.lock);
+ uart_port_lock_irq(&up->port);

mode = serial_in(up, UART_RSA_MSR);
result = !(mode & UART_RSA_MSR_FIFO);
@@ -825,7 +825,7 @@ static void disable_rsa(struct uart_8250_port *up)

if (result)
up->port.uartclk = SERIAL_RSA_BAUD_BASE_LO * 16;
- spin_unlock_irq(&up->port.lock);
+ uart_port_unlock_irq(&up->port);
}
}
#endif /* CONFIG_SERIAL_8250_RSA */
@@ -1236,7 +1236,7 @@ static void autoconfig(struct uart_8250_port *up)
* We really do need global IRQs disabled here - we're going to
* be frobbing the chips IRQ enable register to see if it exists.
*/
- spin_lock_irqsave(&port->lock, flags);
+ uart_port_lock_irqsave(port, flags);

up->capabilities = 0;
up->bugs = 0;
@@ -1275,7 +1275,7 @@ static void autoconfig(struct uart_8250_port *up)
/*
* We failed; there's nothing here
*/
- spin_unlock_irqrestore(&port->lock, flags);
+ uart_port_unlock_irqrestore(port, flags);
DEBUG_AUTOCONF("IER test failed (%02x, %02x) ",
scratch2, scratch3);
goto out;
@@ -1299,7 +1299,7 @@ static void autoconfig(struct uart_8250_port *up)
status1 = serial_in(up, UART_MSR) & 0xF0;
serial8250_out_MCR(up, save_mcr);
if (status1 != 0x90) {
- spin_unlock_irqrestore(&port->lock, flags);
+ uart_port_unlock_irqrestore(port, flags);
DEBUG_AUTOCONF("LOOP test failed (%02x) ",
status1);
goto out;
@@ -1372,7 +1372,7 @@ static void autoconfig(struct uart_8250_port *up)
serial_out(up, UART_IER, 0);

out_lock:
- spin_unlock_irqrestore(&port->lock, flags);
+ uart_port_unlock_irqrestore(port, flags);

/*
* Check if the device is a Fintek F81216A
@@ -1483,12 +1483,12 @@ static enum hrtimer_restart serial8250_em485_handle_stop_tx(struct hrtimer *t)
p = em485->port;

serial8250_rpm_get(p);
- spin_lock_irqsave(&p->port.lock, flags);
+ uart_port_lock_irqsave(&p->port, flags);
if (em485->active_timer == &em485->stop_tx_timer) {
__do_stop_tx_rs485(p);
em485->active_timer = NULL;
}
- spin_unlock_irqrestore(&p->port.lock, flags);
+ uart_port_unlock_irqrestore(&p->port, flags);
serial8250_rpm_put(p);
return HRTIMER_NORESTART;
}
@@ -1637,12 +1637,12 @@ static enum hrtimer_restart serial8250_em485_handle_start_tx(struct hrtimer *t)
em485 = container_of(t, struct uart_8250_em485, start_tx_timer);
p = em485->port;

- spin_lock_irqsave(&p->port.lock, flags);
+ uart_port_lock_irqsave(&p->port, flags);
if (em485->active_timer == &em485->start_tx_timer) {
__start_tx(&p->port);
em485->active_timer = NULL;
}
- spin_unlock_irqrestore(&p->port.lock, flags);
+ uart_port_unlock_irqrestore(&p->port, flags);
return HRTIMER_NORESTART;
}

@@ -1884,7 +1884,7 @@ int serial8250_handle_irq(struct uart_port *port, unsigned int iir)
if (iir & UART_IIR_NO_INT)
return 0;

- spin_lock_irqsave(&port->lock, flags);
+ uart_port_lock_irqsave(port, flags);

status = serial_port_in(port, UART_LSR);

@@ -1897,7 +1897,7 @@ int serial8250_handle_irq(struct uart_port *port, unsigned int iir)
if ((!up->dma || up->dma->tx_err) && (status & UART_LSR_THRE))
serial8250_tx_chars(up);

- spin_unlock_irqrestore(&port->lock, flags);
+ uart_port_unlock_irqrestore(port, flags);
return 1;
}
EXPORT_SYMBOL_GPL(serial8250_handle_irq);
@@ -1932,9 +1932,9 @@ static int serial8250_tx_threshold_handle_irq(struct uart_port *port)
if ((iir & UART_IIR_ID) == UART_IIR_THRI) {
struct uart_8250_port *up = up_to_u8250p(port);

- spin_lock_irqsave(&port->lock, flags);
+ uart_port_lock_irqsave(port, flags);
serial8250_tx_chars(up);
- spin_unlock_irqrestore(&port->lock, flags);
+ uart_port_unlock_irqrestore(port, flags);
}

iir = serial_port_in(port, UART_IIR);
@@ -1949,10 +1949,10 @@ static unsigned int serial8250_tx_empty(struct uart_port *port)

serial8250_rpm_get(up);

- spin_lock_irqsave(&port->lock, flags);
+ uart_port_lock_irqsave(port, flags);
lsr = serial_port_in(port, UART_LSR);
up->lsr_saved_flags |= lsr & LSR_SAVE_FLAGS;
- spin_unlock_irqrestore(&port->lock, flags);
+ uart_port_unlock_irqrestore(port, flags);

serial8250_rpm_put(up);

@@ -2025,13 +2025,13 @@ static void serial8250_break_ctl(struct uart_port *port, int break_state)
unsigned long flags;

serial8250_rpm_get(up);
- spin_lock_irqsave(&port->lock, flags);
+ uart_port_lock_irqsave(port, flags);
if (break_state == -1)
up->lcr |= UART_LCR_SBC;
else
up->lcr &= ~UART_LCR_SBC;
serial_port_out(port, UART_LCR, up->lcr);
- spin_unlock_irqrestore(&port->lock, flags);
+ uart_port_unlock_irqrestore(port, flags);
serial8250_rpm_put(up);
}

@@ -2282,7 +2282,7 @@ int serial8250_do_startup(struct uart_port *port)
* the interrupt is enabled. Delays are necessary to
* allow register changes to become visible.
*/
- spin_lock_irqsave(&port->lock, flags);
+ uart_port_lock_irqsave(port, flags);
if (up->port.irqflags & IRQF_SHARED)
disable_irq_nosync(port->irq);

@@ -2298,7 +2298,7 @@ int serial8250_do_startup(struct uart_port *port)

if (port->irqflags & IRQF_SHARED)
enable_irq(port->irq);
- spin_unlock_irqrestore(&port->lock, flags);
+ uart_port_unlock_irqrestore(port, flags);

/*
* If the interrupt is not reasserted, or we otherwise
@@ -2320,7 +2320,7 @@ int serial8250_do_startup(struct uart_port *port)
*/
serial_port_out(port, UART_LCR, UART_LCR_WLEN8);

- spin_lock_irqsave(&port->lock, flags);
+ uart_port_lock_irqsave(port, flags);
if (up->port.flags & UPF_FOURPORT) {
if (!up->port.irq)
up->port.mctrl |= TIOCM_OUT1;
@@ -2367,7 +2367,7 @@ int serial8250_do_startup(struct uart_port *port)
}

dont_test_tx_en:
- spin_unlock_irqrestore(&port->lock, flags);
+ uart_port_unlock_irqrestore(port, flags);

/*
* Clear the interrupt registers again for luck, and clear the
@@ -2434,17 +2434,17 @@ void serial8250_do_shutdown(struct uart_port *port)
/*
* Disable interrupts from this port
*/
- spin_lock_irqsave(&port->lock, flags);
+ uart_port_lock_irqsave(port, flags);
up->ier = 0;
serial_port_out(port, UART_IER, 0);
- spin_unlock_irqrestore(&port->lock, flags);
+ uart_port_unlock_irqrestore(port, flags);

synchronize_irq(port->irq);

if (up->dma)
serial8250_release_dma(up);

- spin_lock_irqsave(&port->lock, flags);
+ uart_port_lock_irqsave(port, flags);
if (port->flags & UPF_FOURPORT) {
/* reset interrupts on the AST Fourport board */
inb((port->iobase & 0xfe0) | 0x1f);
@@ -2453,7 +2453,7 @@ void serial8250_do_shutdown(struct uart_port *port)
port->mctrl &= ~TIOCM_OUT2;

serial8250_set_mctrl(port, port->mctrl);
- spin_unlock_irqrestore(&port->lock, flags);
+ uart_port_unlock_irqrestore(port, flags);

/*
* Disable break condition and FIFOs
@@ -2679,7 +2679,7 @@ serial8250_do_set_termios(struct uart_port *port, struct ktermios *termios,
* interrupts disabled.
*/
serial8250_rpm_get(up);
- spin_lock_irqsave(&port->lock, flags);
+ uart_port_lock_irqsave(port, flags);

up->lcr = cval; /* Save computed LCR */

@@ -2783,7 +2783,7 @@ serial8250_do_set_termios(struct uart_port *port, struct ktermios *termios,
serial_port_out(port, UART_FCR, up->fcr); /* set fcr */
}
serial8250_set_mctrl(port, port->mctrl);
- spin_unlock_irqrestore(&port->lock, flags);
+ uart_port_unlock_irqrestore(port, flags);
serial8250_rpm_put(up);

/* Don't rewrite B0 */
@@ -2806,15 +2806,15 @@ void serial8250_do_set_ldisc(struct uart_port *port, struct ktermios *termios)
{
if (termios->c_line == N_PPS) {
port->flags |= UPF_HARDPPS_CD;
- spin_lock_irq(&port->lock);
+ uart_port_lock_irq(port);
serial8250_enable_ms(port);
- spin_unlock_irq(&port->lock);
+ uart_port_unlock_irq(port);
} else {
port->flags &= ~UPF_HARDPPS_CD;
if (!UART_ENABLE_MS(port, termios->c_cflag)) {
- spin_lock_irq(&port->lock);
+ uart_port_lock_irq(port);
serial8250_disable_ms(port);
- spin_unlock_irq(&port->lock);
+ uart_port_unlock_irq(port);
}
}
}
@@ -3252,19 +3252,13 @@ void serial8250_console_write(struct uart_8250_port *up, const char *s,
struct uart_port *port = &up->port;
unsigned long flags;
unsigned int ier;
- int locked = 1;
+ bool locked;

touch_nmi_watchdog();

serial8250_rpm_get(up);

- if (port->sysrq)
- locked = 0;
- else if (oops_in_progress)
- locked = spin_trylock_irqsave(&port->lock, flags);
- else
- spin_lock_irqsave(&port->lock, flags);
-
+ uart_port_lock_for_printk(port, flags, locked);
/*
* First save the IER then disable the interrupts
*/
@@ -3300,8 +3294,7 @@ void serial8250_console_write(struct uart_8250_port *up, const char *s,
if (up->msr_saved_flags)
serial8250_modem_status(up);

- if (locked)
- spin_unlock_irqrestore(&port->lock, flags);
+ uart_port_unlock_after_printk(port, flags, locked);
serial8250_rpm_put(up);
}

--
2.19.1


2018-10-16 07:24:52

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [RFC][PATCHv2 0/4] less deadlock prone serial consoles

On Tue, Oct 16, 2018 at 02:04:24PM +0900, Sergey Senozhatsky wrote:
> Hello,
>
>
> RFC
>
>
> The patch set reduces the number of ways serial consoles
> can deadlock the system, including the recently reported deadlock
> in panic().

That's what I use early_printk for... :-)

2018-10-16 07:28:12

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [RFC][PATCHv2 2/4] printk: move printk_safe macros to printk header

On Tue, Oct 16, 2018 at 02:04:26PM +0900, Sergey Senozhatsky wrote:
> Make printk_safe_enter_irqsave()/etc macros available to the
> rest of the kernel.
>
> Signed-off-by: Sergey Senozhatsky <[email protected]>
> ---
> include/linux/printk.h | 40 +++++++++++++++++++++++++++++++++++++
> kernel/printk/internal.h | 37 ----------------------------------
> kernel/printk/printk_safe.c | 6 ++++--
> 3 files changed, 44 insertions(+), 39 deletions(-)
>
> diff --git a/include/linux/printk.h b/include/linux/printk.h
> index cf3eccfe1543..75f99441fd54 100644
> --- a/include/linux/printk.h
> +++ b/include/linux/printk.h
> @@ -157,6 +157,46 @@ static inline void printk_nmi_direct_enter(void) { }
> static inline void printk_nmi_direct_exit(void) { }
> #endif /* PRINTK_NMI */
>
> +#ifdef CONFIG_PRINTK
> +extern void printk_safe_enter(void);
> +extern void printk_safe_exit(void);
> +
> +#define printk_safe_enter_irqsave(flags) \
> + do { \
> + local_irq_save(flags); \
> + printk_safe_enter(); \
> + } while (0)
> +
> +#define printk_safe_exit_irqrestore(flags) \
> + do { \
> + printk_safe_exit(); \
> + local_irq_restore(flags); \
> + } while (0)
> +
> +#define printk_safe_enter_irq() \
> + do { \
> + local_irq_disable(); \
> + printk_safe_enter(); \
> + } while (0)
> +
> +#define printk_safe_exit_irq() \
> + do { \
> + printk_safe_exit(); \
> + local_irq_enable(); \
> + } while (0)

So I really _really_ hate all this. Utterly detest it.

That whole magic 'safe' printk buffer crap is just that crap.

Instead of this tinkering around the edges, why don't you make the main
logbuf a lockless ringbuffer and then delegate the actual printing of
that buffer to a kthread, except for earlycon, which can do synchronous
output.

2018-10-16 08:13:08

by Sergey Senozhatsky

[permalink] [raw]
Subject: Re: [RFC][PATCHv2 0/4] less deadlock prone serial consoles

On (10/16/18 09:23), Peter Zijlstra wrote:
> On Tue, Oct 16, 2018 at 02:04:24PM +0900, Sergey Senozhatsky wrote:
> > Hello,
> >
> >
> > RFC
> >
> >
> > The patch set reduces the number of ways serial consoles
> > can deadlock the system, including the recently reported deadlock
> > in panic().
>
> That's what I use early_printk for... :-)

I've been waiting for this comment ;-)

-ss

2018-10-16 11:42:39

by Petr Mladek

[permalink] [raw]
Subject: Re: [RFC][PATCHv2 2/4] printk: move printk_safe macros to printk header

On Tue 2018-10-16 09:27:19, Peter Zijlstra wrote:
> On Tue, Oct 16, 2018 at 02:04:26PM +0900, Sergey Senozhatsky wrote:
> > Make printk_safe_enter_irqsave()/etc macros available to the
> > rest of the kernel.
> >
> > Signed-off-by: Sergey Senozhatsky <[email protected]>
> > ---
> > include/linux/printk.h | 40 +++++++++++++++++++++++++++++++++++++
> > kernel/printk/internal.h | 37 ----------------------------------
> > kernel/printk/printk_safe.c | 6 ++++--
> > 3 files changed, 44 insertions(+), 39 deletions(-)
> >
> > diff --git a/include/linux/printk.h b/include/linux/printk.h
> > index cf3eccfe1543..75f99441fd54 100644
> > --- a/include/linux/printk.h
> > +++ b/include/linux/printk.h
> > @@ -157,6 +157,46 @@ static inline void printk_nmi_direct_enter(void) { }
> > static inline void printk_nmi_direct_exit(void) { }
> > #endif /* PRINTK_NMI */
> >
> > +#ifdef CONFIG_PRINTK
> > +extern void printk_safe_enter(void);
> > +extern void printk_safe_exit(void);
> > +
> > +#define printk_safe_enter_irqsave(flags) \
> > + do { \
> > + local_irq_save(flags); \
> > + printk_safe_enter(); \
> > + } while (0)
> > +
> > +#define printk_safe_exit_irqrestore(flags) \
> > + do { \
> > + printk_safe_exit(); \
> > + local_irq_restore(flags); \
> > + } while (0)
> > +
> > +#define printk_safe_enter_irq() \
> > + do { \
> > + local_irq_disable(); \
> > + printk_safe_enter(); \
> > + } while (0)
> > +
> > +#define printk_safe_exit_irq() \
> > + do { \
> > + printk_safe_exit(); \
> > + local_irq_enable(); \
> > + } while (0)
>
> So I really _really_ hate all this. Utterly detest it.
>
> That whole magic 'safe' printk buffer crap is just that crap.
>
> Instead of this tinkering around the edges, why don't you make the main
> logbuf a lockless ringbuffer and then delegate the actual printing of
> that buffer to a kthread, except for earlycon, which can do synchronous
> output.

In fact, there is no problem with printk log buffer. This patchset
tries to avoid deadlock caused by console-specific locks
used by console->write() methods.

By other words, we neither need printk_safe or lockless log buffer
to fix this prolem. Instead, we need either printk_deferred context
or lockless consoles.

Best Regards,
Petr

2018-10-16 12:18:43

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [RFC][PATCHv2 2/4] printk: move printk_safe macros to printk header

On Tue, Oct 16, 2018 at 01:40:06PM +0200, Petr Mladek wrote:
> On Tue 2018-10-16 09:27:19, Peter Zijlstra wrote:

> > Instead of this tinkering around the edges, why don't you make the main
> > logbuf a lockless ringbuffer and then delegate the actual printing of
> > that buffer to a kthread, except for earlycon, which can do synchronous
> > output.
>
> In fact, there is no problem with printk log buffer. This patchset
> tries to avoid deadlock caused by console-specific locks
> used by console->write() methods.
>
> By other words, we neither need printk_safe or lockless log buffer
> to fix this prolem. Instead, we need either printk_deferred context
> or lockless consoles.

If you have a lockless buffer and a trailing printk thread, that's
deferred.

And earlycon _should_ be lockless (yes, I know, some suck)

But if you do this deferred nonsense that's currently in printk, then
you risk never seeing the message -- the same problem I have with the
whole printk_safe thing too.

Printing _after_ the fact is horrible for robustness.

2018-10-16 12:28:18

by Sergey Senozhatsky

[permalink] [raw]
Subject: Re: [RFC][PATCHv2 2/4] printk: move printk_safe macros to printk header

On (10/16/18 09:27), Peter Zijlstra wrote:
>
> So I really _really_ hate all this. Utterly detest it.

I learned a new word today - detest. I can haz re-entrant consoles now please?

> That whole magic 'safe' printk buffer crap is just that crap.

No, I don't see it this way; printk_safe is *semi-magic* at best! :)

> Instead of this tinkering around the edges, why don't you make the main
> logbuf a lockless ringbuffer and then delegate the actual printing of
> that buffer to a kthread, except for earlycon, which can do synchronous
> output.

Well, hmm. These are good questions. Let me think.

per-CPU printk_safe _semi-magic_ makes some things simple to handle.
We can't just remove per-CPU buffers and add a wake_up_process() at
the bottom of vprintk_emit(). Because this will deadlock:

printk()
wake_up_process()
try_to_wake_up()
raw_spin_lock_irqsave()
<NMI>
printk()
wake_up_process()
try_to_wake_up()
raw_spin_lock_irqsave()

So we still need some amount of per-CPU printk() semi-magic anyway.

And printk-kthread offloding will not eliminate the need of
printk_deferred(). Which is very hard to use in the right places, like
always; and we don't have any ways to find out that a random innocently
looking printk() may actually be invoked from somewhere deep in the
call-chain with rq lock or pi_lock being locked some 5 frames ago, until
that printk() actually gets invoked and possibly deadlocks the system.

Having "unprotected" wake_up_process() in vprintk_emit(), in fact,
will make the need for printk_deferred() even bigger.

On the other hand, we have wake_up_klogd_work_func() in printk, which
uses irq work, so we, may be, can wakeup printk_kthread from there and,
thus, remove all the external locks from printk()... But I doubt it that
anyone will ever ACK such a patch.

Speaking of lockless logbuf,
logbuf buffer and logbuf_lock are the smallest of the problems printk
currently has. Mainly because of lockless semi-magical printk_safe
buffers. Replacing one lockless buffer with another lockless buffer will
not simplify things in this department. We probably additionally will
have some nasty screw ups, e.g. when NMI printk on CPUA interrupts normal
printk on the same CPUA and now we have mixed in characters; and so on.
per-CPU printk_safe at least keeps us on the safe side in these cases and
looks fairly simple. I also sort of like that logbuf_lock lets us to have
a single static textbuf buffer which we use to vscnprintf() printk messages,
and how printk_safe helps us to get recursive errors/warnings from
vscnprintf(). So printk_safe/printk_nmi things are not _entirely_ crappy,
I guess.

We do, however, have loads of problems with all those dependencies which
come from serial drivers and friends: timekeeping, scheduler (scheduler
is brilliant and cool, but we do have some deadlocks in printk because of
it ;), tty, net, MM, wq, etc. So I generally like the idea of "detached
serial consoles" (that's what I call it). IOW, elimination of the direct
printk -> serial console path. I don't hate the idea of a complete printk
re-work. But some people do, tho. And they have their points. Some people
like the synchronous printk nature and see scheduler dependency as a
"regression".

The current approach - use the existing printk_safe mechanism and
case-by-case, manually, break dependencies which can deadlock us is
a lazy approach, yes; and not very convenient. I'm aware of it.

So, unless I'm missing something, things are not entirely that simple:
- throw away printk_safe semi-magic
- add a lockless logbuf
- add wake_up_process() to vprintk_emit().

It does not completely remove dependency on "external" locks - scheduler,
etc. And as long as we have them - printk can deadlock.

Am I missing something?


Another idea, which I had like a year ago, was to treat printk logbuf
messages in serial consoles the same way they treat uart xmit buffer.
Each console has an IRQ handler, which reads pending messages from xmit
buffer and prints them to the console:

int serial_foo_irq(...)
{
unsigned int max_count = TX_FIFO_DEPTH;

while (max_count) {
unsigned int c;

if (uart_circ_empty(xmit))
break;

c = xmit->buf[xmit->tail];
writel(port, UART_TX, c);
xmit->tail = (xmit->tail + 1) & (UART_XMIT_SIZE - 1);
max_count--;
}
}

We can do the same for printk logbuf. Each console can have last_seen_idx.
And read logbuf messages (poll logbuf) starting from that per-console
last_seen_idx in IRQ handler; we don't have to call into scheduler, net,
etc. printk() will simply add messages to logbuf, consoles will poll pending
messages from IRQs handlers; and everyone is happy... And we will do direct
printk -> console_drivers only for early_con or when in panic().

-ss

2018-10-16 12:39:28

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [RFC][PATCHv2 2/4] printk: move printk_safe macros to printk header

On Tue, Oct 16, 2018 at 09:27:34PM +0900, Sergey Senozhatsky wrote:
> On (10/16/18 09:27), Peter Zijlstra wrote:
> >
> > So I really _really_ hate all this. Utterly detest it.
>
> I learned a new word today - detest. I can haz re-entrant consoles now please?

early_serial_write() not haz a lock anywhere.

2018-10-16 12:55:03

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [RFC][PATCHv2 2/4] printk: move printk_safe macros to printk header

On Tue, Oct 16, 2018 at 09:27:34PM +0900, Sergey Senozhatsky wrote:
> per-CPU printk_safe _semi-magic_ makes some things simple to handle.
> We can't just remove per-CPU buffers and add a wake_up_process() at
> the bottom of vprintk_emit(). Because this will deadlock:
>
> printk()
> wake_up_process()
> try_to_wake_up()
> raw_spin_lock_irqsave()
> <NMI>
> printk()
> wake_up_process()
> try_to_wake_up()
> raw_spin_lock_irqsave()
>
> So we still need some amount of per-CPU printk() semi-magic anyway.

All we need is 4 max-line-length buffers per-CPU. Nothing more.

The above trainwreck is the direct result of forcing synchronous
printk'ing (which I'm otherwise a big fan of, but regular console
drivers stink and are unfixable).

> And printk-kthread offloding will not eliminate the need of
> printk_deferred().

Why not? printk() will reduce to a lockless buffer insert. IOW _all_
printk is deferred.

All you need are 4 max-line-length buffers per CPU and a global/shared
lockless buffer.

printk will determine the current context:

task, softirq, hardirq or NMI

and pick the corresponding per-cpu line buffer and do the vsnprintf()
thing. Then we have the actual line length and content. With the length
we reserve the bytes from the global buffer, we memcpy into the buffer
and commit.

Done.

The printk thread will observe it lags behind the buffer head and will
start printk-ing crud from task context.

> Speaking of lockless logbuf,
> logbuf buffer and logbuf_lock are the smallest of the problems printk
> currently has. Mainly because of lockless semi-magical printk_safe
> buffers.

Which are rubbish. You now have two buffers with duplicated
functionality and wildly different semantics.

Why not have a single buffer that does it all?

> Replacing one lockless buffer with another lockless buffer will
> not simplify things in this department. We probably additionally will
> have some nasty screw ups, e.g. when NMI printk on CPUA interrupts normal
> printk on the same CPUA and now we have mixed in characters; and so on.

Line buffers fix that. You don't need anything fancy for that.

> per-CPU printk_safe at least keeps us on the safe side in these cases and
> looks fairly simple.

printk_safe is an abomination, it fixes something that shouldn't need
fixing. See above.

> We do, however, have loads of problems with all those dependencies which
> come from serial drivers and friends: timekeeping, scheduler (scheduler
> is brilliant and cool, but we do have some deadlocks in printk because of
> it ;), tty, net, MM, wq, etc. So I generally like the idea of "detached
> serial consoles" (that's what I call it). IOW, elimination of the direct
> printk -> serial console path.

Right; we need to get rid of that in the generic case. Only allow
lockless consoles (earlycon) to be used synchonously. With maybe a
special case for the BUG/PANIC case to push things out ASAP.

But given 'fine' console drivers like USB-serial, or DRM-framebuffer, I
don't know if it makes much difference to wake a thread and rely on that
or call it directly and deadlock :/

> The current approach - use the existing printk_safe mechanism and
> case-by-case, manually, break dependencies which can deadlock us is
> a lazy approach, yes; and not very convenient. I'm aware of it.

It's ugly and pushes ugly all over :/

> So, unless I'm missing something, things are not entirely that simple:
> - throw away printk_safe semi-magic
> - add a lockless logbuf
> - add wake_up_process() to vprintk_emit().

No, no wakups. irq_work to wake the printk-thread, at most.



2018-10-16 14:23:00

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [RFC][PATCHv2 2/4] printk: move printk_safe macros to printk header

On Tue, Oct 16, 2018 at 02:54:15PM +0200, Peter Zijlstra wrote:
> printk will determine the current context:
>
> task, softirq, hardirq or NMI
>
> and pick the corresponding per-cpu line buffer and do the vsnprintf()

We need 4, but we don't need to do the exact context determination for
this. We can keep a simple counter:

#define MAX_IDX 4 /* task, sirq, hirq, nmi */
#define MAX_LEN 1020 /* sizeof(struct line_buffer) < 4k */

struct line_buffers {
int idx;
char line[MAX_IDX][MAX_LEN];
};

static DEFINE_PER_CPU(struct line_buffers, lbs);

static char *get_linebuf(void)
{
struct line_buffers *lbp = this_cpu_ptr(&lbs);
int idx;

idx = lbp->idx++;
return lbp->linx[idx];
}

static void put_linbuf(void)
{
this_cpu_dec(lbs.idx);
}

> thing. Then we have the actual line length and content. With the length
> we reserve the bytes from the global buffer, we memcpy into the buffer
> and commit.


2018-10-17 04:34:59

by Sergey Senozhatsky

[permalink] [raw]
Subject: Re: [RFC][PATCHv2 2/4] printk: move printk_safe macros to printk header

On (10/16/18 14:54), Peter Zijlstra wrote:
> On Tue, Oct 16, 2018 at 09:27:34PM +0900, Sergey Senozhatsky wrote:
> > per-CPU printk_safe _semi-magic_ makes some things simple to handle.
> > We can't just remove per-CPU buffers and add a wake_up_process() at
> > the bottom of vprintk_emit(). Because this will deadlock:
> >
> > printk()
> > wake_up_process()
> > try_to_wake_up()
> > raw_spin_lock_irqsave()
> > <NMI>
> > printk()
> > wake_up_process()
> > try_to_wake_up()
> > raw_spin_lock_irqsave()
> >
> > So we still need some amount of per-CPU printk() semi-magic anyway.
>
> All we need is 4 max-line-length buffers per-CPU. Nothing more.

OK, similar to what Steven did with cpu_buffer->current_context.

> The above trainwreck is the direct result of forcing synchronous
> printk'ing (which I'm otherwise a big fan of, but regular console
> drivers stink and are unfixable).

Yep.

> > And printk-kthread offloding will not eliminate the need of
> > printk_deferred().
>
> Why not? printk() will reduce to a lockless buffer insert. IOW _all_
> printk is deferred.

Aha! Interesting. I didn't realize you were talking about
"all printk()-s are deferred".
OK, jump to the last part of this mail.

> All you need are 4 max-line-length buffers per CPU and a global/shared
> lockless buffer.
>
> printk will determine the current context:
>
> task, softirq, hardirq or NMI
>
> and pick the corresponding per-cpu line buffer and do the vsnprintf()
> thing. Then we have the actual line length and content. With the length
> we reserve the bytes from the global buffer, we memcpy into the buffer
> and commit.
>
> Done.
>
> The printk thread will observe it lags behind the buffer head and will
> start printk-ing crud from task context.

[you can skip this part]

This probably will be a bit more hairy. logbuf is written to by many
sources and is read from by many sides, including user-space [both read()
and write()]. So we will need more flags/magic around memcpy(). A simple,
"grab the logbuf entry, set the proper offset to point to the next available
logbuf record and then do memcpy()" won't suffice. We need a flag for
"memcpy() complete, we can read this entry". Otherwise:

CPU0 CPU1 CPU2 CPU3
printk printk printk_kthread
logbuf_entry A logbuf_entry B syslog(read all) call_console_drivers
memcpy memcpy read unfinished print unfinished
A and B A and B

[..]

> > We do, however, have loads of problems with all those dependencies which
> > come from serial drivers and friends: timekeeping, scheduler (scheduler
> > is brilliant and cool, but we do have some deadlocks in printk because of
> > it ;), tty, net, MM, wq, etc. So I generally like the idea of "detached
> > serial consoles" (that's what I call it). IOW, elimination of the direct
> > printk -> serial console path.
>
> Right; we need to get rid of that in the generic case. Only allow
> lockless consoles (earlycon) to be used synchonously. With maybe a
> special case for the BUG/PANIC case to push things out ASAP.
[..]
> > So, unless I'm missing something, things are not entirely that simple:
> > - throw away printk_safe semi-magic
> > - add a lockless logbuf
> > - add wake_up_process() to vprintk_emit().
>
> No, no wakups. irq_work to wake the printk-thread, at most.

All right. OK. So we are on the same page here:

printk has internal locks - logbuf spin_lock; and external locks - all
the scheduler locks, console_sem, net, tty, wq, you name it. printk() is
not aware of those external locks; the only way to fix it is to remove
them from printk(). And that's why

"turn printk() into printk_deferred() and fix printk() deadlocks
in general case"

was my final proposal at the 2016 KS, NM, USA [1] (grep for printk_deferred).
I mentioned this idea several times since then, and even sent a patch, doing
this "printk is now printk_deferred unless we are in panic" thing. As far
as I remember, back then the idea/patch were rejected [2], and one of
reviewers even hinted that I was crazy :-) I have absolutely no issues
with that, but, considering past experiences, I'd really like to:

- Have more opinions on this. People please speak out.
- Have clear "let's do it" from Cc-ed people.


If we are really doing this, then let's split it and have
incremental changes. Namely, what I suggest is:

- keep internal printk lock - logbuf lock for now; we know how to
handle it. I promise.

- keep printk_safe for now, we need it to deal with logbuf lock

- keep printk_safe completely internal to printk

- add printk_kthread

- do printk()->irq_work()->wake_up_process(printk_kthread)
change and remove external locks dependency

- use direct printk() for panic() case

- do something about early_printk


That's big enough already.
From there, once we land this thing, we can start building new logbuf,
stealing code from Steven, improving per-CPU buffers and so on.


Are we doing this?


[1] https://lwn.net/Articles/705938/
[2] https://lore.kernel.org/lkml/[email protected]/

-ss

2018-10-17 04:51:10

by Sergey Senozhatsky

[permalink] [raw]
Subject: Re: [RFC][PATCHv2 1/4] panic: avoid deadlocks in re-entrant console drivers

On (10/16/18 14:04), Sergey Senozhatsky wrote:
>
> Fix this by setting oops_in_progress before console_flush_on_panic(),
> so re-entrant console drivers will trylock the port->lock instead of
> spinning on it forever.
>

Just a small note:
Regardless of what's going to happen to the series, this patch
deserves attention.

-ss

2018-10-17 07:58:26

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [RFC][PATCHv2 2/4] printk: move printk_safe macros to printk header

On Wed, Oct 17, 2018 at 01:32:51PM +0900, Sergey Senozhatsky wrote:
> This probably will be a bit more hairy. logbuf is written to by many
> sources and is read from by many sides, including user-space [both read()
> and write()]. So we will need more flags/magic around memcpy(). A simple,
> "grab the logbuf entry, set the proper offset to point to the next available
> logbuf record and then do memcpy()" won't suffice. We need a flag for
> "memcpy() complete, we can read this entry". Otherwise:

Sure, but lockless buffers mostly have reserve and commit stages anyway.
Exactly to avoid that problem.

> All right. OK. So we are on the same page here:

> - Have more opinions on this. People please speak out.
> - Have clear "let's do it" from Cc-ed people.
>
>
> If we are really doing this, then let's split it and have
> incremental changes. Namely, what I suggest is:

I'd start by replacing logbuf with the lockless buffer and ripping out
the current nmi/safe/etc.. bollocks.

There is absolutely no point what so ever in doing anything until that
is sorted.

After that, move the console output to a kthread, but keep earlycon (if
set) synchronous.

Also avoid the whole panic/warn special case when earlycon is set, no
point in doing dodgy crap when you know you have a good option already.


2018-10-17 10:50:55

by Petr Mladek

[permalink] [raw]
Subject: Re: [RFC][PATCHv2 2/4] printk: move printk_safe macros to printk header

On Tue 2018-10-16 14:17:52, Peter Zijlstra wrote:
> On Tue, Oct 16, 2018 at 01:40:06PM +0200, Petr Mladek wrote:
> > On Tue 2018-10-16 09:27:19, Peter Zijlstra wrote:
>
> > > Instead of this tinkering around the edges, why don't you make the main
> > > logbuf a lockless ringbuffer and then delegate the actual printing of
> > > that buffer to a kthread, except for earlycon, which can do synchronous
> > > output.
> >
> > In fact, there is no problem with printk log buffer. This patchset
> > tries to avoid deadlock caused by console-specific locks
> > used by console->write() methods.
> >
> > By other words, we neither need printk_safe or lockless log buffer
> > to fix this prolem. Instead, we need either printk_deferred context
> > or lockless consoles.
>
> If you have a lockless buffer and a trailing printk thread, that's
> deferred.
>
> And earlycon _should_ be lockless (yes, I know, some suck)
>
> But if you do this deferred nonsense that's currently in printk, then
> you risk never seeing the message -- the same problem I have with the
> whole printk_safe thing too.

What do you mean with printing the message, please? Storing it
into a buffer or showing on console?

I guess that it is storing because you suggest handling
the console by a printk kthread.

Now, please, what is your vision of the lock less buffer?
Would it be similar with the one used by ftrace?

The current implementation of the lockless buffer is very tricky
and supports only one reader. It will be even bigger maze
of code when we add such a support.

Also the messages are spread in many page-size buffers that
are linked in a per-cpu lists. The page used by the reader
is put outside the list. This all makes it very hard to
reconstruct the log. AFAIK, people use ftrace_dump_on_oops
even with a crashdump because nobody improved crash
to extract the messages from the complicated lockless
buffer for years.

IMHO, the current printk buffer is much more easy from
this point of view. It is one global buffer plus
two per-cpu buffers that are mostly empty.


I still think that the buffer is the minor problem these
days. The big problem are consoles and I do not see
how any lockless buffer could help with it.

Also note that by deferred printk I mean deferring the console
handling! IMHO, there are _no more problems_ with storing
the messages into the buffer if we accept that the current
very limited use of printk_safe per-cpu buffers is easier
than any complicated generic lockless buffer.

Best Regards,
Petr

2018-10-17 13:38:47

by Sergey Senozhatsky

[permalink] [raw]
Subject: Re: [RFC][PATCHv2 2/4] printk: move printk_safe macros to printk header

On (10/17/18 09:57), Peter Zijlstra wrote:
> On Wed, Oct 17, 2018 at 01:32:51PM +0900, Sergey Senozhatsky wrote:
> > This probably will be a bit more hairy. logbuf is written to by many
> > sources and is read from by many sides, including user-space [both read()
> > and write()]. So we will need more flags/magic around memcpy(). A simple,
> > "grab the logbuf entry, set the proper offset to point to the next available
> > logbuf record and then do memcpy()" won't suffice. We need a flag for
> > "memcpy() complete, we can read this entry". Otherwise:
>
> Sure, but lockless buffers mostly have reserve and commit stages anyway.
> Exactly to avoid that problem.

Right. So it should be a lockless ringbuffer supporting multiple
parallel readers and multiple parallel writers from multiple contexts
(task, sortirq, irq, NMI); with sane wrap around, etc. And we probably
need to somehow keep the existing format of logbuf entries to avoid
breakages in crash tool, and so on. It will take a lot of time, I'm
afraid.

> > All right. OK. So we are on the same page here:
>
> > - Have more opinions on this. People please speak out.
> > - Have clear "let's do it" from Cc-ed people.
> >
> >
> > If we are really doing this, then let's split it and have
> > incremental changes. Namely, what I suggest is:
>
> I'd start by replacing logbuf with the lockless buffer and ripping out
> the current nmi/safe/etc.. bollocks.
>
> There is absolutely no point what so ever in doing anything until that
> is sorted.

Peter, you have your point, however, I think I have a slightly different
priority list. On the top of my list - deadlocks in printk(). That's the
whole reason I sent this series. Deadlocks are real. We have real reports;
in the best case in form of lockdep splats; in the worst case - nothing,
simply because the system deadlocks in printk(), and even more - printk()
can deadlock in panic(). So addressing deadlocks in printk() will fix a
rather big, real problem. Then we can have fun with the rest of the things
you have mentioned. What do you think?

-ss

2018-10-17 14:02:00

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [RFC][PATCHv2 2/4] printk: move printk_safe macros to printk header

On Wed, Oct 17, 2018 at 12:50:15PM +0200, Petr Mladek wrote:

> > If you have a lockless buffer and a trailing printk thread, that's
> > deferred.
> >
> > And earlycon _should_ be lockless (yes, I know, some suck)
> >
> > But if you do this deferred nonsense that's currently in printk, then
> > you risk never seeing the message -- the same problem I have with the
> > whole printk_safe thing too.
>
> What do you mean with printing the message, please? Storing it
> into a buffer or showing on console?
>
> I guess that it is storing because you suggest handling
> the console by a printk kthread.

I'm not sure from the context you quoted.

> Now, please, what is your vision of the lock less buffer?
> Would it be similar with the one used by ftrace?

No, exactly not like ftrace in fact.

> The current implementation of the lockless buffer is very tricky
> and supports only one reader. It will be even bigger maze
> of code when we add such a support.

The ftrace thing is way complicated and unsuitable. It does what it
does, but I would've never build it that way around.

> IMHO, the current printk buffer is much more easy from
> this point of view. It is one global buffer plus
> two per-cpu buffers that are mostly empty.

I'm arguing for a single (global) lockless buffer with per-cpu line
buffers.

> I still think that the buffer is the minor problem these
> days. The big problem are consoles and I do not see
> how any lockless buffer could help with it.

Yes, the consoles are horrible unfixable crap (some far worse than
others).

But the logbuf is also horrible, and that we can in fact fix.

> Also note that by deferred printk I mean deferring the console
> handling! IMHO, there are _no more problems_ with storing
> the messages into the buffer if we accept that the current
> very limited use of printk_safe per-cpu buffers is easier
> than any complicated generic lockless buffer.

They hide messages. The whole two radically different printk paths is
also quite horrible.

And lockless buffers aren't all _that_ complicated, esp. not when
performance isn't the top priority.

And earlycon is mostly usable, esp. the serial ones. Those, when
configured, should synchronously print along. The current design
also makes that difficult.

A wee little like so; typed in a hurry, never been near a compiler.

struct ll_buffer {
void *buffer;
unsigned int size;

unsigned int tail, reserve, head;

unsigned int cpu;
unsigned int ctx;
};

struct ll_entry {
unsigned int size;
char data[0];
};

struct ll_handle {
unsigned int cpu;
struct ll_entry *lhe;
};

static struct ll_buffer logbuf = {
.buffer = /* static storage */,
.cpu = -1;
};

static void __ll_get(struct ll_buffer *llb, struct ll_handle *llh)
{
unsigned int cpu, old;

cpu = smp_processor_id();
for (;;) {
old = READ_ONCE(llb->cpu);
if (old == -1) {
if (try_cmpxchg_acquire(&llb->cpu, &old, cpu))
break;
}
if (old == cpu)
break;

cpu_relax();
}

llh->cpu = old;
}

static void __ll_put(struct ll_buffer *llb, struct ll_handle *llh)
{
smp_store_release(llb->cpu, llh->cpu);
}

static char *__ll_reserve(struct ll_buffer *llb, struct ll_hande *llh, unsigned int size)
{
unsigned int tail, res;
struct ll_entry *lle;

__ll_get(llb, llh);

llb->ctx++;

size += sizeof(struct ll_entry);
size += 3;
size &= ~3;

do {
for (;;) {
tail = READ_ONCE(llb->tail);
res = READ_ONCE(llb->resserve);

if (CIRC_SPACE(tail, res, llb->size) > size)
break;

lle = llb->buffer + (tail % llb->size);
cmpxchg(&llb->tail, tail, tail + lhe->size);
}

} while (cmpxchg(&llb->reserve, res, res + size) != res);

llh->lhe = llb->buffer + (res % llb->size);
llh->lhe->size = size;
}

static void __ll_commit(struct ll_buffer *llb, struct ll_handle *llh)
{
for (;;) {
unsigned int ctx, res;

res = READ_ONCE(llb->res);
barrier();
ctx = --llb->ctx;
if (ctx)
break;

smp_store_release(llb->head, res);

if (likely(READ_ONCE(llb->res) == res))
break;

llb->ctx++;
}

__ll_put(llb, llh);
}

#define MAX_LINE 1020
#define MAX_CTX 4

struct line_buffers {
int ctx;
char buffers[MAX_CTX][MAX_LINE];
};

static DEFINE_PER_CPU(struct line_buffers, linbufs);

int vprintk(const char *fmt, va_list args)
{
struct ll_handle handle;
struct line_buffers *lb;
struct ll_entry *lhe;
char linbuf;
int sz;

preempt_disable();
lb = this_cpu_ptr(&linbufs);
linbuf = lb->buffers[lb->ctx++];

sz = scnprintf(linbuf, MAX_LINE, fmt, args);

__ll_reserve(&logbuf, &handle, sz);
memcpy(llh->lhe->data, linbuf, sz); // XXX doesn't wrap on llb->size
__ll_commit(&logbuf, &handle);

ln->ctx--;
preempt_enable();
}


struct ll_reader {
unsigned int tail;
char line[MAX_LINE];
};

static bool __ll_read(struct ll_buffer *llb, struct ll_reader *llr)
{
unsigned int tail, reader, head, sz;
struct ll_entry *lhe;
bool lost = false;

for (;;) {
head = READ_ONCE(llb->head);
smp_rmb();
tail = READ_ONCE(llb->tail);
reader = READ_ONCE(llr->tail);
if (unlikely((signed)(tail - reader) < 0)) {
lost = true;
reader = tail;
}
smp_rmb();
lhe = llb->buffer + reader;
sz = lhe->size;
llr->tail = reader + sz;
sz -= sizeof(struct ll_entry);

if ((signed)(head - reader) > 0)
memcpy(llr->line, lhe->data, sz); // XXX wrap

smp_rmb();
tail = READ_ONCE(llb->tail);
if ((signed)(tail - reader) >= 0)
break;
}

return lost;
}

2018-10-22 14:52:31

by Petr Mladek

[permalink] [raw]
Subject: Re: [RFC][PATCHv2 2/4] printk: move printk_safe macros to printk header

On Wed 2018-10-17 16:00:44, Peter Zijlstra wrote:
> On Wed, Oct 17, 2018 at 12:50:15PM +0200, Petr Mladek wrote:
> > Also note that by deferred printk I mean deferring the console
> > handling! IMHO, there are _no more problems_ with storing
> > the messages into the buffer if we accept that the current
> > very limited use of printk_safe per-cpu buffers is easier
> > than any complicated generic lockless buffer.
>
> They hide messages. The whole two radically different printk paths is
> also quite horrible.

I agree that having everything in one buffer with unified
and safe access from any context would be a great win.


> And lockless buffers aren't all _that_ complicated, esp. not when
> performance isn't the top priority.
>
> And earlycon is mostly usable, esp. the serial ones. Those, when
> configured, should synchronously print along. The current design
> also makes that difficult.
>
> A wee little like so; typed in a hurry, never been near a compiler.

Thanks a lot for the code. I still need to find time to better
understand it. Anyway, it looks worth considering.

Best Regards,
Petr

2018-10-23 06:26:22

by Sergey Senozhatsky

[permalink] [raw]
Subject: Re: [RFC][PATCHv2 2/4] printk: move printk_safe macros to printk header

On (10/16/18 14:54), Peter Zijlstra wrote:
>
> No, no wakups. irq_work to wake the printk-thread, at most.
>

There are cases when we probably prefer to be in "direct printk" mode.
E.g. sysrq or late PM stages (probably).

Doing irq_work->wake_up_process->printk_kthread from sysrq probably
might not work all the time, so direct printk path may look more
"reliable" in some cases. In *theory* (just in theory), we can do

void __handle_sysrq(int key, bool check_mask)
{
...
op_p->handler(key);
...

+ if (console_trylock())
+ console_unlock();
}

type of thing. So sysrq handler will just log_store() the data and we
will try to flush logbuf immediately once we are dont with sysrq handler.
This will require additional work, tho. Some sysrq handlers can print
significant (depends on serial console baud) amounts of data. For instance,
sysrq_handle_showstate() calls show_workqueue_state() and show_state(),
which do numerous printk()-s. Therefore, those functions touch NMI and
softlockup watchdogs:

void show_workqueue_state(void)
{
for_each_pwq(pwq, wq) {
...
touch_nmi_watchdog();
}

for_each_pool(pool, pi) {
...
touch_nmi_watchdog();
}
}

and

void show_state_filter(unsigned long state_filter)
{
for_each_process_thread(g, p) {
touch_nmi_watchdog();
touch_all_softlockup_watchdogs();
...
}
}

If we will move the actual printout to console_unlock() then we will
have to start touching watchdogs from console_unlock().

Another troubling moment might be that with completely async printk()
it's easier to cause logbuf wrap around; because CPU which does printk()
in a loop is always async, console drivers don't throttle it anymore;
currently it's sometimes sync (when console_sem is not locked, or locked
but we have active console_sem_owner) or async (when console_sem is
locked and there is no active console_sem_owner).

-ss

2018-10-23 11:08:43

by Petr Mladek

[permalink] [raw]
Subject: Re: [RFC][PATCHv2 1/4] panic: avoid deadlocks in re-entrant console drivers

On Tue 2018-10-16 14:04:25, Sergey Senozhatsky wrote:
> >From printk()/serial console point of view panic() is special, because
> it may force CPU to re-enter printk() or/and serial console driver.
> Therefore, some of serial consoles drivers are re-entrant. E.g. 8250:
>
> serial8250_console_write()
> {
> if (port->sysrq)
> locked = 0;
> else if (oops_in_progress)
> locked = spin_trylock_irqsave(&port->lock, flags);
> else
> spin_lock_irqsave(&port->lock, flags);
> ...
> }
>
> panic() does set oops_in_progress via bust_spinlocks(1), so in theory
> we should be able to re-enter serial console driver from panic():
>
> CPU0
> <NMI>
> uart_console_write()
> serial8250_console_write() // if (oops_in_progress)
> // spin_trylock_irqsave()
> call_console_drivers()
> console_unlock()
> console_flush_on_panic()
> bust_spinlocks(1) // oops_in_progress++
> panic()
> <NMI/>
> spin_lock_irqsave(&port->lock, flags) // spin_lock_irqsave()
> serial8250_console_write()
> call_console_drivers()
> console_unlock()
> printk()
> ...
>
> However, this does not happen and we deadlock in serial console on
> port->lock spinlock. And the problem is that console_flush_on_panic()
> called after bust_spinlocks(0):
>
> void panic(const char *fmt, ...)
> {
> bust_spinlocks(1);
> ...
> bust_spinlocks(0);
> console_flush_on_panic();
> ...
> }
>
> bust_spinlocks(0) decrements oops_in_progress, so oops_in_progress
> can go back to zero. Thus even re-entrant console drivers will simply
> spin on port->lock spinlock. Given that port->lock may already be
> locked either by a stopped CPU, or by the very same CPU we execute
> panic() on (for instance, NMI panic() on printing CPU) the system
> deadlocks and does not reboot.

The idea makes sense to me. You are right that we already called
printk/console with busted spinlock many times in panic().
Therefore it should not be worse.


> Fix this by setting oops_in_progress before console_flush_on_panic(),
> so re-entrant console drivers will trylock the port->lock instead of
> spinning on it forever.
>
> Signed-off-by: Sergey Senozhatsky <[email protected]>
> ---
> kernel/panic.c | 6 ++++++
> 1 file changed, 6 insertions(+)
>
> diff --git a/kernel/panic.c b/kernel/panic.c
> index f6d549a29a5c..a0e60ccf3031 100644
> --- a/kernel/panic.c
> +++ b/kernel/panic.c
> @@ -237,7 +237,13 @@ void panic(const char *fmt, ...)
> if (_crash_kexec_post_notifiers)
> __crash_kexec(NULL);
>
> + /*
> + * Decrement oops_in_progress and let bust_spinlocks() to
> + * unblank_screen(), console_unblank() and wake_up_klogd()
> + */
> bust_spinlocks(0);
> + /* Set oops_in_progress, so we can reenter serial console driver */
> + bust_spinlocks(1);

Though this looks a bit weird.

I have just realized that console_unblank() is called by
bust_spinlocks(0) and does basically the same as
console_flush_on_panic(). Also it does not make much
sense wake_up_klogd() there. Finally, it seems to be
too late to disable lockdep there.

I would suggest something like:

diff --git a/kernel/panic.c b/kernel/panic.c
index 8b2e002d52eb..c78e3df8dd58 100644
--- a/kernel/panic.c
+++ b/kernel/panic.c
@@ -233,17 +233,14 @@ void panic(const char *fmt, ...)
if (_crash_kexec_post_notifiers)
__crash_kexec(NULL);

- bust_spinlocks(0);
-
/*
* We may have ended up stopping the CPU holding the lock (in
- * smp_send_stop()) while still having some valuable data in the console
- * buffer. Try to acquire the lock then release it regardless of the
- * result. The release will also print the buffers out. Locks debug
- * should be disabled to avoid reporting bad unlock balance when
- * panic() is not being callled from OOPS.
+ * smp_send_stop()) while still having some valuable data in
+ * the console buffer. Try hard to see them.
*/
- debug_locks_off();
+#ifdef CONFIG_VT
+ unblank_screen();
+#endif
console_flush_on_panic();

if (!panic_blink)
diff --git a/lib/bust_spinlocks.c b/lib/bust_spinlocks.c
index ab719495e2cb..e42d2fcd6453 100644
--- a/lib/bust_spinlocks.c
+++ b/lib/bust_spinlocks.c
@@ -20,6 +20,12 @@
void __attribute__((weak)) bust_spinlocks(int yes)
{
if (yes) {
+ /*
+ * Some locks might get ignored in the Oops situation
+ * to get an important work done. Locks debug should
+ * be disabled to avoid reporting bad unlock balance.
+ */
+ debug_locks_off();
++oops_in_progress;
} else {
#ifdef CONFIG_VT

2018-10-23 11:57:55

by Sergey Senozhatsky

[permalink] [raw]
Subject: Re: [RFC][PATCHv2 1/4] panic: avoid deadlocks in re-entrant console drivers

On (10/23/18 13:07), Petr Mladek wrote:
> Though this looks a bit weird.
>
> I have just realized that console_unblank() is called by
> bust_spinlocks(0) and does basically the same as
> console_flush_on_panic(). Also it does not make much
> sense wake_up_klogd() there. Finally, it seems to be
> too late to disable lockdep there.

Thanks for taking a look.
As of "weird" part I have some explanations:

> @@ -233,17 +233,14 @@ void panic(const char *fmt, ...)
> if (_crash_kexec_post_notifiers)
> __crash_kexec(NULL);
>
> - bust_spinlocks(0);
> -
[..]
> - debug_locks_off();
> +#ifdef CONFIG_VT
> + unblank_screen();
> +#endif
> console_flush_on_panic();
>
> if (!panic_blink)


So I did look at what lib/bust_spinlocks.c does; and I agree that waking
up klogd makes little sense, on the other hand it just sets per-cpu
pending bit, so not a big deal. console_unlock() should do there the
same thing as console_flush_on_panic(). Yes. However, a bit of a bigger
argument:
__attribute__((weak)) suggests that bust_spinlocks() is arch-dependent
and it's up to arch to do some extra stuff there [if needed]. So that's
why I decided to keep bust_spinlocks(0) in panic() and, thus, call into
arch-specific code (or common bust_spinlocks); then bump oops_in_progress
so serial consoles become re-entrant and finally call
console_flush_on_panic().

> void __attribute__((weak)) bust_spinlocks(int yes)
> {
> if (yes) {
> + /*
> + * Some locks might get ignored in the Oops situation
> + * to get an important work done. Locks debug should
> + * be disabled to avoid reporting bad unlock balance.
> + */
> + debug_locks_off();
> ++oops_in_progress;

Hmm, I don't think I've seen any reports because of this. From printk/console
POV the locks which are not taken under oops_in_progress are not released.

Wrt to uart port we usually have "bool locked" flag and unlock
port->lock only if we locked it:

{
if (oops_in_progress)
locked = spin_trylock_irqsave(&port->lock, flags);

...

if (locked)
spin_unlock_irqrestore(&port->lock, flags);
}

Wrt to console_sem we have

{
if (oops_in_progress)
if (!down_trylock_console_sem() != 0)
return;
...
console_unlock();
}

So the locks that we care about in this particular patch (console sem
and port->lock) probably should not see any locking imbalance.

If you have strong opinion then we can have debug_locks_off() change
as part of this patch. But maybe I'd prefer to have it as a separate
patch. What do you think?

-ss

2018-10-23 12:13:31

by Sergey Senozhatsky

[permalink] [raw]
Subject: Re: [RFC][PATCHv2 1/4] panic: avoid deadlocks in re-entrant console drivers

On (10/23/18 21:04), Sergey Senozhatsky wrote:
>
> Seems that s390 is the only arch which defines its own bust_spinlocks().
> Not sure why... Just to play games with console_loglevel?
>
> ---
>
> void bust_spinlocks(int yes)
> {
> if (yes) {
> oops_in_progress = 1;
> } else {
> int loglevel_save = console_loglevel;
> console_unblank();
> oops_in_progress = 0;
> /*
> * OK, the message is on the console. Now we call printk()
> * without oops_in_progress set so that printk will give klogd
> * a poke. Hold onto your hats...
> */
> console_loglevel = 15;
> printk(" ");
> console_loglevel = loglevel_save;
> }
> }
>
> ---
>
> The "printk(" "); without oops_in_progress" part is a bit worrisome.
> This thing technically can deadlock. Unless s390 has no NMI panic().

And console_unblank() is not guaranteed to print anything (unlike
console_flush_on_panic(), but oops is not panic() yet, so we can't
replace it with flush_on_panic()) - console_sem can be locked, so
console_unblank() would do nothing.

That printk(" ") probably wants to be printk_deferred(" ").

-ss

2018-10-23 12:49:13

by Sergey Senozhatsky

[permalink] [raw]
Subject: Re: [RFC][PATCHv2 1/4] panic: avoid deadlocks in re-entrant console drivers

On (10/23/18 20:54), Sergey Senozhatsky wrote:
> So I did look at what lib/bust_spinlocks.c does; and I agree that waking
> up klogd makes little sense, on the other hand it just sets per-cpu
> pending bit, so not a big deal. console_unlock() should do there the
> same thing as console_flush_on_panic(). Yes. However, a bit of a bigger
> argument:
> __attribute__((weak)) suggests that bust_spinlocks() is arch-dependent
> and it's up to arch to do some extra stuff there [if needed]. So that's
> why I decided to keep bust_spinlocks(0) in panic() and, thus, call into
> arch-specific code (or common bust_spinlocks); then bump oops_in_progress
> so serial consoles become re-entrant and finally call
> console_flush_on_panic().

Seems that s390 is the only arch which defines its own bust_spinlocks().
Not sure why... Just to play games with console_loglevel?

---

void bust_spinlocks(int yes)
{
if (yes) {
oops_in_progress = 1;
} else {
int loglevel_save = console_loglevel;
console_unblank();
oops_in_progress = 0;
/*
* OK, the message is on the console. Now we call printk()
* without oops_in_progress set so that printk will give klogd
* a poke. Hold onto your hats...
*/
console_loglevel = 15;
printk(" ");
console_loglevel = loglevel_save;
}
}

---

The "printk(" "); without oops_in_progress" part is a bit worrisome.
This thing technically can deadlock. Unless s390 has no NMI panic().

-ss

2018-10-25 08:31:07

by Petr Mladek

[permalink] [raw]
Subject: Re: [RFC][PATCHv2 1/4] panic: avoid deadlocks in re-entrant console drivers

On Tue 2018-10-23 20:54:33, Sergey Senozhatsky wrote:
> On (10/23/18 13:07), Petr Mladek wrote:
> > Though this looks a bit weird.
> >
> > I have just realized that console_unblank() is called by
> > bust_spinlocks(0) and does basically the same as
> > console_flush_on_panic(). Also it does not make much
> > sense wake_up_klogd() there. Finally, it seems to be
> > too late to disable lockdep there.
>
> So I did look at what lib/bust_spinlocks.c does; and I agree that waking
> up klogd makes little sense, on the other hand it just sets per-cpu
> pending bit, so not a big deal. console_unlock() should do there the
> same thing as console_flush_on_panic().

Yes, klogd is not a big deal. I just think that the bust_spinlocks()
ping-pong would just confuse the code. It might be better to keep
the spinlocks busted and make sure that we do not cause regressions
by not calling bust_spinlocks(0).

It tried to solve this by calling the important actions from
bust_spinlocks(0) directly in panic(). I wanted to omit anything
that does not make sense in panic().


> > void __attribute__((weak)) bust_spinlocks(int yes)
> > {
> > if (yes) {
> > + /*
> > + * Some locks might get ignored in the Oops situation
> > + * to get an important work done. Locks debug should
> > + * be disabled to avoid reporting bad unlock balance.
> > + */
> > + debug_locks_off();
> > ++oops_in_progress;
>
> Hmm, I don't think I've seen any reports because of this. From printk/console
> POV the locks which are not taken under oops_in_progress are not released.

Fair enough. Let's keep debug_locks_off() in panic().

Best Regards,
Petr

2018-10-25 09:06:33

by Sergey Senozhatsky

[permalink] [raw]
Subject: Re: [RFC][PATCHv2 1/4] panic: avoid deadlocks in re-entrant console drivers

On (10/25/18 10:29), Petr Mladek wrote:
>
> Yes, klogd is not a big deal. I just think that the bust_spinlocks()
> ping-pong would just confuse the code.

I agree; that's why I put some comments there.

> It might be better to keep the spinlocks busted and make sure that we do
> not cause regressions by not calling bust_spinlocks(0).

Sure, I understand. One reason to keep bust_spinlocks(0) there was "invoke
arch-specific bust_spinlocks(0), which might do something that common
bust_spinlocks() wouldn't do". Without going into details if any arch actually
does anything "special" in bust_spinlocks(0). Another reason was - this patch
looks like a -stable material to me; especially given that we have panic()
deadlock reports now. So I wanted to have a one liner which will not change
things for arch-s that re-define bust_spinlocks() and, at the same time,
fix the deadlock. Other than that I'm all for keeping spinlocks busted all
the time and just doing:

---
#ifdef CONFIG_VT
unblank_screen();
#endif
console_unblank();
---

in panic().

BTW, speaking of s390 bust_spinlocks(). It seems that starting from 4.21
all arch-s will use common bust_spinlocks() [1].

[..]
> > Hmm, I don't think I've seen any reports because of this. From printk/console
> > POV the locks which are not taken under oops_in_progress are not released.
>
> Fair enough. Let's keep debug_locks_off() in panic().

Agreed.

[1] lkml.kernel.org/r/20181025081108.GB26561@osiris

-ss

2018-10-25 09:07:02

by Petr Mladek

[permalink] [raw]
Subject: Re: [RFC][PATCHv2 1/4] panic: avoid deadlocks in re-entrant console drivers

On Tue 2018-10-23 21:12:30, Sergey Senozhatsky wrote:
> On (10/23/18 21:04), Sergey Senozhatsky wrote:
> >
> > Seems that s390 is the only arch which defines its own bust_spinlocks().
> > Not sure why... Just to play games with console_loglevel?
> >
> > ---
> >
> > void bust_spinlocks(int yes)
> > {
> > if (yes) {
> > oops_in_progress = 1;
> > } else {
> > int loglevel_save = console_loglevel;
> > console_unblank();
> > oops_in_progress = 0;
> > /*
> > * OK, the message is on the console. Now we call printk()
> > * without oops_in_progress set so that printk will give klogd
> > * a poke. Hold onto your hats...
> > */
> > console_loglevel = 15;
> > printk(" ");
> > console_loglevel = loglevel_save;
> > }
> > }
> >
> > ---
> >
> > The "printk(" "); without oops_in_progress" part is a bit worrisome.
> > This thing technically can deadlock. Unless s390 has no NMI panic().

The comment suggests that the main purpose it is to wake up
klogd(). It is done in the generic bust_spinlocks() a safe way.

The games with console_loglevel do not make much sense because
console_unblank() calls console_unlock() when it was able to
take the console_lock. It means that either console_unblank()
flush all messages to the console or console_trylock() would
most likely fail also in that extra printk().

IMHO, the custom s390 implementation can get removed.
The generic code should do the same job these days.


> And console_unblank() is not guaranteed to print anything (unlike
> console_flush_on_panic(), but oops is not panic() yet, so we can't
> replace it with flush_on_panic()) - console_sem can be locked, so
> console_unblank() would do nothing.

I see. I missed that console_unblank() returns early when
down_trylock_console_sem() fails.

I still would like to refactor the code somehow to avoid
the bust_spinlocks(0)/bust_spinlocks(1) ping-pong.

It might make sense to call console_unblank() from
console_flush_on_panic().

I wonder if it would make sense to call unblank_screen() in
console_unblank()...

Best Regards,
Petr

2018-10-25 09:32:49

by Sergey Senozhatsky

[permalink] [raw]
Subject: Re: [RFC][PATCHv2 1/4] panic: avoid deadlocks in re-entrant console drivers

On (10/25/18 11:06), Petr Mladek wrote:
>
> IMHO, the custom s390 implementation can get removed.
> The generic code should do the same job these days.
>

Yep.

> > And console_unblank() is not guaranteed to print anything (unlike
> > console_flush_on_panic(), but oops is not panic() yet, so we can't
> > replace it with flush_on_panic()) - console_sem can be locked, so
> > console_unblank() would do nothing.
>
> I see. I missed that console_unblank() returns early when
> down_trylock_console_sem() fails.
>
> I still would like to refactor the code somehow to avoid
> the bust_spinlocks(0)/bust_spinlocks(1) ping-pong.
>
> It might make sense to call console_unblank() from
> console_flush_on_panic().
>
> I wonder if it would make sense to call unblank_screen() in
> console_unblank()...

These are interesting thoughts.

I can add one more thing to the list.
bust_spinlock() is probably not doing enough. It says that it

"clears any spinlocks which would prevent oops, die(), BUG()
and panic() information from reaching the user."

But this is, technically, not true. Because bust_spinlock() does not remove
console_sem out of sight. And we do have several spinlocks behind it. E.g.
semaphore's ->lock. Both down() and down_trylock() do
raw_spin_lock_irqsave(&sem->lock, flags), so if we got NMI panic while one
of the CPUs was under raw_spin_lock_irqsave(&sem->lock, flags) trying to
lock the console_sem then we are done. And IIRC we had exactly this type
of a bug report from LG 1 or 2 years ago - deadlock on sem->lock.

-ss

2018-10-25 10:13:54

by Sergey Senozhatsky

[permalink] [raw]
Subject: [PATCHv3] panic: avoid deadlocks in re-entrant console drivers

From printk()/serial console point of view panic() is special, because
it may force CPU to re-enter printk() or/and serial console driver.
Therefore, some of serial consoles drivers are re-entrant. E.g. 8250:

serial8250_console_write()
{
if (port->sysrq)
locked = 0;
else if (oops_in_progress)
locked = spin_trylock_irqsave(&port->lock, flags);
else
spin_lock_irqsave(&port->lock, flags);
...
}

panic() does set oops_in_progress via bust_spinlocks(1), so in theory
we should be able to re-enter serial console driver from panic():

CPU0
<NMI>
uart_console_write()
serial8250_console_write() // if (oops_in_progress)
// spin_trylock_irqsave()
call_console_drivers()
console_unlock()
console_flush_on_panic()
bust_spinlocks(1) // oops_in_progress++
panic()
<NMI/>
spin_lock_irqsave(&port->lock, flags) // spin_lock_irqsave()
serial8250_console_write()
call_console_drivers()
console_unlock()
printk()
...

However, this does not happen and we deadlock in serial console on
port->lock spinlock. And the problem is that console_flush_on_panic()
called after bust_spinlocks(0):

void panic(const char *fmt, ...)
{
bust_spinlocks(1);
...
bust_spinlocks(0);
console_flush_on_panic();
...
}

bust_spinlocks(0) decrements oops_in_progress, so oops_in_progress
can go back to zero. Thus even re-entrant console drivers will simply
spin on port->lock spinlock. Given that port->lock may already be
locked either by a stopped CPU, or by the very same CPU we execute
panic() on (for instance, NMI panic() on printing CPU) the system
deadlocks and does not reboot.

Fix this by removing bust_spinlocks(0), so oops_in_progress is always
set in panic() now and, thus, re-entrant console drivers will trylock
the port->lock instead of spinning on it forever, when we call them
from console_flush_on_panic().

Signed-off-by: Sergey Senozhatsky <[email protected]>
---

v2: do not do bust_spinlocks(1);bust_spinlocks(0);bust_spinlocks(1)
thing and just copy paste from lib/bust_spinlocks what
bust_spinlocks(0) does. (Petr)

kernel/panic.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/kernel/panic.c b/kernel/panic.c
index f6d549a29a5c..fccd41628b24 100644
--- a/kernel/panic.c
+++ b/kernel/panic.c
@@ -237,7 +237,10 @@ void panic(const char *fmt, ...)
if (_crash_kexec_post_notifiers)
__crash_kexec(NULL);

- bust_spinlocks(0);
+#ifdef CONFIG_VT
+ unblank_screen();
+#endif
+ console_unblank();

/*
* We may have ended up stopping the CPU holding the lock (in
--
2.19.1


2018-10-25 10:52:52

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCHv3] panic: avoid deadlocks in re-entrant console drivers

Hi Sergey,

I love your patch! Yet something to improve:

[auto build test ERROR on linux-sof-driver/master]
[also build test ERROR on v4.19 next-20181019]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url: https://github.com/0day-ci/linux/commits/Sergey-Senozhatsky/panic-avoid-deadlocks-in-re-entrant-console-drivers/20181025-181929
base: https://github.com/thesofproject/linux master
config: i386-randconfig-x077-201842 (attached as .config)
compiler: gcc-7 (Debian 7.3.0-1) 7.3.0
reproduce:
# save the attached .config to linux build tree
make ARCH=i386

All errors (new ones prefixed by >>):

kernel/panic.c: In function 'panic':
>> kernel/panic.c:237:2: error: implicit declaration of function 'unblank_screen' [-Werror=implicit-function-declaration]
unblank_screen();
^~~~~~~~~~~~~~
cc1: some warnings being treated as errors

vim +/unblank_screen +237 kernel/panic.c

186
187 /*
188 * If we have crashed and we have a crash kernel loaded let it handle
189 * everything else.
190 * If we want to run this after calling panic_notifiers, pass
191 * the "crash_kexec_post_notifiers" option to the kernel.
192 *
193 * Bypass the panic_cpu check and call __crash_kexec directly.
194 */
195 if (!_crash_kexec_post_notifiers) {
196 printk_safe_flush_on_panic();
197 __crash_kexec(NULL);
198
199 /*
200 * Note smp_send_stop is the usual smp shutdown function, which
201 * unfortunately means it may not be hardened to work in a
202 * panic situation.
203 */
204 smp_send_stop();
205 } else {
206 /*
207 * If we want to do crash dump after notifier calls and
208 * kmsg_dump, we will need architecture dependent extra
209 * works in addition to stopping other CPUs.
210 */
211 crash_smp_send_stop();
212 }
213
214 /*
215 * Run any panic handlers, including those that might need to
216 * add information to the kmsg dump output.
217 */
218 atomic_notifier_call_chain(&panic_notifier_list, 0, buf);
219
220 /* Call flush even twice. It tries harder with a single online CPU */
221 printk_safe_flush_on_panic();
222 kmsg_dump(KMSG_DUMP_PANIC);
223
224 /*
225 * If you doubt kdump always works fine in any situation,
226 * "crash_kexec_post_notifiers" offers you a chance to run
227 * panic_notifiers and dumping kmsg before kdump.
228 * Note: since some panic_notifiers can make crashed kernel
229 * more unstable, it can increase risks of the kdump failure too.
230 *
231 * Bypass the panic_cpu check and call __crash_kexec directly.
232 */
233 if (_crash_kexec_post_notifiers)
234 __crash_kexec(NULL);
235
236 #ifdef CONFIG_VT
> 237 unblank_screen();
238 #endif
239 console_unblank();
240
241 /*
242 * We may have ended up stopping the CPU holding the lock (in
243 * smp_send_stop()) while still having some valuable data in the console
244 * buffer. Try to acquire the lock then release it regardless of the
245 * result. The release will also print the buffers out. Locks debug
246 * should be disabled to avoid reporting bad unlock balance when
247 * panic() is not being callled from OOPS.
248 */
249 debug_locks_off();
250 console_flush_on_panic();
251
252 if (!panic_blink)
253 panic_blink = no_blink;
254
255 if (panic_timeout > 0) {
256 /*
257 * Delay timeout seconds before rebooting the machine.
258 * We can't use the "normal" timers since we just panicked.
259 */
260 pr_emerg("Rebooting in %d seconds..\n", panic_timeout);
261
262 for (i = 0; i < panic_timeout * 1000; i += PANIC_TIMER_STEP) {
263 touch_nmi_watchdog();
264 if (i >= i_next) {
265 i += panic_blink(state ^= 1);
266 i_next = i + 3600 / PANIC_BLINK_SPD;
267 }
268 mdelay(PANIC_TIMER_STEP);
269 }
270 }
271 if (panic_timeout != 0) {
272 /*
273 * This will not be a clean reboot, with everything
274 * shutting down. But if there is a chance of
275 * rebooting the system it will be rebooted.
276 */
277 emergency_restart();
278 }
279 #ifdef __sparc__
280 {
281 extern int stop_a_enabled;
282 /* Make sure the user can actually press Stop-A (L1-A) */
283 stop_a_enabled = 1;
284 pr_emerg("Press Stop-A (L1-A) from sun keyboard or send break\n"
285 "twice on console to return to the boot prom\n");
286 }
287 #endif
288 #if defined(CONFIG_S390)
289 {
290 unsigned long caller;
291
292 caller = (unsigned long)__builtin_return_address(0);
293 disabled_wait(caller);
294 }
295 #endif
296 pr_emerg("---[ end Kernel panic - not syncing: %s ]---\n", buf);
297 local_irq_enable();
298 for (i = 0; ; i += PANIC_TIMER_STEP) {
299 touch_softlockup_watchdog();
300 if (i >= i_next) {
301 i += panic_blink(state ^= 1);
302 i_next = i + 3600 / PANIC_BLINK_SPD;
303 }
304 mdelay(PANIC_TIMER_STEP);
305 }
306 }
307

---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all Intel Corporation


Attachments:
(No filename) (5.43 kB)
.config.gz (32.74 kB)
Download all attachments

2018-10-25 11:57:08

by Sergey Senozhatsky

[permalink] [raw]
Subject: Re: [PATCHv3] panic: avoid deadlocks in re-entrant console drivers

On (10/25/18 18:51), kbuild test robot wrote:
>
> [auto build test ERROR on linux-sof-driver/master]
> [also build test ERROR on v4.19 next-20181019]
> [if your patch is applied to the wrong git tree, please drop us a note to help improve the system]
>

My bad, sorry!

+#include <linux/vt_kern.h>

This should fix it.

=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=


From: Sergey Senozhatsky <[email protected]>
Subject: [PATCH] panic: avoid deadlocks in re-entrant console drivers

From printk()/serial console point of view panic() is special, because
it may force CPU to re-enter printk() or/and serial console driver.
Therefore, some of serial consoles drivers are re-entrant. E.g. 8250:

serial8250_console_write()
{
if (port->sysrq)
locked = 0;
else if (oops_in_progress)
locked = spin_trylock_irqsave(&port->lock, flags);
else
spin_lock_irqsave(&port->lock, flags);
...
}

panic() does set oops_in_progress via bust_spinlocks(1), so in theory
we should be able to re-enter serial console driver from panic():

CPU0
<NMI>
uart_console_write()
serial8250_console_write() // if (oops_in_progress)
// spin_trylock_irqsave()
call_console_drivers()
console_unlock()
console_flush_on_panic()
bust_spinlocks(1) // oops_in_progress++
panic()
<NMI/>
spin_lock_irqsave(&port->lock, flags) // spin_lock_irqsave()
serial8250_console_write()
call_console_drivers()
console_unlock()
printk()
...

However, this does not happen and we deadlock in serial console on
port->lock spinlock. And the problem is that console_flush_on_panic()
called after bust_spinlocks(0):

void panic(const char *fmt, ...)
{
bust_spinlocks(1);
...
bust_spinlocks(0);
console_flush_on_panic();
...
}

bust_spinlocks(0) decrements oops_in_progress, so oops_in_progress
can go back to zero. Thus even re-entrant console drivers will simply
spin on port->lock spinlock. Given that port->lock may already be
locked either by a stopped CPU, or by the very same CPU we execute
panic() on (for instance, NMI panic() on printing CPU) the system
deadlocks and does not reboot.

Fix this by removing bust_spinlocks(0), so oops_in_progress is always
set in panic() now and, thus, re-entrant console drivers will trylock
the port->lock instead of spinning on it forever, when we call them
from console_flush_on_panic().

Signed-off-by: Sergey Senozhatsky <[email protected]>
---
kernel/panic.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/kernel/panic.c b/kernel/panic.c
index f6d549a29a5c..272ac1c34e4b 100644
--- a/kernel/panic.c
+++ b/kernel/panic.c
@@ -29,6 +29,7 @@
#include <linux/ratelimit.h>
#include <linux/debugfs.h>
#include <asm/sections.h>
+#include <linux/vt_kern.h>

#define PANIC_TIMER_STEP 100
#define PANIC_BLINK_SPD 18
@@ -237,7 +238,10 @@ void panic(const char *fmt, ...)
if (_crash_kexec_post_notifiers)
__crash_kexec(NULL);

- bust_spinlocks(0);
+#ifdef CONFIG_VT
+ unblank_screen();
+#endif
+ console_unblank();

/*
* We may have ended up stopping the CPU holding the lock (in
--
2.19.1


2018-10-31 12:28:48

by Petr Mladek

[permalink] [raw]
Subject: Re: [PATCHv3] panic: avoid deadlocks in re-entrant console drivers

On Thu 2018-10-25 19:10:36, Sergey Senozhatsky wrote:
> >From printk()/serial console point of view panic() is special, because
> it may force CPU to re-enter printk() or/and serial console driver.
> Therefore, some of serial consoles drivers are re-entrant. E.g. 8250:
>
> serial8250_console_write()
> {
> if (port->sysrq)
> locked = 0;
> else if (oops_in_progress)
> locked = spin_trylock_irqsave(&port->lock, flags);
> else
> spin_lock_irqsave(&port->lock, flags);
> ...
> }
>
> panic() does set oops_in_progress via bust_spinlocks(1), so in theory
> we should be able to re-enter serial console driver from panic():
>
> CPU0
> <NMI>
> uart_console_write()
> serial8250_console_write() // if (oops_in_progress)
> // spin_trylock_irqsave()
> call_console_drivers()
> console_unlock()
> console_flush_on_panic()
> bust_spinlocks(1) // oops_in_progress++
> panic()
> <NMI/>
> spin_lock_irqsave(&port->lock, flags) // spin_lock_irqsave()
> serial8250_console_write()
> call_console_drivers()
> console_unlock()
> printk()
> ...
>
> However, this does not happen and we deadlock in serial console on
> port->lock spinlock. And the problem is that console_flush_on_panic()
> called after bust_spinlocks(0):
>
> void panic(const char *fmt, ...)
> {
> bust_spinlocks(1);
> ...
> bust_spinlocks(0);
> console_flush_on_panic();
> ...
> }
>
> bust_spinlocks(0) decrements oops_in_progress, so oops_in_progress
> can go back to zero. Thus even re-entrant console drivers will simply
> spin on port->lock spinlock. Given that port->lock may already be
> locked either by a stopped CPU, or by the very same CPU we execute
> panic() on (for instance, NMI panic() on printing CPU) the system
> deadlocks and does not reboot.
>
> Fix this by removing bust_spinlocks(0), so oops_in_progress is always
> set in panic() now and, thus, re-entrant console drivers will trylock
> the port->lock instead of spinning on it forever, when we call them
> from console_flush_on_panic().
>
> Signed-off-by: Sergey Senozhatsky <[email protected]>

The patch makes sense to me. The locks should stay busted also for
console_flush_on_panic().

With the added #include <linux/vt_kern.h>:

Reviewed-by: Petr Mladek <[email protected]>

Best Regards,
Petr

2018-11-01 01:52:19

by Sergey Senozhatsky

[permalink] [raw]
Subject: Re: [PATCHv3] panic: avoid deadlocks in re-entrant console drivers

On (10/31/18 13:27), Petr Mladek wrote:
> >
> > Signed-off-by: Sergey Senozhatsky <[email protected]>
>
> The patch makes sense to me. The locks should stay busted also for
> console_flush_on_panic().
>
> With the added #include <linux/vt_kern.h>:
>
> Reviewed-by: Petr Mladek <[email protected]>

Thanks!

Since there are no objections - how shall we route it? Via printk tree?
I'd also prefer this patch to be in -stable, it fixes a real issue after
all.

-ss

2018-11-01 08:09:08

by Petr Mladek

[permalink] [raw]
Subject: Re: [PATCHv3] panic: avoid deadlocks in re-entrant console drivers

On Thu 2018-11-01 10:48:21, Sergey Senozhatsky wrote:
> On (10/31/18 13:27), Petr Mladek wrote:
> > >
> > > Signed-off-by: Sergey Senozhatsky <[email protected]>
> >
> > The patch makes sense to me. The locks should stay busted also for
> > console_flush_on_panic().
> >
> > With the added #include <linux/vt_kern.h>:
> >
> > Reviewed-by: Petr Mladek <[email protected]>
>
> Thanks!
>
> Since there are no objections - how shall we route it? Via printk tree?

Good question. OK, I am going to put it into printk.git unless I hear
complains withing next couple of days.


> I'd also prefer this patch to be in -stable, it fixes a real issue after
> all.

It makes sense.

Best Regards,
Petr

2018-11-23 16:37:52

by Petr Mladek

[permalink] [raw]
Subject: Re: [PATCHv3] panic: avoid deadlocks in re-entrant console drivers

On Thu 2018-11-01 09:08:08, Petr Mladek wrote:
> On Thu 2018-11-01 10:48:21, Sergey Senozhatsky wrote:
> > On (10/31/18 13:27), Petr Mladek wrote:
> > > >
> > > > Signed-off-by: Sergey Senozhatsky <[email protected]>
> > >
> > > The patch makes sense to me. The locks should stay busted also for
> > > console_flush_on_panic().
> > >
> > > With the added #include <linux/vt_kern.h>:
> > >
> > > Reviewed-by: Petr Mladek <[email protected]>
> >
> > Thanks!
> >
> > Since there are no objections - how shall we route it? Via printk tree?
>
> Good question. OK, I am going to put it into printk.git unless I hear
> complains withing next couple of days.

I have pushed this into printk.git, branch for-4.21.

Best Regards,
Petr

2018-12-08 03:14:23

by Sergey Senozhatsky

[permalink] [raw]
Subject: Re: [RFC][PATCHv2 3/4] serial: introduce uart_port locking helpers

On (10/16/18 14:04), Sergey Senozhatsky wrote:
[..]
> - The first entry point is console ->write() callback, which we call
> from printk(). A possible deadlock scenario there is:
>
> CPU0
> <NMI>
> spin_lock_irqsave(&port->lock, flags) << deadlock
> serial_foo_write()
> call_console_drivers()
> console_unlock()
> console_flush_on_panic()
> panic()
> <NMI/>
> spin_lock_irqsave(&port->lock, flags)
> serial_foo_write()
> call_console_drivers()
> console_unlock()
> printk()
> ...

[..]
> - The rest (of entry points) requires a bit different handling.
> Let's take a look at the following backtrace:
>
> CPU0
> <IRQ>
> spin_lock_irqsave(&port->lock, flags) << deadlock
> serial_foo_write()
> call_console_drivers()
> console_unlock()
> printk()
> __queue_work()
> tty_flip_buffer_push()
> spin_lock_irqsave(&port->lock, flags)
> serial_foo_handle_IRQ()
> <IRQ/>
>
> Serial drivers invoke tons of core kernel functions - WQ, MM, etc. All
> of which may printk() in various cases. So we can't really just
> "remove those printk-s". The simples way to address this seems to be
> PRINTK_SAFE_CONTEXT_MASK.

serial/UART and printk guys, sorry to bother you, do you hate this
idea of removing console_driver->CORE KERNEL->printk->console_driver
deadlock path? Or is there any chance we can move forward?

-ss

2018-12-12 00:54:45

by Daniel Wang

[permalink] [raw]
Subject: Re: [PATCHv3] panic: avoid deadlocks in re-entrant console drivers

Is it okay to tag this commit with `Cc: [email protected]` so
that it'll get applied to the stable trees once merged into Linux's
tree, if it's not too late? Otherwise I'll follow up on the stable
merges separately. Thanks for making the changes anyway.

On Thu, Nov 22, 2018 at 5:12 AM Petr Mladek <[email protected]> wrote:
>
> On Thu 2018-11-01 09:08:08, Petr Mladek wrote:
> > On Thu 2018-11-01 10:48:21, Sergey Senozhatsky wrote:
> > > On (10/31/18 13:27), Petr Mladek wrote:
> > > > >
> > > > > Signed-off-by: Sergey Senozhatsky <[email protected]>
> > > >
> > > > The patch makes sense to me. The locks should stay busted also for
> > > > console_flush_on_panic().
> > > >
> > > > With the added #include <linux/vt_kern.h>:
> > > >
> > > > Reviewed-by: Petr Mladek <[email protected]>
> > >
> > > Thanks!
> > >
> > > Since there are no objections - how shall we route it? Via printk tree?
> >
> > Good question. OK, I am going to put it into printk.git unless I hear
> > complains withing next couple of days.
>
> I have pushed this into printk.git, branch for-4.21.
>
> Best Regards,
> Petr



--
Best,
Daniel


Attachments:
smime.p7s (4.73 kB)
S/MIME Cryptographic Signature

2018-12-12 05:24:49

by Sergey Senozhatsky

[permalink] [raw]
Subject: Re: [PATCHv3] panic: avoid deadlocks in re-entrant console drivers

On (12/11/18 16:53), Daniel Wang wrote:
> Is it okay to tag this commit with `Cc: [email protected]` so
> that it'll get applied to the stable trees once merged into Linux's
> tree, if it's not too late? Otherwise I'll follow up on the stable
> merges separately. Thanks for making the changes anyway.

My bad! I should have Cc-ed stable on that one, sorry. I guess now it
would be easier to wait and follow up once it gets merged.

-ss

2018-12-12 06:00:39

by Daniel Wang

[permalink] [raw]
Subject: Re: [PATCHv3] panic: avoid deadlocks in re-entrant console drivers

No worries. I will follow up. You would recommend that all four
patches in this set to be backported though, right?

On Tue, Dec 11, 2018 at 9:23 PM Sergey Senozhatsky
<[email protected]> wrote:
>
> On (12/11/18 16:53), Daniel Wang wrote:
> > Is it okay to tag this commit with `Cc: [email protected]` so
> > that it'll get applied to the stable trees once merged into Linux's
> > tree, if it's not too late? Otherwise I'll follow up on the stable
> > merges separately. Thanks for making the changes anyway.
>
> My bad! I should have Cc-ed stable on that one, sorry. I guess now it
> would be easier to wait and follow up once it gets merged.
>
> -ss



--
Best,
Daniel


Attachments:
smime.p7s (4.74 kB)
S/MIME Cryptographic Signature

2018-12-12 06:07:44

by Sergey Senozhatsky

[permalink] [raw]
Subject: Re: [PATCHv3] panic: avoid deadlocks in re-entrant console drivers

On (12/11/18 21:59), Daniel Wang wrote:
> No worries. I will follow up. You would recommend that all four
> patches in this set to be backported though, right?

Just the last one, which makes consoles re-entrant

panic: avoid deadlocks in re-entrant console drivers

Because only this one was applied so far. The reset got stuck.

-ss

2018-12-12 06:10:49

by Daniel Wang

[permalink] [raw]
Subject: Re: [PATCHv3] panic: avoid deadlocks in re-entrant console drivers

Got it. Thank you.

On Tue, Dec 11, 2018 at 10:06 PM Sergey Senozhatsky
<[email protected]> wrote:
>
> On (12/11/18 21:59), Daniel Wang wrote:
> > No worries. I will follow up. You would recommend that all four
> > patches in this set to be backported though, right?
>
> Just the last one, which makes consoles re-entrant
>
> panic: avoid deadlocks in re-entrant console drivers
>
> Because only this one was applied so far. The reset got stuck.
>
> -ss



--
Best,
Daniel


Attachments:
smime.p7s (4.73 kB)
S/MIME Cryptographic Signature

2018-12-12 11:09:46

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [RFC][PATCHv2 3/4] serial: introduce uart_port locking helpers

On Sat, Dec 08, 2018 at 12:12:49PM +0900, Sergey Senozhatsky wrote:
> On (10/16/18 14:04), Sergey Senozhatsky wrote:
> [..]
> > - The first entry point is console ->write() callback, which we call
> > from printk(). A possible deadlock scenario there is:
> >
> > CPU0
> > <NMI>
> > spin_lock_irqsave(&port->lock, flags) << deadlock
> > serial_foo_write()
> > call_console_drivers()
> > console_unlock()
> > console_flush_on_panic()
> > panic()
> > <NMI/>
> > spin_lock_irqsave(&port->lock, flags)
> > serial_foo_write()
> > call_console_drivers()
> > console_unlock()
> > printk()
> > ...
>
> [..]
> > - The rest (of entry points) requires a bit different handling.
> > Let's take a look at the following backtrace:
> >
> > CPU0
> > <IRQ>
> > spin_lock_irqsave(&port->lock, flags) << deadlock
> > serial_foo_write()
> > call_console_drivers()
> > console_unlock()
> > printk()
> > __queue_work()
> > tty_flip_buffer_push()
> > spin_lock_irqsave(&port->lock, flags)
> > serial_foo_handle_IRQ()
> > <IRQ/>
> >
> > Serial drivers invoke tons of core kernel functions - WQ, MM, etc. All
> > of which may printk() in various cases. So we can't really just
> > "remove those printk-s". The simples way to address this seems to be
> > PRINTK_SAFE_CONTEXT_MASK.
>
> serial/UART and printk guys, sorry to bother you, do you hate this
> idea of removing console_driver->CORE KERNEL->printk->console_driver
> deadlock path? Or is there any chance we can move forward?

If done in a sane manner, no objection from me.