2020-08-14 01:39:58

by Sergey Senozhatsky

[permalink] [raw]
Subject: [PATCH] uart:8250: change lock order in serial8250_do_startup()

We have a number of "uart.port->desc.lock vs desc.lock->uart.port"
lockdep reports coming from 8250 driver; this causes a bit of trouble
to people, so let's fix it.

The problem is reverse lock order in two different call paths:

chain #1:

serial8250_do_startup()
spin_lock_irqsave(&port->lock);
disable_irq_nosync(port->irq);
raw_spin_lock_irqsave(&desc->lock)

chain #2:

__report_bad_irq()
raw_spin_lock_irqsave(&desc->lock)
for_each_action_of_desc()
printk()
spin_lock_irqsave(&port->lock);

Fix this by changing the order of locks in serial8250_do_startup():
do disable_irq_nosync() first, which grabs desc->lock, and grab
uart->port after that, so that chain #1 and chain #2 have same lock
order.

Full lockdep splat:

======================================================
WARNING: possible circular locking dependency detected
5.4.39 #55 Not tainted
------------------------------------------------------
swapper/0/0 is trying to acquire lock:
ffffffffab65b6c0 (console_owner){-...}, at: console_lock_spinning_enable+0x31/0x57

but task is already holding lock:
ffff88810a8e34c0 (&irq_desc_lock_class){-.-.}, at: __report_bad_irq+0x5b/0xba

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #2 (&irq_desc_lock_class){-.-.}:
_raw_spin_lock_irqsave+0x61/0x8d
__irq_get_desc_lock+0x65/0x89
__disable_irq_nosync+0x3b/0x93
serial8250_do_startup+0x451/0x75c
uart_startup+0x1b4/0x2ff
uart_port_activate+0x73/0xa0
tty_port_open+0xae/0x10a
uart_open+0x1b/0x26
tty_open+0x24d/0x3a0
chrdev_open+0xd5/0x1cc
do_dentry_open+0x299/0x3c8
path_openat+0x434/0x1100
do_filp_open+0x9b/0x10a
do_sys_open+0x15f/0x3d7
kernel_init_freeable+0x157/0x1dd
kernel_init+0xe/0x105
ret_from_fork+0x27/0x50

-> #1 (&port_lock_key){-.-.}:
_raw_spin_lock_irqsave+0x61/0x8d
serial8250_console_write+0xa7/0x2a0
console_unlock+0x3b7/0x528
vprintk_emit+0x111/0x17f
printk+0x59/0x73
register_console+0x336/0x3a4
uart_add_one_port+0x51b/0x5be
serial8250_register_8250_port+0x454/0x55e
dw8250_probe+0x4dc/0x5b9
platform_drv_probe+0x67/0x8b
really_probe+0x14a/0x422
driver_probe_device+0x66/0x130
device_driver_attach+0x42/0x5b
__driver_attach+0xca/0x139
bus_for_each_dev+0x97/0xc9
bus_add_driver+0x12b/0x228
driver_register+0x64/0xed
do_one_initcall+0x20c/0x4a6
do_initcall_level+0xb5/0xc5
do_basic_setup+0x4c/0x58
kernel_init_freeable+0x13f/0x1dd
kernel_init+0xe/0x105
ret_from_fork+0x27/0x50

-> #0 (console_owner){-...}:
__lock_acquire+0x118d/0x2714
lock_acquire+0x203/0x258
console_lock_spinning_enable+0x51/0x57
console_unlock+0x25d/0x528
vprintk_emit+0x111/0x17f
printk+0x59/0x73
__report_bad_irq+0xa3/0xba
note_interrupt+0x19a/0x1d6
handle_irq_event_percpu+0x57/0x79
handle_irq_event+0x36/0x55
handle_fasteoi_irq+0xc2/0x18a
do_IRQ+0xb3/0x157
ret_from_intr+0x0/0x1d
cpuidle_enter_state+0x12f/0x1fd
cpuidle_enter+0x2e/0x3d
do_idle+0x1ce/0x2ce
cpu_startup_entry+0x1d/0x1f
start_kernel+0x406/0x46a
secondary_startup_64+0xa4/0xb0

other info that might help us debug this:

Chain exists of:
console_owner --> &port_lock_key --> &irq_desc_lock_class

Possible unsafe locking scenario:

CPU0 CPU1
---- ----
lock(&irq_desc_lock_class);
lock(&port_lock_key);
lock(&irq_desc_lock_class);
lock(console_owner);

*** DEADLOCK ***

2 locks held by swapper/0/0:
#0: ffff88810a8e34c0 (&irq_desc_lock_class){-.-.}, at: __report_bad_irq+0x5b/0xba
#1: ffffffffab65b5c0 (console_lock){+.+.}, at: console_trylock_spinning+0x20/0x181

stack backtrace:
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.4.39 #55
Hardware name: XXXXXX
Call Trace:
<IRQ>
dump_stack+0xbf/0x133
? print_circular_bug+0xd6/0xe9
check_noncircular+0x1b9/0x1c3
__lock_acquire+0x118d/0x2714
lock_acquire+0x203/0x258
? console_lock_spinning_enable+0x31/0x57
console_lock_spinning_enable+0x51/0x57
? console_lock_spinning_enable+0x31/0x57
console_unlock+0x25d/0x528
? console_trylock+0x18/0x4e
vprintk_emit+0x111/0x17f
? lock_acquire+0x203/0x258
printk+0x59/0x73
__report_bad_irq+0xa3/0xba
note_interrupt+0x19a/0x1d6
handle_irq_event_percpu+0x57/0x79
handle_irq_event+0x36/0x55
handle_fasteoi_irq+0xc2/0x18a
do_IRQ+0xb3/0x157
common_interrupt+0xf/0xf
</IRQ>

Signed-off-by: Sergey Senozhatsky <[email protected]>
---
drivers/tty/serial/8250/8250_port.c | 11 +++++++----
1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/drivers/tty/serial/8250/8250_port.c b/drivers/tty/serial/8250/8250_port.c
index 09475695effd..67f1a4f31093 100644
--- a/drivers/tty/serial/8250/8250_port.c
+++ b/drivers/tty/serial/8250/8250_port.c
@@ -2275,6 +2275,11 @@ int serial8250_do_startup(struct uart_port *port)

if (port->irq && !(up->port.flags & UPF_NO_THRE_TEST)) {
unsigned char iir1;
+ bool irq_shared = up->port.irqflags & IRQF_SHARED;
+
+ if (irq_shared)
+ disable_irq_nosync(port->irq);
+
/*
* Test for UARTs that do not reassert THRE when the
* transmitter is idle and the interrupt has already
@@ -2284,8 +2289,6 @@ int serial8250_do_startup(struct uart_port *port)
* allow register changes to become visible.
*/
spin_lock_irqsave(&port->lock, flags);
- if (up->port.irqflags & IRQF_SHARED)
- disable_irq_nosync(port->irq);

wait_for_xmitr(up, UART_LSR_THRE);
serial_port_out_sync(port, UART_IER, UART_IER_THRI);
@@ -2297,9 +2300,9 @@ int serial8250_do_startup(struct uart_port *port)
iir = serial_port_in(port, UART_IIR);
serial_port_out(port, UART_IER, 0);

- if (port->irqflags & IRQF_SHARED)
- enable_irq(port->irq);
spin_unlock_irqrestore(&port->lock, flags);
+ if (irq_shared)
+ enable_irq(port->irq);

/*
* If the interrupt is not reasserted, or we otherwise
--
2.28.0


2020-08-14 01:55:22

by Guenter Roeck

[permalink] [raw]
Subject: Re: [PATCH] uart:8250: change lock order in serial8250_do_startup()

On Fri, Aug 14, 2020 at 10:38:02AM +0900, Sergey Senozhatsky wrote:
> We have a number of "uart.port->desc.lock vs desc.lock->uart.port"
> lockdep reports coming from 8250 driver; this causes a bit of trouble
> to people, so let's fix it.
>
> The problem is reverse lock order in two different call paths:
>
> chain #1:
>
> serial8250_do_startup()
> spin_lock_irqsave(&port->lock);
> disable_irq_nosync(port->irq);
> raw_spin_lock_irqsave(&desc->lock)
>
> chain #2:
>
> __report_bad_irq()
> raw_spin_lock_irqsave(&desc->lock)
> for_each_action_of_desc()
> printk()
> spin_lock_irqsave(&port->lock);
>
> Fix this by changing the order of locks in serial8250_do_startup():
> do disable_irq_nosync() first, which grabs desc->lock, and grab
> uart->port after that, so that chain #1 and chain #2 have same lock
> order.
>
> Full lockdep splat:
>
> ======================================================
> WARNING: possible circular locking dependency detected
> 5.4.39 #55 Not tainted
> ------------------------------------------------------
> swapper/0/0 is trying to acquire lock:
> ffffffffab65b6c0 (console_owner){-...}, at: console_lock_spinning_enable+0x31/0x57
>
> but task is already holding lock:
> ffff88810a8e34c0 (&irq_desc_lock_class){-.-.}, at: __report_bad_irq+0x5b/0xba
>
> which lock already depends on the new lock.
>
> the existing dependency chain (in reverse order) is:
>
> -> #2 (&irq_desc_lock_class){-.-.}:
> _raw_spin_lock_irqsave+0x61/0x8d
> __irq_get_desc_lock+0x65/0x89
> __disable_irq_nosync+0x3b/0x93
> serial8250_do_startup+0x451/0x75c
> uart_startup+0x1b4/0x2ff
> uart_port_activate+0x73/0xa0
> tty_port_open+0xae/0x10a
> uart_open+0x1b/0x26
> tty_open+0x24d/0x3a0
> chrdev_open+0xd5/0x1cc
> do_dentry_open+0x299/0x3c8
> path_openat+0x434/0x1100
> do_filp_open+0x9b/0x10a
> do_sys_open+0x15f/0x3d7
> kernel_init_freeable+0x157/0x1dd
> kernel_init+0xe/0x105
> ret_from_fork+0x27/0x50
>
> -> #1 (&port_lock_key){-.-.}:
> _raw_spin_lock_irqsave+0x61/0x8d
> serial8250_console_write+0xa7/0x2a0
> console_unlock+0x3b7/0x528
> vprintk_emit+0x111/0x17f
> printk+0x59/0x73
> register_console+0x336/0x3a4
> uart_add_one_port+0x51b/0x5be
> serial8250_register_8250_port+0x454/0x55e
> dw8250_probe+0x4dc/0x5b9
> platform_drv_probe+0x67/0x8b
> really_probe+0x14a/0x422
> driver_probe_device+0x66/0x130
> device_driver_attach+0x42/0x5b
> __driver_attach+0xca/0x139
> bus_for_each_dev+0x97/0xc9
> bus_add_driver+0x12b/0x228
> driver_register+0x64/0xed
> do_one_initcall+0x20c/0x4a6
> do_initcall_level+0xb5/0xc5
> do_basic_setup+0x4c/0x58
> kernel_init_freeable+0x13f/0x1dd
> kernel_init+0xe/0x105
> ret_from_fork+0x27/0x50
>
> -> #0 (console_owner){-...}:
> __lock_acquire+0x118d/0x2714
> lock_acquire+0x203/0x258
> console_lock_spinning_enable+0x51/0x57
> console_unlock+0x25d/0x528
> vprintk_emit+0x111/0x17f
> printk+0x59/0x73
> __report_bad_irq+0xa3/0xba
> note_interrupt+0x19a/0x1d6
> handle_irq_event_percpu+0x57/0x79
> handle_irq_event+0x36/0x55
> handle_fasteoi_irq+0xc2/0x18a
> do_IRQ+0xb3/0x157
> ret_from_intr+0x0/0x1d
> cpuidle_enter_state+0x12f/0x1fd
> cpuidle_enter+0x2e/0x3d
> do_idle+0x1ce/0x2ce
> cpu_startup_entry+0x1d/0x1f
> start_kernel+0x406/0x46a
> secondary_startup_64+0xa4/0xb0
>
> other info that might help us debug this:
>
> Chain exists of:
> console_owner --> &port_lock_key --> &irq_desc_lock_class
>
> Possible unsafe locking scenario:
>
> CPU0 CPU1
> ---- ----
> lock(&irq_desc_lock_class);
> lock(&port_lock_key);
> lock(&irq_desc_lock_class);
> lock(console_owner);
>
> *** DEADLOCK ***
>
> 2 locks held by swapper/0/0:
> #0: ffff88810a8e34c0 (&irq_desc_lock_class){-.-.}, at: __report_bad_irq+0x5b/0xba
> #1: ffffffffab65b5c0 (console_lock){+.+.}, at: console_trylock_spinning+0x20/0x181
>
> stack backtrace:
> CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.4.39 #55
> Hardware name: XXXXXX
> Call Trace:
> <IRQ>
> dump_stack+0xbf/0x133
> ? print_circular_bug+0xd6/0xe9
> check_noncircular+0x1b9/0x1c3
> __lock_acquire+0x118d/0x2714
> lock_acquire+0x203/0x258
> ? console_lock_spinning_enable+0x31/0x57
> console_lock_spinning_enable+0x51/0x57
> ? console_lock_spinning_enable+0x31/0x57
> console_unlock+0x25d/0x528
> ? console_trylock+0x18/0x4e
> vprintk_emit+0x111/0x17f
> ? lock_acquire+0x203/0x258
> printk+0x59/0x73
> __report_bad_irq+0xa3/0xba
> note_interrupt+0x19a/0x1d6
> handle_irq_event_percpu+0x57/0x79
> handle_irq_event+0x36/0x55
> handle_fasteoi_irq+0xc2/0x18a
> do_IRQ+0xb3/0x157
> common_interrupt+0xf/0xf
> </IRQ>
>
> Signed-off-by: Sergey Senozhatsky <[email protected]>

For the time being:

Reviewed-by: Guenter Roeck <[email protected]>

I triggered complete set of test runs on chromeos-{4.4,4.14,4.19,5.4}.
I'll send an update, hopefully by tomorrow morning, with results.

Thanks,
Guenter

> ---
> drivers/tty/serial/8250/8250_port.c | 11 +++++++----
> 1 file changed, 7 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/tty/serial/8250/8250_port.c b/drivers/tty/serial/8250/8250_port.c
> index 09475695effd..67f1a4f31093 100644
> --- a/drivers/tty/serial/8250/8250_port.c
> +++ b/drivers/tty/serial/8250/8250_port.c
> @@ -2275,6 +2275,11 @@ int serial8250_do_startup(struct uart_port *port)
>
> if (port->irq && !(up->port.flags & UPF_NO_THRE_TEST)) {
> unsigned char iir1;
> + bool irq_shared = up->port.irqflags & IRQF_SHARED;
> +
> + if (irq_shared)
> + disable_irq_nosync(port->irq);
> +
> /*
> * Test for UARTs that do not reassert THRE when the
> * transmitter is idle and the interrupt has already
> @@ -2284,8 +2289,6 @@ int serial8250_do_startup(struct uart_port *port)
> * allow register changes to become visible.
> */
> spin_lock_irqsave(&port->lock, flags);
> - if (up->port.irqflags & IRQF_SHARED)
> - disable_irq_nosync(port->irq);
>
> wait_for_xmitr(up, UART_LSR_THRE);
> serial_port_out_sync(port, UART_IER, UART_IER_THRI);
> @@ -2297,9 +2300,9 @@ int serial8250_do_startup(struct uart_port *port)
> iir = serial_port_in(port, UART_IIR);
> serial_port_out(port, UART_IER, 0);
>
> - if (port->irqflags & IRQF_SHARED)
> - enable_irq(port->irq);
> spin_unlock_irqrestore(&port->lock, flags);
> + if (irq_shared)
> + enable_irq(port->irq);
>
> /*
> * If the interrupt is not reasserted, or we otherwise
> --
> 2.28.0
>

2020-08-14 11:31:39

by Andy Shevchenko

[permalink] [raw]
Subject: Re: [PATCH] uart:8250: change lock order in serial8250_do_startup()

On Fri, Aug 14, 2020 at 10:38:02AM +0900, Sergey Senozhatsky wrote:
> We have a number of "uart.port->desc.lock vs desc.lock->uart.port"
> lockdep reports coming from 8250 driver; this causes a bit of trouble
> to people, so let's fix it.
>
> The problem is reverse lock order in two different call paths:
>
> chain #1:
>
> serial8250_do_startup()
> spin_lock_irqsave(&port->lock);
> disable_irq_nosync(port->irq);
> raw_spin_lock_irqsave(&desc->lock)
>
> chain #2:
>
> __report_bad_irq()
> raw_spin_lock_irqsave(&desc->lock)
> for_each_action_of_desc()
> printk()
> spin_lock_irqsave(&port->lock);
>
> Fix this by changing the order of locks in serial8250_do_startup():
> do disable_irq_nosync() first, which grabs desc->lock, and grab
> uart->port after that, so that chain #1 and chain #2 have same lock
> order.
>
> Full lockdep splat:
>
> ======================================================
> WARNING: possible circular locking dependency detected
> 5.4.39 #55 Not tainted
> ------------------------------------------------------
> swapper/0/0 is trying to acquire lock:
> ffffffffab65b6c0 (console_owner){-...}, at: console_lock_spinning_enable+0x31/0x57
>
> but task is already holding lock:
> ffff88810a8e34c0 (&irq_desc_lock_class){-.-.}, at: __report_bad_irq+0x5b/0xba
>
> which lock already depends on the new lock.
>
> the existing dependency chain (in reverse order) is:
>
> -> #2 (&irq_desc_lock_class){-.-.}:
> _raw_spin_lock_irqsave+0x61/0x8d
> __irq_get_desc_lock+0x65/0x89
> __disable_irq_nosync+0x3b/0x93
> serial8250_do_startup+0x451/0x75c
> uart_startup+0x1b4/0x2ff
> uart_port_activate+0x73/0xa0
> tty_port_open+0xae/0x10a
> uart_open+0x1b/0x26
> tty_open+0x24d/0x3a0
> chrdev_open+0xd5/0x1cc
> do_dentry_open+0x299/0x3c8
> path_openat+0x434/0x1100
> do_filp_open+0x9b/0x10a
> do_sys_open+0x15f/0x3d7
> kernel_init_freeable+0x157/0x1dd
> kernel_init+0xe/0x105
> ret_from_fork+0x27/0x50
>
> -> #1 (&port_lock_key){-.-.}:
> _raw_spin_lock_irqsave+0x61/0x8d
> serial8250_console_write+0xa7/0x2a0
> console_unlock+0x3b7/0x528
> vprintk_emit+0x111/0x17f
> printk+0x59/0x73
> register_console+0x336/0x3a4
> uart_add_one_port+0x51b/0x5be
> serial8250_register_8250_port+0x454/0x55e
> dw8250_probe+0x4dc/0x5b9
> platform_drv_probe+0x67/0x8b
> really_probe+0x14a/0x422
> driver_probe_device+0x66/0x130
> device_driver_attach+0x42/0x5b
> __driver_attach+0xca/0x139
> bus_for_each_dev+0x97/0xc9
> bus_add_driver+0x12b/0x228
> driver_register+0x64/0xed
> do_one_initcall+0x20c/0x4a6
> do_initcall_level+0xb5/0xc5
> do_basic_setup+0x4c/0x58
> kernel_init_freeable+0x13f/0x1dd
> kernel_init+0xe/0x105
> ret_from_fork+0x27/0x50
>
> -> #0 (console_owner){-...}:
> __lock_acquire+0x118d/0x2714
> lock_acquire+0x203/0x258
> console_lock_spinning_enable+0x51/0x57
> console_unlock+0x25d/0x528
> vprintk_emit+0x111/0x17f
> printk+0x59/0x73
> __report_bad_irq+0xa3/0xba
> note_interrupt+0x19a/0x1d6
> handle_irq_event_percpu+0x57/0x79
> handle_irq_event+0x36/0x55
> handle_fasteoi_irq+0xc2/0x18a
> do_IRQ+0xb3/0x157
> ret_from_intr+0x0/0x1d
> cpuidle_enter_state+0x12f/0x1fd
> cpuidle_enter+0x2e/0x3d
> do_idle+0x1ce/0x2ce
> cpu_startup_entry+0x1d/0x1f
> start_kernel+0x406/0x46a
> secondary_startup_64+0xa4/0xb0
>
> other info that might help us debug this:
>
> Chain exists of:
> console_owner --> &port_lock_key --> &irq_desc_lock_class
>
> Possible unsafe locking scenario:
>
> CPU0 CPU1
> ---- ----
> lock(&irq_desc_lock_class);
> lock(&port_lock_key);
> lock(&irq_desc_lock_class);
> lock(console_owner);
>
> *** DEADLOCK ***
>
> 2 locks held by swapper/0/0:
> #0: ffff88810a8e34c0 (&irq_desc_lock_class){-.-.}, at: __report_bad_irq+0x5b/0xba
> #1: ffffffffab65b5c0 (console_lock){+.+.}, at: console_trylock_spinning+0x20/0x181
>
> stack backtrace:
> CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.4.39 #55
> Hardware name: XXXXXX
> Call Trace:
> <IRQ>
> dump_stack+0xbf/0x133
> ? print_circular_bug+0xd6/0xe9
> check_noncircular+0x1b9/0x1c3
> __lock_acquire+0x118d/0x2714
> lock_acquire+0x203/0x258
> ? console_lock_spinning_enable+0x31/0x57
> console_lock_spinning_enable+0x51/0x57
> ? console_lock_spinning_enable+0x31/0x57
> console_unlock+0x25d/0x528
> ? console_trylock+0x18/0x4e
> vprintk_emit+0x111/0x17f
> ? lock_acquire+0x203/0x258
> printk+0x59/0x73
> __report_bad_irq+0xa3/0xba
> note_interrupt+0x19a/0x1d6
> handle_irq_event_percpu+0x57/0x79
> handle_irq_event+0x36/0x55
> handle_fasteoi_irq+0xc2/0x18a
> do_IRQ+0xb3/0x157
> common_interrupt+0xf/0xf
> </IRQ>

I guess we may add some tags here

Fixes: 768aec0b5bcc ("serial: 8250: fix shared interrupts issues with SMP and RT kernels")
Reported-by: Guenter Roeck <[email protected]>
Reported-by: Raul Rangel <[email protected]>
BugLink: https://bugs.chromium.org/p/chromium/issues/detail?id=1114800
Link: https://lore.kernel.org/lkml/CAHQZ30BnfX+gxjPm1DUd5psOTqbyDh4EJE=2=VAMW_VDafctkA@mail.gmail.com/T/#u

Since above below a nit-pick after addressing these,
Reviewed-by: Andy Shevchenko <[email protected]>

Thanks!

> Signed-off-by: Sergey Senozhatsky <[email protected]>
> ---
> drivers/tty/serial/8250/8250_port.c | 11 +++++++----
> 1 file changed, 7 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/tty/serial/8250/8250_port.c b/drivers/tty/serial/8250/8250_port.c
> index 09475695effd..67f1a4f31093 100644
> --- a/drivers/tty/serial/8250/8250_port.c
> +++ b/drivers/tty/serial/8250/8250_port.c
> @@ -2275,6 +2275,11 @@ int serial8250_do_startup(struct uart_port *port)
>
> if (port->irq && !(up->port.flags & UPF_NO_THRE_TEST)) {
> unsigned char iir1;

> + bool irq_shared = up->port.irqflags & IRQF_SHARED;

I'm wondering why we need a temporary variable? This flag is not supposed to be
changed in between, can we leave original conditionals?

Nevertheless I noticed an inconsistency of the dereference of the flags which
seems to be brough by dfe42443ea1d ("serial: reduce number of indirections in
8250 code").

I think we can stick with newer:

if (port->irqflags & IRQF_SHARED)

> +
> + if (irq_shared)
> + disable_irq_nosync(port->irq);
> +
> /*
> * Test for UARTs that do not reassert THRE when the
> * transmitter is idle and the interrupt has already
> @@ -2284,8 +2289,6 @@ int serial8250_do_startup(struct uart_port *port)
> * allow register changes to become visible.
> */
> spin_lock_irqsave(&port->lock, flags);
> - if (up->port.irqflags & IRQF_SHARED)
> - disable_irq_nosync(port->irq);
>
> wait_for_xmitr(up, UART_LSR_THRE);
> serial_port_out_sync(port, UART_IER, UART_IER_THRI);
> @@ -2297,9 +2300,9 @@ int serial8250_do_startup(struct uart_port *port)
> iir = serial_port_in(port, UART_IIR);
> serial_port_out(port, UART_IER, 0);
>
> - if (port->irqflags & IRQF_SHARED)
> - enable_irq(port->irq);
> spin_unlock_irqrestore(&port->lock, flags);
> + if (irq_shared)
> + enable_irq(port->irq);
>
> /*
> * If the interrupt is not reasserted, or we otherwise
> --
> 2.28.0
>

--
With Best Regards,
Andy Shevchenko


2020-08-14 11:42:40

by Sergey Senozhatsky

[permalink] [raw]
Subject: Re: [PATCH] uart:8250: change lock order in serial8250_do_startup()

On (20/08/14 12:59), Andy Shevchenko wrote:
> > ---
> > drivers/tty/serial/8250/8250_port.c | 11 +++++++----
> > 1 file changed, 7 insertions(+), 4 deletions(-)
> >
> > diff --git a/drivers/tty/serial/8250/8250_port.c b/drivers/tty/serial/8250/8250_port.c
> > index 09475695effd..67f1a4f31093 100644
> > --- a/drivers/tty/serial/8250/8250_port.c
> > +++ b/drivers/tty/serial/8250/8250_port.c
> > @@ -2275,6 +2275,11 @@ int serial8250_do_startup(struct uart_port *port)
> >
> > if (port->irq && !(up->port.flags & UPF_NO_THRE_TEST)) {
> > unsigned char iir1;
>
> > + bool irq_shared = up->port.irqflags & IRQF_SHARED;
>
> I'm wondering why we need a temporary variable? This flag is not supposed to be
> changed in between, can we leave original conditionals?

No particular reason. We can keep the original (long) ones, I guess.

> Nevertheless I noticed an inconsistency of the dereference of the flags which
> seems to be brough by dfe42443ea1d ("serial: reduce number of indirections in
> 8250 code").
>
> I think we can stick with newer:
>
> if (port->irqflags & IRQF_SHARED)

I'll take a look.

-ss

2020-08-14 14:49:20

by Guenter Roeck

[permalink] [raw]
Subject: Re: [PATCH] uart:8250: change lock order in serial8250_do_startup()

On 8/13/20 6:38 PM, Sergey Senozhatsky wrote:
> We have a number of "uart.port->desc.lock vs desc.lock->uart.port"
> lockdep reports coming from 8250 driver; this causes a bit of trouble
> to people, so let's fix it.
>
> The problem is reverse lock order in two different call paths:
>
> chain #1:
>
> serial8250_do_startup()
> spin_lock_irqsave(&port->lock);
> disable_irq_nosync(port->irq);
> raw_spin_lock_irqsave(&desc->lock)
>
> chain #2:
>
> __report_bad_irq()
> raw_spin_lock_irqsave(&desc->lock)
> for_each_action_of_desc()
> printk()
> spin_lock_irqsave(&port->lock);
>
> Fix this by changing the order of locks in serial8250_do_startup():
> do disable_irq_nosync() first, which grabs desc->lock, and grab
> uart->port after that, so that chain #1 and chain #2 have same lock
> order.
>
> Full lockdep splat:
>
> ======================================================
> WARNING: possible circular locking dependency detected
> 5.4.39 #55 Not tainted
> ------------------------------------------------------
> swapper/0/0 is trying to acquire lock:
> ffffffffab65b6c0 (console_owner){-...}, at: console_lock_spinning_enable+0x31/0x57
>
> but task is already holding lock:
> ffff88810a8e34c0 (&irq_desc_lock_class){-.-.}, at: __report_bad_irq+0x5b/0xba
>
> which lock already depends on the new lock.
>
> the existing dependency chain (in reverse order) is:
>
> -> #2 (&irq_desc_lock_class){-.-.}:
> _raw_spin_lock_irqsave+0x61/0x8d
> __irq_get_desc_lock+0x65/0x89
> __disable_irq_nosync+0x3b/0x93
> serial8250_do_startup+0x451/0x75c
> uart_startup+0x1b4/0x2ff
> uart_port_activate+0x73/0xa0
> tty_port_open+0xae/0x10a
> uart_open+0x1b/0x26
> tty_open+0x24d/0x3a0
> chrdev_open+0xd5/0x1cc
> do_dentry_open+0x299/0x3c8
> path_openat+0x434/0x1100
> do_filp_open+0x9b/0x10a
> do_sys_open+0x15f/0x3d7
> kernel_init_freeable+0x157/0x1dd
> kernel_init+0xe/0x105
> ret_from_fork+0x27/0x50
>
> -> #1 (&port_lock_key){-.-.}:
> _raw_spin_lock_irqsave+0x61/0x8d
> serial8250_console_write+0xa7/0x2a0
> console_unlock+0x3b7/0x528
> vprintk_emit+0x111/0x17f
> printk+0x59/0x73
> register_console+0x336/0x3a4
> uart_add_one_port+0x51b/0x5be
> serial8250_register_8250_port+0x454/0x55e
> dw8250_probe+0x4dc/0x5b9
> platform_drv_probe+0x67/0x8b
> really_probe+0x14a/0x422
> driver_probe_device+0x66/0x130
> device_driver_attach+0x42/0x5b
> __driver_attach+0xca/0x139
> bus_for_each_dev+0x97/0xc9
> bus_add_driver+0x12b/0x228
> driver_register+0x64/0xed
> do_one_initcall+0x20c/0x4a6
> do_initcall_level+0xb5/0xc5
> do_basic_setup+0x4c/0x58
> kernel_init_freeable+0x13f/0x1dd
> kernel_init+0xe/0x105
> ret_from_fork+0x27/0x50
>
> -> #0 (console_owner){-...}:
> __lock_acquire+0x118d/0x2714
> lock_acquire+0x203/0x258
> console_lock_spinning_enable+0x51/0x57
> console_unlock+0x25d/0x528
> vprintk_emit+0x111/0x17f
> printk+0x59/0x73
> __report_bad_irq+0xa3/0xba
> note_interrupt+0x19a/0x1d6
> handle_irq_event_percpu+0x57/0x79
> handle_irq_event+0x36/0x55
> handle_fasteoi_irq+0xc2/0x18a
> do_IRQ+0xb3/0x157
> ret_from_intr+0x0/0x1d
> cpuidle_enter_state+0x12f/0x1fd
> cpuidle_enter+0x2e/0x3d
> do_idle+0x1ce/0x2ce
> cpu_startup_entry+0x1d/0x1f
> start_kernel+0x406/0x46a
> secondary_startup_64+0xa4/0xb0
>
> other info that might help us debug this:
>
> Chain exists of:
> console_owner --> &port_lock_key --> &irq_desc_lock_class
>
> Possible unsafe locking scenario:
>
> CPU0 CPU1
> ---- ----
> lock(&irq_desc_lock_class);
> lock(&port_lock_key);
> lock(&irq_desc_lock_class);
> lock(console_owner);
>
> *** DEADLOCK ***
>
> 2 locks held by swapper/0/0:
> #0: ffff88810a8e34c0 (&irq_desc_lock_class){-.-.}, at: __report_bad_irq+0x5b/0xba
> #1: ffffffffab65b5c0 (console_lock){+.+.}, at: console_trylock_spinning+0x20/0x181
>
> stack backtrace:
> CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.4.39 #55
> Hardware name: XXXXXX
> Call Trace:
> <IRQ>
> dump_stack+0xbf/0x133
> ? print_circular_bug+0xd6/0xe9
> check_noncircular+0x1b9/0x1c3
> __lock_acquire+0x118d/0x2714
> lock_acquire+0x203/0x258
> ? console_lock_spinning_enable+0x31/0x57
> console_lock_spinning_enable+0x51/0x57
> ? console_lock_spinning_enable+0x31/0x57
> console_unlock+0x25d/0x528
> ? console_trylock+0x18/0x4e
> vprintk_emit+0x111/0x17f
> ? lock_acquire+0x203/0x258
> printk+0x59/0x73
> __report_bad_irq+0xa3/0xba
> note_interrupt+0x19a/0x1d6
> handle_irq_event_percpu+0x57/0x79
> handle_irq_event+0x36/0x55
> handle_fasteoi_irq+0xc2/0x18a
> do_IRQ+0xb3/0x157
> common_interrupt+0xf/0xf
> </IRQ>
>
> Signed-off-by: Sergey Senozhatsky <[email protected]>

Tested-by: Guenter Roeck <[email protected]>

> ---
> drivers/tty/serial/8250/8250_port.c | 11 +++++++----
> 1 file changed, 7 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/tty/serial/8250/8250_port.c b/drivers/tty/serial/8250/8250_port.c
> index 09475695effd..67f1a4f31093 100644
> --- a/drivers/tty/serial/8250/8250_port.c
> +++ b/drivers/tty/serial/8250/8250_port.c
> @@ -2275,6 +2275,11 @@ int serial8250_do_startup(struct uart_port *port)
>
> if (port->irq && !(up->port.flags & UPF_NO_THRE_TEST)) {
> unsigned char iir1;
> + bool irq_shared = up->port.irqflags & IRQF_SHARED;
> +
> + if (irq_shared)
> + disable_irq_nosync(port->irq);
> +
> /*
> * Test for UARTs that do not reassert THRE when the
> * transmitter is idle and the interrupt has already
> @@ -2284,8 +2289,6 @@ int serial8250_do_startup(struct uart_port *port)
> * allow register changes to become visible.
> */
> spin_lock_irqsave(&port->lock, flags);
> - if (up->port.irqflags & IRQF_SHARED)
> - disable_irq_nosync(port->irq);
>
> wait_for_xmitr(up, UART_LSR_THRE);
> serial_port_out_sync(port, UART_IER, UART_IER_THRI);
> @@ -2297,9 +2300,9 @@ int serial8250_do_startup(struct uart_port *port)
> iir = serial_port_in(port, UART_IIR);
> serial_port_out(port, UART_IER, 0);
>
> - if (port->irqflags & IRQF_SHARED)
> - enable_irq(port->irq);
> spin_unlock_irqrestore(&port->lock, flags);
> + if (irq_shared)
> + enable_irq(port->irq);
>
> /*
> * If the interrupt is not reasserted, or we otherwise
>

2020-08-14 17:58:15

by Andy Shevchenko

[permalink] [raw]
Subject: Re: [PATCH] uart:8250: change lock order in serial8250_do_startup()

On Fri, Aug 14, 2020 at 08:29:40PM +0900, Sergey Senozhatsky wrote:
> On (20/08/14 12:59), Andy Shevchenko wrote:

...

> > I think we can stick with newer:
> >
> > if (port->irqflags & IRQF_SHARED)
>
> I'll take a look.

Thanks!

One more thing, perhaps update prefix to be 'serial: 8250:'.

--
With Best Regards,
Andy Shevchenko


2020-08-18 12:53:32

by Petr Mladek

[permalink] [raw]
Subject: Re: [PATCH] uart:8250: change lock order in serial8250_do_startup()

On Fri 2020-08-14 12:59:28, Andy Shevchenko wrote:
> On Fri, Aug 14, 2020 at 10:38:02AM +0900, Sergey Senozhatsky wrote:
> > We have a number of "uart.port->desc.lock vs desc.lock->uart.port"
> > lockdep reports coming from 8250 driver; this causes a bit of trouble
> > to people, so let's fix it.
> >
>
> I guess we may add some tags here
>
> Fixes: 768aec0b5bcc ("serial: 8250: fix shared interrupts issues with SMP and RT kernels")
> Reported-by: Guenter Roeck <[email protected]>
> Reported-by: Raul Rangel <[email protected]>
> BugLink: https://bugs.chromium.org/p/chromium/issues/detail?id=1114800
> Link: https://lore.kernel.org/lkml/CAHQZ30BnfX+gxjPm1DUd5psOTqbyDh4EJE=2=VAMW_VDafctkA@mail.gmail.com/T/#u

"Link:" tag should point to the mail that is applied using git am.
I am not sure if there is a tag for related discussion in another
mail threads.

A solution might be to add a comment like:

This solution has been discussed in several threads:
https://lore.kernel.org/lkml/CAHQZ30BnfX+gxjPm1DUd5psOTqbyDh4EJE=2=VAMW_VDafctkA@mail.gmail.com/T/#u
https://lore.kernel.org/lkml/[email protected]/#t


> > --- a/drivers/tty/serial/8250/8250_port.c
> > +++ b/drivers/tty/serial/8250/8250_port.c
> > @@ -2275,6 +2275,11 @@ int serial8250_do_startup(struct uart_port *port)
> >
> > if (port->irq && !(up->port.flags & UPF_NO_THRE_TEST)) {
> > unsigned char iir1;
>
> > + bool irq_shared = up->port.irqflags & IRQF_SHARED;
>
> I'm wondering why we need a temporary variable? This flag is not supposed to be
> changed in between, can we leave original conditionals?
>
> Nevertheless I noticed an inconsistency of the dereference of the flags which
> seems to be brough by dfe42443ea1d ("serial: reduce number of indirections in
> 8250 code").
>
> I think we can stick with newer:
>
> if (port->irqflags & IRQF_SHARED)

Sounds reasonable to me.

Andy proposed many changes. Sergey, could you please send v2?

Best Regards,
Petr

2020-08-18 13:18:33

by Andy Shevchenko

[permalink] [raw]
Subject: Re: [PATCH] uart:8250: change lock order in serial8250_do_startup()

On Tue, Aug 18, 2020 at 02:52:18PM +0200, Petr Mladek wrote:
> On Fri 2020-08-14 12:59:28, Andy Shevchenko wrote:
> > On Fri, Aug 14, 2020 at 10:38:02AM +0900, Sergey Senozhatsky wrote:
> > > We have a number of "uart.port->desc.lock vs desc.lock->uart.port"
> > > lockdep reports coming from 8250 driver; this causes a bit of trouble
> > > to people, so let's fix it.
> >
> > I guess we may add some tags here
> >
> > Fixes: 768aec0b5bcc ("serial: 8250: fix shared interrupts issues with SMP and RT kernels")
> > Reported-by: Guenter Roeck <[email protected]>
> > Reported-by: Raul Rangel <[email protected]>
> > BugLink: https://bugs.chromium.org/p/chromium/issues/detail?id=1114800
> > Link: https://lore.kernel.org/lkml/CAHQZ30BnfX+gxjPm1DUd5psOTqbyDh4EJE=2=VAMW_VDafctkA@mail.gmail.com/T/#u
>
> "Link:" tag should point to the mail that is applied using git am.
> I am not sure if there is a tag for related discussion in another
> mail threads.

It's fine to have several Link tags and in the past we have them for bug
reports thru mailing lists or so.

> Andy proposed many changes. Sergey, could you please send v2?

There is a v2.

https://lore.kernel.org/linux-serial/[email protected]/T/#u

--
With Best Regards,
Andy Shevchenko


2020-08-19 02:42:15

by Sergey Senozhatsky

[permalink] [raw]
Subject: Re: [PATCH] uart:8250: change lock order in serial8250_do_startup()

On (20/08/18 14:52), Petr Mladek wrote:
> > I guess we may add some tags here
> >
> > Fixes: 768aec0b5bcc ("serial: 8250: fix shared interrupts issues with SMP and RT kernels")
> > Reported-by: Guenter Roeck <[email protected]>
> > Reported-by: Raul Rangel <[email protected]>
> > BugLink: https://bugs.chromium.org/p/chromium/issues/detail?id=1114800
> > Link: https://lore.kernel.org/lkml/CAHQZ30BnfX+gxjPm1DUd5psOTqbyDh4EJE=2=VAMW_VDafctkA@mail.gmail.com/T/#u
>
> "Link:" tag should point to the mail that is applied using git am.
> I am not sure if there is a tag for related discussion in another
> mail threads.

Yes, that's a good point. I wonder if we can slightly change that
rule. That link points to a thread where we discussed various
approaches to the problem, what would work, what wouldn't and why;
there is some valuable feedback there. The "8250-fix-locks-v2.patch"
link, on the other hand, points to nothing valuable.

> Sounds reasonable to me.
>
> Andy proposed many changes. Sergey, could you please send v2?

Sure, I think I sent v2 already.

-ss

2020-08-19 09:25:11

by Petr Mladek

[permalink] [raw]
Subject: Re: [PATCH] uart:8250: change lock order in serial8250_do_startup()

On Wed 2020-08-19 10:52:09, Sergey Senozhatsky wrote:
> On (20/08/18 14:52), Petr Mladek wrote:
> > > I guess we may add some tags here
> > >
> > > Fixes: 768aec0b5bcc ("serial: 8250: fix shared interrupts issues with SMP and RT kernels")
> > > Reported-by: Guenter Roeck <[email protected]>
> > > Reported-by: Raul Rangel <[email protected]>
> > > BugLink: https://bugs.chromium.org/p/chromium/issues/detail?id=1114800
> > > Link: https://lore.kernel.org/lkml/CAHQZ30BnfX+gxjPm1DUd5psOTqbyDh4EJE=2=VAMW_VDafctkA@mail.gmail.com/T/#u
> >
> > "Link:" tag should point to the mail that is applied using git am.
> > I am not sure if there is a tag for related discussion in another
> > mail threads.
>
> Yes, that's a good point. I wonder if we can slightly change that
> rule. That link points to a thread where we discussed various
> approaches to the problem, what would work, what wouldn't and why;
> there is some valuable feedback there. The "8250-fix-locks-v2.patch"
> link, on the other hand, points to nothing valuable.

I agree that the other link is more valuable than the final one.
I just did not want to break a common rule. But it seems that
there already are commits with more Link: tags.

> > Sounds reasonable to me.
> >
> > Andy proposed many changes. Sergey, could you please send v2?
>
> Sure, I think I sent v2 already.

Ah, I have missed it. It is pushed now.

Best Regards,
Petr