2013-11-13 07:30:03

by Huang Shijie

[permalink] [raw]
Subject: [PATCH RFC] tty_ldisc: add more limits to the @write_wakeup

In the uart_handle_cts_change(), uart_write_wakeup() is called after
we call @uart_port->ops->start_tx().

The Documentation/serial/driver tells us:
-----------------------------------------------
start_tx(port)
Start transmitting characters.

Locking: port->lock taken.
Interrupts: locally disabled.
-----------------------------------------------

So when the uart_write_wakeup() is called, the port->lock is taken by
the upper. See the following callstack:

|_ uart_write_wakeup
|_ tty_wakeup
|_ ld->ops->write_wakeup

With the port->lock held, we call the @write_wakeup. Some implemetation of
the @write_wakeup does not notice that the port->lock is held, and it still
tries to send data with uart_write() which will try to grab the prot->lock.
A dead lock occurs, see the following log caught in the Bluetooth by uart:

--------------------------------------------------------------------
BUG: spinlock lockup suspected on CPU#0, swapper/0/0
lock: 0xdc3f4410, .magic: dead4ead, .owner: swapper/0/0, .owner_cpu: 0
CPU: 0 PID: 0 Comm: swapper/0 Tainted: G W 3.10.17-16839-ge4a1bef #1320
[<80014cbc>] (unwind_backtrace+0x0/0x138) from [<8001251c>] (show_stack+0x10/0x14)
[<8001251c>] (show_stack+0x10/0x14) from [<802816ac>] (do_raw_spin_lock+0x108/0x184)
[<802816ac>] (do_raw_spin_lock+0x108/0x184) from [<806a22b0>] (_raw_spin_lock_irqsave+0x54/0x60)
[<806a22b0>] (_raw_spin_lock_irqsave+0x54/0x60) from [<802f5754>] (uart_write+0x38/0xe0)
[<802f5754>] (uart_write+0x38/0xe0) from [<80455270>] (hci_uart_tx_wakeup+0xa4/0x168)
[<80455270>] (hci_uart_tx_wakeup+0xa4/0x168) from [<802dab18>] (tty_wakeup+0x50/0x5c)
[<802dab18>] (tty_wakeup+0x50/0x5c) from [<802f81a4>] (imx_rtsint+0x50/0x80)
[<802f81a4>] (imx_rtsint+0x50/0x80) from [<802f88f4>] (imx_int+0x158/0x17c)
[<802f88f4>] (imx_int+0x158/0x17c) from [<8007abe0>] (handle_irq_event_percpu+0x50/0x194)
[<8007abe0>] (handle_irq_event_percpu+0x50/0x194) from [<8007ad60>] (handle_irq_event+0x3c/0x5c)
--------------------------------------------------------------------

This patch adds more limits to the @write_wakeup, the one who wants to
implemet the @write_wakeup should follow the limits which avoid the deadlock.

Signed-off-by: Huang Shijie <[email protected]>
---
include/linux/tty_ldisc.h | 5 ++++-
1 files changed, 4 insertions(+), 1 deletions(-)

diff --git a/include/linux/tty_ldisc.h b/include/linux/tty_ldisc.h
index f15c898..539ccc5 100644
--- a/include/linux/tty_ldisc.h
+++ b/include/linux/tty_ldisc.h
@@ -91,7 +91,10 @@
* This function is called by the low-level tty driver to signal
* that line discpline should try to send more characters to the
* low-level driver for transmission. If the line discpline does
- * not have any more data to send, it can just return.
+ * not have any more data to send, it can just return. If the line
+ * discipline does have some data to send, please arise a tasklet
+ * or workqueue to do the real data transfer. Do not send data in
+ * this hook, it may leads to a deadlock.
*
* int (*hangup)(struct tty_struct *)
*
--
1.7.2.rc3


2013-12-11 11:47:11

by Peter Hurley

[permalink] [raw]
Subject: Re: [PATCH RFC] tty_ldisc: add more limits to the @write_wakeup

On 12/11/2013 01:44 AM, Huang Shijie wrote:
> 于 2013年12月07日 00:18, Peter Hurley 写道:
>> hci_uart_tx_wakeup() should perform the actual tx in a work item.
> Does the "work item" mean a workqueue or a tasklet?
> This patch is used to tell the line discipline writers to send the data in the workqueue or a tasklet.

Yes, "work item" means struct work_struct.

Regards,
Peter Hurley

2013-12-11 06:44:12

by Huang Shijie

[permalink] [raw]
Subject: Re: [PATCH RFC] tty_ldisc: add more limits to the @write_wakeup

=E4=BA=8E 2013=E5=B9=B412=E6=9C=8807=E6=97=A5 00:18, Peter Hurley =E5=86=99=
=E9=81=93:
> hci_uart_tx_wakeup() should perform the actual tx in a work item.=20
Does the "work item" mean a workqueue or a tasklet?
This patch is used to tell the line discipline writers to send the data=20
in the workqueue or a tasklet.

thanks
Huang Shijie

2013-12-06 16:18:07

by Peter Hurley

[permalink] [raw]
Subject: Re: [PATCH RFC] tty_ldisc: add more limits to the @write_wakeup

On 12/06/2013 05:34 AM, Huang Shijie wrote:
> 于 2013年11月13日 15:30, Huang Shijie 写道:
>> In the uart_handle_cts_change(), uart_write_wakeup() is called after
>> we call @uart_port->ops->start_tx().
>>
>> The Documentation/serial/driver tells us:
>> -----------------------------------------------
>> start_tx(port)
>> Start transmitting characters.
>>
>> Locking: port->lock taken.
>> Interrupts: locally disabled.
>> -----------------------------------------------
>>
>> So when the uart_write_wakeup() is called, the port->lock is taken by
>> the upper. See the following callstack:
>>
>> |_ uart_write_wakeup
>> |_ tty_wakeup
>> |_ ld->ops->write_wakeup
>>
>> With the port->lock held, we call the @write_wakeup. Some implemetation of
>> the @write_wakeup does not notice that the port->lock is held, and it still
>> tries to send data with uart_write() which will try to grab the prot->lock.
>> A dead lock occurs, see the following log caught in the Bluetooth by uart:
>>
>> --------------------------------------------------------------------
>> BUG: spinlock lockup suspected on CPU#0, swapper/0/0
>> lock: 0xdc3f4410, .magic: dead4ead, .owner: swapper/0/0, .owner_cpu: 0
>> CPU: 0 PID: 0 Comm: swapper/0 Tainted: G W 3.10.17-16839-ge4a1bef #1320
>> [<80014cbc>] (unwind_backtrace+0x0/0x138) from [<8001251c>] (show_stack+0x10/0x14)
>> [<8001251c>] (show_stack+0x10/0x14) from [<802816ac>] (do_raw_spin_lock+0x108/0x184)
>> [<802816ac>] (do_raw_spin_lock+0x108/0x184) from [<806a22b0>] (_raw_spin_lock_irqsave+0x54/0x60)
>> [<806a22b0>] (_raw_spin_lock_irqsave+0x54/0x60) from [<802f5754>] (uart_write+0x38/0xe0)
>> [<802f5754>] (uart_write+0x38/0xe0) from [<80455270>] (hci_uart_tx_wakeup+0xa4/0x168)
>> [<80455270>] (hci_uart_tx_wakeup+0xa4/0x168) from [<802dab18>] (tty_wakeup+0x50/0x5c)
>> [<802dab18>] (tty_wakeup+0x50/0x5c) from [<802f81a4>] (imx_rtsint+0x50/0x80)
>> [<802f81a4>] (imx_rtsint+0x50/0x80) from [<802f88f4>] (imx_int+0x158/0x17c)
>> [<802f88f4>] (imx_int+0x158/0x17c) from [<8007abe0>] (handle_irq_event_percpu+0x50/0x194)
>> [<8007abe0>] (handle_irq_event_percpu+0x50/0x194) from [<8007ad60>] (handle_irq_event+0x3c/0x5c)
>> --------------------------------------------------------------------
>>
>> This patch adds more limits to the @write_wakeup, the one who wants to
>> implemet the @write_wakeup should follow the limits which avoid the deadlock.
>>
>> Signed-off-by: Huang Shijie <[email protected]>
>> ---
>> include/linux/tty_ldisc.h | 5 ++++-
>> 1 files changed, 4 insertions(+), 1 deletions(-)
>>
>> diff --git a/include/linux/tty_ldisc.h b/include/linux/tty_ldisc.h
>> index f15c898..539ccc5 100644
>> --- a/include/linux/tty_ldisc.h
>> +++ b/include/linux/tty_ldisc.h
>> @@ -91,7 +91,10 @@
>> * This function is called by the low-level tty driver to signal
>> * that line discpline should try to send more characters to the
>> * low-level driver for transmission. If the line discpline does
>> - * not have any more data to send, it can just return.
>> + * not have any more data to send, it can just return. If the line
>> + * discipline does have some data to send, please arise a tasklet
>> + * or workqueue to do the real data transfer. Do not send data in
>> + * this hook, it may leads to a deadlock.
>> *
>> * int (*hangup)(struct tty_struct *)
>> *
> just a ping.
>
> In actually, this is a BUG in the tty code or BT code.

hci_uart_tx_wakeup() should perform the actual tx in a work item.

Regards,
Peter Hurley

2013-12-06 10:34:51

by Huang Shijie

[permalink] [raw]
Subject: Re: [PATCH RFC] tty_ldisc: add more limits to the @write_wakeup

=D3=DA 2013=C4=EA11=D4=C213=C8=D5 15:30, Huang Shijie =D0=B4=B5=C0:
> In the uart_handle_cts_change(), uart_write_wakeup() is called after
> we call @uart_port->ops->start_tx().
>
> The Documentation/serial/driver tells us:
> -----------------------------------------------
> start_tx(port)
> Start transmitting characters.
>
> Locking: port->lock taken.
> Interrupts: locally disabled.
> -----------------------------------------------
>
> So when the uart_write_wakeup() is called, the port->lock is taken by
> the upper. See the following callstack:
>
> |_ uart_write_wakeup
> |_ tty_wakeup
> |_ ld->ops->write_wakeup
>
> With the port->lock held, we call the @write_wakeup. Some implemetation=
of
> the @write_wakeup does not notice that the port->lock is held, and it s=
till
> tries to send data with uart_write() which will try to grab the prot->l=
ock.
> A dead lock occurs, see the following log caught in the Bluetooth by ua=
rt:
>
> --------------------------------------------------------------------
> BUG: spinlock lockup suspected on CPU#0, swapper/0/0
> lock: 0xdc3f4410, .magic: dead4ead, .owner: swapper/0/0, .owner_cpu: 0
> CPU: 0 PID: 0 Comm: swapper/0 Tainted: G W 3.10.17-16839-ge4a=
1bef #1320
> [<80014cbc>] (unwind_backtrace+0x0/0x138) from [<8001251c>] (show_stack=
+0x10/0x14)
> [<8001251c>] (show_stack+0x10/0x14) from [<802816ac>] (do_raw_spin_lock=
+0x108/0x184)
> [<802816ac>] (do_raw_spin_lock+0x108/0x184) from [<806a22b0>] (_raw_spi=
n_lock_irqsave+0x54/0x60)
> [<806a22b0>] (_raw_spin_lock_irqsave+0x54/0x60) from [<802f5754>] (uart=
_write+0x38/0xe0)
> [<802f5754>] (uart_write+0x38/0xe0) from [<80455270>] (hci_uart_tx_wake=
up+0xa4/0x168)
> [<80455270>] (hci_uart_tx_wakeup+0xa4/0x168) from [<802dab18>] (tty_wak=
eup+0x50/0x5c)
> [<802dab18>] (tty_wakeup+0x50/0x5c) from [<802f81a4>] (imx_rtsint+0x50/=
0x80)
> [<802f81a4>] (imx_rtsint+0x50/0x80) from [<802f88f4>] (imx_int+0x158/0x=
17c)
> [<802f88f4>] (imx_int+0x158/0x17c) from [<8007abe0>] (handle_irq_event_=
percpu+0x50/0x194)
> [<8007abe0>] (handle_irq_event_percpu+0x50/0x194) from [<8007ad60>] (ha=
ndle_irq_event+0x3c/0x5c)
> --------------------------------------------------------------------
>
> This patch adds more limits to the @write_wakeup, the one who wants to
> implemet the @write_wakeup should follow the limits which avoid the dea=
dlock.
>
> Signed-off-by: Huang Shijie <[email protected]>
> ---
> include/linux/tty_ldisc.h | 5 ++++-
> 1 files changed, 4 insertions(+), 1 deletions(-)
>
> diff --git a/include/linux/tty_ldisc.h b/include/linux/tty_ldisc.h
> index f15c898..539ccc5 100644
> --- a/include/linux/tty_ldisc.h
> +++ b/include/linux/tty_ldisc.h
> @@ -91,7 +91,10 @@
> * This function is called by the low-level tty driver to signal
> * that line discpline should try to send more characters to the
> * low-level driver for transmission. If the line discpline does
> - * not have any more data to send, it can just return.
> + * not have any more data to send, it can just return. If the line
> + * discipline does have some data to send, please arise a tasklet
> + * or workqueue to do the real data transfer. Do not send data in
> + * this hook, it may leads to a deadlock.
> *
> * int (*hangup)(struct tty_struct *)
> *
just a ping.

In actually, this is a BUG in the tty code or BT code.

thanks
Huang Shijie