2014-06-16 02:23:51

by Tyler Hall

[permalink] [raw]
Subject: [PATCH 1/2] slip: Fix deadlock in write_wakeup

Use schedule_work() to avoid potentially taking the spinlock in
interrupt context.

Commit cc9fa74e2a ("slip/slcan: added locking in wakeup function") added
necessary locking to the wakeup function and 367525c8c2/ddcde142be ("can:
slcan: Fix spinlock variant") converted it to spin_lock_bh() because the lock
is also taken in timers.

Disabling softirqs is not sufficient, however, as tty drivers may call
write_wakeup from interrupt context. This driver calls tty->ops->write() with
its spinlock held, which may immediately cause an interrupt on the same CPU and
subsequent spin_bug().

Simply converting to spin_lock_irq/irqsave() prevents this deadlock, but
causes lockdep to point out a possible circular locking dependency
between these locks:

(&(&sl->lock)->rlock){-.....}, at: slip_write_wakeup
(&port_lock_key){-.....}, at: serial8250_handle_irq.part.13

The slip transmit is holding the slip spinlock when calling the tty write.
This grabs the port lock. On an interrupt, the handler grabs the port
lock and calls write_wakeup which grabs the slip lock. This could be a
problem if a serial interrupt occurs on another CPU during the slip
transmit.

To deal with these issues, don't grab the lock in the wakeup function by
deferring the writeout to a workqueue. Also hold the lock during close
when de-assigning the tty pointer to safely disarm the worker and
timers.

This bug is easily reproducible on the first transmit when slip is
used with the standard 8250 serial driver.

[<c0410b7c>] (spin_bug+0x0/0x38) from [<c006109c>] (do_raw_spin_lock+0x60/0x1d0)
r5:eab27000 r4:ec02754c
[<c006103c>] (do_raw_spin_lock+0x0/0x1d0) from [<c04185c0>] (_raw_spin_lock+0x28/0x2c)
r10:0000001f r9:eabb814c r8:eabb8140 r7:40070193 r6:ec02754c r5:eab27000
r4:ec02754c r3:00000000
[<c0418598>] (_raw_spin_lock+0x0/0x2c) from [<bf3a0220>] (slip_write_wakeup+0x50/0xe0 [slip])
r4:ec027540 r3:00000003
[<bf3a01d0>] (slip_write_wakeup+0x0/0xe0 [slip]) from [<c026e420>] (tty_wakeup+0x48/0x68)
r6:00000000 r5:ea80c480 r4:eab27000 r3:bf3a01d0
[<c026e3d8>] (tty_wakeup+0x0/0x68) from [<c028a8ec>] (uart_write_wakeup+0x2c/0x30)
r5:ed68ea90 r4:c06790d8
[<c028a8c0>] (uart_write_wakeup+0x0/0x30) from [<c028dc44>] (serial8250_tx_chars+0x114/0x170)
[<c028db30>] (serial8250_tx_chars+0x0/0x170) from [<c028dffc>] (serial8250_handle_irq+0xa0/0xbc)
r6:000000c2 r5:00000060 r4:c06790d8 r3:00000000
[<c028df5c>] (serial8250_handle_irq+0x0/0xbc) from [<c02933a4>] (dw8250_handle_irq+0x38/0x64)
r7:00000000 r6:edd2f390 r5:000000c2 r4:c06790d8
[<c029336c>] (dw8250_handle_irq+0x0/0x64) from [<c028d2f4>] (serial8250_interrupt+0x44/0xc4)
r6:00000000 r5:00000000 r4:c06791c4 r3:c029336c
[<c028d2b0>] (serial8250_interrupt+0x0/0xc4) from [<c0067fe4>] (handle_irq_event_percpu+0xb4/0x2b0)
r10:c06790d8 r9:eab27000 r8:00000000 r7:00000000 r6:0000001f r5:edd52980
r4:ec53b6c0 r3:c028d2b0
[<c0067f30>] (handle_irq_event_percpu+0x0/0x2b0) from [<c006822c>] (handle_irq_event+0x4c/0x6c)
r10:c06790d8 r9:eab27000 r8:c0673ae0 r7:c05c2020 r6:ec53b6c0 r5:edd529d4
r4:edd52980
[<c00681e0>] (handle_irq_event+0x0/0x6c) from [<c006b140>] (handle_level_irq+0xe8/0x100)
r6:00000000 r5:edd529d4 r4:edd52980 r3:00022000
[<c006b058>] (handle_level_irq+0x0/0x100) from [<c00676f8>] (generic_handle_irq+0x30/0x40)
r5:0000001f r4:0000001f
[<c00676c8>] (generic_handle_irq+0x0/0x40) from [<c000f57c>] (handle_IRQ+0xd0/0x13c)
r4:ea997b18 r3:000000e0
[<c000f4ac>] (handle_IRQ+0x0/0x13c) from [<c00086c4>] (armada_370_xp_handle_irq+0x4c/0x118)
r8:000003ff r7:ea997b18 r6:ffffffff r5:60070013 r4:c0674dc0
[<c0008678>] (armada_370_xp_handle_irq+0x0/0x118) from [<c0013840>] (__irq_svc+0x40/0x70)
Exception stack(0xea997b18 to 0xea997b60)
7b00: 00000001 20070013
7b20: 00000000 0000000b 20070013 eab27000 20070013 00000000 ed10103e eab27000
7b40: c06790d8 ea997b74 ea997b60 ea997b60 c04186c0 c04186c8 60070013 ffffffff
r9:eab27000 r8:ed10103e r7:ea997b4c r6:ffffffff r5:60070013 r4:c04186c8
[<c04186a4>] (_raw_spin_unlock_irqrestore+0x0/0x54) from [<c0288fc0>] (uart_start+0x40/0x44)
r4:c06790d8 r3:c028ddd8
[<c0288f80>] (uart_start+0x0/0x44) from [<c028982c>] (uart_write+0xe4/0xf4)
r6:0000003e r5:00000000 r4:ed68ea90 r3:0000003e
[<c0289748>] (uart_write+0x0/0xf4) from [<bf3a0d20>] (sl_xmit+0x1c4/0x228 [slip])
r10:ed388e60 r9:0000003c r8:ffffffdd r7:0000003e r6:ec02754c r5:ea717eb8
r4:ec027000
[<bf3a0b5c>] (sl_xmit+0x0/0x228 [slip]) from [<c0368d74>] (dev_hard_start_xmit+0x39c/0x6d0)
r8:eaf163c0 r7:ec027000 r6:ea717eb8 r5:00000000 r4:00000000

Signed-off-by: Tyler Hall <[email protected]>
Cc: Oliver Hartkopp <[email protected]>
Cc: Andre Naujoks <[email protected]>
Cc: David S. Miller <[email protected]>
Cc: [email protected]
---
drivers/net/slip/slip.c | 36 ++++++++++++++++++++++++++----------
drivers/net/slip/slip.h | 1 +
2 files changed, 27 insertions(+), 10 deletions(-)

diff --git a/drivers/net/slip/slip.c b/drivers/net/slip/slip.c
index ad4a94e..8752644 100644
--- a/drivers/net/slip/slip.c
+++ b/drivers/net/slip/slip.c
@@ -83,6 +83,7 @@
#include <linux/delay.h>
#include <linux/init.h>
#include <linux/slab.h>
+#include <linux/workqueue.h>
#include "slip.h"
#ifdef CONFIG_INET
#include <linux/ip.h>
@@ -416,36 +417,46 @@ static void sl_encaps(struct slip *sl, unsigned char *icp, int len)
#endif
}

-/*
- * Called by the driver when there's room for more data. If we have
- * more packets to send, we send them here.
- */
-static void slip_write_wakeup(struct tty_struct *tty)
+/* Write out any remaining transmit buffer. Scheduled when tty is writable */
+static void slip_transmit(struct work_struct *work)
{
+ struct slip *sl = container_of(work, struct slip, tx_work);
int actual;
- struct slip *sl = tty->disc_data;

+ spin_lock_bh(&sl->lock);
/* First make sure we're connected. */
- if (!sl || sl->magic != SLIP_MAGIC || !netif_running(sl->dev))
+ if (!sl->tty || sl->magic != SLIP_MAGIC || !netif_running(sl->dev)) {
+ spin_unlock_bh(&sl->lock);
return;
+ }

- spin_lock_bh(&sl->lock);
if (sl->xleft <= 0) {
/* Now serial buffer is almost free & we can start
* transmission of another packet */
sl->dev->stats.tx_packets++;
- clear_bit(TTY_DO_WRITE_WAKEUP, &tty->flags);
+ clear_bit(TTY_DO_WRITE_WAKEUP, &sl->tty->flags);
spin_unlock_bh(&sl->lock);
sl_unlock(sl);
return;
}

- actual = tty->ops->write(tty, sl->xhead, sl->xleft);
+ actual = sl->tty->ops->write(sl->tty, sl->xhead, sl->xleft);
sl->xleft -= actual;
sl->xhead += actual;
spin_unlock_bh(&sl->lock);
}

+/*
+ * Called by the driver when there's room for more data.
+ * Schedule the transmit.
+ */
+static void slip_write_wakeup(struct tty_struct *tty)
+{
+ struct slip *sl = tty->disc_data;
+
+ schedule_work(&sl->tx_work);
+}
+
static void sl_tx_timeout(struct net_device *dev)
{
struct slip *sl = netdev_priv(dev);
@@ -749,6 +760,7 @@ static struct slip *sl_alloc(dev_t line)
sl->magic = SLIP_MAGIC;
sl->dev = dev;
spin_lock_init(&sl->lock);
+ INIT_WORK(&sl->tx_work, slip_transmit);
sl->mode = SL_MODE_DEFAULT;
#ifdef CONFIG_SLIP_SMART
/* initialize timer_list struct */
@@ -872,8 +884,12 @@ static void slip_close(struct tty_struct *tty)
if (!sl || sl->magic != SLIP_MAGIC || sl->tty != tty)
return;

+ spin_lock_bh(&sl->lock);
tty->disc_data = NULL;
sl->tty = NULL;
+ spin_unlock_bh(&sl->lock);
+
+ flush_work(&sl->tx_work);

/* VSV = very important to remove timers */
#ifdef CONFIG_SLIP_SMART
diff --git a/drivers/net/slip/slip.h b/drivers/net/slip/slip.h
index 67673cf..cf32aad 100644
--- a/drivers/net/slip/slip.h
+++ b/drivers/net/slip/slip.h
@@ -53,6 +53,7 @@ struct slip {
struct tty_struct *tty; /* ptr to TTY structure */
struct net_device *dev; /* easy for intr handling */
spinlock_t lock;
+ struct work_struct tx_work; /* Flushes transmit buffer */

#ifdef SL_INCLUDE_CSLIP
struct slcompress *slcomp; /* for header compression */
--
2.0.0


2014-06-16 02:23:56

by Tyler Hall

[permalink] [raw]
Subject: [PATCH 2/2] slcan: Port write_wakeup deadlock fix from slip

The commit "slip: Fix deadlock in write_wakeup" fixes a deadlock caused
by a change made in both slcan and slip. This is a direct port of that
fix.

Signed-off-by: Tyler Hall <[email protected]>
Cc: Oliver Hartkopp <[email protected]>
Cc: Andre Naujoks <[email protected]>
Cc: David S. Miller <[email protected]>
Cc: [email protected]
---
drivers/net/can/slcan.c | 37 +++++++++++++++++++++++++++----------
1 file changed, 27 insertions(+), 10 deletions(-)

diff --git a/drivers/net/can/slcan.c b/drivers/net/can/slcan.c
index dcf9196..ea4d4f1 100644
--- a/drivers/net/can/slcan.c
+++ b/drivers/net/can/slcan.c
@@ -52,6 +52,7 @@
#include <linux/delay.h>
#include <linux/init.h>
#include <linux/kernel.h>
+#include <linux/workqueue.h>
#include <linux/can.h>
#include <linux/can/skb.h>

@@ -85,6 +86,7 @@ struct slcan {
struct tty_struct *tty; /* ptr to TTY structure */
struct net_device *dev; /* easy for intr handling */
spinlock_t lock;
+ struct work_struct tx_work; /* Flushes transmit buffer */

/* These are pointers to the malloc()ed frame buffers. */
unsigned char rbuff[SLC_MTU]; /* receiver buffer */
@@ -309,36 +311,46 @@ static void slc_encaps(struct slcan *sl, struct can_frame *cf)
sl->dev->stats.tx_bytes += cf->can_dlc;
}

-/*
- * Called by the driver when there's room for more data. If we have
- * more packets to send, we send them here.
- */
-static void slcan_write_wakeup(struct tty_struct *tty)
+/* Write out any remaining transmit buffer. Scheduled when tty is writable */
+static void slcan_transmit(struct work_struct *work)
{
+ struct slcan *sl = container_of(work, struct slcan, tx_work);
int actual;
- struct slcan *sl = (struct slcan *) tty->disc_data;

+ spin_lock_bh(&sl->lock);
/* First make sure we're connected. */
- if (!sl || sl->magic != SLCAN_MAGIC || !netif_running(sl->dev))
+ if (!sl->tty || sl->magic != SLCAN_MAGIC || !netif_running(sl->dev)) {
+ spin_unlock_bh(&sl->lock);
return;
+ }

- spin_lock_bh(&sl->lock);
if (sl->xleft <= 0) {
/* Now serial buffer is almost free & we can start
* transmission of another packet */
sl->dev->stats.tx_packets++;
- clear_bit(TTY_DO_WRITE_WAKEUP, &tty->flags);
+ clear_bit(TTY_DO_WRITE_WAKEUP, &sl->tty->flags);
spin_unlock_bh(&sl->lock);
netif_wake_queue(sl->dev);
return;
}

- actual = tty->ops->write(tty, sl->xhead, sl->xleft);
+ actual = sl->tty->ops->write(sl->tty, sl->xhead, sl->xleft);
sl->xleft -= actual;
sl->xhead += actual;
spin_unlock_bh(&sl->lock);
}

+/*
+ * Called by the driver when there's room for more data.
+ * Schedule the transmit.
+ */
+static void slcan_write_wakeup(struct tty_struct *tty)
+{
+ struct slcan *sl = tty->disc_data;
+
+ schedule_work(&sl->tx_work);
+}
+
/* Send a can_frame to a TTY queue. */
static netdev_tx_t slc_xmit(struct sk_buff *skb, struct net_device *dev)
{
@@ -528,6 +540,7 @@ static struct slcan *slc_alloc(dev_t line)
sl->magic = SLCAN_MAGIC;
sl->dev = dev;
spin_lock_init(&sl->lock);
+ INIT_WORK(&sl->tx_work, slcan_transmit);
slcan_devs[i] = dev;

return sl;
@@ -626,8 +639,12 @@ static void slcan_close(struct tty_struct *tty)
if (!sl || sl->magic != SLCAN_MAGIC || sl->tty != tty)
return;

+ spin_lock_bh(&sl->lock);
tty->disc_data = NULL;
sl->tty = NULL;
+ spin_unlock_bh(&sl->lock);
+
+ flush_work(&sl->tx_work);

/* Flush network side */
unregister_netdev(sl->dev);
--
2.0.0

2014-06-16 18:01:10

by Oliver Hartkopp

[permalink] [raw]
Subject: Re: [PATCH 1/2] slip: Fix deadlock in write_wakeup

Hello Tyler,

On 16.06.2014 04:23, Tyler Hall wrote:
> Use schedule_work() to avoid potentially taking the spinlock in
> interrupt context.
>
(..)

>
> To deal with these issues, don't grab the lock in the wakeup function by
> deferring the writeout to a workqueue. Also hold the lock during close
> when de-assigning the tty pointer to safely disarm the worker and
> timers.
>
> This bug is easily reproducible on the first transmit when slip is
> used with the standard 8250 serial driver.
>

looks reasonable. Thanks for your patch!
Indeed I can't remember ever using the slcan driver with a real serial
controller hardware with irq line but only via serial-to-USB adapters :-)
Due to the recent fixes from Andre and Alexander these two drivers got in
motion again ...

@Andre/Alexander: Can you please check if slcan still works in your setup. I
don't have that hardware with me. I only was able to compile it successfully.

Just for the records: These two patches would be candidates for stable 3.12+

Thanks,
Oliver


> [<c0410b7c>] (spin_bug+0x0/0x38) from [<c006109c>] (do_raw_spin_lock+0x60/0x1d0)
> r5:eab27000 r4:ec02754c
> [<c006103c>] (do_raw_spin_lock+0x0/0x1d0) from [<c04185c0>] (_raw_spin_lock+0x28/0x2c)
> r10:0000001f r9:eabb814c r8:eabb8140 r7:40070193 r6:ec02754c r5:eab27000
> r4:ec02754c r3:00000000
> [<c0418598>] (_raw_spin_lock+0x0/0x2c) from [<bf3a0220>] (slip_write_wakeup+0x50/0xe0 [slip])
> r4:ec027540 r3:00000003
> [<bf3a01d0>] (slip_write_wakeup+0x0/0xe0 [slip]) from [<c026e420>] (tty_wakeup+0x48/0x68)
> r6:00000000 r5:ea80c480 r4:eab27000 r3:bf3a01d0
> [<c026e3d8>] (tty_wakeup+0x0/0x68) from [<c028a8ec>] (uart_write_wakeup+0x2c/0x30)
> r5:ed68ea90 r4:c06790d8
> [<c028a8c0>] (uart_write_wakeup+0x0/0x30) from [<c028dc44>] (serial8250_tx_chars+0x114/0x170)
> [<c028db30>] (serial8250_tx_chars+0x0/0x170) from [<c028dffc>] (serial8250_handle_irq+0xa0/0xbc)
> r6:000000c2 r5:00000060 r4:c06790d8 r3:00000000
> [<c028df5c>] (serial8250_handle_irq+0x0/0xbc) from [<c02933a4>] (dw8250_handle_irq+0x38/0x64)
> r7:00000000 r6:edd2f390 r5:000000c2 r4:c06790d8
> [<c029336c>] (dw8250_handle_irq+0x0/0x64) from [<c028d2f4>] (serial8250_interrupt+0x44/0xc4)
> r6:00000000 r5:00000000 r4:c06791c4 r3:c029336c
> [<c028d2b0>] (serial8250_interrupt+0x0/0xc4) from [<c0067fe4>] (handle_irq_event_percpu+0xb4/0x2b0)
> r10:c06790d8 r9:eab27000 r8:00000000 r7:00000000 r6:0000001f r5:edd52980
> r4:ec53b6c0 r3:c028d2b0
> [<c0067f30>] (handle_irq_event_percpu+0x0/0x2b0) from [<c006822c>] (handle_irq_event+0x4c/0x6c)
> r10:c06790d8 r9:eab27000 r8:c0673ae0 r7:c05c2020 r6:ec53b6c0 r5:edd529d4
> r4:edd52980
> [<c00681e0>] (handle_irq_event+0x0/0x6c) from [<c006b140>] (handle_level_irq+0xe8/0x100)
> r6:00000000 r5:edd529d4 r4:edd52980 r3:00022000
> [<c006b058>] (handle_level_irq+0x0/0x100) from [<c00676f8>] (generic_handle_irq+0x30/0x40)
> r5:0000001f r4:0000001f
> [<c00676c8>] (generic_handle_irq+0x0/0x40) from [<c000f57c>] (handle_IRQ+0xd0/0x13c)
> r4:ea997b18 r3:000000e0
> [<c000f4ac>] (handle_IRQ+0x0/0x13c) from [<c00086c4>] (armada_370_xp_handle_irq+0x4c/0x118)
> r8:000003ff r7:ea997b18 r6:ffffffff r5:60070013 r4:c0674dc0
> [<c0008678>] (armada_370_xp_handle_irq+0x0/0x118) from [<c0013840>] (__irq_svc+0x40/0x70)
> Exception stack(0xea997b18 to 0xea997b60)
> 7b00: 00000001 20070013
> 7b20: 00000000 0000000b 20070013 eab27000 20070013 00000000 ed10103e eab27000
> 7b40: c06790d8 ea997b74 ea997b60 ea997b60 c04186c0 c04186c8 60070013 ffffffff
> r9:eab27000 r8:ed10103e r7:ea997b4c r6:ffffffff r5:60070013 r4:c04186c8
> [<c04186a4>] (_raw_spin_unlock_irqrestore+0x0/0x54) from [<c0288fc0>] (uart_start+0x40/0x44)
> r4:c06790d8 r3:c028ddd8
> [<c0288f80>] (uart_start+0x0/0x44) from [<c028982c>] (uart_write+0xe4/0xf4)
> r6:0000003e r5:00000000 r4:ed68ea90 r3:0000003e
> [<c0289748>] (uart_write+0x0/0xf4) from [<bf3a0d20>] (sl_xmit+0x1c4/0x228 [slip])
> r10:ed388e60 r9:0000003c r8:ffffffdd r7:0000003e r6:ec02754c r5:ea717eb8
> r4:ec027000
> [<bf3a0b5c>] (sl_xmit+0x0/0x228 [slip]) from [<c0368d74>] (dev_hard_start_xmit+0x39c/0x6d0)
> r8:eaf163c0 r7:ec027000 r6:ea717eb8 r5:00000000 r4:00000000
>
> Signed-off-by: Tyler Hall <[email protected]>
> Cc: Oliver Hartkopp <[email protected]>
> Cc: Andre Naujoks <[email protected]>
> Cc: David S. Miller <[email protected]>
> Cc: [email protected]
> ---
> drivers/net/slip/slip.c | 36 ++++++++++++++++++++++++++----------
> drivers/net/slip/slip.h | 1 +
> 2 files changed, 27 insertions(+), 10 deletions(-)
>
> diff --git a/drivers/net/slip/slip.c b/drivers/net/slip/slip.c
> index ad4a94e..8752644 100644
> --- a/drivers/net/slip/slip.c
> +++ b/drivers/net/slip/slip.c
> @@ -83,6 +83,7 @@
> #include <linux/delay.h>
> #include <linux/init.h>
> #include <linux/slab.h>
> +#include <linux/workqueue.h>
> #include "slip.h"
> #ifdef CONFIG_INET
> #include <linux/ip.h>
> @@ -416,36 +417,46 @@ static void sl_encaps(struct slip *sl, unsigned char *icp, int len)
> #endif
> }
>
> -/*
> - * Called by the driver when there's room for more data. If we have
> - * more packets to send, we send them here.
> - */
> -static void slip_write_wakeup(struct tty_struct *tty)
> +/* Write out any remaining transmit buffer. Scheduled when tty is writable */
> +static void slip_transmit(struct work_struct *work)
> {
> + struct slip *sl = container_of(work, struct slip, tx_work);
> int actual;
> - struct slip *sl = tty->disc_data;
>
> + spin_lock_bh(&sl->lock);
> /* First make sure we're connected. */
> - if (!sl || sl->magic != SLIP_MAGIC || !netif_running(sl->dev))
> + if (!sl->tty || sl->magic != SLIP_MAGIC || !netif_running(sl->dev)) {
> + spin_unlock_bh(&sl->lock);
> return;
> + }
>
> - spin_lock_bh(&sl->lock);
> if (sl->xleft <= 0) {
> /* Now serial buffer is almost free & we can start
> * transmission of another packet */
> sl->dev->stats.tx_packets++;
> - clear_bit(TTY_DO_WRITE_WAKEUP, &tty->flags);
> + clear_bit(TTY_DO_WRITE_WAKEUP, &sl->tty->flags);
> spin_unlock_bh(&sl->lock);
> sl_unlock(sl);
> return;
> }
>
> - actual = tty->ops->write(tty, sl->xhead, sl->xleft);
> + actual = sl->tty->ops->write(sl->tty, sl->xhead, sl->xleft);
> sl->xleft -= actual;
> sl->xhead += actual;
> spin_unlock_bh(&sl->lock);
> }
>
> +/*
> + * Called by the driver when there's room for more data.
> + * Schedule the transmit.
> + */
> +static void slip_write_wakeup(struct tty_struct *tty)
> +{
> + struct slip *sl = tty->disc_data;
> +
> + schedule_work(&sl->tx_work);
> +}
> +
> static void sl_tx_timeout(struct net_device *dev)
> {
> struct slip *sl = netdev_priv(dev);
> @@ -749,6 +760,7 @@ static struct slip *sl_alloc(dev_t line)
> sl->magic = SLIP_MAGIC;
> sl->dev = dev;
> spin_lock_init(&sl->lock);
> + INIT_WORK(&sl->tx_work, slip_transmit);
> sl->mode = SL_MODE_DEFAULT;
> #ifdef CONFIG_SLIP_SMART
> /* initialize timer_list struct */
> @@ -872,8 +884,12 @@ static void slip_close(struct tty_struct *tty)
> if (!sl || sl->magic != SLIP_MAGIC || sl->tty != tty)
> return;
>
> + spin_lock_bh(&sl->lock);
> tty->disc_data = NULL;
> sl->tty = NULL;
> + spin_unlock_bh(&sl->lock);
> +
> + flush_work(&sl->tx_work);
>
> /* VSV = very important to remove timers */
> #ifdef CONFIG_SLIP_SMART
> diff --git a/drivers/net/slip/slip.h b/drivers/net/slip/slip.h
> index 67673cf..cf32aad 100644
> --- a/drivers/net/slip/slip.h
> +++ b/drivers/net/slip/slip.h
> @@ -53,6 +53,7 @@ struct slip {
> struct tty_struct *tty; /* ptr to TTY structure */
> struct net_device *dev; /* easy for intr handling */
> spinlock_t lock;
> + struct work_struct tx_work; /* Flushes transmit buffer */
>
> #ifdef SL_INCLUDE_CSLIP
> struct slcompress *slcomp; /* for header compression */
>

2014-06-17 04:29:57

by David Miller

[permalink] [raw]
Subject: Re: [PATCH 1/2] slip: Fix deadlock in write_wakeup

From: Tyler Hall <[email protected]>
Date: Sun, 15 Jun 2014 22:23:16 -0400

> Use schedule_work() to avoid potentially taking the spinlock in
> interrupt context.
>
> Commit cc9fa74e2a ("slip/slcan: added locking in wakeup function") added
> necessary locking to the wakeup function and 367525c8c2/ddcde142be ("can:
> slcan: Fix spinlock variant") converted it to spin_lock_bh() because the lock
> is also taken in timers.
>
> Disabling softirqs is not sufficient, however, as tty drivers may call
> write_wakeup from interrupt context. This driver calls tty->ops->write() with
> its spinlock held, which may immediately cause an interrupt on the same CPU and
> subsequent spin_bug().
>
> Simply converting to spin_lock_irq/irqsave() prevents this deadlock, but
> causes lockdep to point out a possible circular locking dependency
> between these locks:
>
> (&(&sl->lock)->rlock){-.....}, at: slip_write_wakeup
> (&port_lock_key){-.....}, at: serial8250_handle_irq.part.13
>
> The slip transmit is holding the slip spinlock when calling the tty write.
> This grabs the port lock. On an interrupt, the handler grabs the port
> lock and calls write_wakeup which grabs the slip lock. This could be a
> problem if a serial interrupt occurs on another CPU during the slip
> transmit.
>
> To deal with these issues, don't grab the lock in the wakeup function by
> deferring the writeout to a workqueue. Also hold the lock during close
> when de-assigning the tty pointer to safely disarm the worker and
> timers.
>
> This bug is easily reproducible on the first transmit when slip is
> used with the standard 8250 serial driver.
...
> Signed-off-by: Tyler Hall <[email protected]>

Applied and queued up for -stable.

2014-06-17 04:30:07

by David Miller

[permalink] [raw]
Subject: Re: [PATCH 2/2] slcan: Port write_wakeup deadlock fix from slip

From: Tyler Hall <[email protected]>
Date: Sun, 15 Jun 2014 22:23:17 -0400

> The commit "slip: Fix deadlock in write_wakeup" fixes a deadlock caused
> by a change made in both slcan and slip. This is a direct port of that
> fix.
>
> Signed-off-by: Tyler Hall <[email protected]>

Applied and queued up for -stable.

2014-06-17 08:06:33

by Andre Naujoks

[permalink] [raw]
Subject: Re: [PATCH 2/2] slcan: Port write_wakeup deadlock fix from slip

On 16.06.2014 04:23, schrieb Tyler Hall:
> The commit "slip: Fix deadlock in write_wakeup" fixes a deadlock caused
> by a change made in both slcan and slip. This is a direct port of that
> fix.

I don't know if it is still needed, but for the slcan part, you can add my:

Tested-by: Andre Naujoks <[email protected]>

Regards
Andre

>
> Signed-off-by: Tyler Hall <[email protected]>
> Cc: Oliver Hartkopp <[email protected]>
> Cc: Andre Naujoks <[email protected]>
> Cc: David S. Miller <[email protected]>
> Cc: [email protected]
> ---
> drivers/net/can/slcan.c | 37 +++++++++++++++++++++++++++----------
> 1 file changed, 27 insertions(+), 10 deletions(-)
>
> diff --git a/drivers/net/can/slcan.c b/drivers/net/can/slcan.c
> index dcf9196..ea4d4f1 100644
> --- a/drivers/net/can/slcan.c
> +++ b/drivers/net/can/slcan.c
> @@ -52,6 +52,7 @@
> #include <linux/delay.h>
> #include <linux/init.h>
> #include <linux/kernel.h>
> +#include <linux/workqueue.h>
> #include <linux/can.h>
> #include <linux/can/skb.h>
>
> @@ -85,6 +86,7 @@ struct slcan {
> struct tty_struct *tty; /* ptr to TTY structure */
> struct net_device *dev; /* easy for intr handling */
> spinlock_t lock;
> + struct work_struct tx_work; /* Flushes transmit buffer */
>
> /* These are pointers to the malloc()ed frame buffers. */
> unsigned char rbuff[SLC_MTU]; /* receiver buffer */
> @@ -309,36 +311,46 @@ static void slc_encaps(struct slcan *sl, struct can_frame *cf)
> sl->dev->stats.tx_bytes += cf->can_dlc;
> }
>
> -/*
> - * Called by the driver when there's room for more data. If we have
> - * more packets to send, we send them here.
> - */
> -static void slcan_write_wakeup(struct tty_struct *tty)
> +/* Write out any remaining transmit buffer. Scheduled when tty is writable */
> +static void slcan_transmit(struct work_struct *work)
> {
> + struct slcan *sl = container_of(work, struct slcan, tx_work);
> int actual;
> - struct slcan *sl = (struct slcan *) tty->disc_data;
>
> + spin_lock_bh(&sl->lock);
> /* First make sure we're connected. */
> - if (!sl || sl->magic != SLCAN_MAGIC || !netif_running(sl->dev))
> + if (!sl->tty || sl->magic != SLCAN_MAGIC || !netif_running(sl->dev)) {
> + spin_unlock_bh(&sl->lock);
> return;
> + }
>
> - spin_lock_bh(&sl->lock);
> if (sl->xleft <= 0) {
> /* Now serial buffer is almost free & we can start
> * transmission of another packet */
> sl->dev->stats.tx_packets++;
> - clear_bit(TTY_DO_WRITE_WAKEUP, &tty->flags);
> + clear_bit(TTY_DO_WRITE_WAKEUP, &sl->tty->flags);
> spin_unlock_bh(&sl->lock);
> netif_wake_queue(sl->dev);
> return;
> }
>
> - actual = tty->ops->write(tty, sl->xhead, sl->xleft);
> + actual = sl->tty->ops->write(sl->tty, sl->xhead, sl->xleft);
> sl->xleft -= actual;
> sl->xhead += actual;
> spin_unlock_bh(&sl->lock);
> }
>
> +/*
> + * Called by the driver when there's room for more data.
> + * Schedule the transmit.
> + */
> +static void slcan_write_wakeup(struct tty_struct *tty)
> +{
> + struct slcan *sl = tty->disc_data;
> +
> + schedule_work(&sl->tx_work);
> +}
> +
> /* Send a can_frame to a TTY queue. */
> static netdev_tx_t slc_xmit(struct sk_buff *skb, struct net_device *dev)
> {
> @@ -528,6 +540,7 @@ static struct slcan *slc_alloc(dev_t line)
> sl->magic = SLCAN_MAGIC;
> sl->dev = dev;
> spin_lock_init(&sl->lock);
> + INIT_WORK(&sl->tx_work, slcan_transmit);
> slcan_devs[i] = dev;
>
> return sl;
> @@ -626,8 +639,12 @@ static void slcan_close(struct tty_struct *tty)
> if (!sl || sl->magic != SLCAN_MAGIC || sl->tty != tty)
> return;
>
> + spin_lock_bh(&sl->lock);
> tty->disc_data = NULL;
> sl->tty = NULL;
> + spin_unlock_bh(&sl->lock);
> +
> + flush_work(&sl->tx_work);
>
> /* Flush network side */
> unregister_netdev(sl->dev);
>

2014-06-23 06:34:06

by Alexander Stein

[permalink] [raw]
Subject: Re: [PATCH 1/2] slip: Fix deadlock in write_wakeup

On Monday 16 June 2014 19:55:04, Oliver Hartkopp wrote:
> Hello Tyler,
>
> On 16.06.2014 04:23, Tyler Hall wrote:
> > Use schedule_work() to avoid potentially taking the spinlock in
> > interrupt context.
> >
> (..)
>
> >
> > To deal with these issues, don't grab the lock in the wakeup function by
> > deferring the writeout to a workqueue. Also hold the lock during close
> > when de-assigning the tty pointer to safely disarm the worker and
> > timers.
> >
> > This bug is easily reproducible on the first transmit when slip is
> > used with the standard 8250 serial driver.
> >
>
> looks reasonable. Thanks for your patch!
> Indeed I can't remember ever using the slcan driver with a real serial
> controller hardware with irq line but only via serial-to-USB adapters :-)
> Due to the recent fixes from Andre and Alexander these two drivers got in
> motion again ...
>
> @Andre/Alexander: Can you please check if slcan still works in your setup. I
> don't have that hardware with me. I only was able to compile it successfully.

Sorry, I don't have access to the serial hardware currently.

Best regards
Alexander