2016-12-02 23:06:30

by Lino Sanfilippo

[permalink] [raw]
Subject: Avoid deadlock situation due to use of xmit_lock

Hi,

after stumbling over a potential deadlock situation in the altera driver
(see http://marc.info/?l=linux-netdev&m=148054615230447&w=2), I checked
all other ethernet drivers for the same issue and actually found it in 2
more, namely stmmac, and sxgbe. Please see the commit messages for a
description of the problem.
These 2 patches fix the concerning drivers.

Regards,
Lino


2016-12-02 23:06:34

by Lino Sanfilippo

[permalink] [raw]
Subject: [PATCH 1/2] net: ethernet: sxgbe: do not use xmit_lock in tx completion handler

The driver already uses its private lock for synchronization between the
xmit function and the xmit completion handler, making the additional use of
the xmit_lock unnecessary.
Furthermore the driver does not set NETIF_F_LLTX resulting in xmit to be
called with the xmit_lock held and then taking the private lock.
On the other hand the xmit completion handler uses the reverse locking
order, by first taking the private lock, and then the xmit_lock, which
leads to the potential danger of a deadlock.

Fix this issue by not taking the xmit_lock in the completion handler.
By doing this also remove an unnecessary double check for a stopped tx
queue.

Signed-off-by: Lino Sanfilippo <[email protected]>
---
drivers/net/ethernet/samsung/sxgbe/sxgbe_main.c | 11 +++--------
1 file changed, 3 insertions(+), 8 deletions(-)

Please note that this patch is only compile tested.

diff --git a/drivers/net/ethernet/samsung/sxgbe/sxgbe_main.c b/drivers/net/ethernet/samsung/sxgbe/sxgbe_main.c
index 5dbe406..578cbec 100644
--- a/drivers/net/ethernet/samsung/sxgbe/sxgbe_main.c
+++ b/drivers/net/ethernet/samsung/sxgbe/sxgbe_main.c
@@ -782,14 +782,9 @@ static void sxgbe_tx_queue_clean(struct sxgbe_tx_queue *tqueue)
/* wake up queue */
if (unlikely(netif_tx_queue_stopped(dev_txq) &&
sxgbe_tx_avail(tqueue, tx_rsize) > SXGBE_TX_THRESH(priv))) {
- netif_tx_lock(priv->dev);
- if (netif_tx_queue_stopped(dev_txq) &&
- sxgbe_tx_avail(tqueue, tx_rsize) > SXGBE_TX_THRESH(priv)) {
- if (netif_msg_tx_done(priv))
- pr_debug("%s: restart transmit\n", __func__);
- netif_tx_wake_queue(dev_txq);
- }
- netif_tx_unlock(priv->dev);
+ if (netif_msg_tx_done(priv))
+ pr_debug("%s: restart transmit\n", __func__);
+ netif_tx_wake_queue(dev_txq);
}

spin_unlock(&tqueue->tx_lock);
--
1.9.1

2016-12-02 23:06:57

by Lino Sanfilippo

[permalink] [raw]
Subject: [PATCH 2/2] net: ethernet: stmmac: do not use xmit_lock in tx completion handler

The driver already uses its private lock for synchronization between the
xmit function and the xmit completion handler, making the additional use of
the xmit_lock unnecessary.
Furthermore the driver does not set NETIF_F_LLTX resulting in xmit to be
called with the xmit_lock held and then taking the private lock.
On the other hand the xmit completion handler uses the reverse locking
order, by first taking the private lock, and then the xmit_lock, which
leads to the potential danger of a deadlock.

Fix this issue by not taking the xmit_lock in the completion handler.
By doing this also remove an unnecessary double check for a stopped tx
queue.

Signed-off-by: Lino Sanfilippo <[email protected]>
---
drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 11 +++--------
1 file changed, 3 insertions(+), 8 deletions(-)

Please note that this patch is only compile tested.

diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index 48a4e84..8def423 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -1380,14 +1380,9 @@ static void stmmac_tx_clean(struct stmmac_priv *priv)

if (unlikely(netif_queue_stopped(priv->dev) &&
stmmac_tx_avail(priv) > STMMAC_TX_THRESH)) {
- netif_tx_lock(priv->dev);
- if (netif_queue_stopped(priv->dev) &&
- stmmac_tx_avail(priv) > STMMAC_TX_THRESH) {
- netif_dbg(priv, tx_done, priv->dev,
- "%s: restart transmit\n", __func__);
- netif_wake_queue(priv->dev);
- }
- netif_tx_unlock(priv->dev);
+ netif_dbg(priv, tx_done, priv->dev,
+ "%s: restart transmit\n", __func__);
+ netif_wake_queue(priv->dev);
}

if ((priv->eee_enabled) && (!priv->tx_path_in_lpi_mode)) {
--
1.9.1

2016-12-06 15:06:11

by David Miller

[permalink] [raw]
Subject: Re: Avoid deadlock situation due to use of xmit_lock

From: Lino Sanfilippo <[email protected]>
Date: Sat, 3 Dec 2016 00:06:04 +0100

> after stumbling over a potential deadlock situation in the altera driver
> (see http://marc.info/?l=linux-netdev&m=148054615230447&w=2), I checked
> all other ethernet drivers for the same issue and actually found it in 2
> more, namely stmmac, and sxgbe. Please see the commit messages for a
> description of the problem.
> These 2 patches fix the concerning drivers.

First of all, I don't want to apply these patches without proper testing
and ACKs from the individual driver maintainers.

For both of these drivers, this situation only exists because the TX
path uses the unnecessary ->tx_lock. This private lock should be
removed completely and the driver should use the lock the mid-layer
already holds in the transmit path and take it in the TX reclaim path
instead of the private ->tx_lock.

2016-12-06 19:10:59

by Lino Sanfilippo

[permalink] [raw]
Subject: Re: Avoid deadlock situation due to use of xmit_lock

Hi,

On 06.12.2016 16:06, David Miller wrote:
> From: Lino Sanfilippo <[email protected]>
> Date: Sat, 3 Dec 2016 00:06:04 +0100
>
>> after stumbling over a potential deadlock situation in the altera driver
>> (see http://marc.info/?l=linux-netdev&m=148054615230447&w=2), I checked
>> all other ethernet drivers for the same issue and actually found it in 2
>> more, namely stmmac, and sxgbe. Please see the commit messages for a
>> description of the problem.
>> These 2 patches fix the concerning drivers.
>
> First of all, I don't want to apply these patches without proper testing
> and ACKs from the individual driver maintainers.
>
> For both of these drivers, this situation only exists because the TX
> path uses the unnecessary ->tx_lock. This private lock should be
> removed completely and the driver should use the lock the mid-layer
> already holds in the transmit path and take it in the TX reclaim path
> instead of the private ->tx_lock.
>

Ok, I will prepare a new set of patches to remove those private locks entirely.

Regards,
Lino