2020-11-12 14:49:23

by Wong, Vee Khee

[permalink] [raw]
Subject: [PATCH net 1/1] net: stmmac: Use rtnl_lock/unlock on netif_set_real_num_rx_queues() call

Fix an issue where dump stack is printed on suspend resume flow due to
netif_set_real_num_rx_queues() is not called with rtnl_lock held().

Fixes: 686cff3d7022 ("net: stmmac: Fix incorrect location to set real_num_rx|tx_queues")
Reported-by: Christophe ROULLIER <[email protected]>
Tested-by: Christophe ROULLIER <[email protected]>
Cc: Alexandre TORGUE <[email protected]>
Reviewed-by: Ong Boon Leong <[email protected]>
Signed-off-by: Wong Vee Khee <[email protected]>
---
drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index ba855465a2db..33e280040000 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -5278,7 +5278,10 @@ int stmmac_resume(struct device *dev)

stmmac_clear_descriptors(priv);

+ rtnl_lock();
stmmac_hw_setup(ndev, false);
+ rtnl_unlock();
+
stmmac_init_coalesce(priv);
stmmac_set_rx_mode(ndev);

--
2.17.0


2020-11-14 20:48:04

by Jakub Kicinski

[permalink] [raw]
Subject: Re: [PATCH net 1/1] net: stmmac: Use rtnl_lock/unlock on netif_set_real_num_rx_queues() call

On Thu, 12 Nov 2020 22:49:48 +0800 Wong Vee Khee wrote:
> Fix an issue where dump stack is printed on suspend resume flow due to
> netif_set_real_num_rx_queues() is not called with rtnl_lock held().
>
> Fixes: 686cff3d7022 ("net: stmmac: Fix incorrect location to set real_num_rx|tx_queues")
> Reported-by: Christophe ROULLIER <[email protected]>
> Tested-by: Christophe ROULLIER <[email protected]>
> Cc: Alexandre TORGUE <[email protected]>
> Reviewed-by: Ong Boon Leong <[email protected]>
> Signed-off-by: Wong Vee Khee <[email protected]>
> ---
> drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> index ba855465a2db..33e280040000 100644
> --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> @@ -5278,7 +5278,10 @@ int stmmac_resume(struct device *dev)
>
> stmmac_clear_descriptors(priv);
>
> + rtnl_lock();
> stmmac_hw_setup(ndev, false);
> + rtnl_unlock();
> +
> stmmac_init_coalesce(priv);
> stmmac_set_rx_mode(ndev);
>

Doesn't look quite right. This is under the priv->lock which is
sometimes taken under rtnl_lock. So theoretically there could be
a deadlock.

You should probably take rtnl_lock() before priv->lock and release
it after. It's pretty common for drivers to hold rtnl_lock around
most of the resume method.

With larger context:


mutex_lock(&priv->lock);

stmmac_reset_queues_param(priv);

stmmac_clear_descriptors(priv);

+ rtnl_lock();
stmmac_hw_setup(ndev, false);
+ rtnl_unlock();
+
stmmac_init_coalesce(priv);
stmmac_set_rx_mode(ndev);

stmmac_restore_hw_vlan_rx_fltr(priv, ndev, priv->hw);

stmmac_enable_all_queues(priv);

mutex_unlock(&priv->lock);