2022-11-29 14:47:47

by Vladimir Oltean

[permalink] [raw]
Subject: [PATCH net-next 00/12] Fix rtnl_mutex deadlock with DPAA2 and SFP modules

This patch set deliberately targets net-next and lacks Fixes: tags due
to caution on my part.

While testing some SFP modules on the Solidrun Honeycomb LX2K platform,
I noticed that rebooting causes a deadlock:

============================================
WARNING: possible recursive locking detected
6.1.0-rc5-07010-ga9b9500ffaac-dirty #656 Not tainted
--------------------------------------------
systemd-shutdow/1 is trying to acquire lock:
ffffa62db6cf42f0 (rtnl_mutex){+.+.}-{4:4}, at: rtnl_lock+0x1c/0x30

but task is already holding lock:
ffffa62db6cf42f0 (rtnl_mutex){+.+.}-{4:4}, at: rtnl_lock+0x1c/0x30

other info that might help us debug this:
Possible unsafe locking scenario:

CPU0
----
lock(rtnl_mutex);
lock(rtnl_mutex);

*** DEADLOCK ***

May be due to missing lock nesting notation

6 locks held by systemd-shutdow/1:
#0: ffffa62db6863c70 (system_transition_mutex){+.+.}-{4:4}, at: __do_sys_reboot+0xd4/0x260
#1: ffff2f2b0176f100 (&dev->mutex){....}-{4:4}, at: device_shutdown+0xf4/0x260
#2: ffff2f2b017be900 (&dev->mutex){....}-{4:4}, at: device_shutdown+0x104/0x260
#3: ffff2f2b017680f0 (&dev->mutex){....}-{4:4}, at: device_release_driver_internal+0x40/0x260
#4: ffff2f2b0e1608f0 (&dev->mutex){....}-{4:4}, at: device_release_driver_internal+0x40/0x260
#5: ffffa62db6cf42f0 (rtnl_mutex){+.+.}-{4:4}, at: rtnl_lock+0x1c/0x30

stack backtrace:
CPU: 6 PID: 1 Comm: systemd-shutdow Not tainted 6.1.0-rc5-07010-ga9b9500ffaac-dirty #656
Hardware name: SolidRun LX2160A Honeycomb (DT)
Call trace:
lock_acquire+0x68/0x84
__mutex_lock+0x98/0x460
mutex_lock_nested+0x2c/0x40
rtnl_lock+0x1c/0x30
sfp_bus_del_upstream+0x1c/0xac
phylink_destroy+0x1c/0x50
dpaa2_mac_disconnect+0x28/0x70
dpaa2_eth_remove+0x1dc/0x1f0
fsl_mc_driver_remove+0x24/0x60
device_remove+0x70/0x80
device_release_driver_internal+0x1f0/0x260
device_links_unbind_consumers+0xe0/0x110
device_release_driver_internal+0x138/0x260
device_release_driver+0x18/0x24
bus_remove_device+0x12c/0x13c
device_del+0x16c/0x424
fsl_mc_device_remove+0x28/0x40
__fsl_mc_device_remove+0x10/0x20
device_for_each_child+0x5c/0xac
dprc_remove+0x94/0xb4
fsl_mc_driver_remove+0x24/0x60
device_remove+0x70/0x80
device_release_driver_internal+0x1f0/0x260
device_release_driver+0x18/0x24
bus_remove_device+0x12c/0x13c
device_del+0x16c/0x424
fsl_mc_bus_remove+0x8c/0x10c
fsl_mc_bus_shutdown+0x10/0x20
platform_shutdown+0x24/0x3c
device_shutdown+0x15c/0x260
kernel_restart+0x40/0xa4
__do_sys_reboot+0x1e4/0x260
__arm64_sys_reboot+0x24/0x30

But fixing this appears to be not so simple. The patch set represents my
attempt to address it.

In short, the problem is that dpaa2_mac_connect() and dpaa2_mac_disconnect()
call 2 phylink functions in a row, one takes rtnl_lock() itself -
phylink_create(), and one which requires rtnl_lock() to be held by the
caller - phylink_fwnode_phy_connect(). The existing approach in the
drivers is too simple. We take rtnl_lock() when calling dpaa2_mac_connect(),
which is what results in the deadlock.

Fixing just that creates another problem. The drivers make use of
rtnl_lock() for serializing with other code paths too. I think I've
found all those code paths, and established other mechanisms for
serializing with them.

Vladimir Oltean (12):
net: dpaa2-eth: don't use -ENOTSUPP error code
net: dpaa2: replace dpaa2_mac_is_type_fixed() with
dpaa2_mac_is_type_phy()
net: dpaa2-mac: absorb phylink_start() call into dpaa2_mac_start()
net: dpaa2-mac: remove defensive check in dpaa2_mac_disconnect()
net: dpaa2-eth: assign priv->mac after dpaa2_mac_connect() call
net: dpaa2-switch: assign port_priv->mac after dpaa2_mac_connect()
call
net: dpaa2: publish MAC stringset to ethtool -S even if MAC is missing
net: dpaa2-switch replace direct MAC access with
dpaa2_switch_port_has_mac()
net: dpaa2-eth: connect to MAC before requesting the "endpoint
changed" IRQ
net: dpaa2-eth: serialize changes to priv->mac with a mutex
net: dpaa2-switch: serialize changes to priv->mac with a mutex
net: dpaa2-mac: move rtnl_lock() only around
phylink_{,dis}connect_phy()

.../freescale/dpaa2/mac-phy-support.rst | 9 +-
.../net/ethernet/freescale/dpaa2/dpaa2-eth.c | 87 ++++++++++++-------
.../net/ethernet/freescale/dpaa2/dpaa2-eth.h | 11 +--
.../ethernet/freescale/dpaa2/dpaa2-ethtool.c | 70 ++++++++++-----
.../net/ethernet/freescale/dpaa2/dpaa2-mac.c | 16 +++-
.../net/ethernet/freescale/dpaa2/dpaa2-mac.h | 10 ++-
.../freescale/dpaa2/dpaa2-switch-ethtool.c | 45 ++++++----
.../ethernet/freescale/dpaa2/dpaa2-switch.c | 59 +++++++++----
.../ethernet/freescale/dpaa2/dpaa2-switch.h | 9 +-
9 files changed, 212 insertions(+), 104 deletions(-)

--
2.34.1


2022-11-29 14:48:05

by Vladimir Oltean

[permalink] [raw]
Subject: [PATCH net-next 06/12] net: dpaa2-switch: assign port_priv->mac after dpaa2_mac_connect() call

The dpaa2-switch has the exact same locking requirements when connected
to a DPMAC, so it needs port_priv->mac to always point either to NULL,
or to a DPMAC with a fully initialized phylink instance.

Make the same preparatory change in the dpaa2-switch driver as in the
dpaa2-eth one.

Signed-off-by: Vladimir Oltean <[email protected]>
---
.../ethernet/freescale/dpaa2/dpaa2-switch.c | 21 +++++++++++--------
1 file changed, 12 insertions(+), 9 deletions(-)

diff --git a/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c b/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c
index 42d3290ccd8b..3b0963d95f67 100644
--- a/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c
+++ b/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c
@@ -1449,9 +1449,8 @@ static int dpaa2_switch_port_connect_mac(struct ethsw_port_priv *port_priv)
err = dpaa2_mac_open(mac);
if (err)
goto err_free_mac;
- port_priv->mac = mac;

- if (dpaa2_switch_port_is_type_phy(port_priv)) {
+ if (dpaa2_mac_is_type_phy(mac)) {
err = dpaa2_mac_connect(mac);
if (err) {
netdev_err(port_priv->netdev,
@@ -1461,11 +1460,12 @@ static int dpaa2_switch_port_connect_mac(struct ethsw_port_priv *port_priv)
}
}

+ port_priv->mac = mac;
+
return 0;

err_close_mac:
dpaa2_mac_close(mac);
- port_priv->mac = NULL;
err_free_mac:
kfree(mac);
return err;
@@ -1473,15 +1473,18 @@ static int dpaa2_switch_port_connect_mac(struct ethsw_port_priv *port_priv)

static void dpaa2_switch_port_disconnect_mac(struct ethsw_port_priv *port_priv)
{
- if (dpaa2_switch_port_is_type_phy(port_priv))
- dpaa2_mac_disconnect(port_priv->mac);
+ struct dpaa2_mac *mac = port_priv->mac;

- if (!dpaa2_switch_port_has_mac(port_priv))
+ port_priv->mac = NULL;
+
+ if (!mac)
return;

- dpaa2_mac_close(port_priv->mac);
- kfree(port_priv->mac);
- port_priv->mac = NULL;
+ if (dpaa2_mac_is_type_phy(mac))
+ dpaa2_mac_disconnect(mac);
+
+ dpaa2_mac_close(mac);
+ kfree(mac);
}

static irqreturn_t dpaa2_switch_irq0_handler_thread(int irq_num, void *arg)
--
2.34.1

2022-11-29 14:48:08

by Vladimir Oltean

[permalink] [raw]
Subject: [PATCH net-next 12/12] net: dpaa2-mac: move rtnl_lock() only around phylink_{,dis}connect_phy()

After the introduction of a private mac_lock that serializes access to
priv->mac (and port_priv->mac in the switch), the only remaining purpose
of rtnl_lock() is to satisfy the locking requirements of
phylink_fwnode_phy_connect() and phylink_disconnect_phy().

But the functions these live in, dpaa2_mac_connect() and
dpaa2_mac_disconnect(), have contradictory locking requirements.
While phylink_fwnode_phy_connect() wants rtnl_lock() to be held,
phylink_create() wants it to not be held.

Move the rtnl_lock() from top-level (in the dpaa2-eth and dpaa2-switch
drivers) to only surround the phylink calls that require it, in the
dpaa2-mac library code.

This is possible because dpaa2_mac_connect() and dpaa2_mac_disconnect()
run unlocked, and there isn't any danger of an AB/BA deadlock between
the rtnl_mutex and other private locks.

Signed-off-by: Vladimir Oltean <[email protected]>
---
drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c | 4 ----
drivers/net/ethernet/freescale/dpaa2/dpaa2-mac.c | 5 +++++
drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c | 4 ----
3 files changed, 5 insertions(+), 8 deletions(-)

diff --git a/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c b/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c
index 3ed54c147e98..0c35abb7d065 100644
--- a/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c
+++ b/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c
@@ -4715,7 +4715,6 @@ static irqreturn_t dpni_irq0_handler_thread(int irq_num, void *arg)
dpaa2_eth_set_mac_addr(netdev_priv(net_dev));
dpaa2_eth_update_tx_fqids(priv);

- rtnl_lock();
/* We can avoid locking because the "endpoint changed" IRQ
* handler is the only one who changes priv->mac at runtime,
* so we are not racing with anyone.
@@ -4725,7 +4724,6 @@ static irqreturn_t dpni_irq0_handler_thread(int irq_num, void *arg)
dpaa2_eth_disconnect_mac(priv);
else
dpaa2_eth_connect_mac(priv);
- rtnl_unlock();
}

return IRQ_HANDLED;
@@ -5045,9 +5043,7 @@ static int dpaa2_eth_remove(struct fsl_mc_device *ls_dev)
else
fsl_mc_free_irqs(ls_dev);

- rtnl_lock();
dpaa2_eth_disconnect_mac(priv);
- rtnl_unlock();
dpaa2_eth_free_rings(priv);
free_percpu(priv->fd);
free_percpu(priv->sgt_cache);
diff --git a/drivers/net/ethernet/freescale/dpaa2/dpaa2-mac.c b/drivers/net/ethernet/freescale/dpaa2/dpaa2-mac.c
index 9d1e7026eaef..8ba4ea4adeb3 100644
--- a/drivers/net/ethernet/freescale/dpaa2/dpaa2-mac.c
+++ b/drivers/net/ethernet/freescale/dpaa2/dpaa2-mac.c
@@ -431,7 +431,9 @@ int dpaa2_mac_connect(struct dpaa2_mac *mac)
}
mac->phylink = phylink;

+ rtnl_lock();
err = phylink_fwnode_phy_connect(mac->phylink, dpmac_node, 0);
+ rtnl_unlock();
if (err) {
netdev_err(net_dev, "phylink_fwnode_phy_connect() = %d\n", err);
goto err_phylink_destroy;
@@ -449,7 +451,10 @@ int dpaa2_mac_connect(struct dpaa2_mac *mac)

void dpaa2_mac_disconnect(struct dpaa2_mac *mac)
{
+ rtnl_lock();
phylink_disconnect_phy(mac->phylink);
+ rtnl_unlock();
+
phylink_destroy(mac->phylink);
dpaa2_pcs_destroy(mac);
of_phy_put(mac->serdes_phy);
diff --git a/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c b/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c
index 0472e24191ad..f4ae4289c41a 100644
--- a/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c
+++ b/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c
@@ -1529,7 +1529,6 @@ static irqreturn_t dpaa2_switch_irq0_handler_thread(int irq_num, void *arg)
}

if (status & DPSW_IRQ_EVENT_ENDPOINT_CHANGED) {
- rtnl_lock();
/* We can avoid locking because the "endpoint changed" IRQ
* handler is the only one who changes priv->mac at runtime,
* so we are not racing with anyone.
@@ -1539,7 +1538,6 @@ static irqreturn_t dpaa2_switch_irq0_handler_thread(int irq_num, void *arg)
dpaa2_switch_port_disconnect_mac(port_priv);
else
dpaa2_switch_port_connect_mac(port_priv);
- rtnl_unlock();
}

out:
@@ -2957,9 +2955,7 @@ static void dpaa2_switch_remove_port(struct ethsw_core *ethsw,
{
struct ethsw_port_priv *port_priv = ethsw->ports[port_idx];

- rtnl_lock();
dpaa2_switch_port_disconnect_mac(port_priv);
- rtnl_unlock();
free_netdev(port_priv->netdev);
ethsw->ports[port_idx] = NULL;
}
--
2.34.1

2022-11-29 14:59:57

by Vladimir Oltean

[permalink] [raw]
Subject: [PATCH net-next 11/12] net: dpaa2-switch: serialize changes to priv->mac with a mutex

The dpaa2-switch driver uses a DPMAC in the same way as the dpaa2-eth
driver, so we need to duplicate the locking solution established by the
previous change to the switch driver as well.

Signed-off-by: Vladimir Oltean <[email protected]>
---
.../freescale/dpaa2/dpaa2-switch-ethtool.c | 32 +++++++++++++++----
.../ethernet/freescale/dpaa2/dpaa2-switch.c | 31 ++++++++++++++++--
.../ethernet/freescale/dpaa2/dpaa2-switch.h | 2 ++
3 files changed, 55 insertions(+), 10 deletions(-)

diff --git a/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch-ethtool.c b/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch-ethtool.c
index 76a4b09e2854..6bc1988be311 100644
--- a/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch-ethtool.c
+++ b/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch-ethtool.c
@@ -60,11 +60,18 @@ dpaa2_switch_get_link_ksettings(struct net_device *netdev,
{
struct ethsw_port_priv *port_priv = netdev_priv(netdev);
struct dpsw_link_state state = {0};
- int err = 0;
+ int err;
+
+ mutex_lock(&port_priv->mac_lock);

- if (dpaa2_switch_port_is_type_phy(port_priv))
- return phylink_ethtool_ksettings_get(port_priv->mac->phylink,
- link_ksettings);
+ if (dpaa2_switch_port_is_type_phy(port_priv)) {
+ err = phylink_ethtool_ksettings_get(port_priv->mac->phylink,
+ link_ksettings);
+ mutex_unlock(&port_priv->mac_lock);
+ return err;
+ }
+
+ mutex_unlock(&port_priv->mac_lock);

err = dpsw_if_get_link_state(port_priv->ethsw_data->mc_io, 0,
port_priv->ethsw_data->dpsw_handle,
@@ -99,9 +106,16 @@ dpaa2_switch_set_link_ksettings(struct net_device *netdev,
bool if_running;
int err = 0, ret;

- if (dpaa2_switch_port_is_type_phy(port_priv))
- return phylink_ethtool_ksettings_set(port_priv->mac->phylink,
- link_ksettings);
+ mutex_lock(&port_priv->mac_lock);
+
+ if (dpaa2_switch_port_is_type_phy(port_priv)) {
+ err = phylink_ethtool_ksettings_set(port_priv->mac->phylink,
+ link_ksettings);
+ mutex_unlock(&port_priv->mac_lock);
+ return err;
+ }
+
+ mutex_unlock(&port_priv->mac_lock);

/* Interface needs to be down to change link settings */
if_running = netif_running(netdev);
@@ -189,8 +203,12 @@ static void dpaa2_switch_ethtool_get_stats(struct net_device *netdev,
dpaa2_switch_ethtool_counters[i].name, err);
}

+ mutex_lock(&port_priv->mac_lock);
+
if (dpaa2_switch_port_has_mac(port_priv))
dpaa2_mac_get_ethtool_stats(port_priv->mac, data + i);
+
+ mutex_unlock(&port_priv->mac_lock);
}

const struct ethtool_ops dpaa2_switch_port_ethtool_ops = {
diff --git a/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c b/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c
index 3b0963d95f67..0472e24191ad 100644
--- a/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c
+++ b/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c
@@ -602,8 +602,11 @@ static int dpaa2_switch_port_link_state_update(struct net_device *netdev)

/* When we manage the MAC/PHY using phylink there is no need
* to manually update the netif_carrier.
+ * We can avoid locking because we are called from the "link changed"
+ * IRQ handler, which is the same as the "endpoint changed" IRQ handler
+ * (the writer to port_priv->mac), so we cannot race with it.
*/
- if (dpaa2_switch_port_is_type_phy(port_priv))
+ if (dpaa2_mac_is_type_phy(port_priv->mac))
return 0;

/* Interrupts are received even though no one issued an 'ifconfig up'
@@ -683,6 +686,8 @@ static int dpaa2_switch_port_open(struct net_device *netdev)
struct ethsw_core *ethsw = port_priv->ethsw_data;
int err;

+ mutex_lock(&port_priv->mac_lock);
+
if (!dpaa2_switch_port_is_type_phy(port_priv)) {
/* Explicitly set carrier off, otherwise
* netif_carrier_ok() will return true and cause 'ip link show'
@@ -696,6 +701,7 @@ static int dpaa2_switch_port_open(struct net_device *netdev)
port_priv->ethsw_data->dpsw_handle,
port_priv->idx);
if (err) {
+ mutex_unlock(&port_priv->mac_lock);
netdev_err(netdev, "dpsw_if_enable err %d\n", err);
return err;
}
@@ -705,6 +711,8 @@ static int dpaa2_switch_port_open(struct net_device *netdev)
if (dpaa2_switch_port_is_type_phy(port_priv))
dpaa2_mac_start(port_priv->mac);

+ mutex_unlock(&port_priv->mac_lock);
+
return 0;
}

@@ -714,6 +722,8 @@ static int dpaa2_switch_port_stop(struct net_device *netdev)
struct ethsw_core *ethsw = port_priv->ethsw_data;
int err;

+ mutex_lock(&port_priv->mac_lock);
+
if (dpaa2_switch_port_is_type_phy(port_priv)) {
dpaa2_mac_stop(port_priv->mac);
} else {
@@ -721,6 +731,8 @@ static int dpaa2_switch_port_stop(struct net_device *netdev)
netif_carrier_off(netdev);
}

+ mutex_unlock(&port_priv->mac_lock);
+
err = dpsw_if_disable(port_priv->ethsw_data->mc_io, 0,
port_priv->ethsw_data->dpsw_handle,
port_priv->idx);
@@ -1460,7 +1472,9 @@ static int dpaa2_switch_port_connect_mac(struct ethsw_port_priv *port_priv)
}
}

+ mutex_lock(&port_priv->mac_lock);
port_priv->mac = mac;
+ mutex_unlock(&port_priv->mac_lock);

return 0;

@@ -1473,9 +1487,12 @@ static int dpaa2_switch_port_connect_mac(struct ethsw_port_priv *port_priv)

static void dpaa2_switch_port_disconnect_mac(struct ethsw_port_priv *port_priv)
{
- struct dpaa2_mac *mac = port_priv->mac;
+ struct dpaa2_mac *mac;

+ mutex_lock(&port_priv->mac_lock);
+ mac = port_priv->mac;
port_priv->mac = NULL;
+ mutex_unlock(&port_priv->mac_lock);

if (!mac)
return;
@@ -1494,6 +1511,7 @@ static irqreturn_t dpaa2_switch_irq0_handler_thread(int irq_num, void *arg)
struct ethsw_port_priv *port_priv;
u32 status = ~0;
int err, if_id;
+ bool had_mac;

err = dpsw_get_irq_status(ethsw->mc_io, 0, ethsw->dpsw_handle,
DPSW_IRQ_INDEX_IF, &status);
@@ -1512,7 +1530,12 @@ static irqreturn_t dpaa2_switch_irq0_handler_thread(int irq_num, void *arg)

if (status & DPSW_IRQ_EVENT_ENDPOINT_CHANGED) {
rtnl_lock();
- if (dpaa2_switch_port_has_mac(port_priv))
+ /* We can avoid locking because the "endpoint changed" IRQ
+ * handler is the only one who changes priv->mac at runtime,
+ * so we are not racing with anyone.
+ */
+ had_mac = !!port_priv->mac;
+ if (had_mac)
dpaa2_switch_port_disconnect_mac(port_priv);
else
dpaa2_switch_port_connect_mac(port_priv);
@@ -3255,6 +3278,8 @@ static int dpaa2_switch_probe_port(struct ethsw_core *ethsw,
port_priv->netdev = port_netdev;
port_priv->ethsw_data = ethsw;

+ mutex_init(&port_priv->mac_lock);
+
port_priv->idx = port_idx;
port_priv->stp_state = BR_STATE_FORWARDING;

diff --git a/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.h b/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.h
index 9898073abe01..42b3ca73f55d 100644
--- a/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.h
+++ b/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.h
@@ -161,6 +161,8 @@ struct ethsw_port_priv {

struct dpaa2_switch_filter_block *filter_block;
struct dpaa2_mac *mac;
+ /* Protects against changes to port_priv->mac */
+ struct mutex mac_lock;
};

/* Switch data */
--
2.34.1

2022-11-29 15:01:57

by Vladimir Oltean

[permalink] [raw]
Subject: [PATCH net-next 09/12] net: dpaa2-eth: connect to MAC before requesting the "endpoint changed" IRQ

dpaa2_eth_connect_mac() is called both from dpaa2_eth_probe() and from
dpni_irq0_handler_thread().

It could happen that the DPNI gets connected to a DPMAC on the fsl-mc
bus exactly during probe, as soon as the "endpoint change" interrupt is
requested in dpaa2_eth_setup_irqs(). This will cause the
dpni_irq0_handler_thread() to register a phylink instance for that DPMAC.

Then, the probing function will also try to register a phylink instance
for the same DPMAC, operation which should fail (and this will fail the
probing of the driver).

Reorder dpaa2_eth_setup_irqs() and dpaa2_eth_connect_mac(), such that
dpni_irq0_handler_thread() never races with the DPMAC-related portion of
the probing path.

Also reorder dpaa2_eth_disconnect_mac() to be in the mirror position of
dpaa2_eth_connect_mac() in the teardown path.

Signed-off-by: Vladimir Oltean <[email protected]>
---
.../net/ethernet/freescale/dpaa2/dpaa2-eth.c | 18 +++++++++---------
1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c b/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c
index 4dbf8a1651cd..b77d292cd960 100644
--- a/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c
+++ b/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c
@@ -4899,6 +4899,10 @@ static int dpaa2_eth_probe(struct fsl_mc_device *dpni_dev)
}
#endif

+ err = dpaa2_eth_connect_mac(priv);
+ if (err)
+ goto err_connect_mac;
+
err = dpaa2_eth_setup_irqs(dpni_dev);
if (err) {
netdev_warn(net_dev, "Failed to set link interrupt, fall back to polling\n");
@@ -4911,10 +4915,6 @@ static int dpaa2_eth_probe(struct fsl_mc_device *dpni_dev)
priv->do_link_poll = true;
}

- err = dpaa2_eth_connect_mac(priv);
- if (err)
- goto err_connect_mac;
-
err = dpaa2_eth_dl_alloc(priv);
if (err)
goto err_dl_register;
@@ -4948,13 +4948,13 @@ static int dpaa2_eth_probe(struct fsl_mc_device *dpni_dev)
err_dl_trap_register:
dpaa2_eth_dl_free(priv);
err_dl_register:
- dpaa2_eth_disconnect_mac(priv);
-err_connect_mac:
if (priv->do_link_poll)
kthread_stop(priv->poll_thread);
else
fsl_mc_free_irqs(dpni_dev);
err_poll_thread:
+ dpaa2_eth_disconnect_mac(priv);
+err_connect_mac:
dpaa2_eth_free_rings(priv);
err_alloc_rings:
err_csum:
@@ -5002,9 +5002,6 @@ static int dpaa2_eth_remove(struct fsl_mc_device *ls_dev)
#endif

unregister_netdev(net_dev);
- rtnl_lock();
- dpaa2_eth_disconnect_mac(priv);
- rtnl_unlock();

dpaa2_eth_dl_port_del(priv);
dpaa2_eth_dl_traps_unregister(priv);
@@ -5015,6 +5012,9 @@ static int dpaa2_eth_remove(struct fsl_mc_device *ls_dev)
else
fsl_mc_free_irqs(ls_dev);

+ rtnl_lock();
+ dpaa2_eth_disconnect_mac(priv);
+ rtnl_unlock();
dpaa2_eth_free_rings(priv);
free_percpu(priv->fd);
free_percpu(priv->sgt_cache);
--
2.34.1

2022-11-29 15:14:06

by Vladimir Oltean

[permalink] [raw]
Subject: [PATCH net-next 04/12] net: dpaa2-mac: remove defensive check in dpaa2_mac_disconnect()

dpaa2_mac_disconnect() will only be called with a NULL mac->phylink if
dpaa2_mac_connect() failed, or was never called.

The callers are these:

dpaa2_eth_disconnect_mac():

if (dpaa2_eth_is_type_phy(priv))
dpaa2_mac_disconnect(priv->mac);

dpaa2_switch_port_disconnect_mac():

if (dpaa2_switch_port_is_type_phy(port_priv))
dpaa2_mac_disconnect(port_priv->mac);

priv->mac can be NULL, but in that case, dpaa2_eth_is_type_phy() returns
false, and dpaa2_mac_disconnect() is never called. Similar for
dpaa2-switch.

When priv->mac is non-NULL, it means that dpaa2_mac_connect() returned
zero (success), and therefore, priv->mac->phylink is also a valid
pointer.

Signed-off-by: Vladimir Oltean <[email protected]>
---
drivers/net/ethernet/freescale/dpaa2/dpaa2-mac.c | 3 ---
1 file changed, 3 deletions(-)

diff --git a/drivers/net/ethernet/freescale/dpaa2/dpaa2-mac.c b/drivers/net/ethernet/freescale/dpaa2/dpaa2-mac.c
index c22ce1c871f3..9d1e7026eaef 100644
--- a/drivers/net/ethernet/freescale/dpaa2/dpaa2-mac.c
+++ b/drivers/net/ethernet/freescale/dpaa2/dpaa2-mac.c
@@ -449,9 +449,6 @@ int dpaa2_mac_connect(struct dpaa2_mac *mac)

void dpaa2_mac_disconnect(struct dpaa2_mac *mac)
{
- if (!mac->phylink)
- return;
-
phylink_disconnect_phy(mac->phylink);
phylink_destroy(mac->phylink);
dpaa2_pcs_destroy(mac);
--
2.34.1

2022-11-29 19:14:50

by Andrew Lunn

[permalink] [raw]
Subject: Re: [PATCH net-next 04/12] net: dpaa2-mac: remove defensive check in dpaa2_mac_disconnect()

On Tue, Nov 29, 2022 at 04:12:13PM +0200, Vladimir Oltean wrote:
> dpaa2_mac_disconnect() will only be called with a NULL mac->phylink if
> dpaa2_mac_connect() failed, or was never called.
>
> The callers are these:
>
> dpaa2_eth_disconnect_mac():
>
> if (dpaa2_eth_is_type_phy(priv))
> dpaa2_mac_disconnect(priv->mac);
>
> dpaa2_switch_port_disconnect_mac():
>
> if (dpaa2_switch_port_is_type_phy(port_priv))
> dpaa2_mac_disconnect(port_priv->mac);
>
> priv->mac can be NULL, but in that case, dpaa2_eth_is_type_phy() returns
> false, and dpaa2_mac_disconnect() is never called. Similar for
> dpaa2-switch.
>
> When priv->mac is non-NULL, it means that dpaa2_mac_connect() returned
> zero (success), and therefore, priv->mac->phylink is also a valid
> pointer.
>
> Signed-off-by: Vladimir Oltean <[email protected]>

Reviewed-by: Andrew Lunn <[email protected]>

Andrew

2022-11-30 16:53:29

by Ioana Ciornei

[permalink] [raw]
Subject: Re: [PATCH net-next 00/12] Fix rtnl_mutex deadlock with DPAA2 and SFP modules

On Tue, Nov 29, 2022 at 04:12:09PM +0200, Vladimir Oltean wrote:
> This patch set deliberately targets net-next and lacks Fixes: tags due
> to caution on my part.
>
> While testing some SFP modules on the Solidrun Honeycomb LX2K platform,
> I noticed that rebooting causes a deadlock:
>
> ============================================
> WARNING: possible recursive locking detected
> 6.1.0-rc5-07010-ga9b9500ffaac-dirty #656 Not tainted
> --------------------------------------------
> systemd-shutdow/1 is trying to acquire lock:
> ffffa62db6cf42f0 (rtnl_mutex){+.+.}-{4:4}, at: rtnl_lock+0x1c/0x30
>
> but task is already holding lock:
> ffffa62db6cf42f0 (rtnl_mutex){+.+.}-{4:4}, at: rtnl_lock+0x1c/0x30
>
> other info that might help us debug this:
> Possible unsafe locking scenario:
>
> CPU0
> ----
> lock(rtnl_mutex);
> lock(rtnl_mutex);
>
> *** DEADLOCK ***
>
> May be due to missing lock nesting notation
>
> 6 locks held by systemd-shutdow/1:
> #0: ffffa62db6863c70 (system_transition_mutex){+.+.}-{4:4}, at: __do_sys_reboot+0xd4/0x260
> #1: ffff2f2b0176f100 (&dev->mutex){....}-{4:4}, at: device_shutdown+0xf4/0x260
> #2: ffff2f2b017be900 (&dev->mutex){....}-{4:4}, at: device_shutdown+0x104/0x260
> #3: ffff2f2b017680f0 (&dev->mutex){....}-{4:4}, at: device_release_driver_internal+0x40/0x260
> #4: ffff2f2b0e1608f0 (&dev->mutex){....}-{4:4}, at: device_release_driver_internal+0x40/0x260
> #5: ffffa62db6cf42f0 (rtnl_mutex){+.+.}-{4:4}, at: rtnl_lock+0x1c/0x30
>
> stack backtrace:
> CPU: 6 PID: 1 Comm: systemd-shutdow Not tainted 6.1.0-rc5-07010-ga9b9500ffaac-dirty #656
> Hardware name: SolidRun LX2160A Honeycomb (DT)
> Call trace:
> lock_acquire+0x68/0x84
> __mutex_lock+0x98/0x460
> mutex_lock_nested+0x2c/0x40
> rtnl_lock+0x1c/0x30
> sfp_bus_del_upstream+0x1c/0xac
> phylink_destroy+0x1c/0x50
> dpaa2_mac_disconnect+0x28/0x70
> dpaa2_eth_remove+0x1dc/0x1f0
> fsl_mc_driver_remove+0x24/0x60
> device_remove+0x70/0x80
> device_release_driver_internal+0x1f0/0x260
> device_links_unbind_consumers+0xe0/0x110
> device_release_driver_internal+0x138/0x260
> device_release_driver+0x18/0x24
> bus_remove_device+0x12c/0x13c
> device_del+0x16c/0x424
> fsl_mc_device_remove+0x28/0x40
> __fsl_mc_device_remove+0x10/0x20
> device_for_each_child+0x5c/0xac
> dprc_remove+0x94/0xb4
> fsl_mc_driver_remove+0x24/0x60
> device_remove+0x70/0x80
> device_release_driver_internal+0x1f0/0x260
> device_release_driver+0x18/0x24
> bus_remove_device+0x12c/0x13c
> device_del+0x16c/0x424
> fsl_mc_bus_remove+0x8c/0x10c
> fsl_mc_bus_shutdown+0x10/0x20
> platform_shutdown+0x24/0x3c
> device_shutdown+0x15c/0x260
> kernel_restart+0x40/0xa4
> __do_sys_reboot+0x1e4/0x260
> __arm64_sys_reboot+0x24/0x30
>
> But fixing this appears to be not so simple. The patch set represents my
> attempt to address it.
>
> In short, the problem is that dpaa2_mac_connect() and dpaa2_mac_disconnect()
> call 2 phylink functions in a row, one takes rtnl_lock() itself -
> phylink_create(), and one which requires rtnl_lock() to be held by the
> caller - phylink_fwnode_phy_connect(). The existing approach in the
> drivers is too simple. We take rtnl_lock() when calling dpaa2_mac_connect(),
> which is what results in the deadlock.
>
> Fixing just that creates another problem. The drivers make use of
> rtnl_lock() for serializing with other code paths too. I think I've
> found all those code paths, and established other mechanisms for
> serializing with them.
>
> Vladimir Oltean (12):
> net: dpaa2-eth: don't use -ENOTSUPP error code
> net: dpaa2: replace dpaa2_mac_is_type_fixed() with
> dpaa2_mac_is_type_phy()
> net: dpaa2-mac: absorb phylink_start() call into dpaa2_mac_start()
> net: dpaa2-mac: remove defensive check in dpaa2_mac_disconnect()
> net: dpaa2-eth: assign priv->mac after dpaa2_mac_connect() call
> net: dpaa2-switch: assign port_priv->mac after dpaa2_mac_connect()
> call
> net: dpaa2: publish MAC stringset to ethtool -S even if MAC is missing
> net: dpaa2-switch replace direct MAC access with
> dpaa2_switch_port_has_mac()
> net: dpaa2-eth: connect to MAC before requesting the "endpoint
> changed" IRQ
> net: dpaa2-eth: serialize changes to priv->mac with a mutex
> net: dpaa2-switch: serialize changes to priv->mac with a mutex
> net: dpaa2-mac: move rtnl_lock() only around
> phylink_{,dis}connect_phy()
>
> .../freescale/dpaa2/mac-phy-support.rst | 9 +-
> .../net/ethernet/freescale/dpaa2/dpaa2-eth.c | 87 ++++++++++++-------
> .../net/ethernet/freescale/dpaa2/dpaa2-eth.h | 11 +--
> .../ethernet/freescale/dpaa2/dpaa2-ethtool.c | 70 ++++++++++-----
> .../net/ethernet/freescale/dpaa2/dpaa2-mac.c | 16 +++-
> .../net/ethernet/freescale/dpaa2/dpaa2-mac.h | 10 ++-
> .../freescale/dpaa2/dpaa2-switch-ethtool.c | 45 ++++++----
> .../ethernet/freescale/dpaa2/dpaa2-switch.c | 59 +++++++++----
> .../ethernet/freescale/dpaa2/dpaa2-switch.h | 9 +-
> 9 files changed, 212 insertions(+), 104 deletions(-)

For the entire patch set:

Reviewed-by: Ioana Ciornei <[email protected]>
Tested-by: Ioana Ciornei <[email protected]>

2022-12-01 13:21:22

by patchwork-bot+netdevbpf

[permalink] [raw]
Subject: Re: [PATCH net-next 00/12] Fix rtnl_mutex deadlock with DPAA2 and SFP modules

Hello:

This series was applied to netdev/net-next.git (master)
by Paolo Abeni <[email protected]>:

On Tue, 29 Nov 2022 16:12:09 +0200 you wrote:
> This patch set deliberately targets net-next and lacks Fixes: tags due
> to caution on my part.
>
> While testing some SFP modules on the Solidrun Honeycomb LX2K platform,
> I noticed that rebooting causes a deadlock:
>
> ============================================
> WARNING: possible recursive locking detected
> 6.1.0-rc5-07010-ga9b9500ffaac-dirty #656 Not tainted
>
> [...]

Here is the summary with links:
- [net-next,01/12] net: dpaa2-eth: don't use -ENOTSUPP error code
https://git.kernel.org/netdev/net-next/c/91c71bf14da4
- [net-next,02/12] net: dpaa2: replace dpaa2_mac_is_type_fixed() with dpaa2_mac_is_type_phy()
https://git.kernel.org/netdev/net-next/c/320fefa9e2ed
- [net-next,03/12] net: dpaa2-mac: absorb phylink_start() call into dpaa2_mac_start()
https://git.kernel.org/netdev/net-next/c/385333888154
- [net-next,04/12] net: dpaa2-mac: remove defensive check in dpaa2_mac_disconnect()
https://git.kernel.org/netdev/net-next/c/ccbd7822950f
- [net-next,05/12] net: dpaa2-eth: assign priv->mac after dpaa2_mac_connect() call
https://git.kernel.org/netdev/net-next/c/02d61948e8da
- [net-next,06/12] net: dpaa2-switch: assign port_priv->mac after dpaa2_mac_connect() call
https://git.kernel.org/netdev/net-next/c/88d64367cea0
- [net-next,07/12] net: dpaa2: publish MAC stringset to ethtool -S even if MAC is missing
https://git.kernel.org/netdev/net-next/c/29811d6e19d7
- [net-next,08/12] net: dpaa2-switch replace direct MAC access with dpaa2_switch_port_has_mac()
https://git.kernel.org/netdev/net-next/c/bc230671bfb2
- [net-next,09/12] net: dpaa2-eth: connect to MAC before requesting the "endpoint changed" IRQ
https://git.kernel.org/netdev/net-next/c/55f90a4d07ec
- [net-next,10/12] net: dpaa2-eth: serialize changes to priv->mac with a mutex
https://git.kernel.org/netdev/net-next/c/2291982e29b1
- [net-next,11/12] net: dpaa2-switch: serialize changes to priv->mac with a mutex
https://git.kernel.org/netdev/net-next/c/3c7f44fa9c4c
- [net-next,12/12] net: dpaa2-mac: move rtnl_lock() only around phylink_{,dis}connect_phy()
https://git.kernel.org/netdev/net-next/c/87db82cb6149

You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html