2021-03-29 15:22:41

by Eddie James

[permalink] [raw]
Subject: [PATCH] net/ncsi: Avoid channel_monitor hrtimer deadlock

From: Milton Miller <[email protected]>

Calling ncsi_stop_channel_monitor from channel_monitor is a guaranteed
deadlock on SMP because stop calls del_timer_sync on the timer that
inoked channel_monitor as its timer function.

Recognise the inherent race of marking the monitor disabled before
deleting the timer by just returning if enable was cleared. After
a timeout (the default case -- reset to START when response received)
just mark the monitor.enabled false.

If the channel has an entrie on the channel_queue list, or if the
state is not ACTIVE or INACTIVE, then warn and mark the timer stopped
and don't restart, as the locking is broken somehow.

Fixes: 0795fb2021f0 ("net/ncsi: Stop monitor if channel times out or is inactive")
Signed-off-by: Milton Miller <[email protected]>
Signed-off-by: Eddie James <[email protected]>
---
net/ncsi/ncsi-manage.c | 20 +++++++++++++-------
1 file changed, 13 insertions(+), 7 deletions(-)

diff --git a/net/ncsi/ncsi-manage.c b/net/ncsi/ncsi-manage.c
index a9cb355324d1..ffff8da707b8 100644
--- a/net/ncsi/ncsi-manage.c
+++ b/net/ncsi/ncsi-manage.c
@@ -105,13 +105,20 @@ static void ncsi_channel_monitor(struct timer_list *t)
monitor_state = nc->monitor.state;
spin_unlock_irqrestore(&nc->lock, flags);

- if (!enabled || chained) {
- ncsi_stop_channel_monitor(nc);
- return;
- }
+ if (!enabled)
+ return; /* expected race disabling timer */
+ if (WARN_ON_ONCE(chained))
+ goto bad_state;
+
if (state != NCSI_CHANNEL_INACTIVE &&
state != NCSI_CHANNEL_ACTIVE) {
- ncsi_stop_channel_monitor(nc);
+bad_state:
+ netdev_warn(ndp->ndev.dev,
+ "Bad NCSI monitor state channel %d 0x%x %s queue\n",
+ nc->id, state, chained ? "on" : "off");
+ spin_lock_irqsave(&nc->lock, flags);
+ nc->monitor.enabled = false;
+ spin_unlock_irqrestore(&nc->lock, flags);
return;
}

@@ -136,10 +143,9 @@ static void ncsi_channel_monitor(struct timer_list *t)
ncsi_report_link(ndp, true);
ndp->flags |= NCSI_DEV_RESHUFFLE;

- ncsi_stop_channel_monitor(nc);
-
ncm = &nc->modes[NCSI_MODE_LINK];
spin_lock_irqsave(&nc->lock, flags);
+ nc->monitor.enabled = false;
nc->state = NCSI_CHANNEL_INVISIBLE;
ncm->data[2] &= ~0x1;
spin_unlock_irqrestore(&nc->lock, flags);
--
2.27.0


2021-03-30 20:22:25

by patchwork-bot+netdevbpf

[permalink] [raw]
Subject: Re: [PATCH] net/ncsi: Avoid channel_monitor hrtimer deadlock

Hello:

This patch was applied to netdev/net.git (refs/heads/master):

On Mon, 29 Mar 2021 10:20:39 -0500 you wrote:
> From: Milton Miller <[email protected]>
>
> Calling ncsi_stop_channel_monitor from channel_monitor is a guaranteed
> deadlock on SMP because stop calls del_timer_sync on the timer that
> inoked channel_monitor as its timer function.
>
> Recognise the inherent race of marking the monitor disabled before
> deleting the timer by just returning if enable was cleared. After
> a timeout (the default case -- reset to START when response received)
> just mark the monitor.enabled false.
>
> [...]

Here is the summary with links:
- net/ncsi: Avoid channel_monitor hrtimer deadlock
https://git.kernel.org/netdev/net/c/03cb4d05b4ea

You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html