2024-03-07 19:30:09

by Ben Greear

[permalink] [raw]
Subject: [PATCH 5/6] wifi: mt76: mt7996: Mitigate mcu communication loss.

From: Ben Greear <[email protected]>

Many calls that end up sending mcu messages to the firmware hold
RTNL or other important locks. So when radio stops answering,
the entire system becomes very sluggish.

Add timeout counter, and if radio times out 3 times in a row,
consider it dead and no longer attempt to talk to it.

Signed-off-by: Ben Greear <[email protected]>
---
drivers/net/wireless/mediatek/mt76/mt7996/mcu.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/net/wireless/mediatek/mt76/mt7996/mcu.c b/drivers/net/wireless/mediatek/mt76/mt7996/mcu.c
index 5550671cdaf6..77c89d2d2423 100644
--- a/drivers/net/wireless/mediatek/mt76/mt7996/mcu.c
+++ b/drivers/net/wireless/mediatek/mt76/mt7996/mcu.c
@@ -202,14 +202,16 @@ mt7996_mcu_parse_response(struct mt76_dev *mdev, int cmd,
if (!skb) {
const char *first = "Secondary";

+ mdev->mcu_timeouts++;
if (!mdev->first_failed_mcu_cmd)
first = "Initial";

dev_err(mdev->dev,
- "MCU: %s Failure: Message %08x (cid %lx ext_cid: %lx seq %d) timeout. Last successful cmd: 0x%x\n",
+ "MCU: %s Failure: Message %08x (cid %lx ext_cid: %lx seq %d) timeout (%d/%d). Last successful cmd: 0x%x\n",
first,
cmd, FIELD_GET(__MCU_CMD_FIELD_ID, cmd),
FIELD_GET(__MCU_CMD_FIELD_EXT_ID, cmd), seq,
+ mdev->mcu_timeouts, MAX_MCU_TIMEOUTS,
mdev->last_successful_mcu_cmd);

if (!mdev->first_failed_mcu_cmd)
@@ -217,6 +219,7 @@ mt7996_mcu_parse_response(struct mt76_dev *mdev, int cmd,
return -ETIMEDOUT;
}

+ mdev->mcu_timeouts = 0;
mdev->last_successful_mcu_cmd = cmd;

if (mdev->first_failed_mcu_cmd) {
--
2.42.0