2021-02-24 13:19:50

by Sasha Levin

[permalink] [raw]
Subject: [PATCH AUTOSEL 5.11 01/67] ath10k: prevent deinitializing NAPI twice

From: Wen Gong <[email protected]>

[ Upstream commit e2f8b74e58cb1560c1399ba94a470b770e858259 ]

It happened "Kernel panic - not syncing: hung_task: blocked tasks" when
test simulate crash and ifconfig down/rmmod meanwhile.

Test steps:

1.Test commands, either can reproduce the hang for PCIe, SDIO and SNOC.
echo soft > /sys/kernel/debug/ieee80211/phy0/ath10k/simulate_fw_crash;sleep 0.05;ifconfig wlan0 down
echo soft > /sys/kernel/debug/ieee80211/phy0/ath10k/simulate_fw_crash;rmmod ath10k_sdio
echo hw-restart > /sys/kernel/debug/ieee80211/phy0/ath10k/simulate_fw_crash;rmmod ath10k_pci

2. dmesg:
[ 5622.548630] ath10k_sdio mmc1:0001:1: simulating soft firmware crash
[ 5622.655995] ieee80211 phy0: Hardware restart was requested
[ 5776.355164] INFO: task shill:1572 blocked for more than 122 seconds.
[ 5776.355687] INFO: task kworker/1:2:24437 blocked for more than 122 seconds.
[ 5776.359812] Kernel panic - not syncing: hung_task: blocked tasks
[ 5776.359836] CPU: 1 PID: 55 Comm: khungtaskd Tainted: G W 4.19.86 #137
[ 5776.359846] Hardware name: MediaTek krane sku176 board (DT)
[ 5776.359855] Call trace:
[ 5776.359868] dump_backtrace+0x0/0x170
[ 5776.359881] show_stack+0x20/0x2c
[ 5776.359896] dump_stack+0xd4/0x10c
[ 5776.359916] panic+0x12c/0x29c
[ 5776.359937] hung_task_panic+0x0/0x50
[ 5776.359953] kthread+0x120/0x130
[ 5776.359965] ret_from_fork+0x10/0x18
[ 5776.359986] SMP: stopping secondary CPUs
[ 5776.360012] Kernel Offset: 0x141ea00000 from 0xffffff8008000000
[ 5776.360026] CPU features: 0x0,2188200c
[ 5776.360035] Memory Limit: none

command "ifconfig wlan0 down" or "rmmod ath10k_sdio" will be blocked
callstack of ifconfig:
[<0>] __switch_to+0x120/0x13c
[<0>] msleep+0x28/0x38
[<0>] ath10k_sdio_hif_stop+0x24c/0x294 [ath10k_sdio]
[<0>] ath10k_core_stop+0x50/0x78 [ath10k_core]
[<0>] ath10k_halt+0x120/0x178 [ath10k_core]
[<0>] ath10k_stop+0x4c/0x8c [ath10k_core]
[<0>] drv_stop+0xe0/0x1e4 [mac80211]
[<0>] ieee80211_stop_device+0x48/0x54 [mac80211]
[<0>] ieee80211_do_stop+0x678/0x6f8 [mac80211]
[<0>] ieee80211_stop+0x20/0x30 [mac80211]
[<0>] __dev_close_many+0xb8/0x11c
[<0>] __dev_change_flags+0xe0/0x1d0
[<0>] dev_change_flags+0x30/0x6c
[<0>] devinet_ioctl+0x370/0x564
[<0>] inet_ioctl+0xdc/0x304
[<0>] sock_do_ioctl+0x50/0x288
[<0>] compat_sock_ioctl+0x1b4/0x1aac
[<0>] __se_compat_sys_ioctl+0x100/0x26fc
[<0>] __arm64_compat_sys_ioctl+0x20/0x2c
[<0>] el0_svc_common+0xa4/0x154
[<0>] el0_svc_compat_handler+0x2c/0x38
[<0>] el0_svc_compat+0x8/0x18
[<0>] 0xffffffffffffffff

callstack of rmmod:
[<0>] __switch_to+0x120/0x13c
[<0>] msleep+0x28/0x38
[<0>] ath10k_sdio_hif_stop+0x294/0x31c [ath10k_sdio]
[<0>] ath10k_core_stop+0x50/0x78 [ath10k_core]
[<0>] ath10k_halt+0x120/0x178 [ath10k_core]
[<0>] ath10k_stop+0x4c/0x8c [ath10k_core]
[<0>] drv_stop+0xe0/0x1e4 [mac80211]
[<0>] ieee80211_stop_device+0x48/0x54 [mac80211]
[<0>] ieee80211_do_stop+0x678/0x6f8 [mac80211]
[<0>] ieee80211_stop+0x20/0x30 [mac80211]
[<0>] __dev_close_many+0xb8/0x11c
[<0>] dev_close_many+0x70/0x100
[<0>] dev_close+0x4c/0x80
[<0>] cfg80211_shutdown_all_interfaces+0x50/0xcc [cfg80211]
[<0>] ieee80211_remove_interfaces+0x58/0x1a0 [mac80211]
[<0>] ieee80211_unregister_hw+0x40/0x100 [mac80211]
[<0>] ath10k_mac_unregister+0x1c/0x44 [ath10k_core]
[<0>] ath10k_core_unregister+0x38/0x7c [ath10k_core]
[<0>] ath10k_sdio_remove+0x8c/0xd0 [ath10k_sdio]
[<0>] sdio_bus_remove+0x48/0x108
[<0>] device_release_driver_internal+0x138/0x1ec
[<0>] driver_detach+0x6c/0xa8
[<0>] bus_remove_driver+0x78/0xa8
[<0>] driver_unregister+0x30/0x50
[<0>] sdio_unregister_driver+0x28/0x34
[<0>] cleanup_module+0x14/0x6bc [ath10k_sdio]
[<0>] __arm64_sys_delete_module+0x1e0/0x22c
[<0>] el0_svc_common+0xa4/0x154
[<0>] el0_svc_compat_handler+0x2c/0x38
[<0>] el0_svc_compat+0x8/0x18
[<0>] 0xffffffffffffffff

SNOC:
[ 647.156863] Call trace:
[ 647.162166] [<ffffff80080855a4>] __switch_to+0x120/0x13c
[ 647.164512] [<ffffff800899d8b8>] __schedule+0x5ec/0x798
[ 647.170062] [<ffffff800899dad8>] schedule+0x74/0x94
[ 647.175050] [<ffffff80089a0848>] schedule_timeout+0x314/0x42c
[ 647.179874] [<ffffff80089a0a14>] schedule_timeout_uninterruptible+0x34/0x40
[ 647.185780] [<ffffff80082a494>] msleep+0x28/0x38
[ 647.192546] [<ffffff800117ec4c>] ath10k_snoc_hif_stop+0x4c/0x1e0 [ath10k_snoc]
[ 647.197439] [<ffffff80010dfbd8>] ath10k_core_stop+0x50/0x7c [ath10k_core]
[ 647.204652] [<ffffff80010c8f48>] ath10k_halt+0x114/0x16c [ath10k_core]
[ 647.211420] [<ffffff80010cad68>] ath10k_stop+0x4c/0x88 [ath10k_core]
[ 647.217865] [<ffffff8000fdbf54>] drv_stop+0x110/0x244 [mac80211]
[ 647.224367] [<ffffff80010147ac>] ieee80211_stop_device+0x48/0x54 [mac80211]
[ 647.230359] [<ffffff8000ff3eec>] ieee80211_do_stop+0x6a4/0x73c [mac80211]
[ 647.237033] [<ffffff8000ff4500>] ieee80211_stop+0x20/0x30 [mac80211]
[ 647.243942] [<ffffff80087e39b8>] __dev_close_many+0xa0/0xfc
[ 647.250435] [<ffffff80087e3888>] dev_close_many+0x70/0x100
[ 647.255651] [<ffffff80087e3a60>] dev_close+0x4c/0x80
[ 647.261244] [<ffffff8000f1ba54>] cfg80211_shutdown_all_interfaces+0x44/0xcc [cfg80211]
[ 647.266383] [<ffffff8000ff3fdc>] ieee80211_remove_interfaces+0x58/0x1b4 [mac80211]
[ 647.274128] [<ffffff8000fda540>] ieee80211_unregister_hw+0x50/0x120 [mac80211]
[ 647.281659] [<ffffff80010ca314>] ath10k_mac_unregister+0x1c/0x44 [ath10k_core]
[ 647.288839] [<ffffff80010dfc94>] ath10k_core_unregister+0x48/0x90 [ath10k_core]
[ 647.296027] [<ffffff800117e598>] ath10k_snoc_remove+0x5c/0x150 [ath10k_snoc]
[ 647.303229] [<ffffff80085625fc>] platform_drv_remove+0x28/0x50
[ 647.310517] [<ffffff80085601a4>] device_release_driver_internal+0x114/0x1b8
[ 647.316257] [<ffffff80085602e4>] driver_detach+0x6c/0xa8
[ 647.323021] [<ffffff800855e5b8>] bus_remove_driver+0x78/0xa8
[ 647.328571] [<ffffff800856107c>] driver_unregister+0x30/0x50
[ 647.334213] [<ffffff8008562674>] platform_driver_unregister+0x1c/0x28
[ 647.339876] [<ffffff800117fefc>] cleanup_module+0x1c/0x120 [ath10k_snoc]
[ 647.346196] [<ffffff8008143ab8>] SyS_delete_module+0x1dc/0x22c

PCIe:
[ 615.392770] rmmod D 0 3523 3458 0x00000080
[ 615.392777] Call Trace:
[ 615.392784] __schedule+0x617/0x7d3
[ 615.392791] ? __mod_timer+0x263/0x35c
[ 615.392797] schedule+0x62/0x72
[ 615.392803] schedule_timeout+0x8d/0xf3
[ 615.392809] ? run_local_timers+0x6b/0x6b
[ 615.392814] msleep+0x1b/0x22
[ 615.392824] ath10k_pci_hif_stop+0x68/0xd6 [ath10k_pci]
[ 615.392844] ath10k_core_stop+0x44/0x67 [ath10k_core]
[ 615.392859] ath10k_halt+0x102/0x153 [ath10k_core]
[ 615.392873] ath10k_stop+0x38/0x75 [ath10k_core]
[ 615.392893] drv_stop+0x9a/0x13c [mac80211]
[ 615.392915] ieee80211_do_stop+0x772/0x7cd [mac80211]
[ 615.392937] ieee80211_stop+0x1a/0x1e [mac80211]
[ 615.392945] __dev_close_many+0x9e/0xf0
[ 615.392952] dev_close_many+0x62/0xe8
[ 615.392958] dev_close+0x54/0x7d
[ 615.392975] cfg80211_shutdown_all_interfaces+0x6e/0xa5 [cfg80211]
[ 615.393021] ieee80211_remove_interfaces+0x52/0x1aa [mac80211]
[ 615.393049] ieee80211_unregister_hw+0x54/0x136 [mac80211]
[ 615.393068] ath10k_mac_unregister+0x19/0x4a [ath10k_core]
[ 615.393091] ath10k_core_unregister+0x39/0x7e [ath10k_core]
[ 615.393104] ath10k_pci_remove+0x3d/0x7f [ath10k_pci]
[ 615.393117] pci_device_remove+0x41/0xa6
[ 615.393129] device_release_driver_internal+0x123/0x1ec
[ 615.393140] driver_detach+0x60/0x90
[ 615.393152] bus_remove_driver+0x72/0x9f
[ 615.393164] pci_unregister_driver+0x1e/0x87
[ 615.393177] SyS_delete_module+0x1d7/0x277
[ 615.393188] do_syscall_64+0x6b/0xf7
[ 615.393199] entry_SYSCALL_64_after_hwframe+0x41/0xa6

The test command run simulate_fw_crash firstly and it call into
ath10k_sdio_hif_stop from ath10k_core_restart, then napi_disable
is called and bit NAPI_STATE_SCHED is set. After that, function
ath10k_sdio_hif_stop is called again from ath10k_stop by command
"ifconfig wlan0 down" or "rmmod ath10k_sdio", then command blocked.

It is blocked by napi_synchronize, napi_disable will set bit with
NAPI_STATE_SCHED, and then napi_synchronize will enter dead loop
becuase bit NAPI_STATE_SCHED is set by napi_disable.

function of napi_synchronize
static inline void napi_synchronize(const struct napi_struct *n)
{
if (IS_ENABLED(CONFIG_SMP))
while (test_bit(NAPI_STATE_SCHED, &n->state))
msleep(1);
else
barrier();
}

function of napi_disable
void napi_disable(struct napi_struct *n)
{
might_sleep();
set_bit(NAPI_STATE_DISABLE, &n->state);

while (test_and_set_bit(NAPI_STATE_SCHED, &n->state))
msleep(1);
while (test_and_set_bit(NAPI_STATE_NPSVC, &n->state))
msleep(1);

hrtimer_cancel(&n->timer);

clear_bit(NAPI_STATE_DISABLE, &n->state);
}

Add flag for it avoid the hang and crash.

Tested-on: QCA6174 hw3.2 SDIO WLAN.RMH.4.4.1-00049
Tested-on: QCA6174 hw3.2 PCI WLAN.RM.4.4.1-00110-QCARMSWP-1
Tested-on: WCN3990 hw1.0 SNOC hw1.0 WLAN.HL.3.1-01307.1-QCAHLSWMTPL-2

Signed-off-by: Wen Gong <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/net/wireless/ath/ath10k/ahb.c | 5 ++---
drivers/net/wireless/ath/ath10k/core.c | 25 +++++++++++++++++++++++++
drivers/net/wireless/ath/ath10k/core.h | 5 +++++
drivers/net/wireless/ath/ath10k/pci.c | 7 ++++---
drivers/net/wireless/ath/ath10k/sdio.c | 5 ++---
drivers/net/wireless/ath/ath10k/snoc.c | 6 +++---
6 files changed, 41 insertions(+), 12 deletions(-)

diff --git a/drivers/net/wireless/ath/ath10k/ahb.c b/drivers/net/wireless/ath/ath10k/ahb.c
index 05a61975c83f4..869524852fbaa 100644
--- a/drivers/net/wireless/ath/ath10k/ahb.c
+++ b/drivers/net/wireless/ath/ath10k/ahb.c
@@ -626,7 +626,7 @@ static int ath10k_ahb_hif_start(struct ath10k *ar)
{
ath10k_dbg(ar, ATH10K_DBG_BOOT, "boot ahb hif start\n");

- napi_enable(&ar->napi);
+ ath10k_core_napi_enable(ar);
ath10k_ce_enable_interrupts(ar);
ath10k_pci_enable_legacy_irq(ar);

@@ -644,8 +644,7 @@ static void ath10k_ahb_hif_stop(struct ath10k *ar)
ath10k_ahb_irq_disable(ar);
synchronize_irq(ar_ahb->irq);

- napi_synchronize(&ar->napi);
- napi_disable(&ar->napi);
+ ath10k_core_napi_sync_disable(ar);

ath10k_pci_flush(ar);
}
diff --git a/drivers/net/wireless/ath/ath10k/core.c b/drivers/net/wireless/ath/ath10k/core.c
index eeb6ff6aa2e1e..a419ec7130f97 100644
--- a/drivers/net/wireless/ath/ath10k/core.c
+++ b/drivers/net/wireless/ath/ath10k/core.c
@@ -2305,6 +2305,31 @@ void ath10k_core_start_recovery(struct ath10k *ar)
}
EXPORT_SYMBOL(ath10k_core_start_recovery);

+void ath10k_core_napi_enable(struct ath10k *ar)
+{
+ lockdep_assert_held(&ar->conf_mutex);
+
+ if (test_bit(ATH10K_FLAG_NAPI_ENABLED, &ar->dev_flags))
+ return;
+
+ napi_enable(&ar->napi);
+ set_bit(ATH10K_FLAG_NAPI_ENABLED, &ar->dev_flags);
+}
+EXPORT_SYMBOL(ath10k_core_napi_enable);
+
+void ath10k_core_napi_sync_disable(struct ath10k *ar)
+{
+ lockdep_assert_held(&ar->conf_mutex);
+
+ if (!test_bit(ATH10K_FLAG_NAPI_ENABLED, &ar->dev_flags))
+ return;
+
+ napi_synchronize(&ar->napi);
+ napi_disable(&ar->napi);
+ clear_bit(ATH10K_FLAG_NAPI_ENABLED, &ar->dev_flags);
+}
+EXPORT_SYMBOL(ath10k_core_napi_sync_disable);
+
static void ath10k_core_restart(struct work_struct *work)
{
struct ath10k *ar = container_of(work, struct ath10k, restart_work);
diff --git a/drivers/net/wireless/ath/ath10k/core.h b/drivers/net/wireless/ath/ath10k/core.h
index 51f7e960e2977..f4be6bfb25392 100644
--- a/drivers/net/wireless/ath/ath10k/core.h
+++ b/drivers/net/wireless/ath/ath10k/core.h
@@ -868,6 +868,9 @@ enum ath10k_dev_flags {

/* Indicates that ath10k device is during recovery process and not complete */
ATH10K_FLAG_RESTARTING,
+
+ /* protected by conf_mutex */
+ ATH10K_FLAG_NAPI_ENABLED,
};

enum ath10k_cal_mode {
@@ -1308,6 +1311,8 @@ static inline bool ath10k_peer_stats_enabled(struct ath10k *ar)

extern unsigned long ath10k_coredump_mask;

+void ath10k_core_napi_sync_disable(struct ath10k *ar);
+void ath10k_core_napi_enable(struct ath10k *ar);
struct ath10k *ath10k_core_create(size_t priv_size, struct device *dev,
enum ath10k_bus bus,
enum ath10k_hw_rev hw_rev,
diff --git a/drivers/net/wireless/ath/ath10k/pci.c b/drivers/net/wireless/ath/ath10k/pci.c
index 2328df09875ce..e7fde635e0eef 100644
--- a/drivers/net/wireless/ath/ath10k/pci.c
+++ b/drivers/net/wireless/ath/ath10k/pci.c
@@ -1958,7 +1958,7 @@ static int ath10k_pci_hif_start(struct ath10k *ar)

ath10k_dbg(ar, ATH10K_DBG_BOOT, "boot hif start\n");

- napi_enable(&ar->napi);
+ ath10k_core_napi_enable(ar);

ath10k_pci_irq_enable(ar);
ath10k_pci_rx_post(ar);
@@ -2075,8 +2075,9 @@ static void ath10k_pci_hif_stop(struct ath10k *ar)

ath10k_pci_irq_disable(ar);
ath10k_pci_irq_sync(ar);
- napi_synchronize(&ar->napi);
- napi_disable(&ar->napi);
+
+ ath10k_core_napi_sync_disable(ar);
+
cancel_work_sync(&ar_pci->dump_work);

/* Most likely the device has HTT Rx ring configured. The only way to
diff --git a/drivers/net/wireless/ath/ath10k/sdio.c b/drivers/net/wireless/ath/ath10k/sdio.c
index c415090d1f37c..b746052737e0b 100644
--- a/drivers/net/wireless/ath/ath10k/sdio.c
+++ b/drivers/net/wireless/ath/ath10k/sdio.c
@@ -1859,7 +1859,7 @@ static int ath10k_sdio_hif_start(struct ath10k *ar)
struct ath10k_sdio *ar_sdio = ath10k_sdio_priv(ar);
int ret;

- napi_enable(&ar->napi);
+ ath10k_core_napi_enable(ar);

/* Sleep 20 ms before HIF interrupts are disabled.
* This will give target plenty of time to process the BMI done
@@ -1992,8 +1992,7 @@ static void ath10k_sdio_hif_stop(struct ath10k *ar)

spin_unlock_bh(&ar_sdio->wr_async_lock);

- napi_synchronize(&ar->napi);
- napi_disable(&ar->napi);
+ ath10k_core_napi_sync_disable(ar);
}

#ifdef CONFIG_PM
diff --git a/drivers/net/wireless/ath/ath10k/snoc.c b/drivers/net/wireless/ath/ath10k/snoc.c
index bf9a8cb713dc0..2c2df0b8b3fd9 100644
--- a/drivers/net/wireless/ath/ath10k/snoc.c
+++ b/drivers/net/wireless/ath/ath10k/snoc.c
@@ -915,8 +915,7 @@ static void ath10k_snoc_hif_stop(struct ath10k *ar)
if (!test_bit(ATH10K_FLAG_CRASH_FLUSH, &ar->dev_flags))
ath10k_snoc_irq_disable(ar);

- napi_synchronize(&ar->napi);
- napi_disable(&ar->napi);
+ ath10k_core_napi_sync_disable(ar);
ath10k_snoc_buffer_cleanup(ar);
ath10k_dbg(ar, ATH10K_DBG_BOOT, "boot hif stop\n");
}
@@ -926,7 +925,8 @@ static int ath10k_snoc_hif_start(struct ath10k *ar)
struct ath10k_snoc *ar_snoc = ath10k_snoc_priv(ar);

bitmap_clear(ar_snoc->pending_ce_irqs, 0, CE_COUNT_MAX);
- napi_enable(&ar->napi);
+
+ ath10k_core_napi_enable(ar);
ath10k_snoc_irq_enable(ar);
ath10k_snoc_rx_post(ar);

--
2.27.0


2021-02-24 13:21:51

by Sasha Levin

[permalink] [raw]
Subject: [PATCH AUTOSEL 5.11 12/67] mt76: mt7615: reset token when mac_reset happens

From: Ryder Lee <[email protected]>

[ Upstream commit a6275e934605646ef81b02d8d1164f21343149c9 ]

Reset token in mt7615_mac_reset_work() to avoid possible leakege.

Signed-off-by: Ryder Lee <[email protected]>
Signed-off-by: Felix Fietkau <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
.../net/wireless/mediatek/mt76/mt7615/mac.c | 20 +++++++++++++++++++
.../wireless/mediatek/mt76/mt7615/mt7615.h | 2 +-
.../wireless/mediatek/mt76/mt7615/pci_init.c | 12 +----------
3 files changed, 22 insertions(+), 12 deletions(-)

diff --git a/drivers/net/wireless/mediatek/mt76/mt7615/mac.c b/drivers/net/wireless/mediatek/mt76/mt7615/mac.c
index 0f360be0b8851..fb10a6497ed05 100644
--- a/drivers/net/wireless/mediatek/mt76/mt7615/mac.c
+++ b/drivers/net/wireless/mediatek/mt76/mt7615/mac.c
@@ -2058,6 +2058,23 @@ void mt7615_dma_reset(struct mt7615_dev *dev)
}
EXPORT_SYMBOL_GPL(mt7615_dma_reset);

+void mt7615_tx_token_put(struct mt7615_dev *dev)
+{
+ struct mt76_txwi_cache *txwi;
+ int id;
+
+ spin_lock_bh(&dev->token_lock);
+ idr_for_each_entry(&dev->token, txwi, id) {
+ mt7615_txp_skb_unmap(&dev->mt76, txwi);
+ if (txwi->skb)
+ dev_kfree_skb_any(txwi->skb);
+ mt76_put_txwi(&dev->mt76, txwi);
+ }
+ spin_unlock_bh(&dev->token_lock);
+ idr_destroy(&dev->token);
+}
+EXPORT_SYMBOL_GPL(mt7615_tx_token_put);
+
void mt7615_mac_reset_work(struct work_struct *work)
{
struct mt7615_phy *phy2;
@@ -2101,6 +2118,9 @@ void mt7615_mac_reset_work(struct work_struct *work)

mt76_wr(dev, MT_MCU_INT_EVENT, MT_MCU_INT_EVENT_PDMA_STOPPED);

+ mt7615_tx_token_put(dev);
+ idr_init(&dev->token);
+
if (mt7615_wait_reset_state(dev, MT_MCU_CMD_RESET_DONE)) {
mt7615_dma_reset(dev);

diff --git a/drivers/net/wireless/mediatek/mt76/mt7615/mt7615.h b/drivers/net/wireless/mediatek/mt76/mt7615/mt7615.h
index 99b8abdbb08f7..d697ff2ea56e8 100644
--- a/drivers/net/wireless/mediatek/mt76/mt7615/mt7615.h
+++ b/drivers/net/wireless/mediatek/mt76/mt7615/mt7615.h
@@ -583,7 +583,7 @@ int mt7615_tx_prepare_skb(struct mt76_dev *mdev, void *txwi_ptr,
struct mt76_tx_info *tx_info);

void mt7615_tx_complete_skb(struct mt76_dev *mdev, struct mt76_queue_entry *e);
-
+void mt7615_tx_token_put(struct mt7615_dev *dev);
void mt7615_queue_rx_skb(struct mt76_dev *mdev, enum mt76_rxq_id q,
struct sk_buff *skb);
void mt7615_sta_ps(struct mt76_dev *mdev, struct ieee80211_sta *sta, bool ps);
diff --git a/drivers/net/wireless/mediatek/mt76/mt7615/pci_init.c b/drivers/net/wireless/mediatek/mt76/mt7615/pci_init.c
index 27fcb1374685b..58a0ec1bf8d7b 100644
--- a/drivers/net/wireless/mediatek/mt76/mt7615/pci_init.c
+++ b/drivers/net/wireless/mediatek/mt76/mt7615/pci_init.c
@@ -160,9 +160,7 @@ int mt7615_register_device(struct mt7615_dev *dev)

void mt7615_unregister_device(struct mt7615_dev *dev)
{
- struct mt76_txwi_cache *txwi;
bool mcu_running;
- int id;

mcu_running = mt7615_wait_for_mcu_init(dev);

@@ -172,15 +170,7 @@ void mt7615_unregister_device(struct mt7615_dev *dev)
mt7615_mcu_exit(dev);
mt7615_dma_cleanup(dev);

- spin_lock_bh(&dev->token_lock);
- idr_for_each_entry(&dev->token, txwi, id) {
- mt7615_txp_skb_unmap(&dev->mt76, txwi);
- if (txwi->skb)
- dev_kfree_skb_any(txwi->skb);
- mt76_put_txwi(&dev->mt76, txwi);
- }
- spin_unlock_bh(&dev->token_lock);
- idr_destroy(&dev->token);
+ mt7615_tx_token_put(dev);

tasklet_disable(&dev->irq_tasklet);

--
2.27.0

2021-02-24 13:21:51

by Sasha Levin

[permalink] [raw]
Subject: [PATCH AUTOSEL 5.11 11/67] mt76: mt7915: reset token when mac_reset happens

From: Ryder Lee <[email protected]>

[ Upstream commit f285dfb98562e8380101095d168910df1d07d8be ]

Reset buffering token in mt7915_mac_reset_work() to avoid possible leakege,
which leads to Tx stop after mac reset.

Tested-by: Bo Jiao <[email protected]>
Signed-off-by: Ryder Lee <[email protected]>
Signed-off-by: Felix Fietkau <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
.../net/wireless/mediatek/mt76/mt7915/init.c | 18 +-------------
.../net/wireless/mediatek/mt76/mt7915/mac.c | 24 +++++++++++++++++++
.../wireless/mediatek/mt76/mt7915/mt7915.h | 1 +
3 files changed, 26 insertions(+), 17 deletions(-)

diff --git a/drivers/net/wireless/mediatek/mt76/mt7915/init.c b/drivers/net/wireless/mediatek/mt76/mt7915/init.c
index 102a8f14c22d4..2ec18aaa82807 100644
--- a/drivers/net/wireless/mediatek/mt76/mt7915/init.c
+++ b/drivers/net/wireless/mediatek/mt76/mt7915/init.c
@@ -672,28 +672,12 @@ int mt7915_register_device(struct mt7915_dev *dev)

void mt7915_unregister_device(struct mt7915_dev *dev)
{
- struct mt76_txwi_cache *txwi;
- int id;
-
mt7915_unregister_ext_phy(dev);
mt76_unregister_device(&dev->mt76);
mt7915_mcu_exit(dev);
mt7915_dma_cleanup(dev);

- spin_lock_bh(&dev->token_lock);
- idr_for_each_entry(&dev->token, txwi, id) {
- mt7915_txp_skb_unmap(&dev->mt76, txwi);
- if (txwi->skb) {
- struct ieee80211_hw *hw;
-
- hw = mt76_tx_status_get_hw(&dev->mt76, txwi->skb);
- ieee80211_free_txskb(hw, txwi->skb);
- }
- mt76_put_txwi(&dev->mt76, txwi);
- dev->token_count--;
- }
- spin_unlock_bh(&dev->token_lock);
- idr_destroy(&dev->token);
+ mt7915_tx_token_put(dev);

mt76_free_device(&dev->mt76);
}
diff --git a/drivers/net/wireless/mediatek/mt76/mt7915/mac.c b/drivers/net/wireless/mediatek/mt76/mt7915/mac.c
index f504eeb221f95..1b4d65310b887 100644
--- a/drivers/net/wireless/mediatek/mt76/mt7915/mac.c
+++ b/drivers/net/wireless/mediatek/mt76/mt7915/mac.c
@@ -1485,6 +1485,27 @@ mt7915_dma_reset(struct mt7915_phy *phy)
MT_WFDMA1_GLO_CFG_TX_DMA_EN | MT_WFDMA1_GLO_CFG_RX_DMA_EN);
}

+void mt7915_tx_token_put(struct mt7915_dev *dev)
+{
+ struct mt76_txwi_cache *txwi;
+ int id;
+
+ spin_lock_bh(&dev->token_lock);
+ idr_for_each_entry(&dev->token, txwi, id) {
+ mt7915_txp_skb_unmap(&dev->mt76, txwi);
+ if (txwi->skb) {
+ struct ieee80211_hw *hw;
+
+ hw = mt76_tx_status_get_hw(&dev->mt76, txwi->skb);
+ ieee80211_free_txskb(hw, txwi->skb);
+ }
+ mt76_put_txwi(&dev->mt76, txwi);
+ dev->token_count--;
+ }
+ spin_unlock_bh(&dev->token_lock);
+ idr_destroy(&dev->token);
+}
+
/* system error recovery */
void mt7915_mac_reset_work(struct work_struct *work)
{
@@ -1525,6 +1546,9 @@ void mt7915_mac_reset_work(struct work_struct *work)

mt76_wr(dev, MT_MCU_INT_EVENT, MT_MCU_INT_EVENT_DMA_STOPPED);

+ mt7915_tx_token_put(dev);
+ idr_init(&dev->token);
+
if (mt7915_wait_reset_state(dev, MT_MCU_CMD_RESET_DONE)) {
mt7915_dma_reset(&dev->phy);

diff --git a/drivers/net/wireless/mediatek/mt76/mt7915/mt7915.h b/drivers/net/wireless/mediatek/mt76/mt7915/mt7915.h
index 0339abf360d3f..94bed8a3a050a 100644
--- a/drivers/net/wireless/mediatek/mt76/mt7915/mt7915.h
+++ b/drivers/net/wireless/mediatek/mt76/mt7915/mt7915.h
@@ -463,6 +463,7 @@ int mt7915_tx_prepare_skb(struct mt76_dev *mdev, void *txwi_ptr,
struct ieee80211_sta *sta,
struct mt76_tx_info *tx_info);
void mt7915_tx_complete_skb(struct mt76_dev *mdev, struct mt76_queue_entry *e);
+void mt7915_tx_token_put(struct mt7915_dev *dev);
int mt7915_init_tx_queues(struct mt7915_phy *phy, int idx, int n_desc);
void mt7915_queue_rx_skb(struct mt76_dev *mdev, enum mt76_rxq_id q,
struct sk_buff *skb);
--
2.27.0

2021-02-24 13:21:51

by Sasha Levin

[permalink] [raw]
Subject: [PATCH AUTOSEL 5.11 07/67] wlcore: Fix command execute failure 19 for wl12xx

From: Tony Lindgren <[email protected]>

[ Upstream commit cb88d01b67383a095e3f7caeb4cdade5a6cf0417 ]

We can currently get a "command execute failure 19" error on beacon loss
if the signal is weak:

wlcore: Beacon loss detected. roles:0xff
wlcore: Connection loss work (role_id: 0).
...
wlcore: ERROR command execute failure 19
...
WARNING: CPU: 0 PID: 1552 at drivers/net/wireless/ti/wlcore/main.c:803
...
(wl12xx_queue_recovery_work.part.0 [wlcore])
(wl12xx_cmd_role_start_sta [wlcore])
(wl1271_op_bss_info_changed [wlcore])
(ieee80211_prep_connection [mac80211])

Error 19 is defined as CMD_STATUS_WRONG_NESTING from the wlcore firmware,
and seems to mean that the firmware no longer wants to see the quirk
handling for WLCORE_QUIRK_START_STA_FAILS done.

This quirk got added with commit 18eab430700d ("wlcore: workaround
start_sta problem in wl12xx fw"), and it seems that this already got fixed
in the firmware long time ago back in 2012 as wl18xx never had this quirk
in place to start with.

As we no longer even support firmware that early, to me it seems that it's
safe to just drop WLCORE_QUIRK_START_STA_FAILS to fix the error. Looks
like earlier firmware got disabled back in 2013 with commit 0e284c074ef9
("wl12xx: increase minimum singlerole firmware version required").

If it turns out we still need WLCORE_QUIRK_START_STA_FAILS with any
firmware that the driver works with, we can simply revert this patch and
add extra checks for firmware version used.

With this fix wlcore reconnects properly after a beacon loss.

Cc: Raz Bouganim <[email protected]>
Signed-off-by: Tony Lindgren <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/net/wireless/ti/wl12xx/main.c | 3 ---
drivers/net/wireless/ti/wlcore/main.c | 15 +--------------
drivers/net/wireless/ti/wlcore/wlcore.h | 3 ---
3 files changed, 1 insertion(+), 20 deletions(-)

diff --git a/drivers/net/wireless/ti/wl12xx/main.c b/drivers/net/wireless/ti/wl12xx/main.c
index 3c9c623bb4283..9d7dbfe7fe0c3 100644
--- a/drivers/net/wireless/ti/wl12xx/main.c
+++ b/drivers/net/wireless/ti/wl12xx/main.c
@@ -635,7 +635,6 @@ static int wl12xx_identify_chip(struct wl1271 *wl)
wl->quirks |= WLCORE_QUIRK_LEGACY_NVS |
WLCORE_QUIRK_DUAL_PROBE_TMPL |
WLCORE_QUIRK_TKIP_HEADER_SPACE |
- WLCORE_QUIRK_START_STA_FAILS |
WLCORE_QUIRK_AP_ZERO_SESSION_ID;
wl->sr_fw_name = WL127X_FW_NAME_SINGLE;
wl->mr_fw_name = WL127X_FW_NAME_MULTI;
@@ -659,7 +658,6 @@ static int wl12xx_identify_chip(struct wl1271 *wl)
wl->quirks |= WLCORE_QUIRK_LEGACY_NVS |
WLCORE_QUIRK_DUAL_PROBE_TMPL |
WLCORE_QUIRK_TKIP_HEADER_SPACE |
- WLCORE_QUIRK_START_STA_FAILS |
WLCORE_QUIRK_AP_ZERO_SESSION_ID;
wl->plt_fw_name = WL127X_PLT_FW_NAME;
wl->sr_fw_name = WL127X_FW_NAME_SINGLE;
@@ -688,7 +686,6 @@ static int wl12xx_identify_chip(struct wl1271 *wl)
wl->quirks |= WLCORE_QUIRK_TX_BLOCKSIZE_ALIGN |
WLCORE_QUIRK_DUAL_PROBE_TMPL |
WLCORE_QUIRK_TKIP_HEADER_SPACE |
- WLCORE_QUIRK_START_STA_FAILS |
WLCORE_QUIRK_AP_ZERO_SESSION_ID;

wlcore_set_min_fw_ver(wl, WL128X_CHIP_VER,
diff --git a/drivers/net/wireless/ti/wlcore/main.c b/drivers/net/wireless/ti/wlcore/main.c
index 122c7a4b374f1..0f9cc3de6aebc 100644
--- a/drivers/net/wireless/ti/wlcore/main.c
+++ b/drivers/net/wireless/ti/wlcore/main.c
@@ -2872,21 +2872,8 @@ static int wlcore_join(struct wl1271 *wl, struct wl12xx_vif *wlvif)

if (is_ibss)
ret = wl12xx_cmd_role_start_ibss(wl, wlvif);
- else {
- if (wl->quirks & WLCORE_QUIRK_START_STA_FAILS) {
- /*
- * TODO: this is an ugly workaround for wl12xx fw
- * bug - we are not able to tx/rx after the first
- * start_sta, so make dummy start+stop calls,
- * and then call start_sta again.
- * this should be fixed in the fw.
- */
- wl12xx_cmd_role_start_sta(wl, wlvif);
- wl12xx_cmd_role_stop_sta(wl, wlvif);
- }
-
+ else
ret = wl12xx_cmd_role_start_sta(wl, wlvif);
- }

return ret;
}
diff --git a/drivers/net/wireless/ti/wlcore/wlcore.h b/drivers/net/wireless/ti/wlcore/wlcore.h
index b7821311ac75b..81c94d390623b 100644
--- a/drivers/net/wireless/ti/wlcore/wlcore.h
+++ b/drivers/net/wireless/ti/wlcore/wlcore.h
@@ -547,9 +547,6 @@ wlcore_set_min_fw_ver(struct wl1271 *wl, unsigned int chip,
/* Each RX/TX transaction requires an end-of-transaction transfer */
#define WLCORE_QUIRK_END_OF_TRANSACTION BIT(0)

-/* the first start_role(sta) sometimes doesn't work on wl12xx */
-#define WLCORE_QUIRK_START_STA_FAILS BIT(1)
-
/* wl127x and SPI don't support SDIO block size alignment */
#define WLCORE_QUIRK_TX_BLOCKSIZE_ALIGN BIT(2)

--
2.27.0

2021-02-24 13:21:56

by Sasha Levin

[permalink] [raw]
Subject: [PATCH AUTOSEL 5.11 14/67] ath10k: fix wmi mgmt tx queue full due to race condition

From: Miaoqing Pan <[email protected]>

[ Upstream commit b55379e343a3472c35f4a1245906db5158cab453 ]

Failed to transmit wmi management frames:

[84977.840894] ath10k_snoc a000000.wifi: wmi mgmt tx queue is full
[84977.840913] ath10k_snoc a000000.wifi: failed to transmit packet, dropping: -28
[84977.840924] ath10k_snoc a000000.wifi: failed to submit frame: -28
[84977.840932] ath10k_snoc a000000.wifi: failed to transmit frame: -28

This issue is caused by race condition between skb_dequeue and
__skb_queue_tail. The queue of ‘wmi_mgmt_tx_queue’ is protected by a
different lock: ar->data_lock vs list->lock, the result is no protection.
So when ath10k_mgmt_over_wmi_tx_work() and ath10k_mac_tx_wmi_mgmt()
running concurrently on different CPUs, there appear to be a rare corner
cases when the queue length is 1,

CPUx (skb_deuque) CPUy (__skb_queue_tail)
next=list
prev=list
struct sk_buff *skb = skb_peek(list); WRITE_ONCE(newsk->next, next);
WRITE_ONCE(list->qlen, list->qlen - 1);WRITE_ONCE(newsk->prev, prev);
next = skb->next; WRITE_ONCE(next->prev, newsk);
prev = skb->prev; WRITE_ONCE(prev->next, newsk);
skb->next = skb->prev = NULL; list->qlen++;
WRITE_ONCE(next->prev, prev);
WRITE_ONCE(prev->next, next);

If the instruction ‘next = skb->next’ is executed before
‘WRITE_ONCE(prev->next, newsk)’, newsk will be lost, as CPUx get the
old ‘next’ pointer, but the length is still added by one. The final
result is the length of the queue will reach the maximum value but
the queue is empty.

So remove ar->data_lock, and use 'skb_queue_tail' instead of
'__skb_queue_tail' to prevent the potential race condition. Also switch
to use skb_queue_len_lockless, in case we queue a few SKBs simultaneously.

Tested-on: WCN3990 hw1.0 SNOC WLAN.HL.3.1.c2-00033-QCAHLSWMTPLZ-1

Signed-off-by: Miaoqing Pan <[email protected]>
Reviewed-by: Brian Norris <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/net/wireless/ath/ath10k/mac.c | 15 ++++-----------
1 file changed, 4 insertions(+), 11 deletions(-)

diff --git a/drivers/net/wireless/ath/ath10k/mac.c b/drivers/net/wireless/ath/ath10k/mac.c
index 7d98250380ec5..91b1006ef2a27 100644
--- a/drivers/net/wireless/ath/ath10k/mac.c
+++ b/drivers/net/wireless/ath/ath10k/mac.c
@@ -3763,23 +3763,16 @@ bool ath10k_mac_tx_frm_has_freq(struct ath10k *ar)
static int ath10k_mac_tx_wmi_mgmt(struct ath10k *ar, struct sk_buff *skb)
{
struct sk_buff_head *q = &ar->wmi_mgmt_tx_queue;
- int ret = 0;
-
- spin_lock_bh(&ar->data_lock);

- if (skb_queue_len(q) == ATH10K_MAX_NUM_MGMT_PENDING) {
+ if (skb_queue_len_lockless(q) >= ATH10K_MAX_NUM_MGMT_PENDING) {
ath10k_warn(ar, "wmi mgmt tx queue is full\n");
- ret = -ENOSPC;
- goto unlock;
+ return -ENOSPC;
}

- __skb_queue_tail(q, skb);
+ skb_queue_tail(q, skb);
ieee80211_queue_work(ar->hw, &ar->wmi_mgmt_tx_work);

-unlock:
- spin_unlock_bh(&ar->data_lock);
-
- return ret;
+ return 0;
}

static enum ath10k_mac_tx_path
--
2.27.0

2021-02-24 13:22:47

by Sasha Levin

[permalink] [raw]
Subject: [PATCH AUTOSEL 5.11 22/67] brcmfmac: Add DMI nvram filename quirk for Predia Basic tablet

From: Hans de Goede <[email protected]>

[ Upstream commit af4b3a6f36d6c2fc5fca026bccf45e0fdcabddd9 ]

The Predia Basic tablet contains quite generic names in the sys_vendor and
product_name DMI strings, without this patch brcmfmac will try to load:
brcmfmac43340-sdio.Insyde-CherryTrail.txt as nvram file which is a bit
too generic.

Add a DMI quirk so that a unique and clearly identifiable nvram file name
is used on the Predia Basic tablet.

Signed-off-by: Hans de Goede <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Sasha Levin <[email protected]>
---
.../net/wireless/broadcom/brcm80211/brcmfmac/dmi.c | 14 ++++++++++++++
1 file changed, 14 insertions(+)

diff --git a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/dmi.c b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/dmi.c
index 4aa2561934d77..824a79f243830 100644
--- a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/dmi.c
+++ b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/dmi.c
@@ -40,6 +40,10 @@ static const struct brcmf_dmi_data pov_tab_p1006w_data = {
BRCM_CC_43340_CHIP_ID, 2, "pov-tab-p1006w-data"
};

+static const struct brcmf_dmi_data predia_basic_data = {
+ BRCM_CC_43341_CHIP_ID, 2, "predia-basic"
+};
+
static const struct dmi_system_id dmi_platform_data[] = {
{
/* ACEPC T8 Cherry Trail Z8350 mini PC */
@@ -111,6 +115,16 @@ static const struct dmi_system_id dmi_platform_data[] = {
},
.driver_data = (void *)&pov_tab_p1006w_data,
},
+ {
+ /* Predia Basic tablet (+ with keyboard dock) */
+ .matches = {
+ DMI_MATCH(DMI_SYS_VENDOR, "Insyde"),
+ DMI_MATCH(DMI_PRODUCT_NAME, "CherryTrail"),
+ /* Mx.WT107.KUBNGEA02 with the version-nr dropped */
+ DMI_MATCH(DMI_BIOS_VERSION, "Mx.WT107.KUBNGEA"),
+ },
+ .driver_data = (void *)&predia_basic_data,
+ },
{}
};

--
2.27.0