2024-05-24 02:37:05

by Baochen Qiang

[permalink] [raw]
Subject: [PATCH] wifi: ath12k: fix race due to setting ATH12K_FLAG_EXT_IRQ_ENABLED too early

Commit 5082b3e3027e ("wifi: ath11k: fix race due to setting
ATH11K_FLAG_EXT_IRQ_ENABLED too early") fixes a race in ath11k
driver. Since ath12k shares the same logic as ath11k, currently
the race also exists in ath12k: in ath12k_pci_ext_irq_enable(),
ATH12K_FLAG_EXT_IRQ_ENABLED is set before NAPI is enabled.
In cases where only one MSI vector is allocated, this results
in a race condition: after ATH12K_FLAG_EXT_IRQ_ENABLED is set
but before NAPI enabled, CE interrupt breaks in. Since IRQ is
shared by CE and data path, ath12k_pci_ext_interrupt_handler()
is also called where we call disable_irq_nosync() to disable
IRQ. Then napi_schedule() is called but it does nothing because
NAPI is not enabled at that time, meaning that
ath12k_pci_ext_grp_napi_poll() will never run, so we have
no chance to call enable_irq() to enable IRQ back. Since IRQ
is shared, all interrupts are disabled and we would finally
get no response from target.

So port ath11k fix here, this is done by setting
ATH12K_FLAG_EXT_IRQ_ENABLED after all NAPI and IRQ work are
done. With the fix, we are sure that by the time
ATH12K_FLAG_EXT_IRQ_ENABLED is set, NAPI is enabled.

Note that the fix above also introduce some side effects:
if ath12k_pci_ext_interrupt_handler() breaks in after NAPI
enabled but before ATH12K_FLAG_EXT_IRQ_ENABLED set, nothing
will be done by the handler this time, the work will be
postponed till the next time the IRQ fires.

This is found during code review.

Tested-on: WCN7850 hw2.0 PCI WLAN.HMT.1.0.c5-00481-QCAHMTSWPL_V1.0_V2.0_SILICONZ-3

Signed-off-by: Baochen Qiang <[email protected]>
---
drivers/net/wireless/ath/ath12k/pci.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/wireless/ath/ath12k/pci.c b/drivers/net/wireless/ath/ath12k/pci.c
index ac75e8e3916b..11b95d037051 100644
--- a/drivers/net/wireless/ath/ath12k/pci.c
+++ b/drivers/net/wireless/ath/ath12k/pci.c
@@ -1107,14 +1107,14 @@ void ath12k_pci_ext_irq_enable(struct ath12k_base *ab)
{
int i;

- set_bit(ATH12K_FLAG_EXT_IRQ_ENABLED, &ab->dev_flags);
-
for (i = 0; i < ATH12K_EXT_IRQ_GRP_NUM_MAX; i++) {
struct ath12k_ext_irq_grp *irq_grp = &ab->ext_irq_grp[i];

napi_enable(&irq_grp->napi);
ath12k_pci_ext_grp_enable(irq_grp);
}
+
+ set_bit(ATH12K_FLAG_EXT_IRQ_ENABLED, &ab->dev_flags);
}

void ath12k_pci_ext_irq_disable(struct ath12k_base *ab)

base-commit: fae0804439b5833308ab7d8563c643edb0fa174c
--
2.25.1



2024-05-24 16:51:04

by Jeff Johnson

[permalink] [raw]
Subject: Re: [PATCH] wifi: ath12k: fix race due to setting ATH12K_FLAG_EXT_IRQ_ENABLED too early

On 5/23/2024 7:36 PM, Baochen Qiang wrote:
> Commit 5082b3e3027e ("wifi: ath11k: fix race due to setting
> ATH11K_FLAG_EXT_IRQ_ENABLED too early") fixes a race in ath11k
> driver. Since ath12k shares the same logic as ath11k, currently
> the race also exists in ath12k: in ath12k_pci_ext_irq_enable(),
> ATH12K_FLAG_EXT_IRQ_ENABLED is set before NAPI is enabled.
> In cases where only one MSI vector is allocated, this results
> in a race condition: after ATH12K_FLAG_EXT_IRQ_ENABLED is set
> but before NAPI enabled, CE interrupt breaks in. Since IRQ is
> shared by CE and data path, ath12k_pci_ext_interrupt_handler()
> is also called where we call disable_irq_nosync() to disable
> IRQ. Then napi_schedule() is called but it does nothing because
> NAPI is not enabled at that time, meaning that
> ath12k_pci_ext_grp_napi_poll() will never run, so we have
> no chance to call enable_irq() to enable IRQ back. Since IRQ
> is shared, all interrupts are disabled and we would finally
> get no response from target.
>
> So port ath11k fix here, this is done by setting
> ATH12K_FLAG_EXT_IRQ_ENABLED after all NAPI and IRQ work are
> done. With the fix, we are sure that by the time
> ATH12K_FLAG_EXT_IRQ_ENABLED is set, NAPI is enabled.
>
> Note that the fix above also introduce some side effects:
> if ath12k_pci_ext_interrupt_handler() breaks in after NAPI
> enabled but before ATH12K_FLAG_EXT_IRQ_ENABLED set, nothing
> will be done by the handler this time, the work will be
> postponed till the next time the IRQ fires.
>
> This is found during code review.
>
> Tested-on: WCN7850 hw2.0 PCI WLAN.HMT.1.0.c5-00481-QCAHMTSWPL_V1.0_V2.0_SILICONZ-3
>
> Signed-off-by: Baochen Qiang <[email protected]>

Acked-by: Jeff Johnson <[email protected]>


2024-05-28 17:45:11

by Kalle Valo

[permalink] [raw]
Subject: Re: [PATCH] wifi: ath12k: fix race due to setting ATH12K_FLAG_EXT_IRQ_ENABLED too early

Baochen Qiang <[email protected]> wrote:

> Commit 5082b3e3027e ("wifi: ath11k: fix race due to setting
> ATH11K_FLAG_EXT_IRQ_ENABLED too early") fixes a race in ath11k
> driver. Since ath12k shares the same logic as ath11k, currently
> the race also exists in ath12k: in ath12k_pci_ext_irq_enable(),
> ATH12K_FLAG_EXT_IRQ_ENABLED is set before NAPI is enabled.
> In cases where only one MSI vector is allocated, this results
> in a race condition: after ATH12K_FLAG_EXT_IRQ_ENABLED is set
> but before NAPI enabled, CE interrupt breaks in. Since IRQ is
> shared by CE and data path, ath12k_pci_ext_interrupt_handler()
> is also called where we call disable_irq_nosync() to disable
> IRQ. Then napi_schedule() is called but it does nothing because
> NAPI is not enabled at that time, meaning that
> ath12k_pci_ext_grp_napi_poll() will never run, so we have
> no chance to call enable_irq() to enable IRQ back. Since IRQ
> is shared, all interrupts are disabled and we would finally
> get no response from target.
>
> So port ath11k fix here, this is done by setting
> ATH12K_FLAG_EXT_IRQ_ENABLED after all NAPI and IRQ work are
> done. With the fix, we are sure that by the time
> ATH12K_FLAG_EXT_IRQ_ENABLED is set, NAPI is enabled.
>
> Note that the fix above also introduce some side effects:
> if ath12k_pci_ext_interrupt_handler() breaks in after NAPI
> enabled but before ATH12K_FLAG_EXT_IRQ_ENABLED set, nothing
> will be done by the handler this time, the work will be
> postponed till the next time the IRQ fires.
>
> This is found during code review.
>
> Tested-on: WCN7850 hw2.0 PCI WLAN.HMT.1.0.c5-00481-QCAHMTSWPL_V1.0_V2.0_SILICONZ-3
>
> Signed-off-by: Baochen Qiang <[email protected]>
> Acked-by: Jeff Johnson <[email protected]>
> Signed-off-by: Kalle Valo <[email protected]>

Patch applied to ath-next branch of ath.git, thanks.

0a993772e0f0 wifi: ath12k: fix race due to setting ATH12K_FLAG_EXT_IRQ_ENABLED too early

--
https://patchwork.kernel.org/project/linux-wireless/patch/[email protected]/

https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches