Subject: [PATCH v2] ath10k: Fix soft lockup during firmware crash/hw-restart

From: Mohammed Shafi Shajakhan <[email protected]>

During firmware crash (or) user requested manual restart
the system gets into a soft lock up state because of the
below root cause.

During user requested hardware restart / firmware crash
the system goes into a soft lockup state as 'napi_synchronize'
is called after 'napi_disable' (which sets 'NAPI_STATE_SCHED'
bit) and it sleeps into infinite loop as it waits for
'NAPI_STATE_SCHED' to be cleared. This condition is hit because
'ath10k_hif_stop' is called twice as below (resulting in calling
'napi_synchronize' after 'napi_disable')

'ath10k_core_restart' -> 'ath10k_hif_stop' (ATH10K_STATE_ON) ->
-> 'ieee80211_restart_hw' -> 'ath10k_start' -> 'ath10k_halt' ->
'ath10k_core_stop' -> 'ath10k_hif_stop' (ATH10K_STATE_RESTARTING)

Fix this by calling 'ath10k_halt' in ath10k_core_restart itself
as it makes more sense before informing mac80211 to restart h/w
Also remove 'ath10k_halt' in ath10k_start for the state of 'restarting'

Fixes: 3c97f5de1f28 ("ath10k: implement NAPI support")
Signed-off-by: Mohammed Shafi Shajakhan <[email protected]>
---
[v2 Added Fixes ]

drivers/net/wireless/ath/ath10k/core.c | 2 +-
drivers/net/wireless/ath/ath10k/mac.c | 1 -
2 files changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/net/wireless/ath/ath10k/core.c b/drivers/net/wireless/ath/ath10k/core.c
index 7005e2a..5bc6847 100644
--- a/drivers/net/wireless/ath/ath10k/core.c
+++ b/drivers/net/wireless/ath/ath10k/core.c
@@ -1536,7 +1536,7 @@ static void ath10k_core_restart(struct work_struct *work)
switch (ar->state) {
case ATH10K_STATE_ON:
ar->state = ATH10K_STATE_RESTARTING;
- ath10k_hif_stop(ar);
+ ath10k_halt(ar);
ath10k_scan_finish(ar);
ieee80211_restart_hw(ar->hw);
break;
diff --git a/drivers/net/wireless/ath/ath10k/mac.c b/drivers/net/wireless/ath/ath10k/mac.c
index 717b2fa..481842b 100644
--- a/drivers/net/wireless/ath/ath10k/mac.c
+++ b/drivers/net/wireless/ath/ath10k/mac.c
@@ -4449,7 +4449,6 @@ static int ath10k_start(struct ieee80211_hw *hw)
ar->state = ATH10K_STATE_ON;
break;
case ATH10K_STATE_RESTARTING:
- ath10k_halt(ar);
ar->state = ATH10K_STATE_RESTARTED;
break;
case ATH10K_STATE_ON:
--
1.9.1


2016-12-01 10:35:48

by Kalle Valo

[permalink] [raw]
Subject: Re: [PATCH v2] ath10k: Fix soft lockup during firmware crash/hw-restart

Mohammed Shafi Shajakhan <[email protected]> writes:

> From: Mohammed Shafi Shajakhan <[email protected]>
>
> During firmware crash (or) user requested manual restart
> the system gets into a soft lock up state because of the
> below root cause.
>
> During user requested hardware restart / firmware crash
> the system goes into a soft lockup state as 'napi_synchronize'
> is called after 'napi_disable' (which sets 'NAPI_STATE_SCHED'
> bit) and it sleeps into infinite loop as it waits for
> 'NAPI_STATE_SCHED' to be cleared. This condition is hit because
> 'ath10k_hif_stop' is called twice as below (resulting in calling
> 'napi_synchronize' after 'napi_disable')
>
> 'ath10k_core_restart' -> 'ath10k_hif_stop' (ATH10K_STATE_ON) ->
> -> 'ieee80211_restart_hw' -> 'ath10k_start' -> 'ath10k_halt' ->
> 'ath10k_core_stop' -> 'ath10k_hif_stop' (ATH10K_STATE_RESTARTING)
>
> Fix this by calling 'ath10k_halt' in ath10k_core_restart itself
> as it makes more sense before informing mac80211 to restart h/w
> Also remove 'ath10k_halt' in ath10k_start for the state of 'restarting'
>
> Fixes: 3c97f5de1f28 ("ath10k: implement NAPI support")
> Signed-off-by: Mohammed Shafi Shajakhan <[email protected]>
> ---
> [v2 Added Fixes ]

I'll also add:

Cc: <[email protected]> # v4.9

--=20
Kalle Valo=

2016-12-01 11:13:44

by Kalle Valo

[permalink] [raw]
Subject: Re: [v2] ath10k: Fix soft lockup during firmware crash/hw-restart

Mohammed Shafi Shajakhan <[email protected]> wrote:
> From: Mohammed Shafi Shajakhan <[email protected]>
>
> During firmware crash (or) user requested manual restart
> the system gets into a soft lock up state because of the
> below root cause.
>
> During user requested hardware restart / firmware crash
> the system goes into a soft lockup state as 'napi_synchronize'
> is called after 'napi_disable' (which sets 'NAPI_STATE_SCHED'
> bit) and it sleeps into infinite loop as it waits for
> 'NAPI_STATE_SCHED' to be cleared. This condition is hit because
> 'ath10k_hif_stop' is called twice as below (resulting in calling
> 'napi_synchronize' after 'napi_disable')
>
> 'ath10k_core_restart' -> 'ath10k_hif_stop' (ATH10K_STATE_ON) ->
> -> 'ieee80211_restart_hw' -> 'ath10k_start' -> 'ath10k_halt' ->
> 'ath10k_core_stop' -> 'ath10k_hif_stop' (ATH10K_STATE_RESTARTING)
>
> Fix this by calling 'ath10k_halt' in ath10k_core_restart itself
> as it makes more sense before informing mac80211 to restart h/w
> Also remove 'ath10k_halt' in ath10k_start for the state of 'restarting'
>
> Fixes: 3c97f5de1f28 ("ath10k: implement NAPI support")
> Signed-off-by: Mohammed Shafi Shajakhan <[email protected]>

Patch applied to ath-next branch of ath.git, thanks.

c2cac2f74ab4 ath10k: fix soft lockup during firmware crash/hw-restart

--
https://patchwork.kernel.org/patch/9453609/

Documentation about submitting wireless patches and checking status
from patchwork:

https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches

2016-12-01 11:12:51

by Mohammed Shafi Shajakhan

[permalink] [raw]
Subject: Re: [PATCH v2] ath10k: Fix soft lockup during firmware crash/hw-restart

On Thu, Dec 01, 2016 at 10:35:38AM +0000, Valo, Kalle wrote:
> Mohammed Shafi Shajakhan <[email protected]> writes:
>
> > From: Mohammed Shafi Shajakhan <[email protected]>
> >
> > During firmware crash (or) user requested manual restart
> > the system gets into a soft lock up state because of the
> > below root cause.
> >
> > During user requested hardware restart / firmware crash
> > the system goes into a soft lockup state as 'napi_synchronize'
> > is called after 'napi_disable' (which sets 'NAPI_STATE_SCHED'
> > bit) and it sleeps into infinite loop as it waits for
> > 'NAPI_STATE_SCHED' to be cleared. This condition is hit because
> > 'ath10k_hif_stop' is called twice as below (resulting in calling
> > 'napi_synchronize' after 'napi_disable')
> >
> > 'ath10k_core_restart' -> 'ath10k_hif_stop' (ATH10K_STATE_ON) ->
> > -> 'ieee80211_restart_hw' -> 'ath10k_start' -> 'ath10k_halt' ->
> > 'ath10k_core_stop' -> 'ath10k_hif_stop' (ATH10K_STATE_RESTARTING)
> >
> > Fix this by calling 'ath10k_halt' in ath10k_core_restart itself
> > as it makes more sense before informing mac80211 to restart h/w
> > Also remove 'ath10k_halt' in ath10k_start for the state of 'restarting'
> >
> > Fixes: 3c97f5de1f28 ("ath10k: implement NAPI support")
> > Signed-off-by: Mohammed Shafi Shajakhan <[email protected]>
> > ---
> > [v2 Added Fixes ]
>
> I'll also add:
>
> Cc: <[email protected]> # v4.9
>
thank you Kalle.

regards,
shafi