2022-10-11 07:25:48

by Wen Gong

[permalink] [raw]
Subject: [PATCH v3 0/2] wifi: ath11k: reduce the timeout value for hw scan

v3:
change code and log of "wifi: ath11k: reduce the timeout value back for hw scan from 10 seconds to 1 second"
to handle the "failed to start hw scan: -110" correctly.

v2:
add "wifi: ath11k: change to set 11d state instead of start 11d scan while disconnect"

Reduce the max timeout for hw scan started.

Wen Gong (2):
wifi: ath11k: change to set 11d state instead of start 11d scan while
disconnect
wifi: ath11k: reduce the timeout value back for hw scan from 10
seconds to 1 second

drivers/net/wireless/ath/ath11k/mac.c | 28 +++++++++++++++------------
1 file changed, 16 insertions(+), 12 deletions(-)


base-commit: c6d18be90f9b0c7fb64c6138b51c49151140fb57
--
2.31.1


2022-10-11 07:25:48

by Wen Gong

[permalink] [raw]
Subject: [PATCH v3 1/2] wifi: ath11k: change to set 11d state instead of start 11d scan while disconnect

When switch to connect to a new AP for station which is already connected
to an AP, the time cost is too long, it arrives 10 seconds.

The reason is when switch connection, disconnect operation happened on
the 1st AP, then 11d scan start command sent to firmware, and then a
new hw scan arrived for the 2nd AP. The 11d scan is running at this
moment, so the hw scan can not start immediately, it needs to wait
the 11d scan finished, it increased the time cost of switch AP and
even happened scan fail as log below after apply the incoming patch.

[ 1194.815104] ath11k_pci 0000:06:00.0: failed to start hw scan: -110
[ 1196.864157] ath11k_pci 0000:06:00.0: failed to start hw scan: -110
[ 1198.911926] ath11k_pci 0000:06:00.0: failed to start hw scan: -110

Change to set 11d state while disconnect, and the 11d scan will be
started after the new hw scan in ath11k_mac_op_hw_scan(). Then the
time cost of switching AP is small and not happened scan fail.

Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3

Fixes: 9dcf6808b253 ("ath11k: add 11d scan offload support")
Signed-off-by: Wen Gong <[email protected]>
---
drivers/net/wireless/ath/ath11k/mac.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/net/wireless/ath/ath11k/mac.c b/drivers/net/wireless/ath/ath11k/mac.c
index 4218211afa30..b0c3cf258d12 100644
--- a/drivers/net/wireless/ath/ath11k/mac.c
+++ b/drivers/net/wireless/ath/ath11k/mac.c
@@ -7190,8 +7190,12 @@ ath11k_mac_op_unassign_vif_chanctx(struct ieee80211_hw *hw,
ret);
}

- if (arvif->vdev_type == WMI_VDEV_TYPE_STA)
- ath11k_mac_11d_scan_start(ar, arvif->vdev_id);
+ if (arvif->vdev_type == WMI_VDEV_TYPE_STA &&
+ ar->state_11d != ATH11K_11D_PREPARING &&
+ test_bit(WMI_TLV_SERVICE_11D_OFFLOAD, ab->wmi_ab.svc_map)) {
+ reinit_completion(&ar->completed_11d_scan);
+ ar->state_11d = ATH11K_11D_PREPARING;
+ }

mutex_unlock(&ar->conf_mutex);
}
--
2.31.1

2022-10-11 07:25:56

by Wen Gong

[permalink] [raw]
Subject: [PATCH v3 2/2] wifi: ath11k: reduce the timeout value back for hw scan from 10 seconds to 1 second

For 11d scan, commit 9dcf6808b253 ("ath11k: add 11d scan offload support")
increased the timeout from one second to max 10 seconds when 11d scan
offload enabled and 6 GHz enabled, it is reasonable for the commit, it
is because the first 11d scan request is sent to firmware before the
first hw scan request after wlan load, then the hw scan started event
will reported from firmware after the 11d scan finished, it needs about
6 seconds when 6 GHz enabled, so increased it from one second to 10
seconds in the commit to avoid timed out for hw scan started. Then
another commit 1f682dc9fb37 ("ath11k: reduce the wait time of 11d scan
and hw scan while add interface") change the sequence of the first 11d
scan and hw scan, then ath11k will receive the hw scan started event
from firmware immediately for the first hw scan, thus ath11k does not
need set the timeout value to max 10 seconds again, and this is to set
the timeout value back from 10 seconds to 1 second.

After the 1st hw scan finished, firmware will start 11d scan immediately,
and firmware need use some seconds to finish 11d scan, if the 2nd hw
scan is sent from ath11k to firmware before 11d scan finished, the 2nd
hw scan will started after 11d scan finished, this will lead timeout to
wait scan started in ath11k. Treat the timeout as a normal situation if
11d scan is running and skip report scan fail for this situation.

Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3

Signed-off-by: Wen Gong <[email protected]>
---
drivers/net/wireless/ath/ath11k/mac.c | 20 ++++++++++----------
1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/drivers/net/wireless/ath/ath11k/mac.c b/drivers/net/wireless/ath/ath11k/mac.c
index b0c3cf258d12..666775a1e2a9 100644
--- a/drivers/net/wireless/ath/ath11k/mac.c
+++ b/drivers/net/wireless/ath/ath11k/mac.c
@@ -3560,7 +3560,6 @@ static int ath11k_start_scan(struct ath11k *ar,
struct scan_req_params *arg)
{
int ret;
- unsigned long timeout = 1 * HZ;

lockdep_assert_held(&ar->conf_mutex);

@@ -3571,19 +3570,15 @@ static int ath11k_start_scan(struct ath11k *ar,
if (ret)
return ret;

- if (test_bit(WMI_TLV_SERVICE_11D_OFFLOAD, ar->ab->wmi_ab.svc_map)) {
- timeout = 5 * HZ;
-
- if (ar->supports_6ghz)
- timeout += 5 * HZ;
- }
-
- ret = wait_for_completion_timeout(&ar->scan.started, timeout);
+ ret = wait_for_completion_timeout(&ar->scan.started, 1 * HZ);
if (ret == 0) {
ret = ath11k_scan_stop(ar);
if (ret)
ath11k_warn(ar->ab, "failed to stop scan: %d\n", ret);

+ if (ar->state_11d == ATH11K_11D_RUNNING)
+ return -EBUSY;
+
return -ETIMEDOUT;
}

@@ -3682,7 +3677,12 @@ static int ath11k_mac_op_hw_scan(struct ieee80211_hw *hw,

ret = ath11k_start_scan(ar, &arg);
if (ret) {
- ath11k_warn(ar->ab, "failed to start hw scan: %d\n", ret);
+ if (ret == -EBUSY)
+ ath11k_dbg(ar->ab, ATH11K_DBG_MAC,
+ "scan engine is busy 11d state %d\n", ar->state_11d);
+ else
+ ath11k_warn(ar->ab, "failed to start hw scan: %d\n", ret);
+
spin_lock_bh(&ar->data_lock);
ar->scan.state = ATH11K_SCAN_IDLE;
spin_unlock_bh(&ar->data_lock);
--
2.31.1

2022-11-08 10:21:21

by Kalle Valo

[permalink] [raw]
Subject: Re: [PATCH v3 2/2] wifi: ath11k: reduce the timeout value back for hw scan from 10 seconds to 1 second

Wen Gong <[email protected]> writes:

> For 11d scan, commit 9dcf6808b253 ("ath11k: add 11d scan offload support")
> increased the timeout from one second to max 10 seconds when 11d scan
> offload enabled and 6 GHz enabled, it is reasonable for the commit, it
> is because the first 11d scan request is sent to firmware before the
> first hw scan request after wlan load, then the hw scan started event
> will reported from firmware after the 11d scan finished, it needs about
> 6 seconds when 6 GHz enabled, so increased it from one second to 10
> seconds in the commit to avoid timed out for hw scan started. Then
> another commit 1f682dc9fb37 ("ath11k: reduce the wait time of 11d scan
> and hw scan while add interface") change the sequence of the first 11d
> scan and hw scan, then ath11k will receive the hw scan started event
> from firmware immediately for the first hw scan, thus ath11k does not
> need set the timeout value to max 10 seconds again, and this is to set
> the timeout value back from 10 seconds to 1 second.
>
> After the 1st hw scan finished, firmware will start 11d scan immediately,
> and firmware need use some seconds to finish 11d scan, if the 2nd hw
> scan is sent from ath11k to firmware before 11d scan finished, the 2nd
> hw scan will started after 11d scan finished, this will lead timeout to
> wait scan started in ath11k. Treat the timeout as a normal situation if
> 11d scan is running and skip report scan fail for this situation.
>
> Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3
>
> Signed-off-by: Wen Gong <[email protected]>

[...]

> @@ -3682,7 +3677,12 @@ static int ath11k_mac_op_hw_scan(struct ieee80211_hw *hw,
>
> ret = ath11k_start_scan(ar, &arg);
> if (ret) {
> - ath11k_warn(ar->ab, "failed to start hw scan: %d\n", ret);
> + if (ret == -EBUSY)
> + ath11k_dbg(ar->ab, ATH11K_DBG_MAC,
> + "scan engine is busy 11d state %d\n", ar->state_11d);
> + else
> + ath11k_warn(ar->ab, "failed to start hw scan: %d\n", ret);
> +
> spin_lock_bh(&ar->data_lock);
> ar->scan.state = ATH11K_SCAN_IDLE;
> spin_unlock_bh(&ar->data_lock);

This feels like a hack to me, for example will these failed scans now
cause delays is connection establishment? IMHO it's crucial from user's
point of view that we don't delay that in any way.

I would rather fix the root cause, do we know what's causing this?

--
https://patchwork.kernel.org/project/linux-wireless/list/

https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches

2022-11-18 10:43:05

by Wen Gong

[permalink] [raw]
Subject: Re: [PATCH v3 2/2] wifi: ath11k: reduce the timeout value back for hw scan from 10 seconds to 1 second

On 11/8/2022 6:20 PM, Kalle Valo wrote:
> Wen Gong <[email protected]> writes:
>
...
> [...]
>
>> @@ -3682,7 +3677,12 @@ static int ath11k_mac_op_hw_scan(struct ieee80211_hw *hw,
>>
>> ret = ath11k_start_scan(ar, &arg);
>> if (ret) {
>> - ath11k_warn(ar->ab, "failed to start hw scan: %d\n", ret);
>> + if (ret == -EBUSY)
>> + ath11k_dbg(ar->ab, ATH11K_DBG_MAC,
>> + "scan engine is busy 11d state %d\n", ar->state_11d);
>> + else
>> + ath11k_warn(ar->ab, "failed to start hw scan: %d\n", ret);
>> +
>> spin_lock_bh(&ar->data_lock);
>> ar->scan.state = ATH11K_SCAN_IDLE;
>> spin_unlock_bh(&ar->data_lock);
> This feels like a hack to me, for example will these failed scans now
> cause delays is connection establishment? IMHO it's crucial from user's
> point of view that we don't delay that in any way.
It will not delay connection.
After wlan load, the 1st hw scan will arrived to ath11k, and then 11d
scan will be sent to firmware after the 1st hw scan. It means the hw
scan for connection is run before 11d scan, and then connection could
be started immediately after the 1st hw scan finished. It means no
delay for connection.
> I would rather fix the root cause, do we know what's causing this?
In firmware, hw scan and 11d scan are all running in the same queue,
they can not be run parallel.

When 6 GHz enabled, the 1st hw scan cost about 7s and finished, and
then 11d scan cost the next 7s. After the 14s, the each hw scan arrived
to ath11k will be run immediately. If the 2nd hw scan arrived before
the 11d scan finished, for example, it arrived 7.1 seconds after the
1st hw scan, at this moment, the 11d scan is still running in firmware,
then the 2nd hw scan will not receive scan started event untill the 11d
scan finished, and meanwhile, the 2nd hw scan is holding the ar->conf_mutex
in ath11k_mac_op_hw_scan(), it is not good to hold a lock for some
seconds because ar->conf_mutex is widely used. So reduce the 10s to 1s
to avoid holding ar->conf_mutex for long time.

2022-11-23 04:04:34

by Wen Gong

[permalink] [raw]
Subject: Re: [PATCH v3 2/2] wifi: ath11k: reduce the timeout value back for hw scan from 10 seconds to 1 second

On 11/18/2022 6:29 PM, Wen Gong wrote:
> On 11/8/2022 6:20 PM, Kalle Valo wrote:
>> Wen Gong <[email protected]> writes:
>>
> ...
>> [...]
>>
>>> @@ -3682,7 +3677,12 @@ static int ath11k_mac_op_hw_scan(struct
>>> ieee80211_hw *hw,
>>>         ret = ath11k_start_scan(ar, &arg);
>>>       if (ret) {
>>> -        ath11k_warn(ar->ab, "failed to start hw scan: %d\n", ret);
>>> +        if (ret == -EBUSY)
>>> +            ath11k_dbg(ar->ab, ATH11K_DBG_MAC,
>>> +                   "scan engine is busy 11d state %d\n",
>>> ar->state_11d);
>>> +        else
>>> +            ath11k_warn(ar->ab, "failed to start hw scan: %d\n", ret);
>>> +
>>>           spin_lock_bh(&ar->data_lock);
>>>           ar->scan.state = ATH11K_SCAN_IDLE;
>>>           spin_unlock_bh(&ar->data_lock);
>> This feels like a hack to me, for example will these failed scans now
>> cause delays is connection establishment? IMHO it's crucial from user's
>> point of view that we don't delay that in any way.
> It will not delay connection.
> After wlan load, the 1st hw scan will arrived to ath11k, and then 11d
> scan will be sent to firmware after the 1st hw scan. It means the hw
> scan for connection is run before 11d scan, and then connection could
> be started immediately after the 1st hw scan finished. It means no
> delay for connection.
>> I would rather fix the root cause, do we know what's causing this?
> In firmware, hw scan and 11d scan are all running in the same queue,
> they can not be run parallel.
>
> When 6 GHz enabled, the 1st hw scan cost about 7s and finished, and
> then 11d scan cost the next 7s. After the 14s, the each hw scan arrived
> to ath11k will be run immediately. If the 2nd hw scan arrived before
> the 11d scan finished, for example, it arrived 7.1 seconds after the
> 1st hw scan, at this moment, the 11d scan is still running in firmware,
> then the 2nd hw scan will not receive scan started event untill the 11d
> scan finished, and meanwhile, the 2nd hw scan is holding the
> ar->conf_mutex
> in ath11k_mac_op_hw_scan(), it is not good to hold a lock for some
> seconds because ar->conf_mutex is widely used. So reduce the 10s to 1s
> to avoid holding ar->conf_mutex for long time.

Hi Kalle,

Should I change commit log with above explanation and send v4?

2022-12-16 03:14:13

by Wen Gong

[permalink] [raw]
Subject: Re: [PATCH v3 2/2] wifi: ath11k: reduce the timeout value back for hw scan from 10 seconds to 1 second

Hi Kalle,

Should I change commit log with below explanation and send v4?

On 11/23/2022 11:41 AM, Wen Gong wrote:
> On 11/18/2022 6:29 PM, Wen Gong wrote:
>> On 11/8/2022 6:20 PM, Kalle Valo wrote:
>>> Wen Gong <[email protected]> writes:
>>>
>> ...
>>> [...]
>>>
>>>> @@ -3682,7 +3677,12 @@ static int ath11k_mac_op_hw_scan(struct
>>>> ieee80211_hw *hw,
>>>>         ret = ath11k_start_scan(ar, &arg);
>>>>       if (ret) {
>>>> -        ath11k_warn(ar->ab, "failed to start hw scan: %d\n", ret);
>>>> +        if (ret == -EBUSY)
>>>> +            ath11k_dbg(ar->ab, ATH11K_DBG_MAC,
>>>> +                   "scan engine is busy 11d state %d\n",
>>>> ar->state_11d);
>>>> +        else
>>>> +            ath11k_warn(ar->ab, "failed to start hw scan: %d\n",
>>>> ret);
>>>> +
>>>>           spin_lock_bh(&ar->data_lock);
>>>>           ar->scan.state = ATH11K_SCAN_IDLE;
>>>>           spin_unlock_bh(&ar->data_lock);
>>> This feels like a hack to me, for example will these failed scans now
>>> cause delays is connection establishment? IMHO it's crucial from user's
>>> point of view that we don't delay that in any way.
>> It will not delay connection.
>> After wlan load, the 1st hw scan will arrived to ath11k, and then 11d
>> scan will be sent to firmware after the 1st hw scan. It means the hw
>> scan for connection is run before 11d scan, and then connection could
>> be started immediately after the 1st hw scan finished. It means no
>> delay for connection.
>>> I would rather fix the root cause, do we know what's causing this?
>> In firmware, hw scan and 11d scan are all running in the same queue,
>> they can not be run parallel.
>>
>> When 6 GHz enabled, the 1st hw scan cost about 7s and finished, and
>> then 11d scan cost the next 7s. After the 14s, the each hw scan arrived
>> to ath11k will be run immediately. If the 2nd hw scan arrived before
>> the 11d scan finished, for example, it arrived 7.1 seconds after the
>> 1st hw scan, at this moment, the 11d scan is still running in firmware,
>> then the 2nd hw scan will not receive scan started event untill the 11d
>> scan finished, and meanwhile, the 2nd hw scan is holding the
>> ar->conf_mutex
>> in ath11k_mac_op_hw_scan(), it is not good to hold a lock for some
>> seconds because ar->conf_mutex is widely used. So reduce the 10s to 1s
>> to avoid holding ar->conf_mutex for long time.
>
> Hi Kalle,
>
> Should I change commit log with above explanation and send v4?
>

2023-01-13 07:15:01

by Wen Gong

[permalink] [raw]
Subject: Re: [PATCH v3 2/2] wifi: ath11k: reduce the timeout value back for hw scan from 10 seconds to 1 second

Hi Kalle,

Should I change commit log with below explanation and send v4?

On 12/16/2022 11:08 AM, Wen Gong wrote:
> Hi Kalle,
>
> Should I change commit log with below explanation and send v4?
>
> On 11/23/2022 11:41 AM, Wen Gong wrote:
>> On 11/18/2022 6:29 PM, Wen Gong wrote:
>>> On 11/8/2022 6:20 PM, Kalle Valo wrote:
>>>> Wen Gong <[email protected]> writes:
>>>>
>>> ...
>>>> [...]
>>>>
>>>>> @@ -3682,7 +3677,12 @@ static int ath11k_mac_op_hw_scan(struct
>>>>> ieee80211_hw *hw,
>>>>>         ret = ath11k_start_scan(ar, &arg);
>>>>>       if (ret) {
>>>>> -        ath11k_warn(ar->ab, "failed to start hw scan: %d\n", ret);
>>>>> +        if (ret == -EBUSY)
>>>>> +            ath11k_dbg(ar->ab, ATH11K_DBG_MAC,
>>>>> +                   "scan engine is busy 11d state %d\n",
>>>>> ar->state_11d);
>>>>> +        else
>>>>> +            ath11k_warn(ar->ab, "failed to start hw scan: %d\n",
>>>>> ret);
>>>>> +
>>>>>           spin_lock_bh(&ar->data_lock);
>>>>>           ar->scan.state = ATH11K_SCAN_IDLE;
>>>>>           spin_unlock_bh(&ar->data_lock);
>>>> This feels like a hack to me, for example will these failed scans now
>>>> cause delays is connection establishment? IMHO it's crucial from
>>>> user's
>>>> point of view that we don't delay that in any way.
>>> It will not delay connection.
>>> After wlan load, the 1st hw scan will arrived to ath11k, and then 11d
>>> scan will be sent to firmware after the 1st hw scan. It means the hw
>>> scan for connection is run before 11d scan, and then connection could
>>> be started immediately after the 1st hw scan finished. It means no
>>> delay for connection.
>>>> I would rather fix the root cause, do we know what's causing this?
>>> In firmware, hw scan and 11d scan are all running in the same queue,
>>> they can not be run parallel.
>>>
>>> When 6 GHz enabled, the 1st hw scan cost about 7s and finished, and
>>> then 11d scan cost the next 7s. After the 14s, the each hw scan arrived
>>> to ath11k will be run immediately. If the 2nd hw scan arrived before
>>> the 11d scan finished, for example, it arrived 7.1 seconds after the
>>> 1st hw scan, at this moment, the 11d scan is still running in firmware,
>>> then the 2nd hw scan will not receive scan started event untill the 11d
>>> scan finished, and meanwhile, the 2nd hw scan is holding the
>>> ar->conf_mutex
>>> in ath11k_mac_op_hw_scan(), it is not good to hold a lock for some
>>> seconds because ar->conf_mutex is widely used. So reduce the 10s to 1s
>>> to avoid holding ar->conf_mutex for long time.
>>
>> Hi Kalle,
>>
>> Should I change commit log with above explanation and send v4?
>>

2023-01-13 12:24:10

by Kalle Valo

[permalink] [raw]
Subject: Re: [PATCH v3 2/2] wifi: ath11k: reduce the timeout value back for hw scan from 10 seconds to 1 second

Wen Gong <[email protected]> writes:

> Should I change commit log with below explanation and send v4?

Please stop spamming the same question over and over, it's really
annoying. If I don't have time to look at something, spamming me won't
help, quite the opposite. It would be a lot better if you would help
with the other upstream related tasks we have, that way I might have
more time to look at your patches.

To answer your question I need to look at this patchset in detail and I
don't know when I'm able to do that. But at this moment I don't trust
this patchset is the right approach and I'm not willing to take it.

--
https://patchwork.kernel.org/project/linux-wireless/list/

https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches

2023-01-31 02:41:01

by Wen Gong

[permalink] [raw]
Subject: Re: [PATCH v3 2/2] wifi: ath11k: reduce the timeout value back for hw scan from 10 seconds to 1 second

On 1/13/2023 8:14 PM, Kalle Valo wrote:
> Wen Gong <[email protected]> writes:
>
>> Should I change commit log with below explanation and send v4?
> Please stop spamming the same question over and over, it's really
> annoying. If I don't have time to look at something, spamming me won't
> help, quite the opposite. It would be a lot better if you would help
> with the other upstream related tasks we have, that way I might have
> more time to look at your patches.
>
> To answer your question I need to look at this patchset in detail and I
> don't know when I'm able to do that. But at this moment I don't trust
> this patchset is the right approach and I'm not willing to take it.

yes.

I will send v4 only for one patch "[v3,1/2] wifi: ath11k: change to set
11d state instead of start 11d scan while disconnect",

is it ok?