2021-02-10 00:49:06

by Shuah Khan

[permalink] [raw]
Subject: [PATCH 0/5] ath10k fixes for warns

I have been seeing lockdep asserts for a couple of months and finally
found time to debug and fix the problems. The dmesg looks clean with
these fixes.

Enabling LOCKDEP and ATH10K_DEBUGFS triggers the lockdep assert and
RCU warns.

The first two patches in this series are fixes to lockdep assert and
RCU usage bugs.

The last patch (5/5) is a fix to reduce invalid ht params rate message
noise. Patch 3/4 changes a message from debug to warn. Patch 4 adds
detect to assert not calling ath10k_drain_tx() holding conf_mutex.

Shuah Khan (5):
ath10k: fix conf_mutex lock assert in ath10k_debug_fw_stats_request()
ath10k: fix WARNING: suspicious RCU usage
ath10k: change ath10k_offchan_tx_work() peer present msg to a warn
ath10k: detect conf_mutex held ath10k_drain_tx() calls
ath10k: reduce invalid ht params rate message noise

drivers/net/wireless/ath/ath10k/mac.c | 13 ++++++++-----
drivers/net/wireless/ath/ath10k/wmi-tlv.c | 15 +++++++++++----
2 files changed, 19 insertions(+), 9 deletions(-)

--
2.27.0


2021-02-10 00:51:01

by Shuah Khan

[permalink] [raw]
Subject: [PATCH 5/5] ath10k: reduce invalid ht params rate message noise

ath10k_mac_get_rate_flags_ht() floods dmesg with the following messages,
when it fails to find a match for mcs=7 and rate=1440.

supported_ht_mcs_rate_nss2:
{7, {1300, 2700, 1444, 3000} }

ath10k_pci 0000:02:00.0: invalid ht params rate 1440 100kbps nss 2 mcs 7

dev_warn_ratelimited() isn't helping the noise. Use dev_warn_once()
instead.

Signed-off-by: Shuah Khan <[email protected]>
---
drivers/net/wireless/ath/ath10k/mac.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/net/wireless/ath/ath10k/mac.c b/drivers/net/wireless/ath/ath10k/mac.c
index 3545ce7dce0a..276321f0cfdd 100644
--- a/drivers/net/wireless/ath/ath10k/mac.c
+++ b/drivers/net/wireless/ath/ath10k/mac.c
@@ -8970,8 +8970,9 @@ static void ath10k_mac_get_rate_flags_ht(struct ath10k *ar, u32 rate, u8 nss, u8
*bw |= RATE_INFO_BW_40;
*flags |= RATE_INFO_FLAGS_SHORT_GI;
} else {
- ath10k_warn(ar, "invalid ht params rate %d 100kbps nss %d mcs %d",
- rate, nss, mcs);
+ dev_warn_once(ar->dev,
+ "invalid ht params rate %d 100kbps nss %d mcs %d",
+ rate, nss, mcs);
}
}

--
2.27.0

2021-02-10 00:51:30

by Shuah Khan

[permalink] [raw]
Subject: [PATCH 3/5] ath10k: change ath10k_offchan_tx_work() peer present msg to a warn

Based on the comment block in this function and the FIXME for this, peer
being present for the offchannel tx is unlikely. Peer is deleted once tx
is complete. Change peer present msg to a warn to detect this condition.

Signed-off-by: Shuah Khan <[email protected]>
---
drivers/net/wireless/ath/ath10k/mac.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/net/wireless/ath/ath10k/mac.c b/drivers/net/wireless/ath/ath10k/mac.c
index e815aab412d7..53f92945006f 100644
--- a/drivers/net/wireless/ath/ath10k/mac.c
+++ b/drivers/net/wireless/ath/ath10k/mac.c
@@ -3954,9 +3954,8 @@ void ath10k_offchan_tx_work(struct work_struct *work)
spin_unlock_bh(&ar->data_lock);

if (peer)
- /* FIXME: should this use ath10k_warn()? */
- ath10k_dbg(ar, ATH10K_DBG_MAC, "peer %pM on vdev %d already present\n",
- peer_addr, vdev_id);
+ ath10k_warn(ar, "peer %pM on vdev %d already present\n",
+ peer_addr, vdev_id);

if (!peer) {
ret = ath10k_peer_create(ar, NULL, NULL, vdev_id,
--
2.27.0

2021-02-10 00:51:48

by Shuah Khan

[permalink] [raw]
Subject: [PATCH 4/5] ath10k: detect conf_mutex held ath10k_drain_tx() calls

ath10k_drain_tx() must not be called with conf_mutex held as workers can
use that also. Add check to detect conf_mutex held calls.

Signed-off-by: Shuah Khan <[email protected]>
---
drivers/net/wireless/ath/ath10k/mac.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/drivers/net/wireless/ath/ath10k/mac.c b/drivers/net/wireless/ath/ath10k/mac.c
index 53f92945006f..3545ce7dce0a 100644
--- a/drivers/net/wireless/ath/ath10k/mac.c
+++ b/drivers/net/wireless/ath/ath10k/mac.c
@@ -4566,6 +4566,7 @@ static void ath10k_mac_op_wake_tx_queue(struct ieee80211_hw *hw,
/* Must not be called with conf_mutex held as workers can use that also. */
void ath10k_drain_tx(struct ath10k *ar)
{
+ WARN_ON(lockdep_is_held(&ar->conf_mutex));
/* make sure rcu-protected mac80211 tx path itself is drained */
synchronize_net();

--
2.27.0

2021-02-10 02:40:12

by Wen Gong

[permalink] [raw]
Subject: Re: [PATCH 5/5] ath10k: reduce invalid ht params rate message noise

On 2021-02-10 08:42, Shuah Khan wrote:
> ath10k_mac_get_rate_flags_ht() floods dmesg with the following
> messages,
> when it fails to find a match for mcs=7 and rate=1440.
>
> supported_ht_mcs_rate_nss2:
> {7, {1300, 2700, 1444, 3000} }
>
> ath10k_pci 0000:02:00.0: invalid ht params rate 1440 100kbps nss 2 mcs
> 7
>
> dev_warn_ratelimited() isn't helping the noise. Use dev_warn_once()
> instead.
>
> Signed-off-by: Shuah Khan <[email protected]>
> ---
> drivers/net/wireless/ath/ath10k/mac.c | 5 +++--
> 1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/net/wireless/ath/ath10k/mac.c
> b/drivers/net/wireless/ath/ath10k/mac.c
> index 3545ce7dce0a..276321f0cfdd 100644
> --- a/drivers/net/wireless/ath/ath10k/mac.c
> +++ b/drivers/net/wireless/ath/ath10k/mac.c
> @@ -8970,8 +8970,9 @@ static void ath10k_mac_get_rate_flags_ht(struct
> ath10k *ar, u32 rate, u8 nss, u8
> *bw |= RATE_INFO_BW_40;
> *flags |= RATE_INFO_FLAGS_SHORT_GI;
> } else {
> - ath10k_warn(ar, "invalid ht params rate %d 100kbps nss %d mcs %d",
> - rate, nss, mcs);
> + dev_warn_once(ar->dev,
> + "invalid ht params rate %d 100kbps nss %d mcs %d",
> + rate, nss, mcs);
> }
> }
The {7, {1300, 2700, 1444, 3000} } is a correct value.
The 1440 is report from firmware, its a wrong value, it has fixed in
firmware.
If change it to dev_warn_once, then it will have no chance to find the
other wrong values which report by firmware, and it indicate
a wrong value to mac80211/cfg80211 and lead "iw wlan0 station dump" get
a wrong bitrate.

2021-02-10 08:43:28

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH 4/5] ath10k: detect conf_mutex held ath10k_drain_tx() calls

Hi Shuah,

I love your patch! Yet something to improve:

[auto build test ERROR on wireless-drivers-next/master]
[also build test ERROR on wireless-drivers/master v5.11-rc7 next-20210125]
[cannot apply to ath6kl/ath-next]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url: https://github.com/0day-ci/linux/commits/Shuah-Khan/ath10k-fixes-for-warns/20210210-085259
base: https://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-next.git master
config: arm64-defconfig (attached as .config)
compiler: aarch64-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# https://github.com/0day-ci/linux/commit/5a96c0c9dfec2bd59e680b7ec8ade9378b654c4c
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review Shuah-Khan/ath10k-fixes-for-warns/20210210-085259
git checkout 5a96c0c9dfec2bd59e680b7ec8ade9378b654c4c
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=arm64

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <[email protected]>

All errors (new ones prefixed by >>, old ones prefixed by <<):

>> ERROR: modpost: "lockdep_is_held" [drivers/net/wireless/ath/ath10k/ath10k_core.ko] undefined!

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/[email protected]


Attachments:
(No filename) (1.69 kB)
.config.gz (52.15 kB)
Download all attachments

2021-02-10 08:45:29

by Kalle Valo

[permalink] [raw]
Subject: Re: [PATCH 5/5] ath10k: reduce invalid ht params rate message noise

Wen Gong <[email protected]> writes:

> On 2021-02-10 08:42, Shuah Khan wrote:
>> ath10k_mac_get_rate_flags_ht() floods dmesg with the following
>> messages,
>> when it fails to find a match for mcs=7 and rate=1440.
>>
>> supported_ht_mcs_rate_nss2:
>> {7, {1300, 2700, 1444, 3000} }
>>
>> ath10k_pci 0000:02:00.0: invalid ht params rate 1440 100kbps nss 2
>> mcs 7
>>
>> dev_warn_ratelimited() isn't helping the noise. Use dev_warn_once()
>> instead.
>>
>> Signed-off-by: Shuah Khan <[email protected]>
>> ---
>> drivers/net/wireless/ath/ath10k/mac.c | 5 +++--
>> 1 file changed, 3 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/net/wireless/ath/ath10k/mac.c
>> b/drivers/net/wireless/ath/ath10k/mac.c
>> index 3545ce7dce0a..276321f0cfdd 100644
>> --- a/drivers/net/wireless/ath/ath10k/mac.c
>> +++ b/drivers/net/wireless/ath/ath10k/mac.c
>> @@ -8970,8 +8970,9 @@ static void ath10k_mac_get_rate_flags_ht(struct
>> ath10k *ar, u32 rate, u8 nss, u8
>> *bw |= RATE_INFO_BW_40;
>> *flags |= RATE_INFO_FLAGS_SHORT_GI;
>> } else {
>> - ath10k_warn(ar, "invalid ht params rate %d 100kbps nss %d mcs %d",
>> - rate, nss, mcs);
>> + dev_warn_once(ar->dev,
>> + "invalid ht params rate %d 100kbps nss %d mcs %d",
>> + rate, nss, mcs);
>> }
>> }
>
> The {7, {1300, 2700, 1444, 3000} } is a correct value.
> The 1440 is report from firmware, its a wrong value, it has fixed in
> firmware.

In what version?

> If change it to dev_warn_once, then it will have no chance to find the
> other wrong values which report by firmware, and it indicate
> a wrong value to mac80211/cfg80211 and lead "iw wlan0 station dump"
> get a wrong bitrate.

I agree, we should keep this warning. If the firmware still keeps
sending invalid rates we should add a specific check to ignore the known
invalid values, but not all of them.

--
https://patchwork.kernel.org/project/linux-wireless/list/

https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches

2021-02-11 06:53:15

by Kalle Valo

[permalink] [raw]
Subject: Re: [PATCH 3/5] ath10k: change ath10k_offchan_tx_work() peer present msg to a warn

Shuah Khan <[email protected]> wrote:

> Based on the comment block in this function and the FIXME for this, peer
> being present for the offchannel tx is unlikely. Peer is deleted once tx
> is complete. Change peer present msg to a warn to detect this condition.
>
> Signed-off-by: Shuah Khan <[email protected]>
> Signed-off-by: Kalle Valo <[email protected]>

Patch applied to ath-next branch of ath.git, thanks.

83bae26532ca ath10k: change ath10k_offchan_tx_work() peer present msg to a warn

--
https://patchwork.kernel.org/project/linux-wireless/patch/3b1f71272d56ee1d7f567fbce13bdb56cc06d342.1612915444.git.skhan@linuxfoundation.org/

https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches

2021-02-11 11:26:15

by Kalle Valo

[permalink] [raw]
Subject: Re: [PATCH 4/5] ath10k: detect conf_mutex held ath10k_drain_tx() calls

Shuah Khan <[email protected]> writes:

> On 2/10/21 1:25 AM, Kalle Valo wrote:
>> Shuah Khan <[email protected]> writes:
>>
>>> ath10k_drain_tx() must not be called with conf_mutex held as workers can
>>> use that also. Add check to detect conf_mutex held calls.
>>>
>>> Signed-off-by: Shuah Khan <[email protected]>
>>
>> The commit log does not answer to "Why?". How did you find this? What
>> actual problem are you trying to solve?
>>
>
> I came across the comment block above the ath10k_drain_tx() as I was
> reviewing at conf_mutex holds while I was debugging the conf_mutex
> lock assert in ath10k_debug_fw_stats_request().
>
> My reasoning is that having this will help detect incorrect usages
> of ath10k_drain_tx() while holding conf_mutex which could lead to
> locking problems when async worker routines try to call this routine.

Ok, makes sense. I prefer having this background info in the commit log,
for example "found by code review" or something like that. Or just copy
what you wrote above :)

>>> --- a/drivers/net/wireless/ath/ath10k/mac.c
>>> +++ b/drivers/net/wireless/ath/ath10k/mac.c
>>> @@ -4566,6 +4566,7 @@ static void
>>> ath10k_mac_op_wake_tx_queue(struct ieee80211_hw *hw,
>>> /* Must not be called with conf_mutex held as workers can use that also. */
>>> void ath10k_drain_tx(struct ath10k *ar)
>>> {
>>> + WARN_ON(lockdep_is_held(&ar->conf_mutex));
>>
>> Empty line after WARN_ON().
>>
>
> Will do.
>
>> Shouldn't this check debug_locks similarly lockdep_assert_held() does?
>>
>> #define lockdep_assert_held(l) do { \
>> WARN_ON(debug_locks && !lockdep_is_held(l)); \
>> } while (0)
>>
>> And I suspect you need #ifdef CONFIG_LOCKDEP which should fix the kbuild
>> bot error.
>>
>
> Yes.
>
>> But honestly I would prefer to have lockdep_assert_not_held() in
>> include/linux/lockdep.h, much cleaner that way. Also
>> i915_gem_object_lookup_rcu() could then use the same macro.
>>
>
> Right. This is the right way to go. That was first instinct and
> decided to have the discussion evolve in that direction. Now that
> it has, I will combine this change with
> include/linux/lockdep.h and add lockdep_assert_not_held()
>
> I think we might have other places in the kernel that could use
> lockdep_assert_not_held() in addition to i915_gem_object_lookup_rcu()

Great, thank you. The only problem is that lockdep.h changes have to go
via some other tree, I just don't know which :) I think it would be
easiest if also the ath10k patch goes via that other tree, I can ack the
ath10k changes.

Another option is that I'll apply the ath10k patch after the lockdep.h
change has trickled down to my tree, but that usually happens only after
the merge window and means weeks of waiting. Either is fine for me.

--
https://patchwork.kernel.org/project/linux-wireless/list/

https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches

2021-02-11 11:28:15

by Kalle Valo

[permalink] [raw]
Subject: Re: [PATCH 5/5] ath10k: reduce invalid ht params rate message noise

Shuah Khan <[email protected]> writes:

> On 2/10/21 1:28 AM, Kalle Valo wrote:
>> Wen Gong <[email protected]> writes:
>>
>>> On 2021-02-10 08:42, Shuah Khan wrote:
>>>> ath10k_mac_get_rate_flags_ht() floods dmesg with the following
>>>> messages,
>>>> when it fails to find a match for mcs=7 and rate=1440.
>>>>
>>>> supported_ht_mcs_rate_nss2:
>>>> {7, {1300, 2700, 1444, 3000} }
>>>>
>>>> ath10k_pci 0000:02:00.0: invalid ht params rate 1440 100kbps nss 2
>>>> mcs 7
>>>>
>>>> dev_warn_ratelimited() isn't helping the noise. Use dev_warn_once()
>>>> instead.
>>>>
>>>> Signed-off-by: Shuah Khan <[email protected]>
>>>> ---
>>>> drivers/net/wireless/ath/ath10k/mac.c | 5 +++--
>>>> 1 file changed, 3 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/drivers/net/wireless/ath/ath10k/mac.c
>>>> b/drivers/net/wireless/ath/ath10k/mac.c
>>>> index 3545ce7dce0a..276321f0cfdd 100644
>>>> --- a/drivers/net/wireless/ath/ath10k/mac.c
>>>> +++ b/drivers/net/wireless/ath/ath10k/mac.c
>>>> @@ -8970,8 +8970,9 @@ static void ath10k_mac_get_rate_flags_ht(struct
>>>> ath10k *ar, u32 rate, u8 nss, u8
>>>> *bw |= RATE_INFO_BW_40;
>>>> *flags |= RATE_INFO_FLAGS_SHORT_GI;
>>>> } else {
>>>> - ath10k_warn(ar, "invalid ht params rate %d 100kbps nss %d mcs %d",
>>>> - rate, nss, mcs);
>>>> + dev_warn_once(ar->dev,
>>>> + "invalid ht params rate %d 100kbps nss %d mcs %d",
>>>> + rate, nss, mcs);
>>>> }
>>>> }
>>>
>>> The {7, {1300, 2700, 1444, 3000} } is a correct value.
>>> The 1440 is report from firmware, its a wrong value, it has fixed in
>>> firmware.
>>
>> In what version?
>>
>
> Here is the info:
>
> ath10k_pci 0000:02:00.0: qca6174 hw3.2 target 0x05030000 chip_id
> 0x00340aff sub 17aa:0827
>
> ath10k_pci 0000:02:00.0: firmware ver WLAN.RM.4.4.1-00140-QCARMSWPZ-1
> api 6 features wowlan,ignore-otp,mfp crc32 29eb8ca1
>
> ath10k_pci 0000:02:00.0: board_file api 2 bmi_id N/A crc32 4ac0889b
>
> ath10k_pci 0000:02:00.0: htt-ver 3.60 wmi-op 4 htt-op 3 cal otp
> max-sta 32 raw 0 hwcrypto 1
>
>>> If change it to dev_warn_once, then it will have no chance to find the
>>> other wrong values which report by firmware, and it indicate
>>> a wrong value to mac80211/cfg80211 and lead "iw wlan0 station dump"
>>> get a wrong bitrate.
>>
>
> Agreed.
>
>> I agree, we should keep this warning. If the firmware still keeps
>> sending invalid rates we should add a specific check to ignore the known
>> invalid values, but not all of them.
>>
>
> Would it be helpful to adjust the default rate limits and set the to
> a higher value instead. It might be difficult to account all possible
> invalid values?
>
> Something like, ath10k_warn_ratelimited() to adjust the
>
> DEFAULT_RATELIMIT_INTERVAL and DEFAULT_RATELIMIT_BURST using
> DEFINE_RATELIMIT_STATE
>
> Let me know if you like this idea. I can send a patch in to do this.
> I will hang on to this firmware version for a little but longer, so
> we have a test case. :)

I would rather first try to fix the root cause, which is the firmware
sending invalid rates. Wen, you mentioned there's a fix in firmware. Do
you know which firmware version (and branch) has the fix?

--
https://patchwork.kernel.org/project/linux-wireless/list/

https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches

2021-02-26 18:03:29

by Shuah Khan

[permalink] [raw]
Subject: Re: [PATCH 5/5] ath10k: reduce invalid ht params rate message noise

On 2/11/21 4:24 AM, Kalle Valo wrote:
> Shuah Khan <[email protected]> writes:
>
>> On 2/10/21 1:28 AM, Kalle Valo wrote:
>>> Wen Gong <[email protected]> writes:
>>>
>>>> On 2021-02-10 08:42, Shuah Khan wrote:
>>>>> ath10k_mac_get_rate_flags_ht() floods dmesg with the following
>>>>> messages,
>>>>> when it fails to find a match for mcs=7 and rate=1440.
>>>>>
>>>>> supported_ht_mcs_rate_nss2:
>>>>> {7, {1300, 2700, 1444, 3000} }
>>>>>
>>>>> ath10k_pci 0000:02:00.0: invalid ht params rate 1440 100kbps nss 2
>>>>> mcs 7
>>>>>
>>>>> dev_warn_ratelimited() isn't helping the noise. Use dev_warn_once()
>>>>> instead.
>>>>>
>>>>> Signed-off-by: Shuah Khan <[email protected]>
>>>>> ---
>>>>> drivers/net/wireless/ath/ath10k/mac.c | 5 +++--
>>>>> 1 file changed, 3 insertions(+), 2 deletions(-)
>>>>>
>>>>> diff --git a/drivers/net/wireless/ath/ath10k/mac.c
>>>>> b/drivers/net/wireless/ath/ath10k/mac.c
>>>>> index 3545ce7dce0a..276321f0cfdd 100644
>>>>> --- a/drivers/net/wireless/ath/ath10k/mac.c
>>>>> +++ b/drivers/net/wireless/ath/ath10k/mac.c
>>>>> @@ -8970,8 +8970,9 @@ static void ath10k_mac_get_rate_flags_ht(struct
>>>>> ath10k *ar, u32 rate, u8 nss, u8
>>>>> *bw |= RATE_INFO_BW_40;
>>>>> *flags |= RATE_INFO_FLAGS_SHORT_GI;
>>>>> } else {
>>>>> - ath10k_warn(ar, "invalid ht params rate %d 100kbps nss %d mcs %d",
>>>>> - rate, nss, mcs);
>>>>> + dev_warn_once(ar->dev,
>>>>> + "invalid ht params rate %d 100kbps nss %d mcs %d",
>>>>> + rate, nss, mcs);
>>>>> }
>>>>> }
>>>>
>>>> The {7, {1300, 2700, 1444, 3000} } is a correct value.
>>>> The 1440 is report from firmware, its a wrong value, it has fixed in
>>>> firmware.
>>>
>>> In what version?
>>>
>>
>> Here is the info:
>>
>> ath10k_pci 0000:02:00.0: qca6174 hw3.2 target 0x05030000 chip_id
>> 0x00340aff sub 17aa:0827
>>
>> ath10k_pci 0000:02:00.0: firmware ver WLAN.RM.4.4.1-00140-QCARMSWPZ-1
>> api 6 features wowlan,ignore-otp,mfp crc32 29eb8ca1
>>
>> ath10k_pci 0000:02:00.0: board_file api 2 bmi_id N/A crc32 4ac0889b
>>
>> ath10k_pci 0000:02:00.0: htt-ver 3.60 wmi-op 4 htt-op 3 cal otp
>> max-sta 32 raw 0 hwcrypto 1
>>
>>>> If change it to dev_warn_once, then it will have no chance to find the
>>>> other wrong values which report by firmware, and it indicate
>>>> a wrong value to mac80211/cfg80211 and lead "iw wlan0 station dump"
>>>> get a wrong bitrate.
>>>
>>
>> Agreed.
>>
>>> I agree, we should keep this warning. If the firmware still keeps
>>> sending invalid rates we should add a specific check to ignore the known
>>> invalid values, but not all of them.
>>>
>>
>> Would it be helpful to adjust the default rate limits and set the to
>> a higher value instead. It might be difficult to account all possible
>> invalid values?
>>
>> Something like, ath10k_warn_ratelimited() to adjust the
>>
>> DEFAULT_RATELIMIT_INTERVAL and DEFAULT_RATELIMIT_BURST using
>> DEFINE_RATELIMIT_STATE
>>
>> Let me know if you like this idea. I can send a patch in to do this.
>> I will hang on to this firmware version for a little but longer, so
>> we have a test case. :)
>
> I would rather first try to fix the root cause, which is the firmware
> sending invalid rates. Wen, you mentioned there's a fix in firmware. Do
> you know which firmware version (and branch) has the fix?
>

Picking this back up. Wen, which firmware version has this fix? I can
test this on my system and get rid of the noisy messages. :)

thanks,
-- Shuah

2023-09-20 18:39:04

by James Prestwood

[permalink] [raw]
Subject: Re: [PATCH 5/5] ath10k: reduce invalid ht params rate message noise

On 2/26/21 10:01 AM, Shuah Khan wrote:
> On 2/11/21 4:24 AM, Kalle Valo wrote:
>> Shuah Khan <[email protected]> writes:
>>
>>> On 2/10/21 1:28 AM, Kalle Valo wrote:
>>>> Wen Gong <[email protected]> writes:
>>>>
>>>>> On 2021-02-10 08:42, Shuah Khan wrote:
>>>>>> ath10k_mac_get_rate_flags_ht() floods dmesg with the following
>>>>>> messages,
>>>>>> when it fails to find a match for mcs=7 and rate=1440.
>>>>>>
>>>>>> supported_ht_mcs_rate_nss2:
>>>>>> {7,  {1300, 2700, 1444, 3000} }
>>>>>>
>>>>>> ath10k_pci 0000:02:00.0: invalid ht params rate 1440 100kbps nss 2
>>>>>> mcs 7
>>>>>>
>>>>>> dev_warn_ratelimited() isn't helping the noise. Use dev_warn_once()
>>>>>> instead.
>>>>>>
>>>>>> Signed-off-by: Shuah Khan <[email protected]>
>>>>>> ---
>>>>>>    drivers/net/wireless/ath/ath10k/mac.c | 5 +++--
>>>>>>    1 file changed, 3 insertions(+), 2 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/net/wireless/ath/ath10k/mac.c
>>>>>> b/drivers/net/wireless/ath/ath10k/mac.c
>>>>>> index 3545ce7dce0a..276321f0cfdd 100644
>>>>>> --- a/drivers/net/wireless/ath/ath10k/mac.c
>>>>>> +++ b/drivers/net/wireless/ath/ath10k/mac.c
>>>>>> @@ -8970,8 +8970,9 @@ static void
>>>>>> ath10k_mac_get_rate_flags_ht(struct
>>>>>> ath10k *ar, u32 rate, u8 nss, u8
>>>>>>            *bw |= RATE_INFO_BW_40;
>>>>>>            *flags |= RATE_INFO_FLAGS_SHORT_GI;
>>>>>>        } else {
>>>>>> -        ath10k_warn(ar, "invalid ht params rate %d 100kbps nss
>>>>>> %d mcs %d",
>>>>>> -                rate, nss, mcs);
>>>>>> +        dev_warn_once(ar->dev,
>>>>>> +                  "invalid ht params rate %d 100kbps nss %d mcs
>>>>>> %d",
>>>>>> +                  rate, nss, mcs);
>>>>>>        }
>>>>>>    }
>>>>>
>>>>> The {7,  {1300, 2700, 1444, 3000} } is a correct value.
>>>>> The 1440 is report from firmware, its a wrong value, it has fixed in
>>>>> firmware.
>>>>
>>>> In what version?
>>>>
>>>
>>> Here is the info:
>>>
>>> ath10k_pci 0000:02:00.0: qca6174 hw3.2 target 0x05030000 chip_id
>>> 0x00340aff sub 17aa:0827
>>>
>>> ath10k_pci 0000:02:00.0: firmware ver WLAN.RM.4.4.1-00140-QCARMSWPZ-1
>>> api 6 features wowlan,ignore-otp,mfp crc32 29eb8ca1
>>>
>>> ath10k_pci 0000:02:00.0: board_file api 2 bmi_id N/A crc32 4ac0889b
>>>
>>> ath10k_pci 0000:02:00.0: htt-ver 3.60 wmi-op 4 htt-op 3 cal otp
>>> max-sta 32 raw 0 hwcrypto 1
>>>
>>>>> If change it to dev_warn_once, then it will have no chance to find
>>>>> the
>>>>> other wrong values which report by firmware, and it indicate
>>>>> a wrong value to mac80211/cfg80211 and lead "iw wlan0 station dump"
>>>>> get a wrong bitrate.
>>>>
>>>
>>> Agreed.
>>>
>>>> I agree, we should keep this warning. If the firmware still keeps
>>>> sending invalid rates we should add a specific check to ignore the
>>>> known
>>>> invalid values, but not all of them.
>>>>
>>>
>>> Would it be helpful to adjust the default rate limits and set the to
>>> a higher value instead. It might be difficult to account all possible
>>> invalid values?
>>>
>>> Something like, ath10k_warn_ratelimited() to adjust the
>>>
>>> DEFAULT_RATELIMIT_INTERVAL and DEFAULT_RATELIMIT_BURST using
>>> DEFINE_RATELIMIT_STATE
>>>
>>> Let me know if you like this idea. I can send a patch in to do this.
>>> I will hang on to this firmware version for a little but longer, so
>>> we have a test case. :)
>>
>> I would rather first try to fix the root cause, which is the firmware
>> sending invalid rates. Wen, you mentioned there's a fix in firmware. Do
>> you know which firmware version (and branch) has the fix?
>>
>
> Picking this back up. Wen, which firmware version has this fix? I can
> test this on my system and get rid of the noisy messages. :)
>
> thanks,
> -- Shuah

I know its been years, but reading through this Wen mentioned there is a
fix in the firmware? I haven't tried all of the firmware binaries in
Kalle's tree but the most recent definitely still spam the logs with
this message. Is there a specific version I can use to get rid of these?

One thing to note is the older "firmware-4.bin" did not have this
problem, but was met with worse problems like driver/firmware crashes.

Thanks,

James

>
> _______________________________________________
> ath10k mailing list
> [email protected]
> http://lists.infradead.org/mailman/listinfo/ath10k
>

2023-09-21 09:50:17

by Jeff Johnson

[permalink] [raw]
Subject: Re: [PATCH 5/5] ath10k: reduce invalid ht params rate message noise

(just a resend with Wen's current e-mail address, no further comments)
On 9/20/2023 11:27 AM, James Prestwood wrote:
> On 2/26/21 10:01 AM, Shuah Khan wrote:
>> On 2/11/21 4:24 AM, Kalle Valo wrote:
>>> Shuah Khan <[email protected]> writes:
>>>
>>>> On 2/10/21 1:28 AM, Kalle Valo wrote:
>>>>> Wen Gong <[email protected]> writes:
>>>>>
>>>>>> On 2021-02-10 08:42, Shuah Khan wrote:
>>>>>>> ath10k_mac_get_rate_flags_ht() floods dmesg with the following
>>>>>>> messages,
>>>>>>> when it fails to find a match for mcs=7 and rate=1440.
>>>>>>>
>>>>>>> supported_ht_mcs_rate_nss2:
>>>>>>> {7,  {1300, 2700, 1444, 3000} }
>>>>>>>
>>>>>>> ath10k_pci 0000:02:00.0: invalid ht params rate 1440 100kbps nss 2
>>>>>>> mcs 7
>>>>>>>
>>>>>>> dev_warn_ratelimited() isn't helping the noise. Use dev_warn_once()
>>>>>>> instead.
>>>>>>>
>>>>>>> Signed-off-by: Shuah Khan <[email protected]>
>>>>>>> ---
>>>>>>>    drivers/net/wireless/ath/ath10k/mac.c | 5 +++--
>>>>>>>    1 file changed, 3 insertions(+), 2 deletions(-)
>>>>>>>
>>>>>>> diff --git a/drivers/net/wireless/ath/ath10k/mac.c
>>>>>>> b/drivers/net/wireless/ath/ath10k/mac.c
>>>>>>> index 3545ce7dce0a..276321f0cfdd 100644
>>>>>>> --- a/drivers/net/wireless/ath/ath10k/mac.c
>>>>>>> +++ b/drivers/net/wireless/ath/ath10k/mac.c
>>>>>>> @@ -8970,8 +8970,9 @@ static void
>>>>>>> ath10k_mac_get_rate_flags_ht(struct
>>>>>>> ath10k *ar, u32 rate, u8 nss, u8
>>>>>>>            *bw |= RATE_INFO_BW_40;
>>>>>>>            *flags |= RATE_INFO_FLAGS_SHORT_GI;
>>>>>>>        } else {
>>>>>>> -        ath10k_warn(ar, "invalid ht params rate %d 100kbps nss
>>>>>>> %d mcs %d",
>>>>>>> -                rate, nss, mcs);
>>>>>>> +        dev_warn_once(ar->dev,
>>>>>>> +                  "invalid ht params rate %d 100kbps nss %d mcs
>>>>>>> %d",
>>>>>>> +                  rate, nss, mcs);
>>>>>>>        }
>>>>>>>    }
>>>>>>
>>>>>> The {7,  {1300, 2700, 1444, 3000} } is a correct value.
>>>>>> The 1440 is report from firmware, its a wrong value, it has fixed in
>>>>>> firmware.
>>>>>
>>>>> In what version?
>>>>>
>>>>
>>>> Here is the info:
>>>>
>>>> ath10k_pci 0000:02:00.0: qca6174 hw3.2 target 0x05030000 chip_id
>>>> 0x00340aff sub 17aa:0827
>>>>
>>>> ath10k_pci 0000:02:00.0: firmware ver WLAN.RM.4.4.1-00140-QCARMSWPZ-1
>>>> api 6 features wowlan,ignore-otp,mfp crc32 29eb8ca1
>>>>
>>>> ath10k_pci 0000:02:00.0: board_file api 2 bmi_id N/A crc32 4ac0889b
>>>>
>>>> ath10k_pci 0000:02:00.0: htt-ver 3.60 wmi-op 4 htt-op 3 cal otp
>>>> max-sta 32 raw 0 hwcrypto 1
>>>>
>>>>>> If change it to dev_warn_once, then it will have no chance to find
>>>>>> the
>>>>>> other wrong values which report by firmware, and it indicate
>>>>>> a wrong value to mac80211/cfg80211 and lead "iw wlan0 station dump"
>>>>>> get a wrong bitrate.
>>>>>
>>>>
>>>> Agreed.
>>>>
>>>>> I agree, we should keep this warning. If the firmware still keeps
>>>>> sending invalid rates we should add a specific check to ignore the
>>>>> known
>>>>> invalid values, but not all of them.
>>>>>
>>>>
>>>> Would it be helpful to adjust the default rate limits and set the to
>>>> a higher value instead. It might be difficult to account all possible
>>>> invalid values?
>>>>
>>>> Something like, ath10k_warn_ratelimited() to adjust the
>>>>
>>>> DEFAULT_RATELIMIT_INTERVAL and DEFAULT_RATELIMIT_BURST using
>>>> DEFINE_RATELIMIT_STATE
>>>>
>>>> Let me know if you like this idea. I can send a patch in to do this.
>>>> I will hang on to this firmware version for a little but longer, so
>>>> we have a test case. :)
>>>
>>> I would rather first try to fix the root cause, which is the firmware
>>> sending invalid rates. Wen, you mentioned there's a fix in firmware. Do
>>> you know which firmware version (and branch) has the fix?
>>>
>>
>> Picking this back up. Wen, which firmware version has this fix? I can
>> test this on my system and get rid of the noisy messages. :)
>>
>> thanks,
>> -- Shuah
>
> I know its been years, but reading through this Wen mentioned there is a
> fix in the firmware? I haven't tried all of the firmware binaries in
> Kalle's tree but the most recent definitely still spam the logs with
> this message. Is there a specific version I can use to get rid of these?
>
> One thing to note is the older "firmware-4.bin" did not have this
> problem, but was met with worse problems like driver/firmware crashes.