2024-04-13 23:05:29

by Lewis Robbins

[permalink] [raw]
Subject: [PATCH] wifi: rtw88: reduce failed to flush queue severity

Reduce the log message severity when we fail to flush device priority
queue. If a system has a lot of traffic, we may fail to flush the queue
in time. This generates a lot of messages in the kernel ring buffer. As
this is a common occurrence, we should use dev_info instead of dev_warn.

Signed-off-by: Lewis Robbins <[email protected]>
---
drivers/net/wireless/realtek/rtw88/mac.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/wireless/realtek/rtw88/mac.c b/drivers/net/wireless/realtek/rtw88/mac.c
index 0c1c1ff31085..800f4eabae13 100644
--- a/drivers/net/wireless/realtek/rtw88/mac.c
+++ b/drivers/net/wireless/realtek/rtw88/mac.c
@@ -1040,7 +1040,7 @@ static void __rtw_mac_flush_prio_queue(struct rtw_dev *rtwdev,
* And it requires like ~2secs to flush the full priority queue.
*/
if (!drop)
- rtw_warn(rtwdev, "timed out to flush queue %d\n", prio_queue);
+ rtw_info(rtwdev, "timed out to flush queue %d\n", prio_queue);
}

static void rtw_mac_flush_prio_queues(struct rtw_dev *rtwdev,
--
2.44.0



2024-04-15 01:40:54

by Ping-Ke Shih

[permalink] [raw]
Subject: RE: [PATCH] wifi: rtw88: reduce failed to flush queue severity

Lewis Robbins <[email protected]> wrote:
>
> Reduce the log message severity when we fail to flush device priority
> queue. If a system has a lot of traffic, we may fail to flush the queue
> in time. This generates a lot of messages in the kernel ring buffer. As
> this is a common occurrence, we should use dev_info instead of dev_warn.
>
> Signed-off-by: Lewis Robbins <[email protected]>

Acked-by: Ping-Ke Shih <[email protected]>

I'd like to know situations of " If a system has a lot of traffic...".
Did you scan or do something during traffic?


2024-04-15 16:17:40

by Kalle Valo

[permalink] [raw]
Subject: Re: [PATCH] wifi: rtw88: reduce failed to flush queue severity

Ping-Ke Shih <[email protected]> writes:

> Lewis Robbins <[email protected]> wrote:
>>
>> Reduce the log message severity when we fail to flush device priority
>> queue. If a system has a lot of traffic, we may fail to flush the queue
>> in time. This generates a lot of messages in the kernel ring buffer. As
>> this is a common occurrence, we should use dev_info instead of dev_warn.
>>
>> Signed-off-by: Lewis Robbins <[email protected]>
>
> Acked-by: Ping-Ke Shih <[email protected]>
>
> I'd like to know situations of " If a system has a lot of traffic...".
> Did you scan or do something during traffic?

The driver shouldn't print any warnings in normal usage, even using info
level. If this is expected scenario then maybe change it to debug print?
Or if is this an actual bug then it's better try to investigate and fix
it.

--
https://patchwork.kernel.org/project/linux-wireless/list/

https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches

2024-04-15 23:31:16

by Lewis Robbins

[permalink] [raw]
Subject: Re: [PATCH] wifi: rtw88: reduce failed to flush queue severity

Ping-Ke Shih <[email protected]> writes:

> Lewis Robbins <[email protected]> wrote:
>>
>> Reduce the log message severity when we fail to flush device priority
>> queue. If a system has a lot of traffic, we may fail to flush the queue
>> in time. This generates a lot of messages in the kernel ring buffer. As
>> this is a common occurrence, we should use dev_info instead of dev_warn.
>>
>> Signed-off-by: Lewis Robbins <[email protected]>
>
> Acked-by: Ping-Ke Shih <[email protected]>
>
> I'd like to know situations of " If a system has a lot of traffic...".
> Did you scan or do something during traffic?

So, after digging a bit more, it seems you're right this only happens during a
scan. The log message itself is repeated about 5-10x.

Kalle Valo <[email protected]> writes:

> The driver shouldn't print any warnings in normal usage, even using info
> level. If this is expected scenario then maybe change it to debug print?
> Or if is this an actual bug then it's better try to investigate and fix
> it.

I have the stack-trace:

[23838.633664] rtw_8821ce 0000:02:00.0: timed out to flush queue 2
[23838.633685] CPU: 1 PID: 363059 Comm: kworker/u8:1 Tainted G 6.8.5
[23838.633698] Hardware name: /, BIOS 5.26 09/26/2023
[23838.633704] Workqueue: events_unbound cfg80211_wiphy_work [cfg80211]
[23838.633881] Call Trace:
[23838.633889] <TASK>
[23838.633898] dump_stack_lvl+0x47/0x60
[23838.633918] rtw_mac_flush_queues+0x148/0x190 [rtw88_core 0d7ad2d9d6116c633c0aab4e7bc6016d572d75d4]
[23838.633993] rtw_ops_flush+0x5a/0x70 [rtw88_core 0d7ad2d9d6116c633c0aab4e7bc6016d572d75d4]
[23838.634056] __ieee80211_flush_queues+0x10b/0x2e0 [mac80211 5d0b446baffe1290bc56d55aa496e941688b7b40]
[23838.634309] ieee80211_scan_work+0x3e3/0x520 [mac80211 5d0b446baffe1290bc56d55aa496e941688b7b40]
[23838.634494] cfg80211_wiphy_work+0xa7/0xe0 [cfg80211 b36d5437ba649ace42ea92e8f83a3ec499e0d5b7]
[23838.634646] process_one_work+0x178/0x350
[23838.634660] worker_thread+0x30f/0x450
[23838.634670] ? __pfx_worker_thread+0x10/0x10
[23838.634678] kthread+0xe5/0x120
[23838.634691] ? __pfx_kthread+0x10/0x10
[23838.634702] ret_from_fork+0x31/0x50
[23838.634714] ? __pfx_kthread+0x10/0x10
[23838.634724] ret_from_fork_asm+0x1b/0x30
[23838.634736] </TASK>

I'm not sure as to the cause. If the flush operation takes a long time do we
need to release any mutexes etc? And if this is just a hardware issue, then we
can do a debug print as you say.

BugZilla: https://bugzilla.kernel.org/show_bug.cgi?id=218697

2024-04-16 02:58:20

by Ping-Ke Shih

[permalink] [raw]
Subject: RE: [PATCH] wifi: rtw88: reduce failed to flush queue severity

Lewis Robbins <[email protected]> wrote:
>
> Ping-Ke Shih <[email protected]> writes:
>
> > Lewis Robbins <[email protected]> wrote:
> >>
> >> Reduce the log message severity when we fail to flush device priority
> >> queue. If a system has a lot of traffic, we may fail to flush the queue
> >> in time. This generates a lot of messages in the kernel ring buffer. As
> >> this is a common occurrence, we should use dev_info instead of dev_warn.
> >>
> >> Signed-off-by: Lewis Robbins <[email protected]>
> >
> > Acked-by: Ping-Ke Shih <[email protected]>
> >
> > I'd like to know situations of " If a system has a lot of traffic...".
> > Did you scan or do something during traffic?
>
> So, after digging a bit more, it seems you're right this only happens during a
> scan. The log message itself is repeated about 5-10x.

That is the same as my test before.

>
> I'm not sure as to the cause. If the flush operation takes a long time do we
> need to release any mutexes etc? And if this is just a hardware issue, then we
> can do a debug print as you say.

The cause is because packets in hardware TX queue that can't be sent out in time,
and flush ops with 'drop = false', so driver throws one warning. I don't have
good idea for now. Maybe, we can add a special debug mask to replace this kind of
verbose warning with uncertain solution.



2024-04-17 01:43:17

by Ping-Ke Shih

[permalink] [raw]
Subject: RE: [PATCH] wifi: rtw88: reduce failed to flush queue severity

>
> The cause is because packets in hardware TX queue that can't be sent out in time,
> and flush ops with 'drop = false', so driver throws one warning. I don't have
> good idea for now. Maybe, we can add a special debug mask to replace this kind of
> verbose warning with uncertain solution.

I have made a patch [1] as mentioned before. Lewis, please give it a try.

[1] https://lore.kernel.org/linux-wireless/[email protected]/T/#u