2017-06-30 16:32:35

by Andrey Ryabinin

[permalink] [raw]
Subject: WARN_ON_ONCE(work > weight) in napi_poll()

I occasionally hit WARN_ON_ONCE(work > weight); in napi_poll() on a laptop with ath10k card.


[37207.593370] ------------[ cut here ]------------
[37207.593380] WARNING: CPU: 0 PID: 7 at ../net/core/dev.c:5274 net_rx_action+0x258/0x360
[37207.593381] Modules linked in: snd_hda_codec_realtek snd_soc_skl snd_hda_codec_generic snd_soc_skl_ipc snd_soc_sst_ipc snd_soc_sst_dsp snd_hda_ext_core snd_soc_sst_match snd_soc_core rtsx_pci_sdmmc mmc_core snd_pcm_dmaengine x86_pkg_temp_thermal snd_hda_intel uvcvideo kvm_intel videobuf2_vmalloc kvm snd_hda_codec snd_hwdep btusb videobuf2_memops btintel snd_hda_core videobuf2_v4l2 bluetooth irqbypass snd_pcm videobuf2_core crc32c_intel videodev mei_me mei rtsx_pci intel_lpss_pci intel_lpss_acpi intel_vbtn intel_lpss mfd_core tpm_tis intel_hid tpm_tis_core tpm efivarfs
[37207.593430] CPU: 0 PID: 7 Comm: ksoftirqd/0 Not tainted 4.11.7 #28
[37207.593432] Hardware name: Dell Inc. XPS 13 9360/0839Y6, BIOS 1.3.5 05/08/2017
[37207.593433] Call Trace:
[37207.593442] dump_stack+0x68/0x96
[37207.593447] __warn+0xcb/0xf0
[37207.593450] warn_slowpath_null+0x1d/0x20
[37207.593452] net_rx_action+0x258/0x360
[37207.593459] __do_softirq+0x12f/0x2f6
[37207.593463] run_ksoftirqd+0x1f/0x40
[37207.593468] smpboot_thread_fn+0x10a/0x160
[37207.593471] kthread+0x108/0x140
[37207.593474] ? sort_range+0x30/0x30
[37207.593477] ? kthread_create_on_node+0x40/0x40
[37207.593480] ret_from_fork+0x2e/0x40
[37207.593483] ---[ end trace 5e4e25ee6369e3eb ]---


2017-07-04 15:57:46

by Andrey Ryabinin

[permalink] [raw]
Subject: Re: WARN_ON_ONCE(work > weight) in napi_poll()

On 07/04/2017 04:49 PM, Kalle Valo wrote:
> Andrey Ryabinin <[email protected]> writes:
>
>> I occasionally hit WARN_ON_ONCE(work > weight); in napi_poll() on a
>> laptop with ath10k card.
>>
>>
>> [37207.593370] ------------[ cut here ]------------
>> [37207.593380] WARNING: CPU: 0 PID: 7 at ../net/core/dev.c:5274
>> net_rx_action+0x258/0x360
>> [37207.593381] Modules linked in: snd_hda_codec_realtek snd_soc_skl
>> snd_hda_codec_generic snd_soc_skl_ipc snd_soc_sst_ipc snd_soc_sst_dsp
>> snd_hda_ext_core snd_soc_sst_match snd_soc_core rtsx_pci_sdmmc
>> mmc_core snd_pcm_dmaengine x86_pkg_temp_thermal snd_hda_intel uvcvideo
>> kvm_intel videobuf2_vmalloc kvm snd_hda_codec snd_hwdep btusb
>> videobuf2_memops btintel snd_hda_core videobuf2_v4l2 bluetooth
>> irqbypass snd_pcm videobuf2_core crc32c_intel videodev mei_me mei
>> rtsx_pci intel_lpss_pci intel_lpss_acpi intel_vbtn intel_lpss mfd_core
>> tpm_tis intel_hid tpm_tis_core tpm efivarfs
>> [37207.593430] CPU: 0 PID: 7 Comm: ksoftirqd/0 Not tainted 4.11.7 #28
>> [37207.593432] Hardware name: Dell Inc. XPS 13 9360/0839Y6, BIOS 1.3.5 05/08/2017
>
> What firmware and hardware versions exactly? The dmesg output when
> ath10k loads is preferred. As you are using XPS 13 I'm guessing it's one
> of QCA6174 family.
>

Yes it's qca6174.

$ dmesg |grep ath10
[ 0.624828] ath10k_pci 0000:3a:00.0: enabling device (0000 -> 0002)
[ 0.626370] ath10k_pci 0000:3a:00.0: pci irq msi oper_irq_mode 2 irq_mode 0 reset_mode 0
[ 0.837862] ath10k_pci 0000:3a:00.0: qca6174 hw3.2 target 0x05030000 chip_id 0x00340aff sub 1a56:1535
[ 0.837863] ath10k_pci 0000:3a:00.0: kconfig debug 0 debugfs 1 tracing 0 dfs 0 testmode 0
[ 0.838388] ath10k_pci 0000:3a:00.0: firmware ver WLAN.RM.2.0-00180-QCARMSWPZ-1 api 4 features wowlan,ignore-otp,no-4addr-pad crc32 75dee6c5
[ 0.900606] ath10k_pci 0000:3a:00.0: board_file api 2 bmi_id N/A crc32 19644295
[ 3.020681] ath10k_pci 0000:3a:00.0: htt-ver 3.26 wmi-op 4 htt-op 3 cal otp max-sta 32 raw 0 hwcrypto 1
[ 9.574087] ath10k_pci 0000:3a:00.0 wlp58s0: renamed from wlan0

2017-07-04 13:53:55

by Kalle Valo

[permalink] [raw]
Subject: Re: WARN_ON_ONCE(work > weight) in napi_poll()

Andrey Ryabinin <[email protected]> writes:

> I occasionally hit WARN_ON_ONCE(work > weight); in napi_poll() on a
> laptop with ath10k card.
>
>
> [37207.593370] ------------[ cut here ]------------
> [37207.593380] WARNING: CPU: 0 PID: 7 at ../net/core/dev.c:5274
> net_rx_action+0x258/0x360
> [37207.593381] Modules linked in: snd_hda_codec_realtek snd_soc_skl
> snd_hda_codec_generic snd_soc_skl_ipc snd_soc_sst_ipc snd_soc_sst_dsp
> snd_hda_ext_core snd_soc_sst_match snd_soc_core rtsx_pci_sdmmc
> mmc_core snd_pcm_dmaengine x86_pkg_temp_thermal snd_hda_intel uvcvideo
> kvm_intel videobuf2_vmalloc kvm snd_hda_codec snd_hwdep btusb
> videobuf2_memops btintel snd_hda_core videobuf2_v4l2 bluetooth
> irqbypass snd_pcm videobuf2_core crc32c_intel videodev mei_me mei
> rtsx_pci intel_lpss_pci intel_lpss_acpi intel_vbtn intel_lpss mfd_core
> tpm_tis intel_hid tpm_tis_core tpm efivarfs
> [37207.593430] CPU: 0 PID: 7 Comm: ksoftirqd/0 Not tainted 4.11.7 #28
> [37207.593432] Hardware name: Dell Inc. XPS 13 9360/0839Y6, BIOS 1.3.5 05=
/08/2017

What firmware and hardware versions exactly? The dmesg output when
ath10k loads is preferred. As you are using XPS 13 I'm guessing it's one
of QCA6174 family.

--=20
Kalle Valo=

2017-07-18 06:47:48

by Ryan Hsu

[permalink] [raw]
Subject: Re: WARN_ON_ONCE(work > weight) in napi_poll()

On 07/11/2017 06:19 PM, Igor Mitsyanko wrote:

> On 07/11/2017 10:28 AM, Andrey Ryabinin wrote:
>>
>> It gave me this:
>>
>> [118648.825347] #1 quota too big 72 64 16
>> [118648.825351] #2 quota too big 72 64 16
>> [118648.825471] ------------[ cut here ]------------
>> [118648.825484] WARNING: CPU: 0 PID: 0 at ../net/core/dev.c:5274 net_rx_=
action+0x258/0x360
>>
>> So this means that we didn't met the condition bellow, i.e. skb_queue_em=
pty() returned true.
>>
>> ath10k_htt_txrx_compl_task():
>>
>> if ((quota > ATH10K_NAPI_QUOTA_LIMIT) &&
>> !skb_queue_empty(&htt->rx_in_ord_compl_q)) {
>> resched_napi =3D true;
>> goto exit;
>> }
>>
>>> Also WLAN.RM.2.0-00180-QCARMSWPZ-1 firmware is a bit old, could you als=
o update firmware to give it a try?
>>> https://github.com/kvalo/ath10k-firmware/tree/master/QCA6174/hw3.0/4.4
>>>
>>
>> Will try.
>>
>
> Maybe ath10k_htt_rx_in_ord_ind() has to accept "budget_left" parameter an=
d use it to limit number of processed MSDUs in queued AMSDU and saving rest=
for later (NAPI has to be rescheduled in this case).
> It seems natural that this problem happens with current logic, in case AM=
SDU in Rx queue has more elements then left in budget.

Thanks, likely in current logic, it does have chance to exceed the budget w=
hile dequeuing from the last list.

Can you give it a try this one? for QCA6174 reorder is offload, so this sho=
uld be good enough for your case to test, will have to check non-offload re=
order case... but let me know if you're seeing something different....

--
diff --git a/drivers/net/wireless/ath/ath10k/htt_rx.c b/drivers/net/wireles=
s/ath/ath10k/htt_rx.c
index 398dda9..e8697a1 100644
--- a/drivers/net/wireless/ath/ath10k/htt_rx.c
+++ b/drivers/net/wireless/ath/ath10k/htt_rx.c
@@ -1735,7 +1735,8 @@ static void ath10k_htt_rx_delba(struct ath10k *ar, st=
ruct htt_resp *resp)
}
=20
static int ath10k_htt_rx_extract_amsdu(struct sk_buff_head *list,
- struct sk_buff_head *amsdu)
+ struct sk_buff_head *amsdu,
+ int budget_left)
{
struct sk_buff *msdu;
struct htt_rx_desc *rxd;
@@ -1746,8 +1747,9 @@ static int ath10k_htt_rx_extract_amsdu(struct sk_buff=
_head *list,
if (WARN_ON(!skb_queue_empty(amsdu)))
return -EINVAL;
=20
- while ((msdu =3D __skb_dequeue(list))) {
+ while ((msdu =3D __skb_dequeue(list)) && budget_left) {
__skb_queue_tail(amsdu, msdu);
+ budget_left--;
=20
rxd =3D (void *)msdu->data - sizeof(*rxd);
if (rxd->msdu_end.common.info0 &
@@ -1838,7 +1840,8 @@ static int ath10k_htt_rx_h_rx_offload(struct ath10k *=
ar,
return num_msdu;
}
=20
-static int ath10k_htt_rx_in_ord_ind(struct ath10k *ar, struct sk_buff *skb=
)
+static int ath10k_htt_rx_in_ord_ind(struct ath10k *ar, struct sk_buff *skb=
,
+ int budget_left)
{
struct ath10k_htt *htt =3D &ar->htt;
struct htt_resp *resp =3D (void *)skb->data;
@@ -1895,9 +1898,9 @@ static int ath10k_htt_rx_in_ord_ind(struct ath10k *ar=
, struct sk_buff *skb)
if (offload)
num_msdus =3D ath10k_htt_rx_h_rx_offload(ar, &list);
=20
- while (!skb_queue_empty(&list)) {
+ while (!skb_queue_empty(&list) && budget_left) {
__skb_queue_head_init(&amsdu);
- ret =3D ath10k_htt_rx_extract_amsdu(&list, &amsdu);
+ ret =3D ath10k_htt_rx_extract_amsdu(&list, &amsdu, budget_left);
switch (ret) {
case 0:
/* Note: The in-order indication may report interleaved
@@ -1907,6 +1910,7 @@ static int ath10k_htt_rx_in_ord_ind(struct ath10k *ar=
, struct sk_buff *skb)
* should still give an idea about rx rate to the user.
*/
num_msdus +=3D skb_queue_len(&amsdu);
+ budget_left -=3D skb_queue_len(&amsdu);
ath10k_htt_rx_h_ppdu(ar, &amsdu, status, vdev_id);
ath10k_htt_rx_h_filter(ar, &amsdu, status);
ath10k_htt_rx_h_mpdu(ar, &amsdu, status);
@@ -2549,7 +2553,8 @@ int ath10k_htt_txrx_compl_task(struct ath10k *ar, int=
budget)
}
=20
spin_lock_bh(&htt->rx_ring.lock);
- num_rx_msdus =3D ath10k_htt_rx_in_ord_ind(ar, skb);
+ num_rx_msdus =3D ath10k_htt_rx_in_ord_ind(ar, skb,
+ (budget - quota));
spin_unlock_bh(&htt->rx_ring.lock);
if (num_rx_msdus < 0) {
resched_napi =3D true;
--

--=20
Ryan Hsu

2017-07-12 01:20:06

by Igor Mitsyanko

[permalink] [raw]
Subject: Re: WARN_ON_ONCE(work > weight) in napi_poll()

On 07/11/2017 10:28 AM, Andrey Ryabinin wrote:
>
> It gave me this:
>
> [118648.825347] #1 quota too big 72 64 16
> [118648.825351] #2 quota too big 72 64 16
> [118648.825471] ------------[ cut here ]------------
> [118648.825484] WARNING: CPU: 0 PID: 0 at ../net/core/dev.c:5274 net_rx_action+0x258/0x360
>
> So this means that we didn't met the condition bellow, i.e. skb_queue_empty() returned true.
>
> ath10k_htt_txrx_compl_task():
>
> if ((quota > ATH10K_NAPI_QUOTA_LIMIT) &&
> !skb_queue_empty(&htt->rx_in_ord_compl_q)) {
> resched_napi = true;
> goto exit;
> }
>
>> Also WLAN.RM.2.0-00180-QCARMSWPZ-1 firmware is a bit old, could you also update firmware to give it a try?
>> https://github.com/kvalo/ath10k-firmware/tree/master/QCA6174/hw3.0/4.4
>>
>
> Will try.
>

Maybe ath10k_htt_rx_in_ord_ind() has to accept "budget_left" parameter
and use it to limit number of processed MSDUs in queued AMSDU and saving
rest for later (NAPI has to be rescheduled in this case).
It seems natural that this problem happens with current logic, in case
AMSDU in Rx queue has more elements then left in budget.

2017-07-11 17:26:29

by Andrey Ryabinin

[permalink] [raw]
Subject: Re: WARN_ON_ONCE(work > weight) in napi_poll()

On 07/11/2017 12:24 AM, Ryan Hsu wrote:
> On 07/04/2017 08:59 AM, Andrey Ryabinin wrote:
>
>> On 07/04/2017 04:49 PM, Kalle Valo wrote:
>>> Andrey Ryabinin <[email protected]> writes:
>>>
>>>> I occasionally hit WARN_ON_ONCE(work > weight); in napi_poll() on a
>>>> laptop with ath10k card.
>>>>
>>>>
>>>> [37207.593370] ------------[ cut here ]------------
>>>> [37207.593380] WARNING: CPU: 0 PID: 7 at ../net/core/dev.c:5274
>>>> net_rx_action+0x258/0x360
>>>> [37207.593381] Modules linked in: snd_hda_codec_realtek snd_soc_skl
>>>> snd_hda_codec_generic snd_soc_skl_ipc snd_soc_sst_ipc snd_soc_sst_dsp
>>>> snd_hda_ext_core snd_soc_sst_match snd_soc_core rtsx_pci_sdmmc
>>>> mmc_core snd_pcm_dmaengine x86_pkg_temp_thermal snd_hda_intel uvcvideo
>>>> kvm_intel videobuf2_vmalloc kvm snd_hda_codec snd_hwdep btusb
>>>> videobuf2_memops btintel snd_hda_core videobuf2_v4l2 bluetooth
>>>> irqbypass snd_pcm videobuf2_core crc32c_intel videodev mei_me mei
>>>> rtsx_pci intel_lpss_pci intel_lpss_acpi intel_vbtn intel_lpss mfd_core
>>>> tpm_tis intel_hid tpm_tis_core tpm efivarfs
>>>> [37207.593430] CPU: 0 PID: 7 Comm: ksoftirqd/0 Not tainted 4.11.7 #28
>>>> [37207.593432] Hardware name: Dell Inc. XPS 13 9360/0839Y6, BIOS 1.3.5 05/08/2017
>>> What firmware and hardware versions exactly? The dmesg output when
>>> ath10k loads is preferred. As you are using XPS 13 I'm guessing it's one
>>> of QCA6174 family.
>>>
>> Yes it's qca6174.
>>
>> $ dmesg |grep ath10
>> [ 0.624828] ath10k_pci 0000:3a:00.0: enabling device (0000 -> 0002)
>> [ 0.626370] ath10k_pci 0000:3a:00.0: pci irq msi oper_irq_mode 2 irq_mode 0 reset_mode 0
>> [ 0.837862] ath10k_pci 0000:3a:00.0: qca6174 hw3.2 target 0x05030000 chip_id 0x00340aff sub 1a56:1535
>> [ 0.837863] ath10k_pci 0000:3a:00.0: kconfig debug 0 debugfs 1 tracing 0 dfs 0 testmode 0
>> [ 0.838388] ath10k_pci 0000:3a:00.0: firmware ver WLAN.RM.2.0-00180-QCARMSWPZ-1 api 4 features wowlan,ignore-otp,no-4addr-pad crc32 75dee6c5
>> [ 0.900606] ath10k_pci 0000:3a:00.0: board_file api 2 bmi_id N/A crc32 19644295
>> [ 3.020681] ath10k_pci 0000:3a:00.0: htt-ver 3.26 wmi-op 4 htt-op 3 cal otp max-sta 32 raw 0 hwcrypto 1
>> [ 9.574087] ath10k_pci 0000:3a:00.0 wlp58s0: renamed from wlan0
>>
>
> I can't reproduce of this past few days, not sure if this is due to the amsdu packets received from AP, would you mind share what Ap you're using?
> And if there any specific steps you're doing?
>

The bug is quite rare, and I don't know how to reproduce it on purpose. It just happens once in several days
and it happened with different APs.

Perhaps you could think of some debug patch that could help to understand what's going on? I could carry it and will
report if warning hits again.

FYI I applied small debug patch bellow, just to understand where exactly we bail out with too big quota.

---
drivers/net/wireless/ath/ath10k/htt_rx.c | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/drivers/net/wireless/ath/ath10k/htt_rx.c b/drivers/net/wireless/ath/ath10k/htt_rx.c
index 02a3fc81fbe3..02b0cc6ec671 100644
--- a/drivers/net/wireless/ath/ath10k/htt_rx.c
+++ b/drivers/net/wireless/ath/ath10k/htt_rx.c
@@ -2548,6 +2548,8 @@ int ath10k_htt_txrx_compl_task(struct ath10k *ar, int budget)
goto exit;
}
}
+ if (quota > budget)
+ pr_err("#1 quota too big %d %d %d\n", quota, budget, num_rx_msdus);

while (quota < budget) {
/* no more data to receive */
@@ -2568,6 +2570,8 @@ int ath10k_htt_txrx_compl_task(struct ath10k *ar, int budget)
goto exit;
}
}
+ if (quota > budget)
+ pr_err("#2 quota too big %d %d %d\n", quota, budget, num_rx_msdus);

/* From NAPI documentation:
* The napi poll() function may also process TX completions, in which
--


It gave me this:

[118648.825347] #1 quota too big 72 64 16
[118648.825351] #2 quota too big 72 64 16
[118648.825471] ------------[ cut here ]------------
[118648.825484] WARNING: CPU: 0 PID: 0 at ../net/core/dev.c:5274 net_rx_action+0x258/0x360

So this means that we didn't met the condition bellow, i.e. skb_queue_empty() returned true.

ath10k_htt_txrx_compl_task():

if ((quota > ATH10K_NAPI_QUOTA_LIMIT) &&
!skb_queue_empty(&htt->rx_in_ord_compl_q)) {
resched_napi = true;
goto exit;
}

> Also WLAN.RM.2.0-00180-QCARMSWPZ-1 firmware is a bit old, could you also update firmware to give it a try?
> https://github.com/kvalo/ath10k-firmware/tree/master/QCA6174/hw3.0/4.4
>

Will try.

2017-07-10 21:24:53

by Ryan Hsu

[permalink] [raw]
Subject: Re: WARN_ON_ONCE(work > weight) in napi_poll()

On 07/04/2017 08:59 AM, Andrey Ryabinin wrote:

> On 07/04/2017 04:49 PM, Kalle Valo wrote:
>> Andrey Ryabinin <[email protected]> writes:
>>
>>> I occasionally hit WARN_ON_ONCE(work > weight); in napi_poll() on a
>>> laptop with ath10k card.
>>>
>>>
>>> [37207.593370] ------------[ cut here ]------------
>>> [37207.593380] WARNING: CPU: 0 PID: 7 at ../net/core/dev.c:5274
>>> net_rx_action+0x258/0x360
>>> [37207.593381] Modules linked in: snd_hda_codec_realtek snd_soc_skl
>>> snd_hda_codec_generic snd_soc_skl_ipc snd_soc_sst_ipc snd_soc_sst_dsp
>>> snd_hda_ext_core snd_soc_sst_match snd_soc_core rtsx_pci_sdmmc
>>> mmc_core snd_pcm_dmaengine x86_pkg_temp_thermal snd_hda_intel uvcvideo
>>> kvm_intel videobuf2_vmalloc kvm snd_hda_codec snd_hwdep btusb
>>> videobuf2_memops btintel snd_hda_core videobuf2_v4l2 bluetooth
>>> irqbypass snd_pcm videobuf2_core crc32c_intel videodev mei_me mei
>>> rtsx_pci intel_lpss_pci intel_lpss_acpi intel_vbtn intel_lpss mfd_core
>>> tpm_tis intel_hid tpm_tis_core tpm efivarfs
>>> [37207.593430] CPU: 0 PID: 7 Comm: ksoftirqd/0 Not tainted 4.11.7 #28
>>> [37207.593432] Hardware name: Dell Inc. XPS 13 9360/0839Y6, BIOS 1.3.5 =
05/08/2017
>> What firmware and hardware versions exactly? The dmesg output when
>> ath10k loads is preferred. As you are using XPS 13 I'm guessing it's one
>> of QCA6174 family.
>>
> Yes it's qca6174.
>
> $ dmesg |grep ath10
> [ 0.624828] ath10k_pci 0000:3a:00.0: enabling device (0000 -> 0002)
> [ 0.626370] ath10k_pci 0000:3a:00.0: pci irq msi oper_irq_mode 2 irq_m=
ode 0 reset_mode 0
> [ 0.837862] ath10k_pci 0000:3a:00.0: qca6174 hw3.2 target 0x05030000 c=
hip_id 0x00340aff sub 1a56:1535
> [ 0.837863] ath10k_pci 0000:3a:00.0: kconfig debug 0 debugfs 1 tracing=
0 dfs 0 testmode 0
> [ 0.838388] ath10k_pci 0000:3a:00.0: firmware ver WLAN.RM.2.0-00180-QC=
ARMSWPZ-1 api 4 features wowlan,ignore-otp,no-4addr-pad crc32 75dee6c5
> [ 0.900606] ath10k_pci 0000:3a:00.0: board_file api 2 bmi_id N/A crc32=
19644295
> [ 3.020681] ath10k_pci 0000:3a:00.0: htt-ver 3.26 wmi-op 4 htt-op 3 ca=
l otp max-sta 32 raw 0 hwcrypto 1
> [ 9.574087] ath10k_pci 0000:3a:00.0 wlp58s0: renamed from wlan0
>

I can't reproduce of this past few days, not sure if this is due to the ams=
du packets received from AP, would you mind share what Ap you're using?
And if there any specific steps you're doing?

Also WLAN.RM.2.0-00180-QCARMSWPZ-1 firmware is a bit old, could you also up=
date firmware to give it a try?
https://github.com/kvalo/ath10k-firmware/tree/master/QCA6174/hw3.0/4.4

--=20
Ryan Hsu

2017-07-24 09:28:24

by Andrey Ryabinin

[permalink] [raw]
Subject: Re: WARN_ON_ONCE(work > weight) in napi_poll()

On 07/18/2017 09:47 AM, Ryan Hsu wrote:
> On 07/11/2017 06:19 PM, Igor Mitsyanko wrote:
>
>> On 07/11/2017 10:28 AM, Andrey Ryabinin wrote:
>>>
>>> It gave me this:
>>>
>>> [118648.825347] #1 quota too big 72 64 16
>>> [118648.825351] #2 quota too big 72 64 16
>>> [118648.825471] ------------[ cut here ]------------
>>> [118648.825484] WARNING: CPU: 0 PID: 0 at ../net/core/dev.c:5274 net_rx_action+0x258/0x360
>>>
>>> So this means that we didn't met the condition bellow, i.e. skb_queue_empty() returned true.
>>>
>>> ath10k_htt_txrx_compl_task():
>>>
>>> if ((quota > ATH10K_NAPI_QUOTA_LIMIT) &&
>>> !skb_queue_empty(&htt->rx_in_ord_compl_q)) {
>>> resched_napi = true;
>>> goto exit;
>>> }
>>>
>>>> Also WLAN.RM.2.0-00180-QCARMSWPZ-1 firmware is a bit old, could you also update firmware to give it a try?
>>>> https://github.com/kvalo/ath10k-firmware/tree/master/QCA6174/hw3.0/4.4
>>>>
>>>
>>> Will try.
>>>
>>
>> Maybe ath10k_htt_rx_in_ord_ind() has to accept "budget_left" parameter and use it to limit number of processed MSDUs in queued AMSDU and saving rest for later (NAPI has to be rescheduled in this case).
>> It seems natural that this problem happens with current logic, in case AMSDU in Rx queue has more elements then left in budget.
>
> Thanks, likely in current logic, it does have chance to exceed the budget while dequeuing from the last list.
>
> Can you give it a try this one? for QCA6174 reorder is offload, so this should be good enough for your case to test, will have to check non-offload reorder case... but let me know if you're seeing something different....
>

I've been running with this patch almost a week and haven't seen the WARNING. One week is usually enough to trigger it several times.
I guess we can assume that the patch fixed the problem.