2021-05-19 18:44:16

by Tong Zhu

[permalink] [raw]
Subject: ath9k: possible bug with AR93xx during background scanning

Hello

After moving to kernel 4.14 from 4.1, our station with an AR93xx pcie card began
to see packet loss during background scanning. After inspecting the changes, I
realized ath9k no longer pushes to stop traffic at VIF for off-channel scan after
this change.

50f08edf98096a68f01ff4566b605a25bf8e42ce ath9k: Switch to using mac80211 intermediate software queues.

Here is a snippet of a ftrace function sequence during a scan.

kworker/u4:0-28892 [001] .n..... 3082.957750: ieee80211_offchannel_stop_vifs <-ieee80211_scan_work
XX-XXXXXXXXXXd-489 [001] .....12 3082.971105: __ieee80211_subif_start_xmit <-ieee80211_subif_start_xmit
XX-XXXXXXXXXXd-489 [001] .....14 3082.971128: ath_tx_txqaddbuf <-ath_txq_schedule
irq/175-ath9k-189 [001] .....15 3082.998888: ath_tx_process_buffer <-ath_tx_edma_tasklet
irq/175-ath9k-189 [001] .....16 3082.998898: ath_tx_txqaddbuf <-ath_txq_schedule
irq/175-ath9k-189 [000] .....15 3083.011497: ath_tx_process_buffer <-ath_tx_edma_tasklet
irq/175-ath9k-189 [000] .....16 3083.011504: ath_tx_txqaddbuf <-ath_txq_schedule
irq/175-ath9k-189 [001] .....15 3083.022261: ath_tx_process_buffer <-ath_tx_edma_tasklet
irq/175-ath9k-189 [001] .....16 3083.022268: ath_tx_txqaddbuf <-ath_txq_schedule
irq/175-ath9k-189 [000] .....15 3083.034131: ath_tx_process_buffer <-ath_tx_edma_tasklet
irq/175-ath9k-189 [000] .....15 3083.034134: ath_tx_complete_buf <-ath_tx_process_buffer
irq/175-ath9k-189 [000] .....15 3083.034136: ath_tx_complete <-ath_tx_complete_buf
kworker/u4:0-28892 [001] ....... 3083.083246: ieee80211_offchannel_return <-ieee80211_scan_work

Between timestamp 3082.957750 and 3083.083246, the device went off channel. A packet
came down and reached hardware queue. Hardware TX processing continued. It is a
serious issue if PDUs are indeed sent out during off-channel.

Regards,
Tong



2021-05-31 16:12:15

by Toke Høiland-Jørgensen

[permalink] [raw]
Subject: Re: ath9k: possible bug with AR93xx during background scanning

Tong Zhu <[email protected]> writes:

> Hello
>
> After moving to kernel 4.14 from 4.1, our station with an AR93xx pcie card began
> to see packet loss during background scanning. After inspecting the changes, I
> realized ath9k no longer pushes to stop traffic at VIF for off-channel scan after
> this change.
>
> 50f08edf98096a68f01ff4566b605a25bf8e42ce ath9k: Switch to using mac80211 intermediate software queues.
>
> Here is a snippet of a ftrace function sequence during a scan.
>
> kworker/u4:0-28892 [001] .n..... 3082.957750: ieee80211_offchannel_stop_vifs <-ieee80211_scan_work
> XX-XXXXXXXXXXd-489 [001] .....12 3082.971105: __ieee80211_subif_start_xmit <-ieee80211_subif_start_xmit
> XX-XXXXXXXXXXd-489 [001] .....14 3082.971128: ath_tx_txqaddbuf <-ath_txq_schedule
> irq/175-ath9k-189 [001] .....15 3082.998888: ath_tx_process_buffer <-ath_tx_edma_tasklet
> irq/175-ath9k-189 [001] .....16 3082.998898: ath_tx_txqaddbuf <-ath_txq_schedule
> irq/175-ath9k-189 [000] .....15 3083.011497: ath_tx_process_buffer <-ath_tx_edma_tasklet
> irq/175-ath9k-189 [000] .....16 3083.011504: ath_tx_txqaddbuf <-ath_txq_schedule
> irq/175-ath9k-189 [001] .....15 3083.022261: ath_tx_process_buffer <-ath_tx_edma_tasklet
> irq/175-ath9k-189 [001] .....16 3083.022268: ath_tx_txqaddbuf <-ath_txq_schedule
> irq/175-ath9k-189 [000] .....15 3083.034131: ath_tx_process_buffer <-ath_tx_edma_tasklet
> irq/175-ath9k-189 [000] .....15 3083.034134: ath_tx_complete_buf <-ath_tx_process_buffer
> irq/175-ath9k-189 [000] .....15 3083.034136: ath_tx_complete <-ath_tx_complete_buf
> kworker/u4:0-28892 [001] ....... 3083.083246: ieee80211_offchannel_return <-ieee80211_scan_work
>
> Between timestamp 3082.957750 and 3083.083246, the device went off channel. A packet
> came down and reached hardware queue. Hardware TX processing continued. It is a
> serious issue if PDUs are indeed sent out during off-channel.

I believe this was fixed by this commit:
21a5d4c3a45c ("mac80211: add stop/start logic for software TXQs")

which first appeared in kernel 4.20. It doesn't appear to have been
backported to 4.14, so I suppose it makes sense if you're seeing queues
not getting stopped on that kernel...

-Toke

2021-06-01 19:07:19

by Tong Zhu

[permalink] [raw]
Subject: Re: ath9k: possible bug with AR93xx during background scanning

Yes, this commit would fix it. I am glad the community has taken care of this.

Thanks you.
-Tong

On Mon, May 31, 2021 at 06:03:36PM +0200, Toke H?iland-J?rgensen wrote:
>
>
> Tong Zhu <[email protected]> writes:
>
> > Hello
> >
> > After moving to kernel 4.14 from 4.1, our station with an AR93xx pcie card began
> > to see packet loss during background scanning. After inspecting the changes, I
> > realized ath9k no longer pushes to stop traffic at VIF for off-channel scan after
> > this change.
> >
> > 50f08edf98096a68f01ff4566b605a25bf8e42ce ath9k: Switch to using mac80211 intermediate software queues.
> >
> > Here is a snippet of a ftrace function sequence during a scan.
> >
> > kworker/u4:0-28892 [001] .n..... 3082.957750: ieee80211_offchannel_stop_vifs <-ieee80211_scan_work
> > XX-XXXXXXXXXXd-489 [001] .....12 3082.971105: __ieee80211_subif_start_xmit <-ieee80211_subif_start_xmit
> > XX-XXXXXXXXXXd-489 [001] .....14 3082.971128: ath_tx_txqaddbuf <-ath_txq_schedule
> > irq/175-ath9k-189 [001] .....15 3082.998888: ath_tx_process_buffer <-ath_tx_edma_tasklet
> > irq/175-ath9k-189 [001] .....16 3082.998898: ath_tx_txqaddbuf <-ath_txq_schedule
> > irq/175-ath9k-189 [000] .....15 3083.011497: ath_tx_process_buffer <-ath_tx_edma_tasklet
> > irq/175-ath9k-189 [000] .....16 3083.011504: ath_tx_txqaddbuf <-ath_txq_schedule
> > irq/175-ath9k-189 [001] .....15 3083.022261: ath_tx_process_buffer <-ath_tx_edma_tasklet
> > irq/175-ath9k-189 [001] .....16 3083.022268: ath_tx_txqaddbuf <-ath_txq_schedule
> > irq/175-ath9k-189 [000] .....15 3083.034131: ath_tx_process_buffer <-ath_tx_edma_tasklet
> > irq/175-ath9k-189 [000] .....15 3083.034134: ath_tx_complete_buf <-ath_tx_process_buffer
> > irq/175-ath9k-189 [000] .....15 3083.034136: ath_tx_complete <-ath_tx_complete_buf
> > kworker/u4:0-28892 [001] ....... 3083.083246: ieee80211_offchannel_return <-ieee80211_scan_work
> >
> > Between timestamp 3082.957750 and 3083.083246, the device went off channel. A packet
> > came down and reached hardware queue. Hardware TX processing continued. It is a
> > serious issue if PDUs are indeed sent out during off-channel.
>
> I believe this was fixed by this commit:
> 21a5d4c3a45c ("mac80211: add stop/start logic for software TXQs")
>
> which first appeared in kernel 4.20. It doesn't appear to have been
> backported to 4.14, so I suppose it makes sense if you're seeing queues
> not getting stopped on that kernel...
>
> -Toke