2014-07-23 13:41:09

by Felix Fietkau

[permalink] [raw]
Subject: [PATCH] ath9k: fix aggregation session lockup

If an aggregation session fails, frames still end up in the driver queue
with IEEE80211_TX_CTL_AMPDU set.
This causes tx for the affected station/tid to stall, since
ath_tx_get_tid_subframe returning packets to send.

Fix this by clearing IEEE80211_TX_CTL_AMPDU as long as no aggregation
session is running.

Cc: [email protected]
Reported-by: Antonio Quartulli <[email protected]>
Signed-off-by: Felix Fietkau <[email protected]>
---
drivers/net/wireless/ath/ath9k/xmit.c | 9 +++++++++
1 file changed, 9 insertions(+)

diff --git a/drivers/net/wireless/ath/ath9k/xmit.c b/drivers/net/wireless/ath/ath9k/xmit.c
index 3611529..704fcbc 100644
--- a/drivers/net/wireless/ath/ath9k/xmit.c
+++ b/drivers/net/wireless/ath/ath9k/xmit.c
@@ -897,6 +897,15 @@ ath_tx_get_tid_subframe(struct ath_softc *sc, struct ath_txq *txq,

tx_info = IEEE80211_SKB_CB(skb);
tx_info->flags &= ~IEEE80211_TX_CTL_CLEAR_PS_FILT;
+
+ /*
+ * No aggregation session is running, but there may be frames
+ * from a previous session or a failed attempt in the queue.
+ * Send them out as normal data frames
+ */
+ if (!tid->active)
+ tx_info->flags &= ~IEEE80211_TX_CTL_AMPDU;
+
if (!(tx_info->flags & IEEE80211_TX_CTL_AMPDU)) {
bf->bf_state.bf_type = 0;
return bf;
--
1.8.5.2 (Apple Git-48)



2014-07-28 20:54:22

by Ben Greear

[permalink] [raw]
Subject: Re: [PATCH] ath9k: fix aggregation session lockup

On 07/23/2014 06:40 AM, Felix Fietkau wrote:
> If an aggregation session fails, frames still end up in the driver queue
> with IEEE80211_TX_CTL_AMPDU set.
> This causes tx for the affected station/tid to stall, since
> ath_tx_get_tid_subframe returning packets to send.
>
> Fix this by clearing IEEE80211_TX_CTL_AMPDU as long as no aggregation
> session is running.

Did this and your previous patch about 'pending tx frames accounting'
fix the issues reported?

They at least do not fix the problems I see where tx appears to hang,
but my setup is pretty 'special'....

Thanks,
Ben

--
Ben Greear <[email protected]>
Candela Technologies Inc http://www.candelatech.com


2014-07-28 21:17:45

by Ben Greear

[permalink] [raw]
Subject: Re: [PATCH] ath9k: fix aggregation session lockup

On 07/28/2014 02:11 PM, Felix Fietkau wrote:
> On 2014-07-28 22:54, Ben Greear wrote:
>> On 07/23/2014 06:40 AM, Felix Fietkau wrote:
>>> If an aggregation session fails, frames still end up in the driver queue
>>> with IEEE80211_TX_CTL_AMPDU set.
>>> This causes tx for the affected station/tid to stall, since
>>> ath_tx_get_tid_subframe returning packets to send.
>>>
>>> Fix this by clearing IEEE80211_TX_CTL_AMPDU as long as no aggregation
>>> session is running.
>>
>> Did this and your previous patch about 'pending tx frames accounting'
>> fix the issues reported?
> Yes.
>
>> They at least do not fix the problems I see where tx appears to hang,
>> but my setup is pretty 'special'....
> Interesting. Maybe you should ask Antonio for an updated version of the
> patch that he used to debug this issue. If you give me the output of it
> while it's locked up, I might be able to figure out what's going on in
> your setup.

Let me first try to reproduce with a recent, un-hacked-upon kernel...

I would welcome the debug patch in the meantime.

Thanks,
Ben

--
Ben Greear <[email protected]>
Candela Technologies Inc http://www.candelatech.com


2014-07-28 21:11:52

by Felix Fietkau

[permalink] [raw]
Subject: Re: [PATCH] ath9k: fix aggregation session lockup

On 2014-07-28 22:54, Ben Greear wrote:
> On 07/23/2014 06:40 AM, Felix Fietkau wrote:
>> If an aggregation session fails, frames still end up in the driver queue
>> with IEEE80211_TX_CTL_AMPDU set.
>> This causes tx for the affected station/tid to stall, since
>> ath_tx_get_tid_subframe returning packets to send.
>>
>> Fix this by clearing IEEE80211_TX_CTL_AMPDU as long as no aggregation
>> session is running.
>
> Did this and your previous patch about 'pending tx frames accounting'
> fix the issues reported?
Yes.

> They at least do not fix the problems I see where tx appears to hang,
> but my setup is pretty 'special'....
Interesting. Maybe you should ask Antonio for an updated version of the
patch that he used to debug this issue. If you give me the output of it
while it's locked up, I might be able to figure out what's going on in
your setup.

- Felix

2014-07-29 16:05:54

by Ben Greear

[permalink] [raw]
Subject: Re: [PATCH] ath9k: fix aggregation session lockup

On 07/28/2014 02:11 PM, Felix Fietkau wrote:
> On 2014-07-28 22:54, Ben Greear wrote:
>> On 07/23/2014 06:40 AM, Felix Fietkau wrote:
>>> If an aggregation session fails, frames still end up in the driver queue
>>> with IEEE80211_TX_CTL_AMPDU set.
>>> This causes tx for the affected station/tid to stall, since
>>> ath_tx_get_tid_subframe returning packets to send.
>>>
>>> Fix this by clearing IEEE80211_TX_CTL_AMPDU as long as no aggregation
>>> session is running.
>>
>> Did this and your previous patch about 'pending tx frames accounting'
>> fix the issues reported?
> Yes.
>
>> They at least do not fix the problems I see where tx appears to hang,
>> but my setup is pretty 'special'....
> Interesting. Maybe you should ask Antonio for an updated version of the
> patch that he used to debug this issue. If you give me the output of it
> while it's locked up, I might be able to figure out what's going on in
> your setup.

What is the easiest upstream tree for you to debug?

Thanks,
Ben

>
> - Felix
> --
> To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>


--
Ben Greear <[email protected]>
Candela Technologies Inc http://www.candelatech.com


2014-08-04 13:34:05

by Antonio Quartulli

[permalink] [raw]
Subject: Re: [PATCH] ath9k: fix aggregation session lockup



On 29/07/14 18:05, Ben Greear wrote:
> On 07/28/2014 02:11 PM, Felix Fietkau wrote:
>> Interesting. Maybe you should ask Antonio for an updated version of the
>> patch that he used to debug this issue. If you give me the output of it
>> while it's locked up, I might be able to figure out what's going on in
>> your setup.
>
> What is the easiest upstream tree for you to debug?

Sorry for the late reply but I was pretty "disconnected" in the last days.

Is the debugging patch still required? If so, I can dig in my buildroot
and retrieve it.

Cheers,



--
Antonio Quartulli


Attachments:
signature.asc (884.00 B)
OpenPGP digital signature

2014-08-08 11:23:30

by Antonio Quartulli

[permalink] [raw]
Subject: Re: [PATCH] ath9k: fix aggregation session lockup

On 04/08/14 22:14, Ben Greear wrote:
> On 08/04/2014 06:27 AM, Antonio Quartulli wrote:
>
>
>> On 29/07/14 18:05, Ben Greear wrote:
>>> On 07/28/2014 02:11 PM, Felix Fietkau wrote:
>>>> Interesting. Maybe you should ask Antonio for an updated version of the patch that he used to debug this issue. If you give me the output of it while
>>>> it's locked up, I might be able to figure out what's going on in your setup.
>>>
>>> What is the easiest upstream tree for you to debug?
>
>> Sorry for the late reply but I was pretty "disconnected" in the last days.
>
>> Is the debugging patch still required? If so, I can dig in my buildroot and retrieve it.
>
> I would appreciate it if you could post the patch. I'd like to
> see if I can get the bug fixed once and for all.....
>
> Thanks,
> Ben

Ben,

attached you have the patches we have been using during our debug.

They create a new debugfs file that you can read when the wifi is stuck
in order to gather some information about the queues/tids status. The
file is located in the ath9k debugfs folder and its name is "nodes".

I hope it can help.

Cheers,

--
Antonio Quartulli


Attachments:
921-debug.patch (8.25 kB)
923-ath9k-queue-debug.patch (1.43 kB)
924-ath9k-more-debug-output.patch (3.16 kB)
925-ath9k-print-tx_info.patch (956.00 B)
signature.asc (884.00 B)
OpenPGP digital signature
Download all attachments

2014-08-04 20:14:48

by Ben Greear

[permalink] [raw]
Subject: Re: [PATCH] ath9k: fix aggregation session lockup

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 08/04/2014 06:27 AM, Antonio Quartulli wrote:
>
>
> On 29/07/14 18:05, Ben Greear wrote:
>> On 07/28/2014 02:11 PM, Felix Fietkau wrote:
>>> Interesting. Maybe you should ask Antonio for an updated version of the patch that he used to debug this issue. If you give me the output of it while
>>> it's locked up, I might be able to figure out what's going on in your setup.
>>
>> What is the easiest upstream tree for you to debug?
>
> Sorry for the late reply but I was pretty "disconnected" in the last days.
>
> Is the debugging patch still required? If so, I can dig in my buildroot and retrieve it.

I would appreciate it if you could post the patch. I'd like to
see if I can get the bug fixed once and for all.....

Thanks,
Ben

>
> Cheers,
>
>
>


- --
Ben Greear <[email protected]>
Candela Technologies Inc http://www.candelatech.com

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.13 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQEcBAEBAgAGBQJT3+m0AAoJELbHqkYeJT4OniwH/1mLfg10xq5CPR5eNKCiM2vY
Ww38H/fC5N2K0Uyw84uWTHmJTjqvl4IbW2Xhve4i6yqiQa+NSva/ge5ogK6bLiMH
13AxHysUMnPgu3f9so4wziRuc3izkcs4qgArnXOW03UnVOpAoUbRUVkNPojDzQpM
zkxmQTYTd61XAW6cvOUlvLJsaY6Eapa9ClbIRtj8SuyoIvrvKzKlww96ibnJ7jFR
AXDBSt6n0bPy07DpHuvIs5UypbV2rO6JjQ4WFD85tqScP3+ZEs0KcYk3lv06FXnq
o1SbMSIY24AiLMMhIdIREPadETgft65DGd7wgMarvF8nnqsB7Gnnq0f/YJS32uc=
=daKD
-----END PGP SIGNATURE-----