2012-03-05 16:48:31

by Stanislaw Gruszka

[permalink] [raw]
Subject: [PATCH 3.3] rt2x00: fix random stalls

Is possible that we stop queue and then do not wake up it again,
especially when packets are transmitted fast. That can be easily
reproduced with modified tx queue entry_num to some small value e.g. 16.

If mac80211 already hold local->queue_stop_reason_lock, then we can wait
on that lock in both rt2x00queue_pause_queue() and
rt2x00queue_unpause_queue(). After drooping ->queue_stop_reason_lock
is possible that __ieee80211_wake_queue() will be performed before
__ieee80211_stop_queue(), hence we stop queue and newer wake up it
again.

To prevent stalls serialize pause/unpause by queue->tx_lock.

Cc: [email protected]
Signed-off-by: Stanislaw Gruszka <[email protected]>
---
drivers/net/wireless/rt2x00/rt2x00dev.c | 10 ++++++++--
drivers/net/wireless/rt2x00/rt2x00mac.c | 10 +++++++++-
2 files changed, 17 insertions(+), 3 deletions(-)

diff --git a/drivers/net/wireless/rt2x00/rt2x00dev.c b/drivers/net/wireless/rt2x00/rt2x00dev.c
index 49a51b4..6c64658 100644
--- a/drivers/net/wireless/rt2x00/rt2x00dev.c
+++ b/drivers/net/wireless/rt2x00/rt2x00dev.c
@@ -430,10 +430,16 @@ void rt2x00lib_txdone(struct queue_entry *entry,
/*
* If the data queue was below the threshold before the txdone
* handler we must make sure the packet queue in the mac80211 stack
- * is reenabled when the txdone handler has finished.
+ * is reenabled when the txdone handler has finished. This has to be
+ * serialized with rt2x00mac_tx, otherwise we can wake up mac80211
+ * queue before it was stopped if someone else hold mac80211 internal
+ * local->queue_stop_reason_lock .
*/
- if (!rt2x00queue_threshold(entry->queue))
+ if (!rt2x00queue_threshold(entry->queue)) {
+ spin_lock_irq(&entry->queue->tx_lock);
rt2x00queue_unpause_queue(entry->queue);
+ spin_unlock_irq(&entry->queue->tx_lock);
+ }
}
EXPORT_SYMBOL_GPL(rt2x00lib_txdone);

diff --git a/drivers/net/wireless/rt2x00/rt2x00mac.c b/drivers/net/wireless/rt2x00/rt2x00mac.c
index ede3c58..2880512 100644
--- a/drivers/net/wireless/rt2x00/rt2x00mac.c
+++ b/drivers/net/wireless/rt2x00/rt2x00mac.c
@@ -152,13 +152,21 @@ void rt2x00mac_tx(struct ieee80211_hw *hw, struct sk_buff *skb)
if (unlikely(rt2x00queue_write_tx_frame(queue, skb, false)))
goto exit_fail;

- if (rt2x00queue_threshold(queue))
+ /*
+ * Pausing queue has to be serialized with rt2x00lib_txdone .
+ */
+ if (rt2x00queue_threshold(queue)) {
+ spin_lock(&queue->tx_lock);
rt2x00queue_pause_queue(queue);
+ spin_unlock(&queue->tx_lock);
+ }

return;

exit_fail:
+ spin_lock(&queue->tx_lock);
rt2x00queue_pause_queue(queue);
+ spin_unlock(&queue->tx_lock);
exit_free_skb:
ieee80211_free_txskb(hw, skb);
}
--
1.7.1



2012-03-06 07:45:22

by Helmut Schaa

[permalink] [raw]
Subject: Re: [PATCH 3.3] rt2x00: fix random stalls

Hi,

On Mon, Mar 5, 2012 at 5:48 PM, Stanislaw Gruszka <[email protected]> wrote:
> Is possible that we stop queue and then do not wake up it again,
> especially when packets are transmitted fast. That can be easily
> reproduced with modified tx queue entry_num to some small value e.g. 16.
>
> If mac80211 already hold local->queue_stop_reason_lock, then we can wait
> on that lock in both rt2x00queue_pause_queue() and
> rt2x00queue_unpause_queue(). After drooping ->queue_stop_reason_lock
> is possible that __ieee80211_wake_queue() will be performed before
> __ieee80211_stop_queue(), hence we stop queue and newer wake up it
> again.
>
> To prevent stalls serialize pause/unpause by queue->tx_lock.
>
> Cc: [email protected]
> Signed-off-by: Stanislaw Gruszka <[email protected]>
> ---
> ?drivers/net/wireless/rt2x00/rt2x00dev.c | ? 10 ++++++++--
> ?drivers/net/wireless/rt2x00/rt2x00mac.c | ? 10 +++++++++-
> ?2 files changed, 17 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/net/wireless/rt2x00/rt2x00dev.c b/drivers/net/wireless/rt2x00/rt2x00dev.c
> index 49a51b4..6c64658 100644
> --- a/drivers/net/wireless/rt2x00/rt2x00dev.c
> +++ b/drivers/net/wireless/rt2x00/rt2x00dev.c
> @@ -430,10 +430,16 @@ void rt2x00lib_txdone(struct queue_entry *entry,
> ? ? ? ?/*
> ? ? ? ? * If the data queue was below the threshold before the txdone
> ? ? ? ? * handler we must make sure the packet queue in the mac80211 stack
> - ? ? ? ?* is reenabled when the txdone handler has finished.
> + ? ? ? ?* is reenabled when the txdone handler has finished. This has to be
> + ? ? ? ?* serialized with rt2x00mac_tx, otherwise we can wake up mac80211
> + ? ? ? ?* queue before it was stopped if someone else hold mac80211 internal
> + ? ? ? ?* local->queue_stop_reason_lock .
> ? ? ? ? */
> - ? ? ? if (!rt2x00queue_threshold(entry->queue))
> + ? ? ? if (!rt2x00queue_threshold(entry->queue)) {
> + ? ? ? ? ? ? ? spin_lock_irq(&entry->queue->tx_lock);
> ? ? ? ? ? ? ? ?rt2x00queue_unpause_queue(entry->queue);
> + ? ? ? ? ? ? ? spin_unlock_irq(&entry->queue->tx_lock);

Why do we need to disable interrupts here? spin_lock_bh should
be sufficient.

> + ? ? ? }
> ?}
> ?EXPORT_SYMBOL_GPL(rt2x00lib_txdone);
>
> diff --git a/drivers/net/wireless/rt2x00/rt2x00mac.c b/drivers/net/wireless/rt2x00/rt2x00mac.c
> index ede3c58..2880512 100644
> --- a/drivers/net/wireless/rt2x00/rt2x00mac.c
> +++ b/drivers/net/wireless/rt2x00/rt2x00mac.c
> @@ -152,13 +152,21 @@ void rt2x00mac_tx(struct ieee80211_hw *hw, struct sk_buff *skb)
> ? ? ? ?if (unlikely(rt2x00queue_write_tx_frame(queue, skb, false)))
> ? ? ? ? ? ? ? ?goto exit_fail;
>
> - ? ? ? if (rt2x00queue_threshold(queue))
> + ? ? ? /*
> + ? ? ? ?* Pausing queue has to be serialized with rt2x00lib_txdone .
> + ? ? ? ?*/
> + ? ? ? if (rt2x00queue_threshold(queue)) {
> + ? ? ? ? ? ? ? spin_lock(&queue->tx_lock);
> ? ? ? ? ? ? ? ?rt2x00queue_pause_queue(queue);
> + ? ? ? ? ? ? ? spin_unlock(&queue->tx_lock);
> + ? ? ? }
>
> ? ? ? ?return;
>
> ?exit_fail:
> + ? ? ? spin_lock(&queue->tx_lock);
> ? ? ? ?rt2x00queue_pause_queue(queue);
> + ? ? ? spin_unlock(&queue->tx_lock);
> ?exit_free_skb:
> ? ? ? ?ieee80211_free_txskb(hw, skb);
> ?}
> --
> 1.7.1
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
> the body of a message to [email protected]
> More majordomo info at ?http://vger.kernel.org/majordomo-info.html

2012-03-06 06:53:49

by Stanislaw Gruszka

[permalink] [raw]
Subject: Re: [rt2x00-users] [PATCH 3.3] rt2x00: fix random stalls

Hi Gertjan

On Mon, Mar 05, 2012 at 08:54:37PM +0100, Gertjan van Wingerde wrote:
> There are more places in the rt2x00 code that call upon rt2x00queue_pause_queue and rt2x00queue_unpause_queue. Shouldn't these places be protected with tx_lock as well?

Hmm, good question. Perhaps there are possible races between other usage
of pause/unpause, but they are not obvious for me.

Seems there are more bugs there, i.e on rt2x00queue_stop_queue, we first
clear QUEUE_STARTED then call pause_queue, which will just return with
that bit cleared.

On rt2x00queue_flush_queue(), we first call pause queue, then
->kick_queue, which will cause to call txdone and unpause queue.

So, it's hard to tell for me right now if other paces needs serialization,
apparently related area needs some more detailed review,

> Or better, shouldn't the locking be moved inside the pause / unpause functions?

Thats a bit more complication, because we need lock for TX queues only
and it has to be taken before test_and_{set,clear}_bit(QUEUE_PAUSED, &queue->flags),
otherwise we still can race.

I believe this patch is simplest possible approach to solve the problem,
at least for now.

Stanislaw

2012-03-06 11:53:45

by Stanislaw Gruszka

[permalink] [raw]
Subject: Re: [PATCH 3.3] rt2x00: fix random stalls

On Tue, Mar 06, 2012 at 08:45:21AM +0100, Helmut Schaa wrote:
> > - ? ? ? if (!rt2x00queue_threshold(entry->queue))
> > + ? ? ? if (!rt2x00queue_threshold(entry->queue)) {
> > + ? ? ? ? ? ? ? spin_lock_irq(&entry->queue->tx_lock);
> > ? ? ? ? ? ? ? ?rt2x00queue_unpause_queue(entry->queue);
> > + ? ? ? ? ? ? ? spin_unlock_irq(&entry->queue->tx_lock);
>
> Why do we need to disable interrupts here? spin_lock_bh should
> be sufficient.

I'm not 100% sure, and I was to lazy to find out, and chose safer
version. I guess I need to find out now ...

Stanislaw

2012-03-07 18:25:39

by Stanislaw Gruszka

[permalink] [raw]
Subject: Re: [PATCH 3.3] rt2x00: fix random stalls

On Tue, Mar 06, 2012 at 01:08:04PM +0100, Gertjan van Wingerde wrote:
> On Tue, Mar 6, 2012 at 12:53 PM, Stanislaw Gruszka <[email protected]> wrote:
> > On Tue, Mar 06, 2012 at 08:45:21AM +0100, Helmut Schaa wrote:
> >> > - ? ? ? if (!rt2x00queue_threshold(entry->queue))
> >> > + ? ? ? if (!rt2x00queue_threshold(entry->queue)) {
> >> > + ? ? ? ? ? ? ? spin_lock_irq(&entry->queue->tx_lock);
> >> > ? ? ? ? ? ? ? ?rt2x00queue_unpause_queue(entry->queue);
> >> > + ? ? ? ? ? ? ? spin_unlock_irq(&entry->queue->tx_lock);
> >>
> >> Why do we need to disable interrupts here? spin_lock_bh should
> >> be sufficient.
> >
> > I'm not 100% sure, and I was to lazy to find out, and chose safer
> > version. I guess I need to find out now ...

Ok, locking with bh is fine.

> That is actually a good point of Helmut. In all other cases where the tx_lock
> is used we actually use spin_lock and spin_unlock. AFAIK we shouldn't mix
> the different spinlock variants, so with this the other uses may have to change
> as well.

We use this lock only in rt2x00mac_tx (2 times) with bh disabled by
generic net or mac80211 layer. And now from txdone in process context
(usb) or tasklet (pci), so existing spin_lock function version does not
need to be changed.

On the meantime, I realized that we should also serialize
rt2x00queue_threshold(). I'll post second version of a patch shortly.

Stanislaw

2012-03-05 19:54:33

by Gertjan van Wingerde

[permalink] [raw]
Subject: Re: [rt2x00-users] [PATCH 3.3] rt2x00: fix random stalls

Hi Stanislaw,

On 5 mrt. 2012, at 17:48, Stanislaw Gruszka <[email protected]> wrote:

> Is possible that we stop queue and then do not wake up it again,
> especially when packets are transmitted fast. That can be easily
> reproduced with modified tx queue entry_num to some small value e.g. 16.
>
> If mac80211 already hold local->queue_stop_reason_lock, then we can wait
> on that lock in both rt2x00queue_pause_queue() and
> rt2x00queue_unpause_queue(). After drooping ->queue_stop_reason_lock
> is possible that __ieee80211_wake_queue() will be performed before
> __ieee80211_stop_queue(), hence we stop queue and newer wake up it
> again.
>
> To prevent stalls serialize pause/unpause by queue->tx_lock.
>
> Cc: [email protected]
> Signed-off-by: Stanislaw Gruszka <[email protected]>
> ---
> drivers/net/wireless/rt2x00/rt2x00dev.c | 10 ++++++++--
> drivers/net/wireless/rt2x00/rt2x00mac.c | 10 +++++++++-
> 2 files changed, 17 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/net/wireless/rt2x00/rt2x00dev.c b/drivers/net/wireless/rt2x00/rt2x00dev.c
> index 49a51b4..6c64658 100644
> --- a/drivers/net/wireless/rt2x00/rt2x00dev.c
> +++ b/drivers/net/wireless/rt2x00/rt2x00dev.c
> @@ -430,10 +430,16 @@ void rt2x00lib_txdone(struct queue_entry *entry,
> /*
> * If the data queue was below the threshold before the txdone
> * handler we must make sure the packet queue in the mac80211 stack
> - * is reenabled when the txdone handler has finished.
> + * is reenabled when the txdone handler has finished. This has to be
> + * serialized with rt2x00mac_tx, otherwise we can wake up mac80211
> + * queue before it was stopped if someone else hold mac80211 internal
> + * local->queue_stop_reason_lock .
> */
> - if (!rt2x00queue_threshold(entry->queue))
> + if (!rt2x00queue_threshold(entry->queue)) {
> + spin_lock_irq(&entry->queue->tx_lock);
> rt2x00queue_unpause_queue(entry->queue);
> + spin_unlock_irq(&entry->queue->tx_lock);
> + }
> }
> EXPORT_SYMBOL_GPL(rt2x00lib_txdone);
>
> diff --git a/drivers/net/wireless/rt2x00/rt2x00mac.c b/drivers/net/wireless/rt2x00/rt2x00mac.c
> index ede3c58..2880512 100644
> --- a/drivers/net/wireless/rt2x00/rt2x00mac.c
> +++ b/drivers/net/wireless/rt2x00/rt2x00mac.c
> @@ -152,13 +152,21 @@ void rt2x00mac_tx(struct ieee80211_hw *hw, struct sk_buff *skb)
> if (unlikely(rt2x00queue_write_tx_frame(queue, skb, false)))
> goto exit_fail;
>
> - if (rt2x00queue_threshold(queue))
> + /*
> + * Pausing queue has to be serialized with rt2x00lib_txdone .
> + */
> + if (rt2x00queue_threshold(queue)) {
> + spin_lock(&queue->tx_lock);
> rt2x00queue_pause_queue(queue);
> + spin_unlock(&queue->tx_lock);
> + }
>
> return;
>
> exit_fail:
> + spin_lock(&queue->tx_lock);
> rt2x00queue_pause_queue(queue);
> + spin_unlock(&queue->tx_lock);
> exit_free_skb:
> ieee80211_free_txskb(hw, skb);
> }

There are more places in the rt2x00 code that call upon rt2x00queue_pause_queue and rt2x00queue_unpause_queue. Shouldn't these places be protected with tx_lock as well?
Or better, shouldn't the locking be moved inside the pause / unpause functions?

---
Gertjan

2012-03-07 18:46:52

by Stanislaw Gruszka

[permalink] [raw]
Subject: Re: [PATCH 3.3] rt2x00: fix random stalls

Hi

On Tue, Mar 06, 2012 at 06:43:56PM +0100, Martin Hundeb?ll wrote:
> Hi,
>
> On 03/05/2012 05:48 PM, Stanislaw Gruszka wrote:
> >Is possible that we stop queue and then do not wake up it again,
> >especially when packets are transmitted fast. That can be easily
> >reproduced with modified tx queue entry_num to some small value e.g. 16.
> >
> >If mac80211 already hold local->queue_stop_reason_lock, then we can wait
> >on that lock in both rt2x00queue_pause_queue() and
> >rt2x00queue_unpause_queue(). After drooping ->queue_stop_reason_lock
> >is possible that __ieee80211_wake_queue() will be performed before
> >__ieee80211_stop_queue(), hence we stop queue and newer wake up it
> >again.
> >
> >To prevent stalls serialize pause/unpause by queue->tx_lock.
>
> I've been having CPU load issues with rt2800usb/Ralink RT2870, when doing simultaneous TX/RX between to nodes in an adhoc network. While transfering UDP packets in one direction with iperf[1], I get ~23Mbit/s and kworker is utilizing <10% of the CPU (OMAP4 1GHz dualcore or/and Pentium M 1.70GHz) on both ends. When doing bidirectional tests with iperf[2], one kworker thread jumps too 100% and throughput drops.
>
> By using two iperf clients to do bidirectional TCP transfers, I got ~6Mbit/s in both directions, so I suspected some queueing issues and thus applied this patch, but no change. I've tried to do some tracing[3], but this is quite new to me, so please instruct me, if you need more info.

I did short test here and do not enter that issue. Which kernel version are you using?

> [3]
> out.txt has a trace from 10.10.10.55 while running iperf as in [2] and the following commands:

Please newer do this again :-) If you want to provide such big data, put it
somewhere and paste download link to the email. Moreover that tracing
did not provide any useful information.

Stanislaw


2012-03-06 12:08:05

by Gertjan van Wingerde

[permalink] [raw]
Subject: Re: [PATCH 3.3] rt2x00: fix random stalls

Hi Stanislaw,

On Tue, Mar 6, 2012 at 12:53 PM, Stanislaw Gruszka <[email protected]> wrote:
> On Tue, Mar 06, 2012 at 08:45:21AM +0100, Helmut Schaa wrote:
>> > - ? ? ? if (!rt2x00queue_threshold(entry->queue))
>> > + ? ? ? if (!rt2x00queue_threshold(entry->queue)) {
>> > + ? ? ? ? ? ? ? spin_lock_irq(&entry->queue->tx_lock);
>> > ? ? ? ? ? ? ? ?rt2x00queue_unpause_queue(entry->queue);
>> > + ? ? ? ? ? ? ? spin_unlock_irq(&entry->queue->tx_lock);
>>
>> Why do we need to disable interrupts here? spin_lock_bh should
>> be sufficient.
>
> I'm not 100% sure, and I was to lazy to find out, and chose safer
> version. I guess I need to find out now ...
>

That is actually a good point of Helmut. In all other cases where the tx_lock
is used we actually use spin_lock and spin_unlock. AFAIK we shouldn't mix
the different spinlock variants, so with this the other uses may have to change
as well.

---
Gertjan

2012-03-05 19:27:48

by Ivo Van Doorn

[permalink] [raw]
Subject: Re: [rt2x00-users] [PATCH 3.3] rt2x00: fix random stalls

On Mon, Mar 5, 2012 at 5:48 PM, Stanislaw Gruszka <[email protected]> wrote:
> Is possible that we stop queue and then do not wake up it again,
> especially when packets are transmitted fast. That can be easily
> reproduced with modified tx queue entry_num to some small value e.g. 16.
>
> If mac80211 already hold local->queue_stop_reason_lock, then we can wait
> on that lock in both rt2x00queue_pause_queue() and
> rt2x00queue_unpause_queue(). After drooping ->queue_stop_reason_lock
> is possible that __ieee80211_wake_queue() will be performed before
> __ieee80211_stop_queue(), hence we stop queue and newer wake up it
> again.
>
> To prevent stalls serialize pause/unpause by queue->tx_lock.
>
> Cc: [email protected]
> Signed-off-by: Stanislaw Gruszka <[email protected]>

Acked-by: Ivo van Doorn <[email protected]>

> ---
> ?drivers/net/wireless/rt2x00/rt2x00dev.c | ? 10 ++++++++--
> ?drivers/net/wireless/rt2x00/rt2x00mac.c | ? 10 +++++++++-
> ?2 files changed, 17 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/net/wireless/rt2x00/rt2x00dev.c b/drivers/net/wireless/rt2x00/rt2x00dev.c
> index 49a51b4..6c64658 100644
> --- a/drivers/net/wireless/rt2x00/rt2x00dev.c
> +++ b/drivers/net/wireless/rt2x00/rt2x00dev.c
> @@ -430,10 +430,16 @@ void rt2x00lib_txdone(struct queue_entry *entry,
> ? ? ? ?/*
> ? ? ? ? * If the data queue was below the threshold before the txdone
> ? ? ? ? * handler we must make sure the packet queue in the mac80211 stack
> - ? ? ? ?* is reenabled when the txdone handler has finished.
> + ? ? ? ?* is reenabled when the txdone handler has finished. This has to be
> + ? ? ? ?* serialized with rt2x00mac_tx, otherwise we can wake up mac80211
> + ? ? ? ?* queue before it was stopped if someone else hold mac80211 internal
> + ? ? ? ?* local->queue_stop_reason_lock .
> ? ? ? ? */
> - ? ? ? if (!rt2x00queue_threshold(entry->queue))
> + ? ? ? if (!rt2x00queue_threshold(entry->queue)) {
> + ? ? ? ? ? ? ? spin_lock_irq(&entry->queue->tx_lock);
> ? ? ? ? ? ? ? ?rt2x00queue_unpause_queue(entry->queue);
> + ? ? ? ? ? ? ? spin_unlock_irq(&entry->queue->tx_lock);
> + ? ? ? }
> ?}
> ?EXPORT_SYMBOL_GPL(rt2x00lib_txdone);
>
> diff --git a/drivers/net/wireless/rt2x00/rt2x00mac.c b/drivers/net/wireless/rt2x00/rt2x00mac.c
> index ede3c58..2880512 100644
> --- a/drivers/net/wireless/rt2x00/rt2x00mac.c
> +++ b/drivers/net/wireless/rt2x00/rt2x00mac.c
> @@ -152,13 +152,21 @@ void rt2x00mac_tx(struct ieee80211_hw *hw, struct sk_buff *skb)
> ? ? ? ?if (unlikely(rt2x00queue_write_tx_frame(queue, skb, false)))
> ? ? ? ? ? ? ? ?goto exit_fail;
>
> - ? ? ? if (rt2x00queue_threshold(queue))
> + ? ? ? /*
> + ? ? ? ?* Pausing queue has to be serialized with rt2x00lib_txdone .
> + ? ? ? ?*/
> + ? ? ? if (rt2x00queue_threshold(queue)) {
> + ? ? ? ? ? ? ? spin_lock(&queue->tx_lock);
> ? ? ? ? ? ? ? ?rt2x00queue_pause_queue(queue);
> + ? ? ? ? ? ? ? spin_unlock(&queue->tx_lock);
> + ? ? ? }
>
> ? ? ? ?return;
>
> ?exit_fail:
> + ? ? ? spin_lock(&queue->tx_lock);
> ? ? ? ?rt2x00queue_pause_queue(queue);
> + ? ? ? spin_unlock(&queue->tx_lock);
> ?exit_free_skb:
> ? ? ? ?ieee80211_free_txskb(hw, skb);
> ?}
> --
> 1.7.1
>
>
> _______________________________________________
> users mailing list
> [email protected]
> http://rt2x00.serialmonkey.com/mailman/listinfo/users_rt2x00.serialmonkey.com

2012-03-11 09:53:36

by Martin Hundebøll

[permalink] [raw]
Subject: Re: [PATCH 3.3] rt2x00: fix random stalls

Hi,

On 03/07/2012 07:46 PM, Stanislaw Gruszka wrote:
> Hi
>
> On Tue, Mar 06, 2012 at 06:43:56PM +0100, Martin Hundebøll wrote:
>> Hi,
>>
>> On 03/05/2012 05:48 PM, Stanislaw Gruszka wrote:
>>> Is possible that we stop queue and then do not wake up it again,
>>> especially when packets are transmitted fast. That can be easily
>>> reproduced with modified tx queue entry_num to some small value e.g. 16.
>>>
>>> If mac80211 already hold local->queue_stop_reason_lock, then we can wait
>>> on that lock in both rt2x00queue_pause_queue() and
>>> rt2x00queue_unpause_queue(). After drooping ->queue_stop_reason_lock
>>> is possible that __ieee80211_wake_queue() will be performed before
>>> __ieee80211_stop_queue(), hence we stop queue and newer wake up it
>>> again.
>>>
>>> To prevent stalls serialize pause/unpause by queue->tx_lock.
>> I've been having CPU load issues with rt2800usb/Ralink RT2870, when doing simultaneous TX/RX between to nodes in an adhoc network. While transfering UDP packets in one direction with iperf[1], I get ~23Mbit/s and kworker is utilizing<10% of the CPU (OMAP4 1GHz dualcore or/and Pentium M 1.70GHz) on both ends. When doing bidirectional tests with iperf[2], one kworker thread jumps too 100% and throughput drops.
>>
>> By using two iperf clients to do bidirectional TCP transfers, I got ~6Mbit/s in both directions, so I suspected some queueing issues and thus applied this patch, but no change. I've tried to do some tracing[3], but this is quite new to me, so please instruct me, if you need more info.
> I did short test here and do not enter that issue. Which kernel version are you using?

I forgot to mention, that we work with adhoc networks. I've tested 2.6.38.4 and 3.2.7, with in-tree and compat-wireless on both versions.

However, the issue is fixed by another patch-set posted on [email protected] ([RFC/RFT 0/5] rt2x00: rt2800usb: rework tx path). So now I "just" see random stalls, which should be fixed by this patch. (I will test the most recent version shortly.)

>> [3]
>> out.txt has a trace from 10.10.10.55 while running iperf as in [2] and the following commands:
> Please newer do this again :-) If you want to provide such big data, put it
> somewhere and paste download link to the email. Moreover that tracing
> did not provide any useful information.

Will never happen again :)

// Martin

2012-03-06 21:37:17

by Martin Hundebøll

[permalink] [raw]
Subject: Re: [PATCH 3.3] rt2x00: fix random stalls

Hi,

I'm very sorry about the size of my previous mail. I hope that it was not too trouble.

/ Martin

On 03/06/2012 06:43 PM, Martin Hundebøll wrote:
> Hi,
>
> On 03/05/2012 05:48 PM, Stanislaw Gruszka wrote:
>> Is possible that we stop queue and then do not wake up it again,
>> especially when packets are transmitted fast. That can be easily
>> reproduced with modified tx queue entry_num to some small value e.g. 16.
>>
>> If mac80211 already hold local->queue_stop_reason_lock, then we can wait
>> on that lock in both rt2x00queue_pause_queue() and
>> rt2x00queue_unpause_queue(). After drooping ->queue_stop_reason_lock
>> is possible that __ieee80211_wake_queue() will be performed before
>> __ieee80211_stop_queue(), hence we stop queue and newer wake up it
>> again.
>>
>> To prevent stalls serialize pause/unpause by queue->tx_lock.
>
> I've been having CPU load issues with rt2800usb/Ralink RT2870, when doing simultaneous TX/RX between to nodes in an adhoc network. While transfering UDP packets in one direction with iperf[1], I get ~23Mbit/s and kworker is utilizing <10% of the CPU (OMAP4 1GHz dualcore or/and Pentium M 1.70GHz) on both ends. When doing bidirectional tests with iperf[2], one kworker thread jumps too 100% and throughput drops.
>
> By using two iperf clients to do bidirectional TCP transfers, I got ~6Mbit/s in both directions, so I suspected some queueing issues and thus applied this patch, but no change. I've tried to do some tracing[3], but this is quite new to me, so please instruct me, if you need more info.
>
> Kind regards,
> Martin Hundebøll
>
> [1]
> iperf unidirectional cmd and output:
> # iperf -c10.10.10.56 -ub50M
> Server Report:
> 0.0-10.0 sec 27.5 MBytes 22.9 Mbits/sec 1.639 ms 0/19602 (0%)
>
> [2]
> iperf bidirectional cmd and output:
> # iperf -c10.10.10.56 -udb8M
> Sent 2501 datagrams
> [ 3] 0.0-11.0 sec 1.26 MBytes 963 Kbits/sec 22.437 ms 943/ 1840 (51%)
> Server Report:
> [ 4] 0.0-10.9 sec 2.11 MBytes 1.62 Mbits/sec 309.803 ms 993/ 2500 (40%)
>
> [3]
> out.txt has a trace from 10.10.10.55 while running iperf as in [2] and the following commands:
> $ echo workqueue:workqueue_queue_work > /sys/kernel/debug/tracing/set_event
> $ cat /sys/kernel/debug/tracing/trace_pipe > out.txt