2012-06-27 13:51:42

by Johannes Berg

[permalink] [raw]
Subject: 3.4.x bug with channel type changes

Hi Paul,

I'm debugging a problem for a user (Florian)
(http://bugzilla.intellinuxwireless.org/show_bug.cgi?id=2363) that turns
out to be a mac80211 bug in 3.4, apparently introduced by an interaction
between your commit

commit 3117bbdb7899d43927c8ce4fe885ab7c1231c121
Author: Paul Stewart <[email protected]>
Date: Tue Mar 13 07:46:18 2012 -0700

mac80211: Don't let regulatory make us deaf


and Rajkumar's commit

commit 7cc44ed48d0ec0937c1f098642540b6c9ca38de5
Author: Rajkumar Manoharan <[email protected]>
Date: Fri Sep 16 15:32:34 2011 +0530

mac80211: Fix regression on queue stop during 2040 bss change


The problem is that the code looks like this in 3.4, after both commits:

if (beacon_htcap_ie && (prev_chantype != rx_channel_type)) {
/*
* Whenever the AP announces the HT mode change that can be
* 40MHz intolerant or etc., it would be safer to stop tx
* queues before doing hw config to avoid buffer overflow.
*/
ieee80211_stop_queues_by_reason(&sdata->local->hw,
IEEE80211_QUEUE_STOP_REASON_CHTYPE_CHANGE);

/* flush out all packets */
synchronize_net();

drv_flush(local, false);
}

/* channel_type change automatically detected */
ieee80211_hw_config(local, 0);

if (prev_chantype != tx_channel_type) {
rcu_read_lock();
sta = sta_info_get(sdata, bssid);
if (sta)
rate_control_rate_update(local, sband, sta,
IEEE80211_RC_HT_CHANGED,
tx_channel_type);
rcu_read_unlock();

if (beacon_htcap_ie)
ieee80211_wake_queues_by_reason(&sdata->local->hw,
IEEE80211_QUEUE_STOP_REASON_CHTYPE_CHANGE);
}


In the tracing I got from Florian, I see that the AP is constantly
switching between HT20 and HT40-. At some point, all this switching
leads to all queues remaining stopped -- forever. The module has to be
unloaded & reloaded for this to get fixed because the CHTYPE_CHANGE stop
reason is never cleared and thus we never transmit any packets again.

Clearly, the problem is that in the first if, we check beacon_htcap_ie
as well as "prev_chantype != rx_channel_type", and in the second if that
would wake the queues, it's "prev_chantype != tx_channel_type". Since RX
and TX channel type can differ (which was the point of your commit),
this can lead to queues being woken that weren't stopped (which is not a
problem) or queues being stopped and never woken again (major issue).

Now, the reason I'm writing all this in an email and not a commit log is
that I don't know what the TX/RX channel type usage should be, or if
maybe it should be "if (queues_stopped) wake_queues()" or something like
that... Can you tell me what it should be?

Thanks,
johannes



2012-06-27 15:55:12

by Rajkumar Manoharan

[permalink] [raw]
Subject: Re: 3.4.x bug with channel type changes

On Wed, Jun 27, 2012 at 07:58:08AM -0700, Paul Stewart wrote:
> On Wed, Jun 27, 2012 at 7:06 AM, Johannes Berg
> <[email protected]> wrote:
> > On Wed, 2012-06-27 at 15:51 +0200, Johannes Berg wrote:
> >
> >> Now, the reason I'm writing all this in an email and not a commit log is
> >> that I don't know what the TX/RX channel type usage should be, or if
> >> maybe it should be "if (queues_stopped) wake_queues()" or something like
> >> that... Can you tell me what it should be?
> >
> > Something like this should help, of course, but maybe there are other
> > issues in this area too?
> >
> > http://p.sipsolutions.net/3e89ee42463d6cc9.txt
>
> This looks good to me, however, Rajkumar might be better at answering
> the question of correctness since I was only involved as far as
> changing the rate control, and didn't (intend to) change anything as
> far as queues go.
>
The changes are good to me.

-Rajkumar

2012-06-27 14:06:23

by Johannes Berg

[permalink] [raw]
Subject: Re: 3.4.x bug with channel type changes

On Wed, 2012-06-27 at 15:51 +0200, Johannes Berg wrote:

> Now, the reason I'm writing all this in an email and not a commit log is
> that I don't know what the TX/RX channel type usage should be, or if
> maybe it should be "if (queues_stopped) wake_queues()" or something like
> that... Can you tell me what it should be?

Something like this should help, of course, but maybe there are other
issues in this area too?

http://p.sipsolutions.net/3e89ee42463d6cc9.txt

johannes


2012-06-27 14:58:09

by Paul Stewart

[permalink] [raw]
Subject: Re: 3.4.x bug with channel type changes

On Wed, Jun 27, 2012 at 7:06 AM, Johannes Berg
<[email protected]> wrote:
> On Wed, 2012-06-27 at 15:51 +0200, Johannes Berg wrote:
>
>> Now, the reason I'm writing all this in an email and not a commit log is
>> that I don't know what the TX/RX channel type usage should be, or if
>> maybe it should be "if (queues_stopped) wake_queues()" or something like
>> that... Can you tell me what it should be?
>
> Something like this should help, of course, but maybe there are other
> issues in this area too?
>
> http://p.sipsolutions.net/3e89ee42463d6cc9.txt

This looks good to me, however, Rajkumar might be better at answering
the question of correctness since I was only involved as far as
changing the rate control, and didn't (intend to) change anything as
far as queues go.


>
> johannes
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
> the body of a message to [email protected]
> More majordomo info at ?http://vger.kernel.org/majordomo-info.html