2009-07-18 15:02:54

by Larry Finger

[permalink] [raw]
Subject: Re: WARN_ON at minstrel_get_rate (include/net/mac80211.h:2111) with p54usb

Christian Lamparter wrote:
> On Saturday 18 July 2009 03:05:04 Larry Finger wrote:
>> Johannes and Christian,
>>
>> I am getting a WARN_ON from mac80211 in the location stated in the
>> subject. I put in some test prints and got the following:
>>
>> sband->n_bitrates 8, Band 1, supp_rates 0x0
> hmm, so something decides to talk to a 5GHz network here?
> But the AP doesn't have any available rates in that band!?

Ah, band 1 is 5GHz, not 2.4. There are no 802.11a AP's in my neighborhood.

> If my memory serves my right some people have triggered the same WARN_ON
> with the ath5k driver as well...
> unfortunately, I didn't follow the thread and now I can't find it anymore.
>
> (btw: is your b43 11a capable as well?)

Yes it is, but that section is software crippled.

Thanks for the patch. I'll let you know what happens.

Larry


2009-07-19 14:36:49

by Larry Finger

[permalink] [raw]
Subject: Re: WARN_ON at minstrel_get_rate (include/net/mac80211.h:2111) with p54usb

Christian Lamparter wrote:
> well, ieee80211_sta_monitor_work - which probes the AP every now and then -
> didn't check if we're scanning.
> The attached diff survives a non-stop scanning without throwing the WARN in
> rate_lowest_index once.
>
> However, I'm not so sure about the locking for hw_scanning and sw_scanning.
> It looks like only scan.c manipulates them under the scan mutex.
> But then, do we need locking for a single threaded workqueue? guess not.
> ---
> Larry,
>
> here's another _fix_ which might even fix the problem after all ;-)
>
> Regards,
> Chr
> ---
> diff --git a/net/mac80211/mlme.c b/net/mac80211/mlme.c
> index 18dad22..4833e7c 100644
> --- a/net/mac80211/mlme.c
> +++ b/net/mac80211/mlme.c
> @@ -2210,6 +2210,9 @@ static void ieee80211_sta_monitor_work(struct work_struct *work)
> container_of(work, struct ieee80211_sub_if_data,
> u.mgd.monitor_work);
>
> + if (sdata->local->sw_scanning || sdata->local->hw_scanning)
> + return;
> +
> ieee80211_mgd_probe_ap(sdata, false);
> }

Yes - this patch gets rid of the warnings.

Thanks,

Larry


2009-07-18 17:48:56

by Christian Lamparter

[permalink] [raw]
Subject: Re: WARN_ON at minstrel_get_rate (include/net/mac80211.h:2111)

On Saturday 18 July 2009 18:08:04 Christian Lamparter wrote:
> On Saturday 18 July 2009 17:03:18 Larry Finger wrote:
> > Thanks for the patch. I'll let you know what happens.
> I have my doubts. Do you still have the WARN_ON dump?
> or at least which of the two functions are listed in the dump?
> - rate_control_get_rate
> - rate_control_rate_init <- unlikely/really surprising

found a similar WARNINGs on kerneloops:
http://www.kerneloops.org/raw.php?rawid=449304&msgid=
but the user has ath5k.

and there's more: iwlagn used to have this issue, but they fixed it.
---
commit c338ba3ca5bef2df2082d9e8d336ff7b2880c326
Author: Abbas, Mohamed <[email protected]>
Date: Wed Jan 21 10:58:02 2009 -0800

iwlwifi: fix rs_get_rate WARN_ON()

In ieee80211_sta structure there is u64 supp_rates[IEEE80211_NUM_BANDS]
this is filled with all support rate from assoc_resp. If we associate
with G-band AP only supp_rates of G-band will be set the other band
supp_rates will be set to 0. If the user type this command
this will cause mac80211 to set to new channel, mac80211
does not disassociate in setting new channel, so the active
band is now A-band. then in handling the new essid mac80211 will
kick in the assoc steps which involve sending disassociation frame.
in this mac80211 will WARN_ON sta->supp_rates[A_BAND] == 0.
---

Larry, can you confirm that the frame which triggers the WARN is a
disassociation frame.

2009-07-18 22:33:00

by Christian Lamparter

[permalink] [raw]
Subject: Re: WARN_ON at minstrel_get_rate (include/net/mac80211.h:2111) with p54usb

On Saturday 18 July 2009 23:19:36 Bob Copeland wrote:
> On Sat, Jul 18, 2009 at 11:03 AM, Larry Finger<[email protected]> wrote:
> > Christian Lamparter wrote:
> >> On Saturday 18 July 2009 03:05:04 Larry Finger wrote:
> >>> Johannes and Christian,
> >>>
> >>> I am getting a WARN_ON from mac80211 in the location stated in the
> >>> subject. I put in some test prints and got the following:
> >>>
> >>> sband->n_bitrates 8, Band 1, supp_rates 0x0
> >> hmm, so something decides to talk to a 5GHz network here?
> >> But the AP doesn't have any available rates in that band!?
> >
> > Ah, band 1 is 5GHz, not 2.4. There are no 802.11a AP's in my neighborhood.
> >
> >> If my memory serves my right some people have triggered the same WARN_ON
> >> with the ath5k driver as well...
> >> unfortunately, I didn't follow the thread and now I can't find it anymore.
> >>
> >> (btw: is your b43 11a capable as well?)
> >
> > Yes it is, but that section is software crippled.
> >
> > Thanks for the patch. I'll let you know what happens.
>
> In the ath5k case as well, I'm willing to bet it has something to do with dual
> band operation. IIRC at least some instances of this in the past were due to
> getting the rate after the band was changed, e.g. due to scanning, and the peer
> of course didn't support any rates on that band.
>
well, ieee80211_sta_monitor_work - which probes the AP every now and then -
didn't check if we're scanning.
The attached diff survives a non-stop scanning without throwing the WARN in
rate_lowest_index once.

However, I'm not so sure about the locking for hw_scanning and sw_scanning.
It looks like only scan.c manipulates them under the scan mutex.
But then, do we need locking for a single threaded workqueue? guess not.
---
Larry,

here's another _fix_ which might even fix the problem after all ;-)

Regards,
Chr
---
diff --git a/net/mac80211/mlme.c b/net/mac80211/mlme.c
index 18dad22..4833e7c 100644
--- a/net/mac80211/mlme.c
+++ b/net/mac80211/mlme.c
@@ -2210,6 +2210,9 @@ static void ieee80211_sta_monitor_work(struct work_struct *work)
container_of(work, struct ieee80211_sub_if_data,
u.mgd.monitor_work);

+ if (sdata->local->sw_scanning || sdata->local->hw_scanning)
+ return;
+
ieee80211_mgd_probe_ap(sdata, false);
}


2009-07-19 20:11:11

by Christian Lamparter

[permalink] [raw]
Subject: [PATCH] mac80211: do not monitor the connection while scanning

mac80211 constantly monitors the connection to the associated AP
in order to check if it is out of reach/dead.

This is absolutely fine most of the time.
Except when there is a scheduled scan for the whole neighborhood.
After all this path could trigger while scanning on different channel.
Or even worse: this AP probing triggers a WARN_ON in rate_lowest_index
when the scan code did a band transition!
( http://www.kerneloops.org/raw.php?rawid=449304 )

Reported-by: Larry Finger <[email protected]>
Signed-off-by: Christian Lamparter <[email protected]>
Tested-by: Larry Finger <[email protected]>
---
diff --git a/net/mac80211/mlme.c b/net/mac80211/mlme.c
index 18dad22..4833e7c 100644
--- a/net/mac80211/mlme.c
+++ b/net/mac80211/mlme.c
@@ -2210,6 +2210,9 @@ static void ieee80211_sta_monitor_work(struct work_struct *work)
container_of(work, struct ieee80211_sub_if_data,
u.mgd.monitor_work);

+ if (sdata->local->sw_scanning || sdata->local->hw_scanning)
+ return;
+
ieee80211_mgd_probe_ap(sdata, false);
}


2009-07-18 16:08:07

by Christian Lamparter

[permalink] [raw]
Subject: Re: WARN_ON at minstrel_get_rate (include/net/mac80211.h:2111)

On Saturday 18 July 2009 17:03:18 Larry Finger wrote:
> Christian Lamparter wrote:
> > If my memory serves my right some people have triggered the same WARN_ON
> > with the ath5k driver as well...
> > unfortunately, I didn't follow the thread and now I can't find it anymore.
anyone? Bob, Nick?

> Thanks for the patch. I'll let you know what happens.
I have my doubts. Do you still have the WARN_ON dump?
or at least which of the two functions are listed in the dump?
- rate_control_get_rate
- rate_control_rate_init <- unlikely/really surprising

Regards,
Chr

2009-07-18 21:19:37

by Bob Copeland

[permalink] [raw]
Subject: Re: WARN_ON at minstrel_get_rate (include/net/mac80211.h:2111) with p54usb

On Sat, Jul 18, 2009 at 11:03 AM, Larry Finger<[email protected]> wrote:
> Christian Lamparter wrote:
>> On Saturday 18 July 2009 03:05:04 Larry Finger wrote:
>>> Johannes and Christian,
>>>
>>> I am getting a WARN_ON from mac80211 in the location stated in the
>>> subject. I put in some test prints and got the following:
>>>
>>> sband->n_bitrates 8, Band 1, supp_rates 0x0
>> hmm, so something decides to talk to a 5GHz network here?
>> But the AP doesn't have any available rates in that band!?
>
> Ah, band 1 is 5GHz, not 2.4. There are no 802.11a AP's in my neighborhood.
>
>> If my memory serves my right some people have triggered the same WARN_ON
>> with the ath5k driver as well...
>> unfortunately, I didn't follow the thread and now I can't find it anymore.
>>
>> (btw: is your b43 11a capable as well?)
>
> Yes it is, but that section is software crippled.
>
> Thanks for the patch. I'll let you know what happens.

In the ath5k case as well, I'm willing to bet it has something to do with dual
band operation. IIRC at least some instances of this in the past were due to
getting the rate after the band was changed, e.g. due to scanning, and the peer
of course didn't support any rates on that band.

--
Bob Copeland %% http://www.bobcopeland.com

2009-07-20 16:17:24

by Luis R. Rodriguez

[permalink] [raw]
Subject: Re: WARN_ON at minstrel_get_rate (include/net/mac80211.h:2111)

On Sat, Jul 18, 2009 at 10:48 AM, Christian Lamparter<[email protected]> wrote:
> On Saturday 18 July 2009 18:08:04 Christian Lamparter wrote:
>> On Saturday 18 July 2009 17:03:18 Larry Finger wrote:
>> > Thanks for the patch. I'll let you know what happens.
>> I have my doubts. Do you still have the WARN_ON dump?
>> or at least which of the two functions are listed in the dump?
>> - rate_control_get_rate
>> - rate_control_rate_init <- unlikely/really surprising
>
> found a similar WARNINGs on kerneloops:
> http://www.kerneloops.org/raw.php?rawid=449304&msgid=
> but the user has ath5k.
>
> and there's more: iwlagn used to have this issue, but they fixed it.
> ---
> commit c338ba3ca5bef2df2082d9e8d336ff7b2880c326
> Author: Abbas, Mohamed <[email protected]>
> Date:   Wed Jan 21 10:58:02 2009 -0800
>
>    iwlwifi: fix rs_get_rate WARN_ON()
>
>    In ieee80211_sta structure there is u64 supp_rates[IEEE80211_NUM_BANDS]
>    this is filled with all support rate from assoc_resp.  If we associate
>    with G-band AP only supp_rates of G-band will be set the other band
>    supp_rates will be set to 0. If the user type this command
>    this will cause mac80211 to set to new channel, mac80211
>    does not disassociate in setting new channel, so the active
>    band is now A-band. then in handling the new essid mac80211 will
>    kick in the assoc steps which involve sending disassociation frame.
>    in this mac80211 will WARN_ON sta->supp_rates[A_BAND] == 0.

I revert this patch in my rate control cleanup series, the real issue
was underneath the hood in mac80211 and should no longer be present.
It was caused by scans being issued with the assumption a valid rate
will be found on a different band for the target sta entry (the AP).
In my patch, "mac80211: drop frames for sta with no valid rate" we
simply drop these frames now and warn whenever such attempts are being
made within mac80211.

Luis