2010-01-06 14:31:11

by Johannes Berg

[permalink] [raw]
Subject: [PATCH] mac80211: fix a few work bugs

Kalle and Lennert reported problems with the new work
code, and at least Kalle's problem I was able to trace
to a missing jiffies initialisation.

I also ran into a problem where occasionally I couldn't
connect, which seems fixed with kicking the work items
after scanning.

Finally, also add some sanity checking code to verify
that we're not adding work items while an interface is
down -- that case could lead to something similar to
what Lennert was seeing.

There still seems to be a race condition that we're
trying to figure out separately.

Signed-off-by: Johannes Berg <[email protected]>
---
net/mac80211/scan.c | 1 +
net/mac80211/work.c | 4 ++++
2 files changed, 5 insertions(+)

--- wireless-testing.orig/net/mac80211/scan.c 2010-01-06 09:50:40.000000000 +0100
+++ wireless-testing/net/mac80211/scan.c 2010-01-06 09:53:23.000000000 +0100
@@ -285,6 +285,7 @@ void ieee80211_scan_completed(struct iee
ieee80211_mlme_notify_scan_completed(local);
ieee80211_ibss_notify_scan_completed(local);
ieee80211_mesh_notify_scan_completed(local);
+ ieee80211_queue_work(&local->hw, &local->work_work);
}
EXPORT_SYMBOL(ieee80211_scan_completed);

--- wireless-testing.orig/net/mac80211/work.c 2010-01-06 09:51:03.000000000 +0100
+++ wireless-testing/net/mac80211/work.c 2010-01-06 11:30:26.000000000 +0100
@@ -818,6 +818,7 @@ static void ieee80211_work_work(struct w
wk->chan == local->tmp_channel &&
wk->chan_type == local->tmp_channel_type) {
wk->started = true;
+ wk->timeout = jiffies;
}

if (!wk->started && !local->tmp_channel) {
@@ -935,6 +936,9 @@ void ieee80211_add_work(struct ieee80211
if (WARN_ON(!wk->done))
return;

+ if (WARN_ON(!ieee80211_sdata_running(wk->sdata)))
+ return;
+
wk->started = false;

local = wk->sdata->local;




2010-01-07 06:51:00

by Kalle Valo

[permalink] [raw]
Subject: Re: [PATCH] mac80211: fix a few work bugs

Kalle Valo <[email protected]> writes:

> Johannes Berg <[email protected]> writes:
>
>> Kalle and Lennert reported problems with the new work
>> code, and at least Kalle's problem I was able to trace
>> to a missing jiffies initialisation.
>>
>> I also ran into a problem where occasionally I couldn't
>> connect, which seems fixed with kicking the work items
>> after scanning.

[...]

> Because I'm currently travelling, I have tested this only once. Will
> test more when I get back home.

I have now tested this by rebooting few times and I didn't notice any
problems. So the patch seems to really help.


--
Kalle Valo

2010-01-06 14:32:08

by Lennert Buytenhek

[permalink] [raw]
Subject: Re: [PATCH] mac80211: fix a few work bugs

On Wed, Jan 06, 2010 at 03:30:58PM +0100, Johannes Berg wrote:

> Kalle and Lennert reported problems with the new work
> code, and at least Kalle's problem I was able to trace
> to a missing jiffies initialisation.
>
> I also ran into a problem where occasionally I couldn't
> connect, which seems fixed with kicking the work items
> after scanning.
>
> Finally, also add some sanity checking code to verify
> that we're not adding work items while an interface is
> down -- that case could lead to something similar to
> what Lennert was seeing.
>
> There still seems to be a race condition that we're
> trying to figure out separately.
>
> Signed-off-by: Johannes Berg <[email protected]>

Tested-by: Lennert Buytenhek <[email protected]>

2010-01-06 16:07:15

by Kalle Valo

[permalink] [raw]
Subject: Re: [PATCH] mac80211: fix a few work bugs

Johannes Berg <[email protected]> writes:

> Kalle and Lennert reported problems with the new work
> code, and at least Kalle's problem I was able to trace
> to a missing jiffies initialisation.
>
> I also ran into a problem where occasionally I couldn't
> connect, which seems fixed with kicking the work items
> after scanning.
>
> Finally, also add some sanity checking code to verify
> that we're not adding work items while an interface is
> down -- that case could lead to something similar to
> what Lennert was seeing.
>
> There still seems to be a race condition that we're
> trying to figure out separately.

This seems to help with my problem, it associated quickly:

Jan 6 18:05:36 tikku kernel: [ 153.527515] wlan0: direct probe
responded
Jan 6 18:05:36 tikku kernel: [ 153.536061] wlan0: authenticate with
00:1e:ab:10:e0:c2 (try 1)
Jan 6 18:05:36 tikku kernel: [ 153.537910] wlan0: authenticated
Jan 6 18:05:36 tikku kernel: [ 153.537924] phy0: device now idle
Jan 6 18:05:36 tikku kernel: [ 153.538563] wlan0: associate with
00:1e:ab:10:e0:c2 (try 1)
Jan 6 18:05:36 tikku kernel: [ 153.538578] phy0: device no longer
idle - working
Jan 6 18:05:36 tikku kernel: [ 153.578808] wlan0: RX AssocResp from
00:1e:ab:10:e0:c2 (capab=0x11 status=0 aid=1)
Jan 6 18:05:36 tikku kernel: [ 153.578814] wlan0: associated


Because I'm currently travelling, I have tested this only once. Will
test more when I get back home.

Thanks for the patch.

> Signed-off-by: Johannes Berg <[email protected]>

Tested-by: Kalle Valo <[email protected]>

--
Kalle Valo