2010-01-10 13:07:55

by Lennert Buytenhek

[permalink] [raw]
Subject: [PATCH] mac80211: flush workqueue before calling driver ->stop() method

Since commit "mwl8k: handle station database update for AP's sta entry
via ->sta_notify()", mwl8k every now and then gets a command timeout
when ifconfig'ing a STA interface down. This turns out to be due to
mwl8k_stop() being called while the work queue item that was scheduled
by mwl8k_sta_notify() to remove the STA entry for the associated AP is
still queued, and the former disables interrupts so that when the
latter eventually runs, a command completion interrupt is never seen.

Fix this by changing ieee80211_stop_device() so that the workqueue is
flushed before drv_stop() is called, instead of doing it the other way
around as is done now. (As ->stop() is allowed to sleep, there isn't
any reason for drivers to queue work from within it.)

Signed-off-by: Lennert Buytenhek <[email protected]>

diff --git a/net/mac80211/util.c b/net/mac80211/util.c
index bc73904..04680ca 100644
--- a/net/mac80211/util.c
+++ b/net/mac80211/util.c
@@ -1077,9 +1077,9 @@ void ieee80211_stop_device(struct ieee80211_local *local)
ieee80211_led_radio(local, false);

cancel_work_sync(&local->reconfig_filter);
- drv_stop(local);

flush_workqueue(local->workqueue);
+ drv_stop(local);
}

int ieee80211_reconfig(struct ieee80211_local *local)


2010-01-10 13:43:44

by Michael Büsch

[permalink] [raw]
Subject: Re: [PATCH] mac80211: flush workqueue before calling driver ->stop() method

On Sunday 10 January 2010 14:40:39 Johannes Berg wrote:
> See, it's not strictly forbidden to queue work, there's just no

Ok, well. Lennert's mail sounded different to me.

--
Greetings, Michael.

2010-01-10 13:46:02

by Johannes Berg

[permalink] [raw]
Subject: Re: [PATCH] mac80211: flush workqueue before calling driver ->stop() method

On Sun, 2010-01-10 at 14:43 +0100, Michael Buesch wrote:
> On Sunday 10 January 2010 14:40:39 Johannes Berg wrote:
> > See, it's not strictly forbidden to queue work, there's just no
>
> Ok, well. Lennert's mail sounded different to me.

Ok. But would you agree with my assertion? I don't see what flushing it
again would buy us since after stop, it doesn't seem to matter when it
gets executed.

johannes


Attachments:
signature.asc (801.00 B)
This is a digitally signed message part

2010-01-10 13:35:31

by Michael Büsch

[permalink] [raw]
Subject: Re: [PATCH] mac80211: flush workqueue before calling driver ->stop() method

On Sunday 10 January 2010 14:07:53 Lennert Buytenhek wrote:
> Since commit "mwl8k: handle station database update for AP's sta entry
> via ->sta_notify()", mwl8k every now and then gets a command timeout
> when ifconfig'ing a STA interface down. This turns out to be due to
> mwl8k_stop() being called while the work queue item that was scheduled
> by mwl8k_sta_notify() to remove the STA entry for the associated AP is
> still queued, and the former disables interrupts so that when the
> latter eventually runs, a command completion interrupt is never seen.
>
> Fix this by changing ieee80211_stop_device() so that the workqueue is
> flushed before drv_stop() is called, instead of doing it the other way
> around as is done now. (As ->stop() is allowed to sleep, there isn't
> any reason for drivers to queue work from within it.)

This smells like we should either:
o Add an assertion that checks whether the driver queued work although it was forbidden.
or
o Call flush_workqueue twice. Once before and once after drv_stop.

> Signed-off-by: Lennert Buytenhek <[email protected]>
>
> diff --git a/net/mac80211/util.c b/net/mac80211/util.c
> index bc73904..04680ca 100644
> --- a/net/mac80211/util.c
> +++ b/net/mac80211/util.c
> @@ -1077,9 +1077,9 @@ void ieee80211_stop_device(struct ieee80211_local *local)
> ieee80211_led_radio(local, false);
>
> cancel_work_sync(&local->reconfig_filter);
> - drv_stop(local);
>
> flush_workqueue(local->workqueue);
> + drv_stop(local);
> }
>
> int ieee80211_reconfig(struct ieee80211_local *local)

--
Greetings, Michael.

2010-01-10 13:41:08

by Johannes Berg

[permalink] [raw]
Subject: Re: [PATCH] mac80211: flush workqueue before calling driver ->stop() method

On Sun, 2010-01-10 at 14:35 +0100, Michael Buesch wrote:
> On Sunday 10 January 2010 14:07:53 Lennert Buytenhek wrote:
> > Since commit "mwl8k: handle station database update for AP's sta
> entry
> > via ->sta_notify()", mwl8k every now and then gets a command timeout
> > when ifconfig'ing a STA interface down. This turns out to be due to
> > mwl8k_stop() being called while the work queue item that was
> scheduled
> > by mwl8k_sta_notify() to remove the STA entry for the associated AP
> is
> > still queued, and the former disables interrupts so that when the
> > latter eventually runs, a command completion interrupt is never
> seen.
> >
> > Fix this by changing ieee80211_stop_device() so that the workqueue
> is
> > flushed before drv_stop() is called, instead of doing it the other
> way
> > around as is done now. (As ->stop() is allowed to sleep, there
> isn't
> > any reason for drivers to queue work from within it.)
>
> This smells like we should either:
> o Add an assertion that checks whether the driver queued work although
> it was forbidden.
> or
> o Call flush_workqueue twice. Once before and once after drv_stop.

I don't think we need to do either.

See, it's not strictly forbidden to queue work, there's just no
guarantee when it will be executed. But since the driver is already
stopped, what kind of guarantee should it get anyway? And destroying the
workqueue will flush it anyway, if this is part of the unregistration
procedure. So, the driver would be stupid to queue work there that did
anything to the hw, but if it has some work that's purely software state
it shouldn't matter at all.

johannes


Attachments:
signature.asc (801.00 B)
This is a digitally signed message part

2010-01-10 13:50:42

by Michael Büsch

[permalink] [raw]
Subject: Re: [PATCH] mac80211: flush workqueue before calling driver ->stop() method

On Sunday 10 January 2010 14:45:58 Johannes Berg wrote:
> On Sun, 2010-01-10 at 14:43 +0100, Michael Buesch wrote:
> > On Sunday 10 January 2010 14:40:39 Johannes Berg wrote:
> > > See, it's not strictly forbidden to queue work, there's just no
> >
> > Ok, well. Lennert's mail sounded different to me.
>
> Ok. But would you agree with my assertion? I don't see what flushing it
> again would buy us since after stop, it doesn't seem to matter when it
> gets executed.

Yeah, well. I was just thinking about possible existing driver bugs that depended
on the current behavior to flush the workqueue after stop. Those would probably
silently blow up.
And as I thought (I might have been wrong) that there was a constraint on drv_stop
callbacks to not being allowed to queue work, I thought it was a good idea to assert that.
In my experience work flush bugs are hard to track down and debug, so...

Well, not that this is important, but well...

--
Greetings, Michael.