2010-04-09 10:12:38

by Teemu Paasikivi

[permalink] [raw]
Subject: [PATCH] mac80211: check whether scan is in progress before queueing scan_work

From: Paasikivi Teemu.3 (EXT-Ixonos/Tampere) <[email protected]>

As scan_work is queued from work_work it needs to be checked if scan
has been started during execution of work_work. Otherwise, when hw
scan is used, the stack gets error about hw being busy with ongoing
scan. This causes the stack to abort scan without notifying the driver
about it. This leads to a situation where the hw is scanning and the stack
thinks it's not. Then when the scan finishes, the stack will complain by
warnings.

Signed-off-by: Teemu Paasikivi <[email protected]>
Reviewed-by: Johannes Berg <[email protected]>
---
net/mac80211/work.c | 7 ++++++-
1 files changed, 6 insertions(+), 1 deletions(-)

diff --git a/net/mac80211/work.c b/net/mac80211/work.c
index 1e1ea30..7bd8670 100644
--- a/net/mac80211/work.c
+++ b/net/mac80211/work.c
@@ -919,11 +919,16 @@ static void ieee80211_work_work(struct work_struct *work)
run_again(local, jiffies + HZ/2);
}

- if (list_empty(&local->work_list) && local->scan_req)
+ mutex_lock(&local->scan_mtx);
+
+ if (list_empty(&local->work_list) && local->scan_req &&
+ !local->scanning)
ieee80211_queue_delayed_work(&local->hw,
&local->scan_work,
round_jiffies_relative(0));

+ mutex_unlock(&local->scan_mtx);
+
mutex_unlock(&local->work_mtx);

ieee80211_recalc_idle(local);
--
1.5.6.3



2010-04-06 09:08:04

by Johannes Berg

[permalink] [raw]
Subject: Re: [PATCH] mac80211: check whether scan is in progress before queueing scan_work

On Tue, 2010-04-06 at 11:06 +0200, Johannes Berg wrote:

> > - if (list_empty(&local->work_list) && local->scan_req)
> > + mutex_lock(&local->scan_mtx);
> > +
> > + if (list_empty(&local->work_list) && local->scan_req &&
> > + !local->scanning)
> > ieee80211_queue_delayed_work(&local->hw,
> > &local->scan_work,
> > round_jiffies_relative(0));
> >
> > + mutex_unlock(&local->scan_mtx);
> > +
> > mutex_unlock(&local->work_mtx);
>
> That really doesn't look right ... at all.

Oops, confused the two locks.

Still though, wouldn't the scan work simply not do anything?

johannes


2010-04-06 16:23:24

by Dan Williams

[permalink] [raw]
Subject: Re: [PATCH] mac80211: check whether scan is in progress before queueing scan_work

On Tue, 2010-04-06 at 14:05 +0300, Teemu Paasikivi wrote:
> On Tue, 2010-04-06 at 12:29 +0200, ext Johannes Berg wrote:
> > On Tue, 2010-04-06 at 12:27 +0200, Johannes Berg wrote:
> > > On Tue, 2010-04-06 at 13:16 +0300, Teemu Paasikivi wrote:
> > > > On Tue, 2010-04-06 at 11:06 +0200, ext Johannes Berg wrote:
> > > > > On Tue, 2010-04-06 at 11:54 +0300, Teemu Paasikivi wrote:
> > > > > > As scan_work is queued from work_work it needs to be checked if scan
> > > > > > has been started during execution of work_work. Otherwise, when hw
> > > > > > scan is used, the stack gets error about hw being busy with ongoing
> > > > > > scan.
> > > > >
> > > > > Does that mean we ask the driver to scan twice? And your particular
> > > > > driver returns busy?
> > > > >
> > > >
> > > > Yes. There seems to be a possibility, that when ieee80211_work_work is
> > > > being executed, __ieee80211_start_scan gets called and it starts a hw
> > > > scan, and also sets local->hw_scan_req etc. (because it looks like
> > > > there's some holes in use of scan_mtx). Result is that work_work queues
> > > > ieee80211_scan_work and when it is executed, as there's already
> > > > hw_scan_req set, it will try to start hw scan again. And as the driver
> > > > used returns busy, scan_work will call ieee80211_scan_completed function
> > > > which leaves the driver (hw more precisely) scanning and the stack
> > > > thinking that it is not anymore scanning.
> > > >
> > > > Obviously this kind of situation doesn't happen very often in normal
> > > > use, but it can be caused quite easily by associating to access points
> > > > in a loop while running scan in another loop.
> > >
> > > Makes some sense. Ignore my other email(s), I looked at this in more
> > > detail now.
> > >
> > > It would appear that we need to fix some of the locking here,
> > > potentially simply using a single mutex for both work and scan?
> >
> > I'm mostly worried about deadlocks between work_mtx and scan_mtx really,
> > when we acquire them both like your patch.
> >
>
> Yes, I'm bit worried about possibility of a possible deadlock too. One
> solution would be to move the block acquiring scan_mtx and queueing
> scan_work outside of the work_mtx locked section, I'll take a look on
> that. In any case, this patch won't most likely fix all issues with the
> locking here. For example there's that checking if scan is in progress
> in the beginning of the work_work. After that it is still possible that
> the scan is started. So there's obviously need for some fixing in the
> locking yet to be done.

One issue would be if the new scan request has different parameters than
the in-progress scan request. This is especially important when the
incoming scan request is for a specific SSID (hidden network probing)
while the ongoing one is not. Can we make sure that the patch at least
returns the error up the stack when the incoming scan is dropped?
Otherwise userspace has no idea that the specific SSID scan was dropped
on the floor, and just thinks that the hidden SSID wasn't found.

Dan



2010-04-06 10:28:03

by Johannes Berg

[permalink] [raw]
Subject: Re: [PATCH] mac80211: check whether scan is in progress before queueing scan_work

On Tue, 2010-04-06 at 13:16 +0300, Teemu Paasikivi wrote:
> On Tue, 2010-04-06 at 11:06 +0200, ext Johannes Berg wrote:
> > On Tue, 2010-04-06 at 11:54 +0300, Teemu Paasikivi wrote:
> > > As scan_work is queued from work_work it needs to be checked if scan
> > > has been started during execution of work_work. Otherwise, when hw
> > > scan is used, the stack gets error about hw being busy with ongoing
> > > scan.
> >
> > Does that mean we ask the driver to scan twice? And your particular
> > driver returns busy?
> >
>
> Yes. There seems to be a possibility, that when ieee80211_work_work is
> being executed, __ieee80211_start_scan gets called and it starts a hw
> scan, and also sets local->hw_scan_req etc. (because it looks like
> there's some holes in use of scan_mtx). Result is that work_work queues
> ieee80211_scan_work and when it is executed, as there's already
> hw_scan_req set, it will try to start hw scan again. And as the driver
> used returns busy, scan_work will call ieee80211_scan_completed function
> which leaves the driver (hw more precisely) scanning and the stack
> thinking that it is not anymore scanning.
>
> Obviously this kind of situation doesn't happen very often in normal
> use, but it can be caused quite easily by associating to access points
> in a loop while running scan in another loop.

Makes some sense. Ignore my other email(s), I looked at this in more
detail now.

It would appear that we need to fix some of the locking here,
potentially simply using a single mutex for both work and scan?

johannes


2010-04-06 17:17:57

by Johannes Berg

[permalink] [raw]
Subject: Re: [PATCH] mac80211: check whether scan is in progress before queueing scan_work

On Tue, 2010-04-06 at 09:23 -0700, Dan Williams wrote:

> > Yes, I'm bit worried about possibility of a possible deadlock too. One
> > solution would be to move the block acquiring scan_mtx and queueing
> > scan_work outside of the work_mtx locked section, I'll take a look on
> > that. In any case, this patch won't most likely fix all issues with the
> > locking here. For example there's that checking if scan is in progress
> > in the beginning of the work_work. After that it is still possible that
> > the scan is started. So there's obviously need for some fixing in the
> > locking yet to be done.
>
> One issue would be if the new scan request has different parameters than
> the in-progress scan request. This is especially important when the
> incoming scan request is for a specific SSID (hidden network probing)
> while the ongoing one is not. Can we make sure that the patch at least
> returns the error up the stack when the incoming scan is dropped?
> Otherwise userspace has no idea that the specific SSID scan was dropped
> on the floor, and just thinks that the hidden SSID wasn't found.

There's no scan request being dropped -- it's just that we try to ask
the hw to do the same scan twice without noticing that we've already
asked it.

johannes


2010-04-06 10:21:48

by Teemu Paasikivi

[permalink] [raw]
Subject: Re: [PATCH] mac80211: check whether scan is in progress before queueing scan_work

On Tue, 2010-04-06 at 11:06 +0200, ext Johannes Berg wrote:
> On Tue, 2010-04-06 at 11:54 +0300, Teemu Paasikivi wrote:
> > As scan_work is queued from work_work it needs to be checked if scan
> > has been started during execution of work_work. Otherwise, when hw
> > scan is used, the stack gets error about hw being busy with ongoing
> > scan.
>
> Does that mean we ask the driver to scan twice? And your particular
> driver returns busy?
>

Yes. There seems to be a possibility, that when ieee80211_work_work is
being executed, __ieee80211_start_scan gets called and it starts a hw
scan, and also sets local->hw_scan_req etc. (because it looks like
there's some holes in use of scan_mtx). Result is that work_work queues
ieee80211_scan_work and when it is executed, as there's already
hw_scan_req set, it will try to start hw scan again. And as the driver
used returns busy, scan_work will call ieee80211_scan_completed function
which leaves the driver (hw more precisely) scanning and the stack
thinking that it is not anymore scanning.

Obviously this kind of situation doesn't happen very often in normal
use, but it can be caused quite easily by associating to access points
in a loop while running scan in another loop.


> > This causes the stack to abort scan without notifying the driver
> > about it. This leads to a situation where the hw is scanning and the stack
> > thinks it's not. Then when the scan finishes, the stack will complain by
> > warnings.
> >
> > Signed-off-by: Teemu Paasikivi <[email protected]>
> > ---
> > net/mac80211/work.c | 7 ++++++-
> > 1 files changed, 6 insertions(+), 1 deletions(-)
> >
> > diff --git a/net/mac80211/work.c b/net/mac80211/work.c
> > index 1e1ea30..7bd8670 100644
> > --- a/net/mac80211/work.c
> > +++ b/net/mac80211/work.c
> > @@ -919,11 +919,16 @@ static void ieee80211_work_work(struct work_struct *work)
> > run_again(local, jiffies + HZ/2);
> > }
> >
> > - if (list_empty(&local->work_list) && local->scan_req)
> > + mutex_lock(&local->scan_mtx);
> > +
> > + if (list_empty(&local->work_list) && local->scan_req &&
> > + !local->scanning)
> > ieee80211_queue_delayed_work(&local->hw,
> > &local->scan_work,
> > round_jiffies_relative(0));
> >
> > + mutex_unlock(&local->scan_mtx);
> > +
> > mutex_unlock(&local->work_mtx);
>
> That really doesn't look right ... at all.
>
> johannes
>

Teemu



2010-04-07 06:13:40

by Teemu Paasikivi

[permalink] [raw]
Subject: Re: [PATCH] mac80211: check whether scan is in progress before queueing scan_work

On Tue, 2010-04-06 at 12:29 +0200, ext Johannes Berg wrote:
> On Tue, 2010-04-06 at 12:27 +0200, Johannes Berg wrote:
> > On Tue, 2010-04-06 at 13:16 +0300, Teemu Paasikivi wrote:
> > > On Tue, 2010-04-06 at 11:06 +0200, ext Johannes Berg wrote:
> > > > On Tue, 2010-04-06 at 11:54 +0300, Teemu Paasikivi wrote:
> > > > > As scan_work is queued from work_work it needs to be checked if scan
> > > > > has been started during execution of work_work. Otherwise, when hw
> > > > > scan is used, the stack gets error about hw being busy with ongoing
> > > > > scan.
> > > >
> > > > Does that mean we ask the driver to scan twice? And your particular
> > > > driver returns busy?
> > > >
> > >
> > > Yes. There seems to be a possibility, that when ieee80211_work_work is
> > > being executed, __ieee80211_start_scan gets called and it starts a hw
> > > scan, and also sets local->hw_scan_req etc. (because it looks like
> > > there's some holes in use of scan_mtx). Result is that work_work queues
> > > ieee80211_scan_work and when it is executed, as there's already
> > > hw_scan_req set, it will try to start hw scan again. And as the driver
> > > used returns busy, scan_work will call ieee80211_scan_completed function
> > > which leaves the driver (hw more precisely) scanning and the stack
> > > thinking that it is not anymore scanning.
> > >
> > > Obviously this kind of situation doesn't happen very often in normal
> > > use, but it can be caused quite easily by associating to access points
> > > in a loop while running scan in another loop.
> >
> > Makes some sense. Ignore my other email(s), I looked at this in more
> > detail now.
> >
> > It would appear that we need to fix some of the locking here,
> > potentially simply using a single mutex for both work and scan?
>
> I'm mostly worried about deadlocks between work_mtx and scan_mtx really,
> when we acquire them both like your patch.
>

I sent a new version of the patch to the linux-wireless. Now both
mutexes are not locked at the same time.


Br,

Teemu



2010-04-06 09:06:10

by Johannes Berg

[permalink] [raw]
Subject: Re: [PATCH] mac80211: check whether scan is in progress before queueing scan_work

On Tue, 2010-04-06 at 11:54 +0300, Teemu Paasikivi wrote:
> As scan_work is queued from work_work it needs to be checked if scan
> has been started during execution of work_work. Otherwise, when hw
> scan is used, the stack gets error about hw being busy with ongoing
> scan.

Does that mean we ask the driver to scan twice? And your particular
driver returns busy?

> This causes the stack to abort scan without notifying the driver
> about it. This leads to a situation where the hw is scanning and the stack
> thinks it's not. Then when the scan finishes, the stack will complain by
> warnings.
>
> Signed-off-by: Teemu Paasikivi <[email protected]>
> ---
> net/mac80211/work.c | 7 ++++++-
> 1 files changed, 6 insertions(+), 1 deletions(-)
>
> diff --git a/net/mac80211/work.c b/net/mac80211/work.c
> index 1e1ea30..7bd8670 100644
> --- a/net/mac80211/work.c
> +++ b/net/mac80211/work.c
> @@ -919,11 +919,16 @@ static void ieee80211_work_work(struct work_struct *work)
> run_again(local, jiffies + HZ/2);
> }
>
> - if (list_empty(&local->work_list) && local->scan_req)
> + mutex_lock(&local->scan_mtx);
> +
> + if (list_empty(&local->work_list) && local->scan_req &&
> + !local->scanning)
> ieee80211_queue_delayed_work(&local->hw,
> &local->scan_work,
> round_jiffies_relative(0));
>
> + mutex_unlock(&local->scan_mtx);
> +
> mutex_unlock(&local->work_mtx);

That really doesn't look right ... at all.

johannes


2010-04-09 08:09:22

by Johannes Berg

[permalink] [raw]
Subject: Re: [PATCH] mac80211: check whether scan is in progress before queueing scan_work

On Tue, 2010-04-06 at 11:54 +0300, Teemu Paasikivi wrote:
> As scan_work is queued from work_work it needs to be checked if scan
> has been started during execution of work_work. Otherwise, when hw
> scan is used, the stack gets error about hw being busy with ongoing
> scan. This causes the stack to abort scan without notifying the driver
> about it. This leads to a situation where the hw is scanning and the stack
> thinks it's not. Then when the scan finishes, the stack will complain by
> warnings.

Actually, let's go with this patch. It should solve both Luis's and your
problem, and I just reviewed the locking -- currently scan_mtx and
work_mtx are completely independent.

Reviewed-by: Johannes Berg <[email protected]>

> Signed-off-by: Teemu Paasikivi <[email protected]>
> ---
> net/mac80211/work.c | 7 ++++++-
> 1 files changed, 6 insertions(+), 1 deletions(-)
>
> diff --git a/net/mac80211/work.c b/net/mac80211/work.c
> index 1e1ea30..7bd8670 100644
> --- a/net/mac80211/work.c
> +++ b/net/mac80211/work.c
> @@ -919,11 +919,16 @@ static void ieee80211_work_work(struct work_struct *work)
> run_again(local, jiffies + HZ/2);
> }
>
> - if (list_empty(&local->work_list) && local->scan_req)
> + mutex_lock(&local->scan_mtx);
> +
> + if (list_empty(&local->work_list) && local->scan_req &&
> + !local->scanning)
> ieee80211_queue_delayed_work(&local->hw,
> &local->scan_work,
> round_jiffies_relative(0));
>
> + mutex_unlock(&local->scan_mtx);
> +
> mutex_unlock(&local->work_mtx);
>
> ieee80211_recalc_idle(local);



2010-04-06 11:10:27

by Teemu Paasikivi

[permalink] [raw]
Subject: Re: [PATCH] mac80211: check whether scan is in progress before queueing scan_work

On Tue, 2010-04-06 at 12:29 +0200, ext Johannes Berg wrote:
> On Tue, 2010-04-06 at 12:27 +0200, Johannes Berg wrote:
> > On Tue, 2010-04-06 at 13:16 +0300, Teemu Paasikivi wrote:
> > > On Tue, 2010-04-06 at 11:06 +0200, ext Johannes Berg wrote:
> > > > On Tue, 2010-04-06 at 11:54 +0300, Teemu Paasikivi wrote:
> > > > > As scan_work is queued from work_work it needs to be checked if scan
> > > > > has been started during execution of work_work. Otherwise, when hw
> > > > > scan is used, the stack gets error about hw being busy with ongoing
> > > > > scan.
> > > >
> > > > Does that mean we ask the driver to scan twice? And your particular
> > > > driver returns busy?
> > > >
> > >
> > > Yes. There seems to be a possibility, that when ieee80211_work_work is
> > > being executed, __ieee80211_start_scan gets called and it starts a hw
> > > scan, and also sets local->hw_scan_req etc. (because it looks like
> > > there's some holes in use of scan_mtx). Result is that work_work queues
> > > ieee80211_scan_work and when it is executed, as there's already
> > > hw_scan_req set, it will try to start hw scan again. And as the driver
> > > used returns busy, scan_work will call ieee80211_scan_completed function
> > > which leaves the driver (hw more precisely) scanning and the stack
> > > thinking that it is not anymore scanning.
> > >
> > > Obviously this kind of situation doesn't happen very often in normal
> > > use, but it can be caused quite easily by associating to access points
> > > in a loop while running scan in another loop.
> >
> > Makes some sense. Ignore my other email(s), I looked at this in more
> > detail now.
> >
> > It would appear that we need to fix some of the locking here,
> > potentially simply using a single mutex for both work and scan?
>
> I'm mostly worried about deadlocks between work_mtx and scan_mtx really,
> when we acquire them both like your patch.
>

Yes, I'm bit worried about possibility of a possible deadlock too. One
solution would be to move the block acquiring scan_mtx and queueing
scan_work outside of the work_mtx locked section, I'll take a look on
that. In any case, this patch won't most likely fix all issues with the
locking here. For example there's that checking if scan is in progress
in the beginning of the work_work. After that it is still possible that
the scan is started. So there's obviously need for some fixing in the
locking yet to be done.


Teemu



2010-04-06 18:07:48

by Dan Williams

[permalink] [raw]
Subject: Re: [PATCH] mac80211: check whether scan is in progress before queueing scan_work

On Tue, 2010-04-06 at 19:17 +0200, Johannes Berg wrote:
> On Tue, 2010-04-06 at 09:23 -0700, Dan Williams wrote:
>
> > > Yes, I'm bit worried about possibility of a possible deadlock too. One
> > > solution would be to move the block acquiring scan_mtx and queueing
> > > scan_work outside of the work_mtx locked section, I'll take a look on
> > > that. In any case, this patch won't most likely fix all issues with the
> > > locking here. For example there's that checking if scan is in progress
> > > in the beginning of the work_work. After that it is still possible that
> > > the scan is started. So there's obviously need for some fixing in the
> > > locking yet to be done.
> >
> > One issue would be if the new scan request has different parameters than
> > the in-progress scan request. This is especially important when the
> > incoming scan request is for a specific SSID (hidden network probing)
> > while the ongoing one is not. Can we make sure that the patch at least
> > returns the error up the stack when the incoming scan is dropped?
> > Otherwise userspace has no idea that the specific SSID scan was dropped
> > on the floor, and just thinks that the hidden SSID wasn't found.
>
> There's no scan request being dropped -- it's just that we try to ask
> the hw to do the same scan twice without noticing that we've already
> asked it.

Aha; thanks for the clarification.

Dan



2010-04-06 10:29:25

by Johannes Berg

[permalink] [raw]
Subject: Re: [PATCH] mac80211: check whether scan is in progress before queueing scan_work

On Tue, 2010-04-06 at 12:27 +0200, Johannes Berg wrote:
> On Tue, 2010-04-06 at 13:16 +0300, Teemu Paasikivi wrote:
> > On Tue, 2010-04-06 at 11:06 +0200, ext Johannes Berg wrote:
> > > On Tue, 2010-04-06 at 11:54 +0300, Teemu Paasikivi wrote:
> > > > As scan_work is queued from work_work it needs to be checked if scan
> > > > has been started during execution of work_work. Otherwise, when hw
> > > > scan is used, the stack gets error about hw being busy with ongoing
> > > > scan.
> > >
> > > Does that mean we ask the driver to scan twice? And your particular
> > > driver returns busy?
> > >
> >
> > Yes. There seems to be a possibility, that when ieee80211_work_work is
> > being executed, __ieee80211_start_scan gets called and it starts a hw
> > scan, and also sets local->hw_scan_req etc. (because it looks like
> > there's some holes in use of scan_mtx). Result is that work_work queues
> > ieee80211_scan_work and when it is executed, as there's already
> > hw_scan_req set, it will try to start hw scan again. And as the driver
> > used returns busy, scan_work will call ieee80211_scan_completed function
> > which leaves the driver (hw more precisely) scanning and the stack
> > thinking that it is not anymore scanning.
> >
> > Obviously this kind of situation doesn't happen very often in normal
> > use, but it can be caused quite easily by associating to access points
> > in a loop while running scan in another loop.
>
> Makes some sense. Ignore my other email(s), I looked at this in more
> detail now.
>
> It would appear that we need to fix some of the locking here,
> potentially simply using a single mutex for both work and scan?

I'm mostly worried about deadlocks between work_mtx and scan_mtx really,
when we acquire them both like your patch.

johannes