Changes in mac/cfg80211 to notify userspace
of NL80211_CMD_SCHED_SCAN_STOPPED only when the driver
actually reported back that it was stopped. Also
blocks other attempts until we're really done.
This fixes a scenario where stop sched scan
is issued and immediately afterwards a new sched scan
is requested (e.g. with other parameters).
Current state caused a race where the driver started
stopping the sched scan but didn't finish and got
another sched scan request which it couldn't handle.
Eyal Shapira (2):
mac80211: mark stopped sched scan only after driver does
nl80211: report stopped sched scan only after driver does
net/mac80211/scan.c | 10 +++-------
net/wireless/scan.c | 17 ++++++++---------
2 files changed, 11 insertions(+), 16 deletions(-)
--
1.7.4.1
On Wed, 2011-12-21 at 12:11 +0200, Eyal Shapira wrote:
> Changes in mac/cfg80211 to notify userspace
> of NL80211_CMD_SCHED_SCAN_STOPPED only when the driver
> actually reported back that it was stopped. Also
> blocks other attempts until we're really done.
>
> This fixes a scenario where stop sched scan
> is issued and immediately afterwards a new sched scan
> is requested (e.g. with other parameters).
> Current state caused a race where the driver started
> stopping the sched scan but didn't finish and got
> another sched scan request which it couldn't handle.
Luca really needs to take a close look at this, and I think you should
also take a look to see if this kind of API change can be avoided.
johannes
On Sun, 2011-12-25 at 01:49 +0200, Eyal Shapira wrote:
> On Sun, Dec 25, 2011 at 1:48 AM, Eyal Shapira <[email protected]> wrote:
> > On Fri, Dec 23, 2011 at 10:46 AM, Luciano Coelho <[email protected]> wrote:
> >>
> >> Could the code be changed so that we delay the STOP_SCHED_SCAN command
> >> completion instead? Then userspace can rely on that (as it should
> >> anyway, because the command can fail) instead of having to wait for the
> >> stopped event (which is a change in the API).
> >>
> > Sounds like this would be the best way to go and I'll do that. So no need
> > for these patches. I misunderstood the STOP_SCHED_SCAN NL API to be async (due to the NL events) but it can and should be synchronous AFAIU.
> > (and might sleep / block).
Yep, this sounds better.
> > The only mac80211 change I'd be glad to add is to make sched_scan_stop op return
> > a value so errors can be reported back like it's being done in the sched_scan_stop cfg80211 op
Returning errors from driver errors is not exactly the same things as we
already do now. If sched_scan_stop in cfg80211 fails it is either
because the userspace screwed up (and sent a stop when it was not
running) or because someone else already stopped the scan. In both
cases, it is okay, because the userspace can just ignore it and be sure
the scan is stopped.
Now, if the driver fails for some other reason and we return the error
to the userspace, how should it react? In this case it would mean that
the scan is still running, but should the userspace try again? Or
ignore? Hard to know what to do. (This was Johannes's original opinion
about returning errors there, IIRC).
--
Cheers,
Luca.
On Sun, Dec 25, 2011 at 10:22 AM, Luciano Coelho <[email protected]> wrote:
> On Sun, 2011-12-25 at 01:49 +0200, Eyal Shapira wrote:
>> On Sun, Dec 25, 2011 at 1:48 AM, Eyal Shapira <[email protected]> wrote:
>> > On Fri, Dec 23, 2011 at 10:46 AM, Luciano Coelho <[email protected]> wrote:
>> >>
>> >> Could the code be changed so that we delay the STOP_SCHED_SCAN command
>> >> completion instead? Then userspace can rely on that (as it should
>> >> anyway, because the command can fail) instead of having to wait for the
>> >> stopped event (which is a change in the API).
>> >>
>> > Sounds like this would be the best way to go and I'll do that. So no need
>> > for these patches. I misunderstood the STOP_SCHED_SCAN NL API to be async (due to the NL events) but it can and should be synchronous AFAIU.
>> > (and might sleep / block).
>
> Yep, this sounds better.
>
>> > The only mac80211 change I'd be glad to add is to make sched_scan_stop op return
>> > a value so errors can be reported back like it's being done in the sched_scan_stop cfg80211 op
>
> Returning errors from driver errors is not exactly the same things as we
> already do now. ?If sched_scan_stop in cfg80211 fails it is either
> because the userspace screwed up (and sent a stop when it was not
> running) or because someone else already stopped the scan. ?In both
> cases, it is okay, because the userspace can just ignore it and be sure
> the scan is stopped.
>
> Now, if the driver fails for some other reason and we return the error
> to the userspace, how should it react? In this case it would mean that
> the scan is still running, but should the userspace try again? Or
> ignore? Hard to know what to do. ?(This was Johannes's original opinion
> about returning errors there, IIRC).
>
I was referring to sched_scan_stop op in cfg80211_ops which can return
an int. This is used by ath6kl for example in
ath6kl_cfg80211_sscan_stop() to return -EIO
in case there's an error in stopping so userspace might get that.
What's the difference between this and having our driver return an error ?
Since there might be real errors which would prevent us from stopping
, isn't it better
to report to userspace (and let it decide what to do - retry, ignore,
exit program, panic ,....)
other than just silently ignore it and let the userspace think that it
succeeded ?
> --
> Cheers,
> Luca.
>
On Sun, 2011-12-25 at 12:16 +0200, Eyal Shapira wrote:
> On Sun, Dec 25, 2011 at 10:22 AM, Luciano Coelho <[email protected]> wrote:
> > On Sun, 2011-12-25 at 01:49 +0200, Eyal Shapira wrote:
> >> On Sun, Dec 25, 2011 at 1:48 AM, Eyal Shapira <[email protected]> wrote:
> >> > On Fri, Dec 23, 2011 at 10:46 AM, Luciano Coelho <[email protected]> wrote:
> >> >>
> >> >> Could the code be changed so that we delay the STOP_SCHED_SCAN command
> >> >> completion instead? Then userspace can rely on that (as it should
> >> >> anyway, because the command can fail) instead of having to wait for the
> >> >> stopped event (which is a change in the API).
> >> >>
> >> > Sounds like this would be the best way to go and I'll do that. So no need
> >> > for these patches. I misunderstood the STOP_SCHED_SCAN NL API to be async (due to the NL events) but it can and should be synchronous AFAIU.
> >> > (and might sleep / block).
> >
> > Yep, this sounds better.
> >
> >> > The only mac80211 change I'd be glad to add is to make sched_scan_stop op return
> >> > a value so errors can be reported back like it's being done in the sched_scan_stop cfg80211 op
> >
> > Returning errors from driver errors is not exactly the same things as we
> > already do now. If sched_scan_stop in cfg80211 fails it is either
> > because the userspace screwed up (and sent a stop when it was not
> > running) or because someone else already stopped the scan. In both
> > cases, it is okay, because the userspace can just ignore it and be sure
> > the scan is stopped.
> >
> > Now, if the driver fails for some other reason and we return the error
> > to the userspace, how should it react? In this case it would mean that
> > the scan is still running, but should the userspace try again? Or
> > ignore? Hard to know what to do. (This was Johannes's original opinion
> > about returning errors there, IIRC).
> >
> I was referring to sched_scan_stop op in cfg80211_ops which can return
> an int. This is used by ath6kl for example in
> ath6kl_cfg80211_sscan_stop() to return -EIO
> in case there's an error in stopping so userspace might get that.
Okay, so ath6kl does it like this and userspace will have to handle it.
The reason why we didn't do with mac80211 was the one I explained
before.
> What's the difference between this and having our driver return an error ?
What you're proposing for mac80211 is the same as ath6kl is doing, so
you're right.
The difference I meant here is that the errors that mac80211 returns are
easier for the userspace to react. It means userspace itself did
something wrong. But it doesn't propagate errors in the hardware (or
lower driver).
> Since there might be real errors which would prevent us from stopping
> , isn't it better
> to report to userspace (and let it decide what to do - retry, ignore,
> exit program, panic ,....)
> other than just silently ignore it and let the userspace think that it
> succeeded ?
Yes, I tend to agree. If we really can't do anything in the driver,
it's probably better to return the error up, instead of just hiding it.
Johannes is the one to ack this change, though, not me. :)
--
Cheers,
Luca.
Change internal state in mac80211 only after the driver
reports that sched scan was actually stopped.
Signed-off-by: Eyal Shapira <[email protected]>
---
v2: the driver should always call ieee80211_sched_scan_stopped
so a single kfree in the stopped flow is enough (Thanks Luca)
net/mac80211/scan.c | 10 +++-------
1 files changed, 3 insertions(+), 7 deletions(-)
diff --git a/net/mac80211/scan.c b/net/mac80211/scan.c
index 2c5041c..99a38df 100644
--- a/net/mac80211/scan.c
+++ b/net/mac80211/scan.c
@@ -875,7 +875,7 @@ out:
int ieee80211_request_sched_scan_stop(struct ieee80211_sub_if_data *sdata)
{
struct ieee80211_local *local = sdata->local;
- int ret = 0, i;
+ int ret = 0;
mutex_lock(&sdata->local->mtx);
@@ -884,13 +884,9 @@ int ieee80211_request_sched_scan_stop(struct ieee80211_sub_if_data *sdata)
goto out;
}
- if (local->sched_scanning) {
- for (i = 0; i < IEEE80211_NUM_BANDS; i++)
- kfree(local->sched_scan_ies.ie[i]);
-
+ if (local->sched_scanning)
drv_sched_scan_stop(local, sdata);
- local->sched_scanning = false;
- }
+
out:
mutex_unlock(&sdata->local->mtx);
--
1.7.4.1
On Wed, 2011-12-21 at 16:10 +0100, Johannes Berg wrote:
> On Wed, 2011-12-21 at 12:11 +0200, Eyal Shapira wrote:
> > Changes in mac/cfg80211 to notify userspace
> > of NL80211_CMD_SCHED_SCAN_STOPPED only when the driver
> > actually reported back that it was stopped. Also
> > blocks other attempts until we're really done.
> >
> > This fixes a scenario where stop sched scan
> > is issued and immediately afterwards a new sched scan
> > is requested (e.g. with other parameters).
> > Current state caused a race where the driver started
> > stopping the sched scan but didn't finish and got
> > another sched scan request which it couldn't handle.
>
> Luca really needs to take a close look at this, and I think you should
> also take a look to see if this kind of API change can be avoided.
I had discussed this with Eyal briefly and at first it seemed ok. But
actually I now agree that this is an API change.
Looking closer, I see that in the current documentation, there's nothing
stating that after sending sched_scan_stop the userspace needs to wait
for the stopped event. It just says that "[this] event is also sent
when the %NL80211_CMD_STOP_SCHED_SCAN command is received [...]".
Applications may assume that after the stop command completes, it is
really stopped, without waiting for the stopped event.
Could the code be changed so that we delay the STOP_SCHED_SCAN command
completion instead? Then userspace can rely on that (as it should
anyway, because the command can fail) instead of having to wait for the
stopped event (which is a change in the API).
--
Cheers,
Luca.
On Sun, Dec 25, 2011 at 1:48 AM, Eyal Shapira <[email protected]> wrote:
>
>
>
> On Fri, Dec 23, 2011 at 10:46 AM, Luciano Coelho <[email protected]> wrote:
>>
>> On Wed, 2011-12-21 at 16:10 +0100, Johannes Berg wrote:
>> > On Wed, 2011-12-21 at 12:11 +0200, Eyal Shapira wrote:
>> > > Changes in mac/cfg80211 to notify userspace
>> > > of NL80211_CMD_SCHED_SCAN_STOPPED only when the driver
>> > > actually reported back that it was stopped. Also
>> > > blocks other attempts until we're really done.
>> > >
>> > > This fixes a scenario where stop sched scan
>> > > is issued and immediately afterwards a new sched scan
>> > > is requested (e.g. with other parameters).
>> > > Current state caused a race where the driver started
>> > > stopping the sched scan but didn't finish and got
>> > > another sched scan request which it couldn't handle.
>> >
>> > Luca really needs to take a close look at this, and I think you should
>> > also take a look to see if this kind of API change can be avoided.
>>
>> I had discussed this with Eyal briefly and at first it seemed ok. ?But
>> actually I now agree that this is an API change.
>>
>> Looking closer, I see that in the current documentation, there's nothing
>> stating that after sending sched_scan_stop the userspace needs to wait
>> for the stopped event. ?It just says that "[this] event is also sent
>> when the %NL80211_CMD_STOP_SCHED_SCAN command is received [...]".
>> Applications may assume that after the stop command completes, it is
>> really stopped, without waiting for the stopped event.
>>
>> Could the code be changed so that we delay the STOP_SCHED_SCAN command
>> completion instead? Then userspace can rely on that (as it should
>> anyway, because the command can fail) instead of having to wait for the
>> stopped event (which is a change in the API).
>>
> Sounds like this would be the best way to go and I'll do that. So no need
> for these patches.?I misunderstood the STOP_SCHED_SCAN NL API to be async (due to the NL events) but it can and should be synchronous AFAIU.
> (and might sleep / block).
>
> The only mac80211 change I'd be glad to add is to make sched_scan_stop op return
> a value so errors can be reported back like it's being done in the sched_scan_stop cfg80211 op
>
>> --
>> Cheers,
>> Luca.
>>
>
Report NL80211_CMD_SCHED_SCAN_STOPPED and change internal
states in cfg80211 only after the driver reports that
sched scan was actually stopped.
Signed-off-by: Eyal Shapira <[email protected]>
---
v2: No changes
net/wireless/scan.c | 17 ++++++++---------
1 files changed, 8 insertions(+), 9 deletions(-)
diff --git a/net/wireless/scan.c b/net/wireless/scan.c
index 31119e3..d6c87d9 100644
--- a/net/wireless/scan.c
+++ b/net/wireless/scan.c
@@ -133,6 +133,7 @@ EXPORT_SYMBOL(cfg80211_sched_scan_stopped);
int __cfg80211_stop_sched_scan(struct cfg80211_registered_device *rdev,
bool driver_initiated)
{
+ int err = 0;
struct net_device *dev;
lockdep_assert_held(&rdev->sched_scan_mtx);
@@ -143,17 +144,15 @@ int __cfg80211_stop_sched_scan(struct cfg80211_registered_device *rdev,
dev = rdev->sched_scan_req->dev;
if (!driver_initiated) {
- int err = rdev->ops->sched_scan_stop(&rdev->wiphy, dev);
- if (err)
- return err;
+ err = rdev->ops->sched_scan_stop(&rdev->wiphy, dev);
+ } else {
+ nl80211_send_sched_scan(rdev, dev,
+ NL80211_CMD_SCHED_SCAN_STOPPED);
+ kfree(rdev->sched_scan_req);
+ rdev->sched_scan_req = NULL;
}
- nl80211_send_sched_scan(rdev, dev, NL80211_CMD_SCHED_SCAN_STOPPED);
-
- kfree(rdev->sched_scan_req);
- rdev->sched_scan_req = NULL;
-
- return 0;
+ return err;
}
static void bss_release(struct kref *ref)
--
1.7.4.1