2009-07-31 16:15:04

by Maxim Levitsky

[permalink] [raw]
Subject: [PATCH 000/002] Fix frequent reconnects caused by new conection monitor

Hi, here is the updated version of these two patches that fix the
$SUBJECT issue.

I attach these (in case mailer mangles them), and reply with patches.

Tested both with low quality signal, and beacon loss.
Lack of TX is found, every 30 seconds now, and quite reliable.
Lack of beacons, triggers probe like it did every 2 seconds.



Best regards,
Maxim Levitsky


Attachments:
0001--MAC80211-Retry-probe-request-few-times.patch (3.87 kB)
0002--MAC80211-Increase-timeouts-for-station-polling.patch (1.23 kB)
Download all attachments

2009-07-31 16:21:12

by Johannes Berg

[permalink] [raw]
Subject: Re: [PATCH 001/002] [MAC80211] Retry probe request few times

On Fri, 2009-07-31 at 19:14 +0300, Maxim Levitsky wrote:
> >From 0bf5749f2878f9245b8fb1b64456386374205225 Mon Sep 17 00:00:00 2001
> From: Maxim Levitsky <[email protected]>
> Date: Fri, 31 Jul 2009 18:54:12 +0300
> Subject: [PATCH] [MAC80211] Retry probe request few times
>
> Retry 5 times (chosen arbitary ), before assuming
> that station is out of range.
>
> Fixes frequent disassociations while connected to weak,
> and sometimes even strong access points.

Looks good, thanks.

Acked-by: Johannes Berg <[email protected]>

> Signed-off-by: Maxim Levitky <[email protected]>
> ---
> net/mac80211/ieee80211_i.h | 1 +
> net/mac80211/mlme.c | 42 ++++++++++++++++++++++++++++++------------
> 2 files changed, 31 insertions(+), 12 deletions(-)
>
> diff --git a/net/mac80211/ieee80211_i.h b/net/mac80211/ieee80211_i.h
> index aec6853..bca7b60 100644
> --- a/net/mac80211/ieee80211_i.h
> +++ b/net/mac80211/ieee80211_i.h
> @@ -280,6 +280,7 @@ struct ieee80211_if_managed {
> struct work_struct beacon_loss_work;
>
> unsigned long probe_timeout;
> + int probe_send_count;
>
> struct mutex mtx;
> struct ieee80211_bss *associated;
> diff --git a/net/mac80211/mlme.c b/net/mac80211/mlme.c
> index ee83125..1d8640a 100644
> --- a/net/mac80211/mlme.c
> +++ b/net/mac80211/mlme.c
> @@ -31,6 +31,7 @@
> #define IEEE80211_AUTH_MAX_TRIES 3
> #define IEEE80211_ASSOC_TIMEOUT (HZ / 5)
> #define IEEE80211_ASSOC_MAX_TRIES 3
> +#define IEEE80211_MAX_PROBE_TRIES 5
>
> /*
> * beacon loss detection timeout
> @@ -1156,11 +1157,24 @@ void ieee80211_sta_rx_notify(struct ieee80211_sub_if_data *sdata,
> round_jiffies_up(jiffies + IEEE80211_CONNECTION_IDLE_TIME));
> }
>
> +static void ieee80211_mgd_probe_ap_send(struct ieee80211_sub_if_data *sdata)
> +{
> + struct ieee80211_if_managed *ifmgd = &sdata->u.mgd;
> + const u8 *ssid;
> +
> + ssid = ieee80211_bss_get_ie(&ifmgd->associated->cbss, WLAN_EID_SSID);
> + ieee80211_send_probe_req(sdata, ifmgd->associated->cbss.bssid,
> + ssid + 2, ssid[1], NULL, 0);
> +
> + ifmgd->probe_send_count++;
> + ifmgd->probe_timeout = jiffies + IEEE80211_PROBE_WAIT;
> + run_again(ifmgd, ifmgd->probe_timeout);
> +}
> +
> static void ieee80211_mgd_probe_ap(struct ieee80211_sub_if_data *sdata,
> bool beacon)
> {
> struct ieee80211_if_managed *ifmgd = &sdata->u.mgd;
> - const u8 *ssid;
> bool already = false;
>
> if (!netif_running(sdata->dev))
> @@ -1203,18 +1217,12 @@ static void ieee80211_mgd_probe_ap(struct ieee80211_sub_if_data *sdata,
> if (already)
> goto out;
>
> - ifmgd->probe_timeout = jiffies + IEEE80211_PROBE_WAIT;
> -
> mutex_lock(&sdata->local->iflist_mtx);
> ieee80211_recalc_ps(sdata->local, -1);
> mutex_unlock(&sdata->local->iflist_mtx);
>
> - ssid = ieee80211_bss_get_ie(&ifmgd->associated->cbss, WLAN_EID_SSID);
> - ieee80211_send_probe_req(sdata, ifmgd->associated->cbss.bssid,
> - ssid + 2, ssid[1], NULL, 0);
> -
> - run_again(ifmgd, ifmgd->probe_timeout);
> -
> + ifmgd->probe_send_count = 0;
> + ieee80211_mgd_probe_ap_send(sdata);
> out:
> mutex_unlock(&ifmgd->mtx);
> }
> @@ -2072,17 +2080,27 @@ static void ieee80211_sta_work(struct work_struct *work)
> if (ifmgd->flags & (IEEE80211_STA_BEACON_POLL |
> IEEE80211_STA_CONNECTION_POLL) &&
> ifmgd->associated) {
> + u8 bssid[ETH_ALEN];
> +
> + memcpy(bssid, ifmgd->associated->cbss.bssid, ETH_ALEN);
> if (time_is_after_jiffies(ifmgd->probe_timeout))
> run_again(ifmgd, ifmgd->probe_timeout);
> - else {
> - u8 bssid[ETH_ALEN];
> +
> + else if (ifmgd->probe_send_count < IEEE80211_MAX_PROBE_TRIES) {
> +#ifdef CONFIG_MAC80211_VERBOSE_DEBUG
> + printk(KERN_DEBUG "No probe response from AP %pM"
> + " after %dms, try %d\n", bssid,
> + (1000 * IEEE80211_PROBE_WAIT)/HZ,
> + ifmgd->probe_send_count);
> +#endif
> + ieee80211_mgd_probe_ap_send(sdata);
> + } else {
> /*
> * We actually lost the connection ... or did we?
> * Let's make sure!
> */
> ifmgd->flags &= ~(IEEE80211_STA_CONNECTION_POLL |
> IEEE80211_STA_BEACON_POLL);
> - memcpy(bssid, ifmgd->associated->cbss.bssid, ETH_ALEN);
> printk(KERN_DEBUG "No probe response from AP %pM"
> " after %dms, disconnecting.\n",
> bssid, (1000 * IEEE80211_PROBE_WAIT)/HZ);


Attachments:
signature.asc (801.00 B)
This is a digitally signed message part

2009-07-31 19:08:18

by Maxim Levitsky

[permalink] [raw]
Subject: Re: [PATCH 000/002] Fix frequent reconnects caused by new conection monitor

On Fri, 2009-07-31 at 11:52 -0700, reinette chatre wrote:
> On Fri, 2009-07-31 at 09:13 -0700, Maxim Levitsky wrote:
> > Hi, here is the updated version of these two patches that fix the
> > $SUBJECT issue.
> >
> > I attach these (in case mailer mangles them), and reply with patches.
> >
> > Tested both with low quality signal, and beacon loss.
> > Lack of TX is found, every 30 seconds now, and quite reliable.
> > Lack of beacons, triggers probe like it did every 2 seconds.
>
> Thanks!
>
> I've been running with this for two hours now with no disconnects. This
> is where before the patches I would get disconnected after a few
> minutes. I did get two "No probe response from AP xx:xx:xx:xx:xx:xx
> after 500ms, try 1" messages in my log.
This is normal, or at least can be normal, I patched the driver to
display this message, when there is a probe timeout, but instead of
disconnect, it retries, currently 5 times, but this can be even further
increased is necessarily.
(these messages are only in logs when verbose mac debugging is enabled)

I don't know exactly why probes aren't answered, but I strongly suspect
that my AP sometimes 'goes out to lunch' and then answers, since
typically after a failed probe it sends many replies.
(Or it could be some buffering done by iwl3945 microcode). I currently
can't monitor the connection from outside, but as soon as I can I see
whether the above is true. Nevertheless if signal quality isn't great,
there are valid reasons for probe loss, and it shouldn't cause all the
fuzz (and since I use WPA2, every reconnection causes whole WPA
handshake to be preformed, and this takes at least 2 seconds, and if a
reconnection happens each 5 seconds, it gets very very annoying, and
almost unusable.

And polling every 2 seconds, this way or another, I think is too much
anyway.


Best regards,
Maxim Levitsky

>
> Reinette
>
>


2009-07-31 18:52:18

by Reinette Chatre

[permalink] [raw]
Subject: Re: [PATCH 000/002] Fix frequent reconnects caused by new conection monitor

On Fri, 2009-07-31 at 09:13 -0700, Maxim Levitsky wrote:
> Hi, here is the updated version of these two patches that fix the
> $SUBJECT issue.
>
> I attach these (in case mailer mangles them), and reply with patches.
>
> Tested both with low quality signal, and beacon loss.
> Lack of TX is found, every 30 seconds now, and quite reliable.
> Lack of beacons, triggers probe like it did every 2 seconds.

Thanks!

I've been running with this for two hours now with no disconnects. This
is where before the patches I would get disconnected after a few
minutes. I did get two "No probe response from AP xx:xx:xx:xx:xx:xx
after 500ms, try 1" messages in my log.

Reinette



2009-07-31 16:15:10

by Maxim Levitsky

[permalink] [raw]
Subject: [PATCH 001/002] [MAC80211] Retry probe request few times

>From 0bf5749f2878f9245b8fb1b64456386374205225 Mon Sep 17 00:00:00 2001
From: Maxim Levitsky <[email protected]>
Date: Fri, 31 Jul 2009 18:54:12 +0300
Subject: [PATCH] [MAC80211] Retry probe request few times

Retry 5 times (chosen arbitary ), before assuming
that station is out of range.

Fixes frequent disassociations while connected to weak,
and sometimes even strong access points.

Signed-off-by: Maxim Levitky <[email protected]>
---
net/mac80211/ieee80211_i.h | 1 +
net/mac80211/mlme.c | 42 ++++++++++++++++++++++++++++++------------
2 files changed, 31 insertions(+), 12 deletions(-)

diff --git a/net/mac80211/ieee80211_i.h b/net/mac80211/ieee80211_i.h
index aec6853..bca7b60 100644
--- a/net/mac80211/ieee80211_i.h
+++ b/net/mac80211/ieee80211_i.h
@@ -280,6 +280,7 @@ struct ieee80211_if_managed {
struct work_struct beacon_loss_work;

unsigned long probe_timeout;
+ int probe_send_count;

struct mutex mtx;
struct ieee80211_bss *associated;
diff --git a/net/mac80211/mlme.c b/net/mac80211/mlme.c
index ee83125..1d8640a 100644
--- a/net/mac80211/mlme.c
+++ b/net/mac80211/mlme.c
@@ -31,6 +31,7 @@
#define IEEE80211_AUTH_MAX_TRIES 3
#define IEEE80211_ASSOC_TIMEOUT (HZ / 5)
#define IEEE80211_ASSOC_MAX_TRIES 3
+#define IEEE80211_MAX_PROBE_TRIES 5

/*
* beacon loss detection timeout
@@ -1156,11 +1157,24 @@ void ieee80211_sta_rx_notify(struct ieee80211_sub_if_data *sdata,
round_jiffies_up(jiffies + IEEE80211_CONNECTION_IDLE_TIME));
}

+static void ieee80211_mgd_probe_ap_send(struct ieee80211_sub_if_data *sdata)
+{
+ struct ieee80211_if_managed *ifmgd = &sdata->u.mgd;
+ const u8 *ssid;
+
+ ssid = ieee80211_bss_get_ie(&ifmgd->associated->cbss, WLAN_EID_SSID);
+ ieee80211_send_probe_req(sdata, ifmgd->associated->cbss.bssid,
+ ssid + 2, ssid[1], NULL, 0);
+
+ ifmgd->probe_send_count++;
+ ifmgd->probe_timeout = jiffies + IEEE80211_PROBE_WAIT;
+ run_again(ifmgd, ifmgd->probe_timeout);
+}
+
static void ieee80211_mgd_probe_ap(struct ieee80211_sub_if_data *sdata,
bool beacon)
{
struct ieee80211_if_managed *ifmgd = &sdata->u.mgd;
- const u8 *ssid;
bool already = false;

if (!netif_running(sdata->dev))
@@ -1203,18 +1217,12 @@ static void ieee80211_mgd_probe_ap(struct ieee80211_sub_if_data *sdata,
if (already)
goto out;

- ifmgd->probe_timeout = jiffies + IEEE80211_PROBE_WAIT;
-
mutex_lock(&sdata->local->iflist_mtx);
ieee80211_recalc_ps(sdata->local, -1);
mutex_unlock(&sdata->local->iflist_mtx);

- ssid = ieee80211_bss_get_ie(&ifmgd->associated->cbss, WLAN_EID_SSID);
- ieee80211_send_probe_req(sdata, ifmgd->associated->cbss.bssid,
- ssid + 2, ssid[1], NULL, 0);
-
- run_again(ifmgd, ifmgd->probe_timeout);
-
+ ifmgd->probe_send_count = 0;
+ ieee80211_mgd_probe_ap_send(sdata);
out:
mutex_unlock(&ifmgd->mtx);
}
@@ -2072,17 +2080,27 @@ static void ieee80211_sta_work(struct work_struct *work)
if (ifmgd->flags & (IEEE80211_STA_BEACON_POLL |
IEEE80211_STA_CONNECTION_POLL) &&
ifmgd->associated) {
+ u8 bssid[ETH_ALEN];
+
+ memcpy(bssid, ifmgd->associated->cbss.bssid, ETH_ALEN);
if (time_is_after_jiffies(ifmgd->probe_timeout))
run_again(ifmgd, ifmgd->probe_timeout);
- else {
- u8 bssid[ETH_ALEN];
+
+ else if (ifmgd->probe_send_count < IEEE80211_MAX_PROBE_TRIES) {
+#ifdef CONFIG_MAC80211_VERBOSE_DEBUG
+ printk(KERN_DEBUG "No probe response from AP %pM"
+ " after %dms, try %d\n", bssid,
+ (1000 * IEEE80211_PROBE_WAIT)/HZ,
+ ifmgd->probe_send_count);
+#endif
+ ieee80211_mgd_probe_ap_send(sdata);
+ } else {
/*
* We actually lost the connection ... or did we?
* Let's make sure!
*/
ifmgd->flags &= ~(IEEE80211_STA_CONNECTION_POLL |
IEEE80211_STA_BEACON_POLL);
- memcpy(bssid, ifmgd->associated->cbss.bssid, ETH_ALEN);
printk(KERN_DEBUG "No probe response from AP %pM"
" after %dms, disconnecting.\n",
bssid, (1000 * IEEE80211_PROBE_WAIT)/HZ);
--
1.6.0.4




2009-07-31 16:21:54

by Johannes Berg

[permalink] [raw]
Subject: Re: [PATCH 002/002] [MAC80211] Increase timeouts for station polling

On Fri, 2009-07-31 at 19:17 +0300, Maxim Levitsky wrote:
> >From 04976d22d45f26aa4b4dece5dd520e3347ac32d7 Mon Sep 17 00:00:00 2001
> From: Maxim Levitsky <[email protected]>
> Date: Fri, 31 Jul 2009 18:54:23 +0300
> Subject: [PATCH] [MAC80211] Increase timeouts for station polling
>
> Do a probe request every 30 seconds, and wait for probe response, half a second
> This should lower the traffic that card sends, thus save power

Indeed. We just tested that :)

Acked-by: Johannes Berg <[email protected]>

>
> Signed-off-by: Maxim Levitsky <[email protected]>
> ---
> net/mac80211/mlme.c | 4 ++--
> 1 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/net/mac80211/mlme.c b/net/mac80211/mlme.c
> index 1d8640a..e4bb590 100644
> --- a/net/mac80211/mlme.c
> +++ b/net/mac80211/mlme.c
> @@ -42,13 +42,13 @@
> * Time the connection can be idle before we probe
> * it to see if we can still talk to the AP.
> */
> -#define IEEE80211_CONNECTION_IDLE_TIME (2 * HZ)
> +#define IEEE80211_CONNECTION_IDLE_TIME (30 * HZ)
> /*
> * Time we wait for a probe response after sending
> * a probe request because of beacon loss or for
> * checking the connection still works.
> */
> -#define IEEE80211_PROBE_WAIT (HZ / 5)
> +#define IEEE80211_PROBE_WAIT (HZ / 2)
>
> #define TMR_RUNNING_TIMER 0
> #define TMR_RUNNING_CHANSW 1


Attachments:
signature.asc (801.00 B)
This is a digitally signed message part

2009-07-31 19:27:29

by Marcel Holtmann

[permalink] [raw]
Subject: Re: [PATCH 000/002] Fix frequent reconnects caused by new conection monitor

Hi Maxim,

> > > Hi, here is the updated version of these two patches that fix the
> > > $SUBJECT issue.
> > >
> > > I attach these (in case mailer mangles them), and reply with patches.
> > >
> > > Tested both with low quality signal, and beacon loss.
> > > Lack of TX is found, every 30 seconds now, and quite reliable.
> > > Lack of beacons, triggers probe like it did every 2 seconds.
> >
> > Thanks!
> >
> > I've been running with this for two hours now with no disconnects. This
> > is where before the patches I would get disconnected after a few
> > minutes. I did get two "No probe response from AP xx:xx:xx:xx:xx:xx
> > after 500ms, try 1" messages in my log.
> This is normal, or at least can be normal, I patched the driver to
> display this message, when there is a probe timeout, but instead of
> disconnect, it retries, currently 5 times, but this can be even further
> increased is necessarily.
> (these messages are only in logs when verbose mac debugging is enabled)
>
> I don't know exactly why probes aren't answered, but I strongly suspect
> that my AP sometimes 'goes out to lunch' and then answers, since
> typically after a failed probe it sends many replies.
> (Or it could be some buffering done by iwl3945 microcode). I currently
> can't monitor the connection from outside, but as soon as I can I see
> whether the above is true. Nevertheless if signal quality isn't great,
> there are valid reasons for probe loss, and it shouldn't cause all the
> fuzz (and since I use WPA2, every reconnection causes whole WPA
> handshake to be preformed, and this takes at least 2 seconds, and if a
> reconnection happens each 5 seconds, it gets very very annoying, and
> almost unusable.

I am testing your patches and so far so good. Seems to be working
perfectly fine. I see this in the logs:

[41027.333419] wlan0: detected beacon loss from AP - sending probe request
[41027.389260] wlan0: cancelling probereq poll due to a received beacon
[41027.793518] No probe response from AP 00:1c:f0:62:88:5b after 500ms, try 1
[41028.292731] No probe response from AP 00:1c:f0:62:88:5b after 500ms, try 2

Need to watch out if this pattern emerges and if the beacon loss trigger
might give us an indication. Maybe the ucode is just not ready then.

Regards

Marcel



2009-07-31 16:17:07

by Maxim Levitsky

[permalink] [raw]
Subject: [PATCH 002/002] [MAC80211] Increase timeouts for station polling

>From 04976d22d45f26aa4b4dece5dd520e3347ac32d7 Mon Sep 17 00:00:00 2001
From: Maxim Levitsky <[email protected]>
Date: Fri, 31 Jul 2009 18:54:23 +0300
Subject: [PATCH] [MAC80211] Increase timeouts for station polling

Do a probe request every 30 seconds, and wait for probe response, half a second
This should lower the traffic that card sends, thus save power
Wainting longer for response makes probe more robust against 'slow' access points

Signed-off-by: Maxim Levitsky <[email protected]>
---
net/mac80211/mlme.c | 4 ++--
1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/mac80211/mlme.c b/net/mac80211/mlme.c
index 1d8640a..e4bb590 100644
--- a/net/mac80211/mlme.c
+++ b/net/mac80211/mlme.c
@@ -42,13 +42,13 @@
* Time the connection can be idle before we probe
* it to see if we can still talk to the AP.
*/
-#define IEEE80211_CONNECTION_IDLE_TIME (2 * HZ)
+#define IEEE80211_CONNECTION_IDLE_TIME (30 * HZ)
/*
* Time we wait for a probe response after sending
* a probe request because of beacon loss or for
* checking the connection still works.
*/
-#define IEEE80211_PROBE_WAIT (HZ / 5)
+#define IEEE80211_PROBE_WAIT (HZ / 2)

#define TMR_RUNNING_TIMER 0
#define TMR_RUNNING_CHANSW 1
--
1.6.0.4




2009-07-31 20:05:44

by Maxim Levitsky

[permalink] [raw]
Subject: Re: [PATCH 000/002] Fix frequent reconnects caused by new conection monitor

On Fri, 2009-07-31 at 12:27 -0700, Marcel Holtmann wrote:
> Hi Maxim,
>
> > > > Hi, here is the updated version of these two patches that fix the
> > > > $SUBJECT issue.
> > > >
> > > > I attach these (in case mailer mangles them), and reply with patches.
> > > >
> > > > Tested both with low quality signal, and beacon loss.
> > > > Lack of TX is found, every 30 seconds now, and quite reliable.
> > > > Lack of beacons, triggers probe like it did every 2 seconds.
> > >
> > > Thanks!
> > >
> > > I've been running with this for two hours now with no disconnects. This
> > > is where before the patches I would get disconnected after a few
> > > minutes. I did get two "No probe response from AP xx:xx:xx:xx:xx:xx
> > > after 500ms, try 1" messages in my log.
> > This is normal, or at least can be normal, I patched the driver to
> > display this message, when there is a probe timeout, but instead of
> > disconnect, it retries, currently 5 times, but this can be even further
> > increased is necessarily.
> > (these messages are only in logs when verbose mac debugging is enabled)
> >
> > I don't know exactly why probes aren't answered, but I strongly suspect
> > that my AP sometimes 'goes out to lunch' and then answers, since
> > typically after a failed probe it sends many replies.
> > (Or it could be some buffering done by iwl3945 microcode). I currently
> > can't monitor the connection from outside, but as soon as I can I see
> > whether the above is true. Nevertheless if signal quality isn't great,
> > there are valid reasons for probe loss, and it shouldn't cause all the
> > fuzz (and since I use WPA2, every reconnection causes whole WPA
> > handshake to be preformed, and this takes at least 2 seconds, and if a
> > reconnection happens each 5 seconds, it gets very very annoying, and
> > almost unusable.
>
> I am testing your patches and so far so good. Seems to be working
> perfectly fine. I see this in the logs:
>
> [41027.333419] wlan0: detected beacon loss from AP - sending probe request
> [41027.389260] wlan0: cancelling probereq poll due to a received beacon
> [41027.793518] No probe response from AP 00:1c:f0:62:88:5b after 500ms, try 1
> [41028.292731] No probe response from AP 00:1c:f0:62:88:5b after 500ms, try 2
>
> Need to watch out if this pattern emerges and if the beacon loss trigger
> might give us an indication. Maybe the ucode is just not ready then.
Here (on my system) I see no beacon losses at all, like I said there
could be many reasons behind packet losses, and best way to mitigate
them is to retry.

Your logs indicate that beacons weren't recieved for 2 seconds, then
mac80211 tried to send a probe, but a beacon is recieved before the
probe answer, this probe is canceled (at least should be) then after a
while, a probe request (same one?) is time outed, and retried twice,
then finally answered.


Best regards,
Maxim Levitsky

>
> Regards
>
> Marcel
>
>


2009-08-05 05:50:53

by Gábor Stefanik

[permalink] [raw]
Subject: Re: [PATCH 001/002] [MAC80211] Retry probe request few times

On Wed, Aug 5, 2009 at 7:33 AM, Johannes Berg<[email protected]> wrote:
> On Wed, 2009-08-05 at 08:29 +0300, Maxim Levitsky wrote:
>
>> > [54632.657912] No probe response from AP 00:1c:f0:xx:xx:xx after 500ms, try 1
>> > [54633.154560] No probe response from AP 00:1c:f0:xx:xx:xx after 500ms, try 2
>> > [54873.231210] No probe response from AP 00:1c:f0:xx:xx:xx after 500ms, try 1
>> > [55113.467840] No probe response from AP 00:1c:f0:xx:xx:xx after 500ms, try 1
>> > [55113.964510] No probe response from AP 00:1c:f0:xx:xx:xx after 500ms, try 2
>> > [55114.464516] No probe response from AP 00:1c:f0:xx:xx:xx after 500ms, try 3
>> > [55114.967868] No probe response from AP 00:1c:f0:xx:xx:xx after 500ms, try 4
>> > [55115.464511] No probe response from AP 00:1c:f0:xx:xx:xx after 500ms, disconnecting.
>> >
>> > Should we increase the value to 8 or something? Or just accept that
>> > sometimes we get a disconnect?
>> >
>> I think a disconnect or two aren't a problem, they are always handled
>> automatically, and can be caused by natural events.
>>
>> Even without these patches (I have WPA2, and disconnects were every 4
>> five seconds), and still it was possible to use network.
>>
>> One disconnect in a day is really nothing to worry about (I have seen
>> such here as well)
>>
>> You can increase try count, probably won't hurt much (each try is 0.5
>> seconds, so even 10 tries gives total of 5 seconds, before disconnect on
>> a really unaccessible AP, anyway)
>
> Mind you, each try will end up sending a LOT of probe request frames,
> iwlwifi sends like 12 frames for every try and never gets a response for
> some reason, so I wouldn't really increase it much further since you'd
> be spamming around a lot.
>
> johannes
>

My ?0.02 (? because I'm European :) ): Shouldn't probe requests be NO_ACK?

--
Vista: [V]iruses, [I]ntruders, [S]pyware, [T]rojans and [A]dware. :-)

2009-08-05 05:51:49

by Johannes Berg

[permalink] [raw]
Subject: Re: [PATCH 001/002] [MAC80211] Retry probe request few times

On Wed, 2009-08-05 at 07:50 +0200, Gábor Stefanik wrote:

> My €0.02 (€ because I'm European :) ): Shouldn't probe requests be NO_ACK?

Of course not. What makes you think so?

johannes


Attachments:
signature.asc (801.00 B)
This is a digitally signed message part

2009-08-03 23:58:11

by Marcel Holtmann

[permalink] [raw]
Subject: Re: [PATCH 000/002] Fix frequent reconnects caused by new conection monitor

Hi John,

> > Hi, here is the updated version of these two patches that fix the
> > $SUBJECT issue.
> >
> > I attach these (in case mailer mangles them), and reply with patches.
> >
> > Tested both with low quality signal, and beacon loss.
> > Lack of TX is found, every 30 seconds now, and quite reliable.
> > Lack of beacons, triggers probe like it did every 2 seconds.
> >
> >
> >
> > Best regards,
> > Maxim Levitsky
>
> Just a question, when to see these in wireless-testing?

patches have been acked and tested by various people. Should be pretty
much safe to apply and they are helping many of us to get back a stable
WiFi connection.

Regards

Marcel



2009-08-05 05:54:14

by Gábor Stefanik

[permalink] [raw]
Subject: Re: [PATCH 001/002] [MAC80211] Retry probe request few times

2009/8/5 Johannes Berg <[email protected]>:
> On Wed, 2009-08-05 at 07:50 +0200, G?bor Stefanik wrote:
>
>> My ?0.02 (? because I'm European :) ): Shouldn't probe requests be NO_ACK?
>
> Of course not. What makes you think so?
>
> johannes
>
>

It just feels illogical to me that the AP essentially has to respond
to probes twice (it sends an ACK, then a probe response) - but if that
is what the 802.11 spec calls for, then its fine.

--
Vista: [V]iruses, [I]ntruders, [S]pyware, [T]rojans and [A]dware. :-)

2009-08-05 02:22:34

by Marcel Holtmann

[permalink] [raw]
Subject: Re: [PATCH 001/002] [MAC80211] Retry probe request few times

Hi Maxim,

> From: Maxim Levitsky <[email protected]>
> Date: Fri, 31 Jul 2009 18:54:12 +0300
> Subject: [PATCH] [MAC80211] Retry probe request few times
>
> Retry 5 times (chosen arbitary ), before assuming
> that station is out of range.

so today I got the disconnect :(

[54632.657912] No probe response from AP 00:1c:f0:xx:xx:xx after 500ms, try 1
[54633.154560] No probe response from AP 00:1c:f0:xx:xx:xx after 500ms, try 2
[54873.231210] No probe response from AP 00:1c:f0:xx:xx:xx after 500ms, try 1
[55113.467840] No probe response from AP 00:1c:f0:xx:xx:xx after 500ms, try 1
[55113.964510] No probe response from AP 00:1c:f0:xx:xx:xx after 500ms, try 2
[55114.464516] No probe response from AP 00:1c:f0:xx:xx:xx after 500ms, try 3
[55114.967868] No probe response from AP 00:1c:f0:xx:xx:xx after 500ms, try 4
[55115.464511] No probe response from AP 00:1c:f0:xx:xx:xx after 500ms, disconnecting.

Should we increase the value to 8 or something? Or just accept that
sometimes we get a disconnect?

Regards

Marcel



2009-08-05 05:58:05

by Johannes Berg

[permalink] [raw]
Subject: Re: [PATCH 001/002] [MAC80211] Retry probe request few times

On Wed, 2009-08-05 at 07:53 +0200, Gábor Stefanik wrote:

> It just feels illogical to me that the AP essentially has to respond
> to probes twice (it sends an ACK, then a probe response) - but if that
> is what the 802.11 spec calls for, then its fine.

You're just confused then. Think about it again, it's perfectly logical.

johannes


Attachments:
signature.asc (801.00 B)
This is a digitally signed message part

2009-08-05 05:29:18

by Maxim Levitsky

[permalink] [raw]
Subject: Re: [PATCH 001/002] [MAC80211] Retry probe request few times

On Tue, 2009-08-04 at 19:22 -0700, Marcel Holtmann wrote:
> Hi Maxim,
>
> > From: Maxim Levitsky <[email protected]>
> > Date: Fri, 31 Jul 2009 18:54:12 +0300
> > Subject: [PATCH] [MAC80211] Retry probe request few times
> >
> > Retry 5 times (chosen arbitary ), before assuming
> > that station is out of range.
>
> so today I got the disconnect :(
>
> [54632.657912] No probe response from AP 00:1c:f0:xx:xx:xx after 500ms, try 1
> [54633.154560] No probe response from AP 00:1c:f0:xx:xx:xx after 500ms, try 2
> [54873.231210] No probe response from AP 00:1c:f0:xx:xx:xx after 500ms, try 1
> [55113.467840] No probe response from AP 00:1c:f0:xx:xx:xx after 500ms, try 1
> [55113.964510] No probe response from AP 00:1c:f0:xx:xx:xx after 500ms, try 2
> [55114.464516] No probe response from AP 00:1c:f0:xx:xx:xx after 500ms, try 3
> [55114.967868] No probe response from AP 00:1c:f0:xx:xx:xx after 500ms, try 4
> [55115.464511] No probe response from AP 00:1c:f0:xx:xx:xx after 500ms, disconnecting.
>
> Should we increase the value to 8 or something? Or just accept that
> sometimes we get a disconnect?
>
I think a disconnect or two aren't a problem, they are always handled
automatically, and can be caused by natural events.

Even without these patches (I have WPA2, and disconnects were every 4
five seconds), and still it was possible to use network.

One disconnect in a day is really nothing to worry about (I have seen
such here as well)

You can increase try count, probably won't hurt much (each try is 0.5
seconds, so even 10 tries gives total of 5 seconds, before disconnect on
a really unaccessible AP, anyway)

Best regards,
Maxim Levitsky


2009-08-01 15:25:49

by Marcel Holtmann

[permalink] [raw]
Subject: Re: [PATCH 000/002] Fix frequent reconnects caused by new conection monitor

Hi Maxim,

> > > > > Hi, here is the updated version of these two patches that fix the
> > > > > $SUBJECT issue.
> > > > >
> > > > > I attach these (in case mailer mangles them), and reply with patches.
> > > > >
> > > > > Tested both with low quality signal, and beacon loss.
> > > > > Lack of TX is found, every 30 seconds now, and quite reliable.
> > > > > Lack of beacons, triggers probe like it did every 2 seconds.
> > > >
> > > > Thanks!
> > > >
> > > > I've been running with this for two hours now with no disconnects. This
> > > > is where before the patches I would get disconnected after a few
> > > > minutes. I did get two "No probe response from AP xx:xx:xx:xx:xx:xx
> > > > after 500ms, try 1" messages in my log.
> > > This is normal, or at least can be normal, I patched the driver to
> > > display this message, when there is a probe timeout, but instead of
> > > disconnect, it retries, currently 5 times, but this can be even further
> > > increased is necessarily.
> > > (these messages are only in logs when verbose mac debugging is enabled)
> > >
> > > I don't know exactly why probes aren't answered, but I strongly suspect
> > > that my AP sometimes 'goes out to lunch' and then answers, since
> > > typically after a failed probe it sends many replies.
> > > (Or it could be some buffering done by iwl3945 microcode). I currently
> > > can't monitor the connection from outside, but as soon as I can I see
> > > whether the above is true. Nevertheless if signal quality isn't great,
> > > there are valid reasons for probe loss, and it shouldn't cause all the
> > > fuzz (and since I use WPA2, every reconnection causes whole WPA
> > > handshake to be preformed, and this takes at least 2 seconds, and if a
> > > reconnection happens each 5 seconds, it gets very very annoying, and
> > > almost unusable.
> >
> > I am testing your patches and so far so good. Seems to be working
> > perfectly fine. I see this in the logs:
> >
> > [41027.333419] wlan0: detected beacon loss from AP - sending probe request
> > [41027.389260] wlan0: cancelling probereq poll due to a received beacon
> > [41027.793518] No probe response from AP 00:1c:f0:62:88:5b after 500ms, try 1
> > [41028.292731] No probe response from AP 00:1c:f0:62:88:5b after 500ms, try 2
> >
> > Need to watch out if this pattern emerges and if the beacon loss trigger
> > might give us an indication. Maybe the ucode is just not ready then.
> Here (on my system) I see no beacon losses at all, like I said there
> could be many reasons behind packet losses, and best way to mitigate
> them is to retry.
>
> Your logs indicate that beacons weren't recieved for 2 seconds, then
> mac80211 tried to send a probe, but a beacon is recieved before the
> probe answer, this probe is canceled (at least should be) then after a
> while, a probe request (same one?) is time outed, and retried twice,
> then finally answered.

it looked related, but it wasn't at all. I have this running for over 24
hours by now and the patches work perfectly fine. Today it saw for the
first time a try 4 message. Otherwise it only had to try up to three
times before it succeeded.

Tested-by: Marcel Holtmann <[email protected]>

Regards

Marcel



2009-08-05 05:33:41

by Johannes Berg

[permalink] [raw]
Subject: Re: [PATCH 001/002] [MAC80211] Retry probe request few times

On Wed, 2009-08-05 at 08:29 +0300, Maxim Levitsky wrote:

> > [54632.657912] No probe response from AP 00:1c:f0:xx:xx:xx after 500ms, try 1
> > [54633.154560] No probe response from AP 00:1c:f0:xx:xx:xx after 500ms, try 2
> > [54873.231210] No probe response from AP 00:1c:f0:xx:xx:xx after 500ms, try 1
> > [55113.467840] No probe response from AP 00:1c:f0:xx:xx:xx after 500ms, try 1
> > [55113.964510] No probe response from AP 00:1c:f0:xx:xx:xx after 500ms, try 2
> > [55114.464516] No probe response from AP 00:1c:f0:xx:xx:xx after 500ms, try 3
> > [55114.967868] No probe response from AP 00:1c:f0:xx:xx:xx after 500ms, try 4
> > [55115.464511] No probe response from AP 00:1c:f0:xx:xx:xx after 500ms, disconnecting.
> >
> > Should we increase the value to 8 or something? Or just accept that
> > sometimes we get a disconnect?
> >
> I think a disconnect or two aren't a problem, they are always handled
> automatically, and can be caused by natural events.
>
> Even without these patches (I have WPA2, and disconnects were every 4
> five seconds), and still it was possible to use network.
>
> One disconnect in a day is really nothing to worry about (I have seen
> such here as well)
>
> You can increase try count, probably won't hurt much (each try is 0.5
> seconds, so even 10 tries gives total of 5 seconds, before disconnect on
> a really unaccessible AP, anyway)

Mind you, each try will end up sending a LOT of probe request frames,
iwlwifi sends like 12 frames for every try and never gets a response for
some reason, so I wouldn't really increase it much further since you'd
be spamming around a lot.

johannes


Attachments:
signature.asc (801.00 B)
This is a digitally signed message part

2009-08-05 15:45:10

by Marcel Holtmann

[permalink] [raw]
Subject: Re: [PATCH 001/002] [MAC80211] Retry probe request few times

Hi Maxim,

> > > From: Maxim Levitsky <[email protected]>
> > > Date: Fri, 31 Jul 2009 18:54:12 +0300
> > > Subject: [PATCH] [MAC80211] Retry probe request few times
> > >
> > > Retry 5 times (chosen arbitary ), before assuming
> > > that station is out of range.
> >
> > so today I got the disconnect :(
> >
> > [54632.657912] No probe response from AP 00:1c:f0:xx:xx:xx after 500ms, try 1
> > [54633.154560] No probe response from AP 00:1c:f0:xx:xx:xx after 500ms, try 2
> > [54873.231210] No probe response from AP 00:1c:f0:xx:xx:xx after 500ms, try 1
> > [55113.467840] No probe response from AP 00:1c:f0:xx:xx:xx after 500ms, try 1
> > [55113.964510] No probe response from AP 00:1c:f0:xx:xx:xx after 500ms, try 2
> > [55114.464516] No probe response from AP 00:1c:f0:xx:xx:xx after 500ms, try 3
> > [55114.967868] No probe response from AP 00:1c:f0:xx:xx:xx after 500ms, try 4
> > [55115.464511] No probe response from AP 00:1c:f0:xx:xx:xx after 500ms, disconnecting.
> >
> > Should we increase the value to 8 or something? Or just accept that
> > sometimes we get a disconnect?
> >
> I think a disconnect or two aren't a problem, they are always handled
> automatically, and can be caused by natural events.
>
> Even without these patches (I have WPA2, and disconnects were every 4
> five seconds), and still it was possible to use network.

this assumption only works if you are not using NetworkManager or alike
where every new connect triggers DHCP again. If we wanna survive them,
then we have to teach them to handle the lease time more intelligent
which is kinda tricky. Might be worth doing anyway. I have to play with
it a little bit.

> One disconnect in a day is really nothing to worry about (I have seen
> such here as well)
>
> You can increase try count, probably won't hurt much (each try is 0.5
> seconds, so even 10 tries gives total of 5 seconds, before disconnect on
> a really unaccessible AP, anyway)

Or just increase the try to 1 second. I still think there is something
odd going on with the iwlwifi driver here, but so far nobody saw
anything that is obviously wrong.

Regards

Marcel



2009-08-03 22:33:27

by Maxim Levitsky

[permalink] [raw]
Subject: Re: [PATCH 000/002] Fix frequent reconnects caused by new conection monitor

On Fri, 2009-07-31 at 19:13 +0300, Maxim Levitsky wrote:
> Hi, here is the updated version of these two patches that fix the
> $SUBJECT issue.
>
> I attach these (in case mailer mangles them), and reply with patches.
>
> Tested both with low quality signal, and beacon loss.
> Lack of TX is found, every 30 seconds now, and quite reliable.
> Lack of beacons, triggers probe like it did every 2 seconds.
>
>
>
> Best regards,
> Maxim Levitsky

Just a question, when to see these in wireless-testing?

Best regards,
Maxim Levitsky