2020-12-15 17:27:41

by Youghandhar Chintala

[permalink] [raw]
Subject: [PATCH 2/3] mac80211: Add support to trigger sta disconnect on hardware restart

Currently in case of target hardware restart, we just reconfig and
re-enable the security keys and enable the network queues to start
data traffic back from where it was interrupted.

Many ath10k wifi chipsets have sequence numbers for the data
packets assigned by firmware and the mac sequence number will
restart from zero after target hardware restart leading to mismatch
in the sequence number expected by the remote peer vs the sequence
number of the frame sent by the target firmware.

This mismatch in sequence number will cause out-of-order packets
on the remote peer and all the frames sent by the device are dropped
until we reach the sequence number which was sent before we restarted
the target hardware

In order to fix this, we trigger a sta disconnect, for the targets
which expose this corresponding wiphy flag, in case of target hw
restart. After this there will be a fresh connection and thereby
avoiding the dropping of frames by remote peer.

The right fix would be to pull the entire data path into the host
which is not feasible or would need lots of complex changes and
will still be inefficient.

Tested on ath10k using WCN3990, QCA6174

Signed-off-by: Youghandhar Chintala <[email protected]>
---
net/mac80211/ieee80211_i.h | 3 +++
net/mac80211/mlme.c | 9 +++++++++
net/mac80211/util.c | 22 +++++++++++++++++++---
3 files changed, 31 insertions(+), 3 deletions(-)

diff --git a/net/mac80211/ieee80211_i.h b/net/mac80211/ieee80211_i.h
index cde2e3f..8cbeb5f 100644
--- a/net/mac80211/ieee80211_i.h
+++ b/net/mac80211/ieee80211_i.h
@@ -748,6 +748,8 @@ struct ieee80211_if_mesh {
* back to wireless media and to the local net stack.
* @IEEE80211_SDATA_DISCONNECT_RESUME: Disconnect after resume.
* @IEEE80211_SDATA_IN_DRIVER: indicates interface was added to driver
+ * @IEEE80211_SDATA_DISCONNECT_HW_RESTART: Disconnect after hardware restart
+ * recovery
*/
enum ieee80211_sub_if_data_flags {
IEEE80211_SDATA_ALLMULTI = BIT(0),
@@ -755,6 +757,7 @@ enum ieee80211_sub_if_data_flags {
IEEE80211_SDATA_DONT_BRIDGE_PACKETS = BIT(3),
IEEE80211_SDATA_DISCONNECT_RESUME = BIT(4),
IEEE80211_SDATA_IN_DRIVER = BIT(5),
+ IEEE80211_SDATA_DISCONNECT_HW_RESTART = BIT(6),
};

/**
diff --git a/net/mac80211/mlme.c b/net/mac80211/mlme.c
index 6adfcb9..e4d0d16 100644
--- a/net/mac80211/mlme.c
+++ b/net/mac80211/mlme.c
@@ -4769,6 +4769,15 @@ void ieee80211_sta_restart(struct ieee80211_sub_if_data *sdata)
true);
sdata_unlock(sdata);
return;
+ } else if (sdata->flags & IEEE80211_SDATA_DISCONNECT_HW_RESTART) {
+ sdata->flags &= ~IEEE80211_SDATA_DISCONNECT_HW_RESTART;
+ mlme_dbg(sdata, "driver requested disconnect after hardware restart\n");
+ ieee80211_sta_connection_lost(sdata,
+ ifmgd->associated->bssid,
+ WLAN_REASON_UNSPECIFIED,
+ true);
+ sdata_unlock(sdata);
+ return;
}
sdata_unlock(sdata);
}
diff --git a/net/mac80211/util.c b/net/mac80211/util.c
index 8c3c01a..98567a3 100644
--- a/net/mac80211/util.c
+++ b/net/mac80211/util.c
@@ -2567,9 +2567,12 @@ int ieee80211_reconfig(struct ieee80211_local *local)
}
mutex_unlock(&local->sta_mtx);

- /* add back keys */
- list_for_each_entry(sdata, &local->interfaces, list)
- ieee80211_reenable_keys(sdata);
+
+ if (!(hw->wiphy->flags & WIPHY_FLAG_STA_DISCONNECT_ON_HW_RESTART)) {
+ /* add back keys */
+ list_for_each_entry(sdata, &local->interfaces, list)
+ ieee80211_reenable_keys(sdata);
+ }

/* Reconfigure sched scan if it was interrupted by FW restart */
mutex_lock(&local->mtx);
@@ -2643,6 +2646,19 @@ int ieee80211_reconfig(struct ieee80211_local *local)
IEEE80211_QUEUE_STOP_REASON_SUSPEND,
false);

+ if ((hw->wiphy->flags & WIPHY_FLAG_STA_DISCONNECT_ON_HW_RESTART) &&
+ !reconfig_due_to_wowlan) {
+ list_for_each_entry(sdata, &local->interfaces, list) {
+ if (!ieee80211_sdata_running(sdata))
+ continue;
+ if (sdata->vif.type == NL80211_IFTYPE_STATION) {
+ sdata->flags |=
+ IEEE80211_SDATA_DISCONNECT_HW_RESTART;
+ ieee80211_sta_restart(sdata);
+ }
+ }
+ }
+
/*
* If this is for hw restart things are still running.
* We may want to change that later, however.
--
2.7.4


2020-12-15 17:43:59

by Felix Fietkau

[permalink] [raw]
Subject: Re: [PATCH 2/3] mac80211: Add support to trigger sta disconnect on hardware restart


On 2020-12-15 18:23, Youghandhar Chintala wrote:
> Currently in case of target hardware restart, we just reconfig and
> re-enable the security keys and enable the network queues to start
> data traffic back from where it was interrupted.
>
> Many ath10k wifi chipsets have sequence numbers for the data
> packets assigned by firmware and the mac sequence number will
> restart from zero after target hardware restart leading to mismatch
> in the sequence number expected by the remote peer vs the sequence
> number of the frame sent by the target firmware.
>
> This mismatch in sequence number will cause out-of-order packets
> on the remote peer and all the frames sent by the device are dropped
> until we reach the sequence number which was sent before we restarted
> the target hardware
>
> In order to fix this, we trigger a sta disconnect, for the targets
> which expose this corresponding wiphy flag, in case of target hw
> restart. After this there will be a fresh connection and thereby
> avoiding the dropping of frames by remote peer.
>
> The right fix would be to pull the entire data path into the host
> which is not feasible or would need lots of complex changes and
> will still be inefficient.
How about simply tracking which tids have aggregation enabled and send
DELBA frames for those after the restart?
It would mean less disruption for affected stations and less ugly hacks
in the stack for unreliable hardware.

- Felix

2021-01-28 08:12:18

by Youghandhar Chintala

[permalink] [raw]
Subject: Re: [PATCH 2/3] mac80211: Add support to trigger sta disconnect on hardware restart

On 2020-12-15 23:10, Felix Fietkau wrote:
> On 2020-12-15 18:23, Youghandhar Chintala wrote:
>> Currently in case of target hardware restart, we just reconfig and
>> re-enable the security keys and enable the network queues to start
>> data traffic back from where it was interrupted.
>>
>> Many ath10k wifi chipsets have sequence numbers for the data
>> packets assigned by firmware and the mac sequence number will
>> restart from zero after target hardware restart leading to mismatch
>> in the sequence number expected by the remote peer vs the sequence
>> number of the frame sent by the target firmware.
>>
>> This mismatch in sequence number will cause out-of-order packets
>> on the remote peer and all the frames sent by the device are dropped
>> until we reach the sequence number which was sent before we restarted
>> the target hardware
>>
>> In order to fix this, we trigger a sta disconnect, for the targets
>> which expose this corresponding wiphy flag, in case of target hw
>> restart. After this there will be a fresh connection and thereby
>> avoiding the dropping of frames by remote peer.
>>
>> The right fix would be to pull the entire data path into the host
>> which is not feasible or would need lots of complex changes and
>> will still be inefficient.
> How about simply tracking which tids have aggregation enabled and send
> DELBA frames for those after the restart?
> It would mean less disruption for affected stations and less ugly hacks
> in the stack for unreliable hardware.
>
> - Felix

Hi Felix,

We did try to send an ADDBA frame to the AP once the SSR happened. The
AP ack’ed the frame and the new BA session with renewed sequence number
was established. But still, the AP did not respond to the ping requests
with the new sequence number. It did not respond until one of the two
happened.
1. The sequence number was more than the sequence number that DUT had
used before SSR happened
2. DUT disconnected and then reconnected.
The other option is to send a DELBA frame to the AP and make the AP also
force to establish the BA session from its side. This we feel can have
some interoperability issues as some of the AP’s may not honour the
DELBA frame and will continue to use the earlier BA session that it had
established. Given that re-negotiating the BA session is prone to IOT
issues, we feel that it would be good to go with the
Disconnect/Reconnect solution which is foolproof and will work in all
scenarios.

Regards,
Youghandhar

2021-02-05 21:54:25

by Abhishek Kumar

[permalink] [raw]
Subject: Re: [PATCH 2/3] mac80211: Add support to trigger sta disconnect on hardware restart

Since using DELBA frame to APs to re-establish BA session has a
dependency on APs and also some APs may not honour the DELBA frame. I
am fine with having the disconnect/reconnect solution. The change
looks good to me.

Reviewed-by: Abhishek Kumar <[email protected]>

Thanks
Abhishek

On Thu, Jan 28, 2021 at 12:08 AM <[email protected]> wrote:
>
> On 2020-12-15 23:10, Felix Fietkau wrote:
> > On 2020-12-15 18:23, Youghandhar Chintala wrote:
> >> Currently in case of target hardware restart, we just reconfig and
> >> re-enable the security keys and enable the network queues to start
> >> data traffic back from where it was interrupted.
> >>
> >> Many ath10k wifi chipsets have sequence numbers for the data
> >> packets assigned by firmware and the mac sequence number will
> >> restart from zero after target hardware restart leading to mismatch
> >> in the sequence number expected by the remote peer vs the sequence
> >> number of the frame sent by the target firmware.
> >>
> >> This mismatch in sequence number will cause out-of-order packets
> >> on the remote peer and all the frames sent by the device are dropped
> >> until we reach the sequence number which was sent before we restarted
> >> the target hardware
> >>
> >> In order to fix this, we trigger a sta disconnect, for the targets
> >> which expose this corresponding wiphy flag, in case of target hw
> >> restart. After this there will be a fresh connection and thereby
> >> avoiding the dropping of frames by remote peer.
> >>
> >> The right fix would be to pull the entire data path into the host
> >> which is not feasible or would need lots of complex changes and
> >> will still be inefficient.
> > How about simply tracking which tids have aggregation enabled and send
> > DELBA frames for those after the restart?
> > It would mean less disruption for affected stations and less ugly hacks
> > in the stack for unreliable hardware.
> >
> > - Felix
>
> Hi Felix,
>
> We did try to send an ADDBA frame to the AP once the SSR happened. The
> AP ack’ed the frame and the new BA session with renewed sequence number
> was established. But still, the AP did not respond to the ping requests
> with the new sequence number. It did not respond until one of the two
> happened.
> 1. The sequence number was more than the sequence number that DUT had
> used before SSR happened
> 2. DUT disconnected and then reconnected.
> The other option is to send a DELBA frame to the AP and make the AP also
> force to establish the BA session from its side. This we feel can have
> some interoperability issues as some of the AP’s may not honour the
> DELBA frame and will continue to use the earlier BA session that it had
> established. Given that re-negotiating the BA session is prone to IOT
> issues, we feel that it would be good to go with the
> Disconnect/Reconnect solution which is foolproof and will work in all
> scenarios.
>
> Regards,
> Youghandhar

2021-02-08 15:45:40

by Guenter Roeck

[permalink] [raw]
Subject: Re: [PATCH 2/3] mac80211: Add support to trigger sta disconnect on hardware restart

On Tue, Dec 15, 2020 at 10:53:52PM +0530, Youghandhar Chintala wrote:
> Currently in case of target hardware restart, we just reconfig and
> re-enable the security keys and enable the network queues to start
> data traffic back from where it was interrupted.
>
> Many ath10k wifi chipsets have sequence numbers for the data
> packets assigned by firmware and the mac sequence number will
> restart from zero after target hardware restart leading to mismatch
> in the sequence number expected by the remote peer vs the sequence
> number of the frame sent by the target firmware.
>
> This mismatch in sequence number will cause out-of-order packets
> on the remote peer and all the frames sent by the device are dropped
> until we reach the sequence number which was sent before we restarted
> the target hardware
>
> In order to fix this, we trigger a sta disconnect, for the targets
> which expose this corresponding wiphy flag, in case of target hw
> restart. After this there will be a fresh connection and thereby
> avoiding the dropping of frames by remote peer.
>
> The right fix would be to pull the entire data path into the host
> which is not feasible or would need lots of complex changes and
> will still be inefficient.
>
> Tested on ath10k using WCN3990, QCA6174
>
> Signed-off-by: Youghandhar Chintala <[email protected]>
> Reviewed-by: Abhishek Kumar <[email protected]>
> ---
> net/mac80211/ieee80211_i.h | 3 +++
> net/mac80211/mlme.c | 9 +++++++++
> net/mac80211/util.c | 22 +++++++++++++++++++---
> 3 files changed, 31 insertions(+), 3 deletions(-)
>
> diff --git a/net/mac80211/ieee80211_i.h b/net/mac80211/ieee80211_i.h
> index cde2e3f..8cbeb5f 100644
> --- a/net/mac80211/ieee80211_i.h
> +++ b/net/mac80211/ieee80211_i.h
> @@ -748,6 +748,8 @@ struct ieee80211_if_mesh {
> * back to wireless media and to the local net stack.
> * @IEEE80211_SDATA_DISCONNECT_RESUME: Disconnect after resume.
> * @IEEE80211_SDATA_IN_DRIVER: indicates interface was added to driver
> + * @IEEE80211_SDATA_DISCONNECT_HW_RESTART: Disconnect after hardware restart
> + * recovery
> */
> enum ieee80211_sub_if_data_flags {
> IEEE80211_SDATA_ALLMULTI = BIT(0),
> @@ -755,6 +757,7 @@ enum ieee80211_sub_if_data_flags {
> IEEE80211_SDATA_DONT_BRIDGE_PACKETS = BIT(3),
> IEEE80211_SDATA_DISCONNECT_RESUME = BIT(4),
> IEEE80211_SDATA_IN_DRIVER = BIT(5),
> + IEEE80211_SDATA_DISCONNECT_HW_RESTART = BIT(6),
> };
>
> /**
> diff --git a/net/mac80211/mlme.c b/net/mac80211/mlme.c
> index 6adfcb9..e4d0d16 100644
> --- a/net/mac80211/mlme.c
> +++ b/net/mac80211/mlme.c
> @@ -4769,6 +4769,15 @@ void ieee80211_sta_restart(struct ieee80211_sub_if_data *sdata)
> true);
> sdata_unlock(sdata);
> return;
> + } else if (sdata->flags & IEEE80211_SDATA_DISCONNECT_HW_RESTART) {
> + sdata->flags &= ~IEEE80211_SDATA_DISCONNECT_HW_RESTART;
> + mlme_dbg(sdata, "driver requested disconnect after hardware restart\n");
> + ieee80211_sta_connection_lost(sdata,
> + ifmgd->associated->bssid,
> + WLAN_REASON_UNSPECIFIED,
> + true);
> + sdata_unlock(sdata);
> + return;
> }
> sdata_unlock(sdata);
> }
> diff --git a/net/mac80211/util.c b/net/mac80211/util.c
> index 8c3c01a..98567a3 100644
> --- a/net/mac80211/util.c
> +++ b/net/mac80211/util.c
> @@ -2567,9 +2567,12 @@ int ieee80211_reconfig(struct ieee80211_local *local)
> }
> mutex_unlock(&local->sta_mtx);
>
> - /* add back keys */
> - list_for_each_entry(sdata, &local->interfaces, list)
> - ieee80211_reenable_keys(sdata);
> +
> + if (!(hw->wiphy->flags & WIPHY_FLAG_STA_DISCONNECT_ON_HW_RESTART)) {
> + /* add back keys */
> + list_for_each_entry(sdata, &local->interfaces, list)
> + ieee80211_reenable_keys(sdata);
> + }
>
> /* Reconfigure sched scan if it was interrupted by FW restart */
> mutex_lock(&local->mtx);
> @@ -2643,6 +2646,19 @@ int ieee80211_reconfig(struct ieee80211_local *local)
> IEEE80211_QUEUE_STOP_REASON_SUSPEND,
> false);
>
> + if ((hw->wiphy->flags & WIPHY_FLAG_STA_DISCONNECT_ON_HW_RESTART) &&
> + !reconfig_due_to_wowlan) {
> + list_for_each_entry(sdata, &local->interfaces, list) {
> + if (!ieee80211_sdata_running(sdata))
> + continue;
> + if (sdata->vif.type == NL80211_IFTYPE_STATION) {
> + sdata->flags |=
> + IEEE80211_SDATA_DISCONNECT_HW_RESTART;
> + ieee80211_sta_restart(sdata);

If CONFIG_PM=n:

ERROR: "ieee80211_sta_restart" [net/mac80211/mac80211.ko] undefined!

Guenter

> + }
> + }
> + }
> +
> /*
> * If this is for hw restart things are still running.
> * We may want to change that later, however.

2021-02-12 08:40:09

by Johannes Berg

[permalink] [raw]
Subject: Re: [PATCH 2/3] mac80211: Add support to trigger sta disconnect on hardware restart

On Fri, 2021-02-05 at 13:51 -0800, Abhishek Kumar wrote:
> Since using DELBA frame to APs to re-establish BA session has a
> dependency on APs and also some APs may not honour the DELBA frame.


That's completely out of spec ... Can you say which AP this was?

You could also try sending a BAR that updates the SN.

johannes

2021-02-12 08:43:45

by Johannes Berg

[permalink] [raw]
Subject: Re: [PATCH 2/3] mac80211: Add support to trigger sta disconnect on hardware restart

On Tue, 2020-12-15 at 22:53 +0530, Youghandhar Chintala wrote:
> The right fix would be to pull the entire data path into the host

> +++ b/net/mac80211/ieee80211_i.h
> @@ -748,6 +748,8 @@ struct ieee80211_if_mesh {
> * back to wireless media and to the local net stack.
> * @IEEE80211_SDATA_DISCONNECT_RESUME: Disconnect after resume.
> * @IEEE80211_SDATA_IN_DRIVER: indicates interface was added to driver
> + * @IEEE80211_SDATA_DISCONNECT_HW_RESTART: Disconnect after hardware restart
> + * recovery

How did you model this on IEEE80211_SDATA_DISCONNECT_RESUME, but than
didn't check how that's actually used?

Please change it so that the two models are the same. You really don't
need the wiphy flag.

johannes

2021-02-12 08:45:30

by Johannes Berg

[permalink] [raw]
Subject: Re: [PATCH 2/3] mac80211: Add support to trigger sta disconnect on hardware restart

On Fri, 2021-02-12 at 09:42 +0100, Johannes Berg wrote:
> On Tue, 2020-12-15 at 22:53 +0530, Youghandhar Chintala wrote:
> > The right fix would be to pull the entire data path into the host
> > +++ b/net/mac80211/ieee80211_i.h
> > @@ -748,6 +748,8 @@ struct ieee80211_if_mesh {
> > * back to wireless media and to the local net stack.
> > * @IEEE80211_SDATA_DISCONNECT_RESUME: Disconnect after resume.
> > * @IEEE80211_SDATA_IN_DRIVER: indicates interface was added to driver
> > + * @IEEE80211_SDATA_DISCONNECT_HW_RESTART: Disconnect after hardware restart
> > + * recovery
>
> How did you model this on IEEE80211_SDATA_DISCONNECT_RESUME, but than
> didn't check how that's actually used?
>
> Please change it so that the two models are the same. You really don't
> need the wiphy flag.

In fact, you could even simply
generalize IEEE80211_SDATA_DISCONNECT_RESUME
and ieee80211_resume_disconnect() to _reconfig_ instead of _resume_, and
call it from the driver just before requesting HW restart.

johannes

2021-09-24 07:37:58

by Youghandhar Chintala

[permalink] [raw]
Subject: Re: [PATCH 2/3] mac80211: Add support to trigger sta disconnect on hardware restart

Hi Johannes and felix,

We have tested with DELBA experiment during post SSR, DUT packet seq
number and tx pn is resetting to 0 as expected but AP(Netgear R8000) is
not honoring the tx pn from DUT.
Whereas when we tested with DELBA experiment by making Linux android
device as SAP and DUT as STA with which we don’t see any issue. Ping got
resumed post SSR without disconnect.

Please find below logs collected during my test for reference.

192.168.0.15(AtherosC_12:af:af) ===> DUT IP and MAC
192.168.0.55(Netgear_d2:93:3d) ===> AP IP and MAC

No. Time Source Destination
Protocol Channel Sequence number Protected flag Block Ack Starting
Sequence Control (SSC) CCMP Ext. Initialization Vector Action code TID
Info
474 22.186433 192.168.0.15 192.168.0.55 ICMP
44 37 Data is protected
0x000000000026 0
Echo (ping) request id=0x0d00, seq=256/1, ttl=64 (reply in 480)

No. Time Source Destination
Protocol Channel Sequence number Protected flag Block Ack Starting
Sequence Control (SSC) CCMP Ext. Initialization Vector Action code TID
Info
480 22.188371 192.168.0.55 192.168.0.15 ICMP
44 5 Data is protected
0x000000000011 6
Echo (ping) reply id=0x0d00, seq=256/1, ttl=64 (request in 474)

No. Time Source Destination
Protocol Channel Sequence number Protected flag Block Ack Starting
Sequence Control (SSC) CCMP Ext. Initialization Vector Action code TID
Info
483 22.246335 192.168.0.15 192.168.0.55 ICMP
44 38 Data is protected
0x000000000027 0
Echo (ping) request id=0x1258, seq=11/2816, ttl=64 (reply in 489)

No. Time Source Destination
Protocol Channel Sequence number Protected flag Block Ack Starting
Sequence Control (SSC) CCMP Ext. Initialization Vector Action code TID
Info
489 22.248127 192.168.0.55 192.168.0.15 ICMP
44 13 Data is protected
0x000000000012 0
Echo (ping) reply id=0x1258, seq=11/2816, ttl=64 (request in 483)


The above pings(with TID 0) are before SSR. As soon as DUT recovers
after SSR, DUT is sending DELBAs to AP.

No. Time Source Destination
Protocol Channel Sequence number Protected flag Block Ack Starting
Sequence Control (SSC) CCMP Ext. Initialization Vector Action code
TID Info
546 26.129127 AtherosC_12:af:af Netgear_d2:93:3d
802.11 44 4 Data is not protected
Delete Block Ack
0x0 Action, SN=4, FN=0, Flags=........C

No. Time Source Destination
Protocol Channel Sequence number Protected flag Block Ack Starting
Sequence Control (SSC) CCMP Ext. Initialization Vector Action code
TID Info
548 26.129977 AtherosC_12:af:af Netgear_d2:93:3d
802.11 44 5 Data is not protected
Delete Block Ack
0x6 Action, SN=5, FN=0, Flags=........C


After SSR, we started ping traffic with TID 7 and 0. ping is successful
for TID 7 and failed for TID 0.
For TID 0, ping requests tx PN is reset to 0 but it seems AP is not
reset its PN hence we see this ping failure for TID 0.
Whereas TID 7 ping success because we started it after SSR.


No. Time Source Destination
Protocol Channel Sequence number Protected flag Block Ack Starting
Sequence Control (SSC) CCMP Ext. Initialization Vector Action code TID
Info
557 26.355256 192.168.0.15 192.168.0.55 ICMP
44 0 Data is protected
0x000000000001 0
Echo (ping) request id=0x1258, seq=15/3840, ttl=64 (no response found!)

No. Time Source Destination
Protocol Channel Sequence number Protected flag Block Ack Starting
Sequence Control (SSC) CCMP Ext. Initialization Vector Action code TID
Info
571 27.376895 192.168.0.15 192.168.0.55 ICMP
44 1 Data is protected
0x000000000002 0
Echo (ping) request id=0x1258, seq=16/4096, ttl=64 (no response found!)

No. Time Source Destination
Protocol Channel Sequence number Protected flag Block Ack Starting
Sequence Control (SSC) CCMP Ext. Initialization Vector Action code TID
Info
588 28.400946 192.168.0.15 192.168.0.55 ICMP
44 2 Data is protected
0x000000000003 0
Echo (ping) request id=0x1258, seq=17/4352, ttl=64 (no response found!)

No. Time Source Destination
Protocol Channel Sequence number Protected flag Block Ack Starting
Sequence Control (SSC) CCMP Ext. Initialization Vector Action code TID
Info
600 29.424881 192.168.0.15 192.168.0.55 ICMP
44 3 Data is protected
0x000000000004 0
Echo (ping) request id=0x1258, seq=18/4608, ttl=64 (no response found!)


Below ping packets are with TID 7

No. Time Source Destination
Protocol Channel Sequence number Protected flag Block Ack Starting
Sequence Control (SSC) CCMP Ext. Initialization Vector Action code TID
Info
622 30.898249 192.168.0.15 192.168.0.55 ICMP
44 0 Data is protected
0x000000000006 7
Echo (ping) request id=0x1276, seq=1/256, ttl=64 (reply in 626)

No. Time Source Destination
Protocol Channel Sequence number Protected flag Block Ack Starting
Sequence Control (SSC) CCMP Ext. Initialization Vector Action code TID
Info
626 30.900015 192.168.0.55 192.168.0.15 ICMP
44 0 Data is protected
0x000000000013 7
Echo (ping) reply id=0x1276, seq=1/256, ttl=64 (request in 622)

No. Time Source Destination
Protocol Channel Sequence number Protected flag Block Ack Starting
Sequence Control (SSC) CCMP Ext. Initialization Vector Action code TID
Info
644 31.897456 192.168.0.15 192.168.0.55 ICMP
44 1 Data is protected
0x000000000008 7
Echo (ping) request id=0x1276, seq=2/512, ttl=64 (reply in 648)

No. Time Source Destination
Protocol Channel Sequence number Protected flag Block Ack Starting
Sequence Control (SSC) CCMP Ext. Initialization Vector Action code TID
Info
648 31.899266 192.168.0.55 192.168.0.15 ICMP
44 1 Data is protected
0x000000000014 7
Echo (ping) reply id=0x1276, seq=2/512, ttl=64 (request in 644)

Regards,
Youghandhar


On 2021-02-12 14:07, Johannes Berg wrote:
> On Fri, 2021-02-05 at 13:51 -0800, Abhishek Kumar wrote:
>> Since using DELBA frame to APs to re-establish BA session has a
>> dependency on APs and also some APs may not honor the DELBA frame.
>
>
> That's completely out of spec ... Can you say which AP this was?
>
> You could also try sending a BAR that updates the SN.
>
> johannes

Regards,
Youghandhar
--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a
member
of Code Aurora Forum, hosted by The Linux Foundation

2021-09-24 07:41:09

by Johannes Berg

[permalink] [raw]
Subject: Re: [PATCH 2/3] mac80211: Add support to trigger sta disconnect on hardware restart

On Fri, 2021-09-24 at 13:07 +0530, Youghandhar Chintala wrote:
> Hi Johannes and felix,
>
> We have tested with DELBA experiment during post SSR, DUT packet seq
> number and tx pn is resetting to 0 as expected but AP(Netgear R8000) is
> not honoring the tx pn from DUT.
> Whereas when we tested with DELBA experiment by making Linux android
> device as SAP and DUT as STA with which we don’t see any issue. Ping got
> resumed post SSR without disconnect.

Hm. That's a lot of data, and not a lot of explanation :)

I don't understand how DelBA and PN are related?

johannes

2021-09-24 09:22:39

by Youghandhar Chintala

[permalink] [raw]
Subject: Re: [PATCH 2/3] mac80211: Add support to trigger sta disconnect on hardware restart

Hi Johannes

We thought sending the delba would solve the problem as earlier thought
but the actual problem is with TX PN in a secure mode.
It is not because of delba that the Seq number and TX PN are reset to
zero.
It’s because of the HW restart, these parameters are reset to zero.
Since FW/HW is the one which decides the TX PN, when it goes through
SSR, all these parameters are reset.
The other peer say an AP, it does not know anything about the SSR on the
peer device. It expects the next TX PN to be current PN + 1.
Since TX PN starts from zero after SSR, PN check at AP will fail and it
will silently drop all the packets.

Regards,
Youghandhar

On 2021-09-24 13:09, Johannes Berg wrote:
> On Fri, 2021-09-24 at 13:07 +0530, Youghandhar Chintala wrote:
>> Hi Johannes and felix,
>>
>> We have tested with DELBA experiment during post SSR, DUT packet seq
>> number and tx pn is resetting to 0 as expected but AP(Netgear R8000)
>> is
>> not honoring the tx pn from DUT.
>> Whereas when we tested with DELBA experiment by making Linux android
>> device as SAP and DUT as STA with which we don’t see any issue. Ping
>> got
>> resumed post SSR without disconnect.
>
> Hm. That's a lot of data, and not a lot of explanation :)
>
> I don't understand how DelBA and PN are related?
>
> johannes

Regards,
Youghandhar
--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a
member
of Code Aurora Forum, hosted by The Linux Foundation

2021-09-24 11:09:29

by Johannes Berg

[permalink] [raw]
Subject: Re: [PATCH 2/3] mac80211: Add support to trigger sta disconnect on hardware restart

Hi,


> We thought sending the delba would solve the problem as earlier thought
> but the actual problem is with TX PN in a secure mode.
> It is not because of delba that the Seq number and TX PN are reset to
> zero.
> It’s because of the HW restart, these parameters are reset to zero.
> Since FW/HW is the one which decides the TX PN, when it goes through
> SSR, all these parameters are reset.

Right, we solved this problem too - in a sense the driver reads the
database (not just TX PN btw, also RX replay counters) when the firmware
crashes, and sending it back after the restart. mac80211 has some hooks
for that.

johannes


2021-10-05 20:27:22

by Jouni Malinen

[permalink] [raw]
Subject: Re: [PATCH 2/3] mac80211: Add support to trigger sta disconnect on hardware restart

On Fri, Sep 24, 2021 at 11:20:50AM +0200, Johannes Berg wrote:
> > We thought sending the delba would solve the problem as earlier thought
> > but the actual problem is with TX PN in a secure mode.
> > It is not because of delba that the Seq number and TX PN are reset to
> > zero.
> > It’s because of the HW restart, these parameters are reset to zero.
> > Since FW/HW is the one which decides the TX PN, when it goes through
> > SSR, all these parameters are reset.
>
> Right, we solved this problem too - in a sense the driver reads the
> database (not just TX PN btw, also RX replay counters) when the firmware
> crashes, and sending it back after the restart. mac80211 has some hooks
> for that.

This might be doable for some cases where the firmware is the component
assigning the PN values on TX and the firmware still being in a state
where the counter used for this could be fetched after a crash or
detected misbehavior. However, this does not sound like a very reliable
mechanism for cases where the firmware state for this cannot be trusted
or for the cases where the TX PN is actually assigned by the hardware
(which would get cleared on that restart and the value might be
unreadable before that restart). Trying to pull for this information
periodically before the issue is detected does not sound like a very
robust design either, since that would both waste resources and have a
race condition with the lower layers having transmitted additional
frames.

Obviously it would be nice to be able to restore this type of state in
all cases accurately, but that may not really be a viable approach for
all designs and it would seem to make sense to provide an alternative
approach to minimize the user visible impact from the rare cases of
having to restart some low level components during an association.

--
Jouni Malinen PGP id EFC895FA