2019-07-03 12:31:51

by Sergey Matyukevich

[permalink] [raw]
Subject: [RFC PATCH v2] cfg80211: fix duplicated scan entries after channel switch

When associated BSS completes channel switch procedure, its channel
record needs to be updated. The existing mac80211 solution was
extended to cfg80211 in commit 5dc8cdce1d72 ("mac80211/cfg80211:
update bss channel on channel switch")

However that solution still appears to be incomplete as it may lead
to duplicated scan entries for associated BSS after channel switch.
The root cause of the problem is as follows. Each BSS entry is
included into the following data structures:
- bss list rdev->bss_list
- bss search tree rdev->bss_tree
Updating BSS channel record without rebuilding bss_tree may break
tree search since cmp_bss considers all of the following: channel,
bssid, ssid. When BSS channel is updated, but its location in bss_tree
is not updated, then subsequent search operations may fail to locate
this BSS. As a result, for scan performed after associated BSS channel
switch, cfg80211_bss_update may add the second entry for the same BSS
to both bss_list and bss_tree, rather then update the existing one.

To summarize, if BSS channel needs to be updated, then bss_tree should
be rebuilt in order to put updated BSS entry into a proper location.

This commit suggests the following straightforward solution:
- if new entry has been already created for BSS after channel switch,
then use its IEs to update known BSS entry and then remove new
entry completely
- use rb_erase/rb_insert_bss reinstall updated BSS in bss_tree

Signed-off-by: Sergey Matyukevich <[email protected]>

---

v1 -> v2
- use IEs of new BSS entry to update known BSS entry
for this purpose extract BSS update code from cfg80211_bss_update
into a separate function cfg80211_update_known_bss

Tested on both iwlwifi and qtnfmac.

From my perspective, primary reason for RFC tag is nontrans_list bss handling.
I am not sure whether nontrans_bss should be removed for new entry or not.
The approach varies between cfg80211 API functions. For instance,
cfg80211_unlink_bss removes them, while cfg80211_bss_expire does not.
I would appreciate any comments in this regard.

Regards,
Sergey

---
net/wireless/core.h | 2 +
net/wireless/nl80211.c | 5 +-
net/wireless/scan.c | 221 +++++++++++++++++++++++++++++++------------------
3 files changed, 144 insertions(+), 84 deletions(-)

diff --git a/net/wireless/core.h b/net/wireless/core.h
index ee8388fe4a92..77556c58d9ac 100644
--- a/net/wireless/core.h
+++ b/net/wireless/core.h
@@ -306,6 +306,8 @@ void ieee80211_set_bitrate_flags(struct wiphy *wiphy);
void cfg80211_bss_expire(struct cfg80211_registered_device *rdev);
void cfg80211_bss_age(struct cfg80211_registered_device *rdev,
unsigned long age_secs);
+void cfg80211_update_assoc_bss_entry(struct wireless_dev *wdev,
+ struct ieee80211_channel *channel);

/* IBSS */
int __cfg80211_join_ibss(struct cfg80211_registered_device *rdev,
diff --git a/net/wireless/nl80211.c b/net/wireless/nl80211.c
index fc83dd179c1a..9bc4999a8c59 100644
--- a/net/wireless/nl80211.c
+++ b/net/wireless/nl80211.c
@@ -16091,8 +16091,9 @@ void cfg80211_ch_switch_notify(struct net_device *dev,
wdev->preset_chandef = *chandef;

if (wdev->iftype == NL80211_IFTYPE_STATION &&
- !WARN_ON(!wdev->current_bss))
- wdev->current_bss->pub.channel = chandef->chan;
+ !WARN_ON(!wdev->current_bss)) {
+ cfg80211_update_assoc_bss_entry(wdev, chandef->chan);
+ }

nl80211_ch_switch_notify(rdev, dev, chandef, GFP_KERNEL,
NL80211_CMD_CH_SWITCH_NOTIFY, 0);
diff --git a/net/wireless/scan.c b/net/wireless/scan.c
index d66e6d4b7555..53c07f69365d 100644
--- a/net/wireless/scan.c
+++ b/net/wireless/scan.c
@@ -1091,6 +1091,93 @@ struct cfg80211_non_tx_bss {
u8 bssid_index;
};

+static bool
+cfg80211_update_known_bss(struct cfg80211_registered_device *rdev,
+ struct cfg80211_internal_bss *known,
+ struct cfg80211_internal_bss *new,
+ bool signal_valid)
+{
+ lockdep_assert_held(&rdev->bss_lock);
+
+ /* Update IEs */
+ if (rcu_access_pointer(new->pub.proberesp_ies)) {
+ const struct cfg80211_bss_ies *old;
+
+ old = rcu_access_pointer(known->pub.proberesp_ies);
+
+ rcu_assign_pointer(known->pub.proberesp_ies,
+ new->pub.proberesp_ies);
+ /* Override possible earlier Beacon frame IEs */
+ rcu_assign_pointer(known->pub.ies,
+ new->pub.proberesp_ies);
+ if (old)
+ kfree_rcu((struct cfg80211_bss_ies *)old, rcu_head);
+ } else if (rcu_access_pointer(new->pub.beacon_ies)) {
+ const struct cfg80211_bss_ies *old;
+ struct cfg80211_internal_bss *bss;
+
+ if (known->pub.hidden_beacon_bss &&
+ !list_empty(&known->hidden_list)) {
+ const struct cfg80211_bss_ies *f;
+
+ /* The known BSS struct is one of the probe
+ * response members of a group, but we're
+ * receiving a beacon (beacon_ies in the new
+ * bss is used). This can only mean that the
+ * AP changed its beacon from not having an
+ * SSID to showing it, which is confusing so
+ * drop this information.
+ */
+
+ f = rcu_access_pointer(new->pub.beacon_ies);
+ kfree_rcu((struct cfg80211_bss_ies *)f, rcu_head);
+ return false;
+ }
+
+ old = rcu_access_pointer(known->pub.beacon_ies);
+
+ rcu_assign_pointer(known->pub.beacon_ies, new->pub.beacon_ies);
+
+ /* Override IEs if they were from a beacon before */
+ if (old == rcu_access_pointer(known->pub.ies))
+ rcu_assign_pointer(known->pub.ies, new->pub.beacon_ies);
+
+ /* Assign beacon IEs to all sub entries */
+ list_for_each_entry(bss, &known->hidden_list, hidden_list) {
+ const struct cfg80211_bss_ies *ies;
+
+ ies = rcu_access_pointer(bss->pub.beacon_ies);
+ WARN_ON(ies != old);
+
+ rcu_assign_pointer(bss->pub.beacon_ies,
+ new->pub.beacon_ies);
+ }
+
+ if (old)
+ kfree_rcu((struct cfg80211_bss_ies *)old, rcu_head);
+ }
+
+ known->pub.beacon_interval = new->pub.beacon_interval;
+
+ /* don't update the signal if beacon was heard on
+ * adjacent channel.
+ */
+ if (signal_valid)
+ known->pub.signal = new->pub.signal;
+ known->pub.capability = new->pub.capability;
+ known->ts = new->ts;
+ known->ts_boottime = new->ts_boottime;
+ known->parent_tsf = new->parent_tsf;
+ known->pub.chains = new->pub.chains;
+ memcpy(known->pub.chain_signal, new->pub.chain_signal,
+ IEEE80211_MAX_CHAINS);
+ ether_addr_copy(known->parent_bssid, new->parent_bssid);
+ known->pub.max_bssid_indicator = new->pub.max_bssid_indicator;
+ known->pub.bssid_index = new->pub.bssid_index;
+
+ return true;
+}
+
/* Returned bss is reference counted and must be cleaned up appropriately. */
struct cfg80211_internal_bss *
cfg80211_bss_update(struct cfg80211_registered_device *rdev,
@@ -1114,88 +1201,8 @@ cfg80211_bss_update(struct cfg80211_registered_device *rdev,
found = rb_find_bss(rdev, tmp, BSS_CMP_REGULAR);

if (found) {
- /* Update IEs */
- if (rcu_access_pointer(tmp->pub.proberesp_ies)) {
- const struct cfg80211_bss_ies *old;
-
- old = rcu_access_pointer(found->pub.proberesp_ies);
-
- rcu_assign_pointer(found->pub.proberesp_ies,
- tmp->pub.proberesp_ies);
- /* Override possible earlier Beacon frame IEs */
- rcu_assign_pointer(found->pub.ies,
- tmp->pub.proberesp_ies);
- if (old)
- kfree_rcu((struct cfg80211_bss_ies *)old,
- rcu_head);
- } else if (rcu_access_pointer(tmp->pub.beacon_ies)) {
- const struct cfg80211_bss_ies *old;
- struct cfg80211_internal_bss *bss;
-
- if (found->pub.hidden_beacon_bss &&
- !list_empty(&found->hidden_list)) {
- const struct cfg80211_bss_ies *f;
-
- /*
- * The found BSS struct is one of the probe
- * response members of a group, but we're
- * receiving a beacon (beacon_ies in the tmp
- * bss is used). This can only mean that the
- * AP changed its beacon from not having an
- * SSID to showing it, which is confusing so
- * drop this information.
- */
-
- f = rcu_access_pointer(tmp->pub.beacon_ies);
- kfree_rcu((struct cfg80211_bss_ies *)f,
- rcu_head);
- goto drop;
- }
-
- old = rcu_access_pointer(found->pub.beacon_ies);
-
- rcu_assign_pointer(found->pub.beacon_ies,
- tmp->pub.beacon_ies);
-
- /* Override IEs if they were from a beacon before */
- if (old == rcu_access_pointer(found->pub.ies))
- rcu_assign_pointer(found->pub.ies,
- tmp->pub.beacon_ies);
-
- /* Assign beacon IEs to all sub entries */
- list_for_each_entry(bss, &found->hidden_list,
- hidden_list) {
- const struct cfg80211_bss_ies *ies;
-
- ies = rcu_access_pointer(bss->pub.beacon_ies);
- WARN_ON(ies != old);
-
- rcu_assign_pointer(bss->pub.beacon_ies,
- tmp->pub.beacon_ies);
- }
-
- if (old)
- kfree_rcu((struct cfg80211_bss_ies *)old,
- rcu_head);
- }
-
- found->pub.beacon_interval = tmp->pub.beacon_interval;
- /*
- * don't update the signal if beacon was heard on
- * adjacent channel.
- */
- if (signal_valid)
- found->pub.signal = tmp->pub.signal;
- found->pub.capability = tmp->pub.capability;
- found->ts = tmp->ts;
- found->ts_boottime = tmp->ts_boottime;
- found->parent_tsf = tmp->parent_tsf;
- found->pub.chains = tmp->pub.chains;
- memcpy(found->pub.chain_signal, tmp->pub.chain_signal,
- IEEE80211_MAX_CHAINS);
- ether_addr_copy(found->parent_bssid, tmp->parent_bssid);
- found->pub.max_bssid_indicator = tmp->pub.max_bssid_indicator;
- found->pub.bssid_index = tmp->pub.bssid_index;
+ if (!cfg80211_update_known_bss(rdev, found, tmp, signal_valid))
+ goto drop;
} else {
struct cfg80211_internal_bss *new;
struct cfg80211_internal_bss *hidden;
@@ -1995,6 +2002,56 @@ void cfg80211_bss_iter(struct wiphy *wiphy,
}
EXPORT_SYMBOL(cfg80211_bss_iter);

+void cfg80211_update_assoc_bss_entry(struct wireless_dev *wdev,
+ struct ieee80211_channel *chan)
+{
+ struct wiphy *wiphy = wdev->wiphy;
+ struct cfg80211_registered_device *rdev = wiphy_to_rdev(wiphy);
+ struct cfg80211_internal_bss *cbss = wdev->current_bss;
+ struct cfg80211_internal_bss *new = NULL;
+ struct cfg80211_internal_bss *bss;
+
+ spin_lock_bh(&rdev->bss_lock);
+
+ if (WARN_ON(cbss->pub.channel == chan))
+ goto done;
+
+ cbss->pub.channel = chan;
+ cbss->ts = jiffies;
+
+ list_for_each_entry(bss, &rdev->bss_list, list) {
+ if (!cfg80211_bss_type_match(bss->pub.capability,
+ bss->pub.channel->band,
+ wdev->conn_bss_type))
+ continue;
+
+ if (bss == cbss)
+ continue;
+
+ if (!cmp_bss(&bss->pub, &cbss->pub, BSS_CMP_REGULAR)) {
+ new = bss;
+ break;
+ }
+ }
+
+ if (new) {
+ if (cfg80211_update_known_bss(rdev, cbss, new, false)) {
+ new->pub.proberesp_ies = NULL;
+ new->pub.beacon_ies = NULL;
+ }
+
+ WARN_ON(atomic_read(&new->hold));
+ WARN_ON(!__cfg80211_unlink_bss(rdev, new));
+ }
+
+ rb_erase(&cbss->rbn, &rdev->bss_tree);
+ rb_insert_bss(rdev, cbss);
+ rdev->bss_generation++;
+
+done:
+ spin_unlock_bh(&rdev->bss_lock);
+}
+
#ifdef CONFIG_CFG80211_WEXT
static struct cfg80211_registered_device *
cfg80211_get_dev_from_ifindex(struct net *net, int ifindex)
--
2.11.0


2019-07-03 13:38:47

by Johannes Berg

[permalink] [raw]
Subject: Re: [RFC PATCH v2] cfg80211: fix duplicated scan entries after channel switch

On Wed, 2019-07-03 at 12:28 +0000, Sergey Matyukevich wrote:
> When associated BSS completes channel switch procedure, its channel
> record needs to be updated. The existing mac80211 solution was
> extended to cfg80211 in commit 5dc8cdce1d72 ("mac80211/cfg80211:
> update bss channel on channel switch")
>
> However that solution still appears to be incomplete as it may lead
> to duplicated scan entries for associated BSS after channel switch.
> The root cause of the problem is as follows. Each BSS entry is
> included into the following data structures:
> - bss list rdev->bss_list
> - bss search tree rdev->bss_tree
> Updating BSS channel record without rebuilding bss_tree may break
> tree search since cmp_bss considers all of the following: channel,
> bssid, ssid. When BSS channel is updated, but its location in bss_tree
> is not updated, then subsequent search operations may fail to locate
> this BSS. As a result, for scan performed after associated BSS channel
> switch, cfg80211_bss_update may add the second entry for the same BSS
> to both bss_list and bss_tree, rather then update the existing one.
>
> To summarize, if BSS channel needs to be updated, then bss_tree should
> be rebuilt in order to put updated BSS entry into a proper location.
>
> This commit suggests the following straightforward solution:
> - if new entry has been already created for BSS after channel switch,
> then use its IEs to update known BSS entry and then remove new
> entry completely
> - use rb_erase/rb_insert_bss reinstall updated BSS in bss_tree
>
> Signed-off-by: Sergey Matyukevich <[email protected]>
>
> ---
>
> v1 -> v2
> - use IEs of new BSS entry to update known BSS entry
> for this purpose extract BSS update code from cfg80211_bss_update
> into a separate function cfg80211_update_known_bss
>
> Tested on both iwlwifi and qtnfmac.

Looks good to me, a few nitpicks below.

> From my perspective, primary reason for RFC tag is nontrans_list bss handling.
> I am not sure whether nontrans_bss should be removed for new entry or not.
> The approach varies between cfg80211 API functions. For instance,
> cfg80211_unlink_bss removes them, while cfg80211_bss_expire does not.
> I would appreciate any comments in this regard.

Hmm. Good question. Ideally, if we detect CSA on any of them, we'd move
*all* of the nontrans BSSes and their transmitting BSS? Because really
that's what must happen, since a non-transmitting BSS cannot change
channel without its transmitting BSS, and vice versa.

Btw, I think expire would also remove them, since they all should always
have the same timestamp? Actually, they don't, if we jiffies changed
while we updated ... I guess we should fix that.

https://p.sipsolutions.net/09da5943735d7983.txt


> if (wdev->iftype == NL80211_IFTYPE_STATION &&
> - !WARN_ON(!wdev->current_bss))
> - wdev->current_bss->pub.channel = chandef->chan;
> + !WARN_ON(!wdev->current_bss)) {
> + cfg80211_update_assoc_bss_entry(wdev, chandef->chan);
> + }

don't really need the braces, I suppose you previously had more code
here and refactored it? :)

> +static bool
> +cfg80211_update_known_bss(struct cfg80211_registered_device *rdev,
> + struct cfg80211_internal_bss *known,
> + struct cfg80211_internal_bss *new,
> + bool signal_valid)

Maybe split out this refactoring to a separate patch, or at least add a
small statement to the commit log saying that it's just broken out of
the function.

> +void cfg80211_update_assoc_bss_entry(struct wireless_dev *wdev,
> + struct ieee80211_channel *chan)
> +{
>
[...]

> + cbss->pub.channel = chan;
> + cbss->ts = jiffies;

Not entirely sure we should set the timestamp here, you're going to have
an interesting situation where if you have a "new" entry found in the
loop below, you would actually have a *lower* timestamp?

Then again, we surely know that we still have valid data now, so I guess
that's fine. Doesn't matter that much anyway.

So yeah, looks fine to me.

johannes