2013-12-04 21:30:15

by Johannes Berg

[permalink] [raw]
Subject: question for mac80211 driver maintainers - RCU usage in drivers

Hi all,

Except for iwlmvm, I don't find much RCU usage wrt. stations in drivers.

Is there any other driver that assumes it is safe to delete a station
pointer in the sta_state callback and not use synchronize_rcu()? From
looking at the code, I don't see any, but I can't really be sure that
everyone uses __rcu annotations correctly ... :)

Would anyone object if we changed mac80211 to *immediately* free the
station after calling the driver's sta_state (or sta_remove) callback?
We currently delay this until after an RCU grace period, but that way we
end up having a lot of delay in station freeing ... We'd like to
optimise that.

johannes

PS: I'll probably have to add another callback "sta going away before
RCU" so you can invalidate pointers there ... otherwise I'd have to
synchronize_rcu() in iwlmvm which would kinda defeat the purpose.



2013-12-05 12:36:05

by Johannes Berg

[permalink] [raw]
Subject: [PATCH 1/8] mac80211: add pre-RCU-sync sta removal driver operation

From: Johannes Berg <[email protected]>

Currently, mac80211 allows drivers to keep RCU-protected station
references that are cleared when the station is removed from the
driver and consequently needs to synchronize twice, once before
removing the station from the driver (so it can guarantee that
the station is no longer used in TX towards the driver) and once
after the station is removed from the driver.

Add a new pre-RCU-synchronisation station removal operation to
the API to allow drivers to clear/invalidate their RCU-protected
station pointers before the RCU synchronisation.

This will allow removing the second synchronisation by changing
the driver API so that the driver may no longer assume a valid
RCU-protected pointer after sta_remove/sta_state returns.

The alternative to this would be to synchronize_rcu() in all the
drivers that currently rely on this behaviour (only iwlmvm) but
that would defeat the purpose.

Signed-off-by: Johannes Berg <[email protected]>
---
include/net/mac80211.h | 21 +++++++++++++++++++--
net/mac80211/driver-ops.h | 16 ++++++++++++++++
net/mac80211/sta_info.c | 2 ++
net/mac80211/trace.h | 34 +++++++++++++++-------------------
4 files changed, 52 insertions(+), 21 deletions(-)

diff --git a/include/net/mac80211.h b/include/net/mac80211.h
index c014acc..22b6dd9 100644
--- a/include/net/mac80211.h
+++ b/include/net/mac80211.h
@@ -2495,7 +2495,11 @@ enum ieee80211_roc_type {
* AP, IBSS/WDS/mesh peer etc. This callback can sleep.
*
* @sta_remove: Notifies low level driver about removal of an associated
- * station, AP, IBSS/WDS/mesh peer etc. This callback can sleep.
+ * station, AP, IBSS/WDS/mesh peer etc. Note that after the callback
+ * returns it isn't safe to use the pointer, not even RCU protected;
+ * no RCU grace period is guaranteed between returning here and freeing
+ * the station. See @sta_pre_rcu_remove if needed.
+ * This callback can sleep.
*
* @sta_add_debugfs: Drivers can use this callback to add debugfs files
* when a station is added to mac80211's station list. This callback
@@ -2514,7 +2518,17 @@ enum ieee80211_roc_type {
* station (which can be the AP, a client, IBSS/WDS/mesh peer etc.)
* This callback is mutually exclusive with @sta_add/@sta_remove.
* It must not fail for down transitions but may fail for transitions
- * up the list of states.
+ * up the list of states. Also note that after the callback returns it
+ * isn't safe to use the pointer, not even RCU protected - no RCU grace
+ * period is guaranteed between returning here and freeing the station.
+ * See @sta_pre_rcu_remove if needed.
+ * The callback can sleep.
+ *
+ * @sta_pre_rcu_remove: Notify driver about station removal before RCU
+ * synchronisation. This is useful if a driver needs to have station
+ * pointers protected using RCU, it can then use this call to clear
+ * the pointers instead of waiting for an RCU grace period to elapse
+ * in @sta_state.
* The callback can sleep.
*
* @sta_rc_update: Notifies the driver of changes to the bitrates that can be
@@ -2827,6 +2841,9 @@ struct ieee80211_ops {
struct ieee80211_sta *sta,
enum ieee80211_sta_state old_state,
enum ieee80211_sta_state new_state);
+ void (*sta_pre_rcu_remove)(struct ieee80211_hw *hw,
+ struct ieee80211_vif *vif,
+ struct ieee80211_sta *sta);
void (*sta_rc_update)(struct ieee80211_hw *hw,
struct ieee80211_vif *vif,
struct ieee80211_sta *sta,
diff --git a/net/mac80211/driver-ops.h b/net/mac80211/driver-ops.h
index f98059a..ef8b385 100644
--- a/net/mac80211/driver-ops.h
+++ b/net/mac80211/driver-ops.h
@@ -534,6 +534,22 @@ static inline void drv_sta_remove_debugfs(struct ieee80211_local *local,
}
#endif

+static inline void drv_sta_pre_rcu_remove(struct ieee80211_local *local,
+ struct ieee80211_sub_if_data *sdata,
+ struct sta_info *sta)
+{
+ might_sleep();
+
+ sdata = get_bss_sdata(sdata);
+ check_sdata_in_driver(sdata);
+
+ trace_drv_sta_pre_rcu_remove(local, sdata, &sta->sta);
+ if (local->ops->sta_pre_rcu_remove)
+ local->ops->sta_pre_rcu_remove(&local->hw, &sdata->vif,
+ &sta->sta);
+ trace_drv_return_void(local);
+}
+
static inline __must_check
int drv_sta_state(struct ieee80211_local *local,
struct ieee80211_sub_if_data *sdata,
diff --git a/net/mac80211/sta_info.c b/net/mac80211/sta_info.c
index 8ae37f6..1e14774 100644
--- a/net/mac80211/sta_info.c
+++ b/net/mac80211/sta_info.c
@@ -873,6 +873,8 @@ int __must_check __sta_info_destroy(struct sta_info *sta)

list_del_rcu(&sta->list);

+ drv_sta_pre_rcu_remove(local, sta->sdata, sta);
+
/* this always calls synchronize_net() */
ieee80211_free_sta_keys(local, sta);

diff --git a/net/mac80211/trace.h b/net/mac80211/trace.h
index 9d213e8..dc7d745 100644
--- a/net/mac80211/trace.h
+++ b/net/mac80211/trace.h
@@ -766,7 +766,7 @@ TRACE_EVENT(drv_sta_rc_update,
)
);

-TRACE_EVENT(drv_sta_add,
+DECLARE_EVENT_CLASS(sta_event,
TP_PROTO(struct ieee80211_local *local,
struct ieee80211_sub_if_data *sdata,
struct ieee80211_sta *sta),
@@ -791,29 +791,25 @@ TRACE_EVENT(drv_sta_add,
)
);

-TRACE_EVENT(drv_sta_remove,
+DEFINE_EVENT(sta_event, drv_sta_add,
TP_PROTO(struct ieee80211_local *local,
struct ieee80211_sub_if_data *sdata,
struct ieee80211_sta *sta),
+ TP_ARGS(local, sdata, sta)
+);

- TP_ARGS(local, sdata, sta),
-
- TP_STRUCT__entry(
- LOCAL_ENTRY
- VIF_ENTRY
- STA_ENTRY
- ),
-
- TP_fast_assign(
- LOCAL_ASSIGN;
- VIF_ASSIGN;
- STA_ASSIGN;
- ),
+DEFINE_EVENT(sta_event, drv_sta_remove,
+ TP_PROTO(struct ieee80211_local *local,
+ struct ieee80211_sub_if_data *sdata,
+ struct ieee80211_sta *sta),
+ TP_ARGS(local, sdata, sta)
+);

- TP_printk(
- LOCAL_PR_FMT VIF_PR_FMT STA_PR_FMT,
- LOCAL_PR_ARG, VIF_PR_ARG, STA_PR_ARG
- )
+DEFINE_EVENT(sta_event, drv_sta_pre_rcu_remove,
+ TP_PROTO(struct ieee80211_local *local,
+ struct ieee80211_sub_if_data *sdata,
+ struct ieee80211_sta *sta),
+ TP_ARGS(local, sdata, sta)
);

TRACE_EVENT(drv_conf_tx,
--
1.8.4.rc3


2013-12-05 12:36:04

by Johannes Berg

[permalink] [raw]
Subject: [PATCH 4/8] mac80211: don't delay station destruction

From: Johannes Berg <[email protected]>

If we can assume that stations are never referenced by the
driver after sta_state returns (and this is true since the
previous iwlmvm patch and for all other drivers) then we
don't need to delay station destruction, and don't need to
play tricks with rcu_barrier() etc.

This should speed up some scenarios like hostapd shutdown.

Signed-off-by: Johannes Berg <[email protected]>
---
net/mac80211/cfg.c | 10 +++-----
net/mac80211/ieee80211_i.h | 4 ----
net/mac80211/iface.c | 27 ++++-----------------
net/mac80211/mlme.c | 2 +-
net/mac80211/pm.c | 3 +--
net/mac80211/sta_info.c | 58 ++--------------------------------------------
net/mac80211/sta_info.h | 27 +--------------------
7 files changed, 12 insertions(+), 119 deletions(-)

diff --git a/net/mac80211/cfg.c b/net/mac80211/cfg.c
index 754069c..f47c7c0 100644
--- a/net/mac80211/cfg.c
+++ b/net/mac80211/cfg.c
@@ -1098,15 +1098,11 @@ static int ieee80211_stop_ap(struct wiphy *wiphy, struct net_device *dev)
kfree_rcu(old_probe_resp, rcu_head);

list_for_each_entry(vlan, &sdata->u.ap.vlans, u.vlan.list)
- sta_info_flush_defer(vlan);
- sta_info_flush_defer(sdata);
+ sta_info_flush(vlan);
+ sta_info_flush(sdata);
synchronize_net();
- rcu_barrier();
- list_for_each_entry(vlan, &sdata->u.ap.vlans, u.vlan.list) {
- sta_info_flush_cleanup(vlan);
+ list_for_each_entry(vlan, &sdata->u.ap.vlans, u.vlan.list)
ieee80211_free_keys(vlan);
- }
- sta_info_flush_cleanup(sdata);
ieee80211_free_keys(sdata);

sdata->vif.bss_conf.enable_beacon = false;
diff --git a/net/mac80211/ieee80211_i.h b/net/mac80211/ieee80211_i.h
index 8cd84ce..2251111 100644
--- a/net/mac80211/ieee80211_i.h
+++ b/net/mac80211/ieee80211_i.h
@@ -800,10 +800,6 @@ struct ieee80211_sub_if_data {
u32 mntr_flags;
} u;

- spinlock_t cleanup_stations_lock;
- struct list_head cleanup_stations;
- struct work_struct cleanup_stations_wk;
-
#ifdef CONFIG_MAC80211_DEBUGFS
struct {
struct dentry *subdir_stations;
diff --git a/net/mac80211/iface.c b/net/mac80211/iface.c
index 3e5c89a..c5e9a11 100644
--- a/net/mac80211/iface.c
+++ b/net/mac80211/iface.c
@@ -786,10 +786,8 @@ static void ieee80211_do_stop(struct ieee80211_sub_if_data *sdata,
* This is relevant only in WDS mode, in all other modes we've
* already removed all stations when disconnecting or similar,
* so warn otherwise.
- *
- * We call sta_info_flush_cleanup() later, to combine RCU waits.
*/
- flushed = sta_info_flush_defer(sdata);
+ flushed = sta_info_flush(sdata);
WARN_ON_ONCE((sdata->vif.type != NL80211_IFTYPE_WDS && flushed > 0) ||
(sdata->vif.type == NL80211_IFTYPE_WDS && flushed != 1));

@@ -892,16 +890,11 @@ static void ieee80211_do_stop(struct ieee80211_sub_if_data *sdata,
/*
* When we get here, the interface is marked down.
*
- * sta_info_flush_cleanup() requires rcu_barrier()
- * first to wait for the station call_rcu() calls
- * to complete, and we also need synchronize_rcu()
- * to wait for the RX path in case it is using the
- * interface and enqueuing frames at this very time on
- * another CPU.
+ * We need synchronize_rcu() to wait for the RX path in
+ * case it is using the interface and enqueuing frames
+ * at this very time on another CPU.
*/
synchronize_rcu();
- rcu_barrier();
- sta_info_flush_cleanup(sdata);

/*
* Free all remaining keys, there shouldn't be any,
@@ -1569,15 +1562,6 @@ static void ieee80211_assign_perm_addr(struct ieee80211_local *local,
mutex_unlock(&local->iflist_mtx);
}

-static void ieee80211_cleanup_sdata_stas_wk(struct work_struct *wk)
-{
- struct ieee80211_sub_if_data *sdata;
-
- sdata = container_of(wk, struct ieee80211_sub_if_data, cleanup_stations_wk);
-
- ieee80211_cleanup_sdata_stas(sdata);
-}
-
int ieee80211_if_add(struct ieee80211_local *local, const char *name,
struct wireless_dev **new_wdev, enum nl80211_iftype type,
struct vif_params *params)
@@ -1650,9 +1634,6 @@ int ieee80211_if_add(struct ieee80211_local *local, const char *name,

INIT_LIST_HEAD(&sdata->key_list);

- spin_lock_init(&sdata->cleanup_stations_lock);
- INIT_LIST_HEAD(&sdata->cleanup_stations);
- INIT_WORK(&sdata->cleanup_stations_wk, ieee80211_cleanup_sdata_stas_wk);
INIT_DELAYED_WORK(&sdata->dfs_cac_timer_work,
ieee80211_dfs_cac_timer_work);
INIT_DELAYED_WORK(&sdata->dec_tailroom_needed_wk,
diff --git a/net/mac80211/mlme.c b/net/mac80211/mlme.c
index dcec1bd..37b09d2 100644
--- a/net/mac80211/mlme.c
+++ b/net/mac80211/mlme.c
@@ -1698,7 +1698,7 @@ static void ieee80211_set_disassoc(struct ieee80211_sub_if_data *sdata,
memset(ifmgd->bssid, 0, ETH_ALEN);

/* remove AP and TDLS peers */
- sta_info_flush_defer(sdata);
+ sta_info_flush(sdata);

/* finally reset all BSS / config parameters */
changed |= ieee80211_reset_erp_info(sdata);
diff --git a/net/mac80211/pm.c b/net/mac80211/pm.c
index 3401262..af64fb8 100644
--- a/net/mac80211/pm.c
+++ b/net/mac80211/pm.c
@@ -37,9 +37,8 @@ int __ieee80211_suspend(struct ieee80211_hw *hw, struct cfg80211_wowlan *wowlan)
IEEE80211_MAX_QUEUE_MAP,
IEEE80211_QUEUE_STOP_REASON_SUSPEND);

- /* flush out all packets and station cleanup call_rcu()s */
+ /* flush out all packets */
synchronize_net();
- rcu_barrier();

ieee80211_flush_queues(local, NULL);

diff --git a/net/mac80211/sta_info.c b/net/mac80211/sta_info.c
index 557ac25..7241f322 100644
--- a/net/mac80211/sta_info.c
+++ b/net/mac80211/sta_info.c
@@ -99,23 +99,6 @@ static void cleanup_single_sta(struct sta_info *sta)
struct ieee80211_local *local = sdata->local;
struct ps_data *ps;

- /*
- * At this point, when being called as call_rcu callback,
- * neither mac80211 nor the driver can reference this
- * sta struct any more except by still existing timers
- * associated with this station that we clean up below.
- *
- * Note though that this still uses the sdata and even
- * calls the driver in AP and mesh mode, so interfaces
- * of those types mush use call sta_info_flush_cleanup()
- * (typically via sta_info_flush()) before deconfiguring
- * the driver.
- *
- * In station mode, nothing happens here so it doesn't
- * have to (and doesn't) do that, this is intentional to
- * speed up roaming.
- */
-
if (test_sta_flag(sta, WLAN_STA_PS_STA)) {
if (sta->sdata->vif.type == NL80211_IFTYPE_AP ||
sta->sdata->vif.type == NL80211_IFTYPE_AP_VLAN)
@@ -160,37 +143,6 @@ static void cleanup_single_sta(struct sta_info *sta)
sta_info_free(local, sta);
}

-void ieee80211_cleanup_sdata_stas(struct ieee80211_sub_if_data *sdata)
-{
- struct sta_info *sta;
-
- spin_lock_bh(&sdata->cleanup_stations_lock);
- while (!list_empty(&sdata->cleanup_stations)) {
- sta = list_first_entry(&sdata->cleanup_stations,
- struct sta_info, list);
- list_del(&sta->list);
- spin_unlock_bh(&sdata->cleanup_stations_lock);
-
- cleanup_single_sta(sta);
-
- spin_lock_bh(&sdata->cleanup_stations_lock);
- }
-
- spin_unlock_bh(&sdata->cleanup_stations_lock);
-}
-
-static void free_sta_rcu(struct rcu_head *h)
-{
- struct sta_info *sta = container_of(h, struct sta_info, rcu_head);
- struct ieee80211_sub_if_data *sdata = sta->sdata;
-
- spin_lock(&sdata->cleanup_stations_lock);
- list_add_tail(&sta->list, &sdata->cleanup_stations);
- spin_unlock(&sdata->cleanup_stations_lock);
-
- ieee80211_queue_work(&sdata->local->hw, &sdata->cleanup_stations_wk);
-}
-
/* protected by RCU */
struct sta_info *sta_info_get(struct ieee80211_sub_if_data *sdata,
const u8 *addr)
@@ -909,7 +861,7 @@ int __must_check __sta_info_destroy(struct sta_info *sta)
ieee80211_sta_debugfs_remove(sta);
ieee80211_recalc_min_chandef(sdata);

- call_rcu(&sta->rcu_head, free_sta_rcu);
+ cleanup_single_sta(sta);

return 0;
}
@@ -979,7 +931,7 @@ void sta_info_stop(struct ieee80211_local *local)
}


-int sta_info_flush_defer(struct ieee80211_sub_if_data *sdata)
+int sta_info_flush(struct ieee80211_sub_if_data *sdata)
{
struct ieee80211_local *local = sdata->local;
struct sta_info *sta, *tmp;
@@ -999,12 +951,6 @@ int sta_info_flush_defer(struct ieee80211_sub_if_data *sdata)
return ret;
}

-void sta_info_flush_cleanup(struct ieee80211_sub_if_data *sdata)
-{
- ieee80211_cleanup_sdata_stas(sdata);
- cancel_work_sync(&sdata->cleanup_stations_wk);
-}
-
void ieee80211_sta_expire(struct ieee80211_sub_if_data *sdata,
unsigned long exp_time)
{
diff --git a/net/mac80211/sta_info.h b/net/mac80211/sta_info.h
index 0218caf..9104f81 100644
--- a/net/mac80211/sta_info.h
+++ b/net/mac80211/sta_info.h
@@ -605,21 +605,6 @@ void sta_info_recalc_tim(struct sta_info *sta);

void sta_info_init(struct ieee80211_local *local);
void sta_info_stop(struct ieee80211_local *local);
-int sta_info_flush_defer(struct ieee80211_sub_if_data *sdata);
-
-/**
- * sta_info_flush_cleanup - flush the sta_info cleanup queue
- * @sdata: the interface
- *
- * Flushes the sta_info cleanup queue for a given interface;
- * this is necessary before the interface is removed or, for
- * AP/mesh interfaces, before it is deconfigured.
- *
- * Note an rcu_barrier() must precede the function, after all
- * stations have been flushed/removed to ensure the call_rcu()
- * calls that add stations to the cleanup queue have completed.
- */
-void sta_info_flush_cleanup(struct ieee80211_sub_if_data *sdata);

/**
* sta_info_flush - flush matching STA entries from the STA table
@@ -628,15 +613,7 @@ void sta_info_flush_cleanup(struct ieee80211_sub_if_data *sdata);
*
* @sdata: sdata to remove all stations from
*/
-static inline int sta_info_flush(struct ieee80211_sub_if_data *sdata)
-{
- int ret = sta_info_flush_defer(sdata);
-
- rcu_barrier();
- sta_info_flush_cleanup(sdata);
-
- return ret;
-}
+int sta_info_flush(struct ieee80211_sub_if_data *sdata);

void sta_set_rate_info_tx(struct sta_info *sta,
const struct ieee80211_tx_rate *rate,
@@ -651,6 +628,4 @@ void ieee80211_sta_ps_deliver_wakeup(struct sta_info *sta);
void ieee80211_sta_ps_deliver_poll_response(struct sta_info *sta);
void ieee80211_sta_ps_deliver_uapsd(struct sta_info *sta);

-void ieee80211_cleanup_sdata_stas(struct ieee80211_sub_if_data *sdata);
-
#endif /* STA_INFO_H */
--
1.8.4.rc3


2013-12-04 22:51:54

by Johannes Berg

[permalink] [raw]
Subject: Re: question for mac80211 driver maintainers - RCU usage in drivers

On Wed, 2013-12-04 at 23:47 +0100, Christian Lamparter wrote:
> On Wednesday, December 04, 2013 11:23:49 PM Johannes Berg wrote:
> > On Wed, 2013-12-04 at 23:16 +0100, Christian Lamparter wrote:
> > > > Is there any other driver that assumes it is safe to delete a station
> > > > pointer in the sta_state callback and not use synchronize_rcu()? From
> > > > looking at the code, I don't see any, but I can't really be sure that
> > > > everyone uses __rcu annotations correctly ... :)
> >
> > > No, carl9170 doesn't use sta_state but it uses sta_remove.
> >
> > Yeah, I actually saw this. But I think you use it for aggregation
> > sessions only? And the code didn't look like it could still get to the
> > station pointer after it was removed with sta_remove() callback, but I
> > may be wrong.
> no, you are not wrong. The code gets its sta pointers either from
> callback-parameters or via ieee80211_get_sta. [But would it even be
> possible to do it any other way? After all, this should crash "sooner
> or later" otherwise].

Yeah that's what I thought. Both of those are obviously fine. In iwlmvm,
we have something like this:

struct ieee80211_sta __rcu *our_stations[16];

and then just set it to NULL (well, simplifying a bit) on sta_remove
(actually sta_state.)

This is safe because after sta_state/sta_remove, mac80211 still does
call_rcu() to really free the station, but it's kinda pointless. By
adding the callback I'd suggested before to set this to NULL before the
synchronize_net() in mac80211, I think we can get rid of the whole
call_rcu() stuff.

Anyway, right now my patch is breaking iwlmvm, I'll try to sort that out
tomorrow and then post patches.

johannes


2013-12-04 22:47:15

by Christian Lamparter

[permalink] [raw]
Subject: Re: question for mac80211 driver maintainers - RCU usage in drivers

On Wednesday, December 04, 2013 11:23:49 PM Johannes Berg wrote:
> On Wed, 2013-12-04 at 23:16 +0100, Christian Lamparter wrote:
> > > Is there any other driver that assumes it is safe to delete a station
> > > pointer in the sta_state callback and not use synchronize_rcu()? From
> > > looking at the code, I don't see any, but I can't really be sure that
> > > everyone uses __rcu annotations correctly ... :)
>
> > No, carl9170 doesn't use sta_state but it uses sta_remove.
>
> Yeah, I actually saw this. But I think you use it for aggregation
> sessions only? And the code didn't look like it could still get to the
> station pointer after it was removed with sta_remove() callback, but I
> may be wrong.
no, you are not wrong. The code gets its sta pointers either from
callback-parameters or via ieee80211_get_sta. [But would it even be
possible to do it any other way? After all, this should crash "sooner
or later" otherwise].

> > I'm curious: how you would achieve this feat? After all, mac80211's
> > tx- and rx-paths currently relies on RCU protected station pointers
> > too (and all that goes along with it. e.g.: ampdu sessions, keys, ...)?
>
> Oh, that'll stay. But right now we do *two* things:
> a) synchronize_net() to protect against mac80211
> b) rcu_barrier() stuff (see commit b22cfcfca) to protect again
>
> It seems to me if the drivers don't assume RCU grace period
> after sta_state/sta_remove, then the second one can go away.
>
> > > PS: I'll probably have to add another callback "sta going away before
> > > RCU" so you can invalidate pointers there ... otherwise I'd have to
> > > synchronize_rcu() in iwlmvm which would kinda defeat the purpose.
> > hm, invalidating ampdu sessions is going to be tricky isn't it?
> > But ok, some time ago I played around with moving the complicated
> > amdpu scheduler of carl9170 into the (single-threaded) firmware.
> > So large parts of carl9170's code which uses rcu in this context would
> > become "unnecessary".
> >
> > I didn't merge the patch - since there was no obvious benefit - but
> > I still have them and they can be revived if needed.
>
> No, don't worry. I'm keeping the aggregation session guarantees even,
> right now I'm not touching that so each aggregation session still uses
> synchronize_rcu() etc.
Sold!

Regards

Christian

2013-12-04 22:16:34

by Christian Lamparter

[permalink] [raw]
Subject: Re: question for mac80211 driver maintainers - RCU usage in drivers

Hello,

On Wednesday, December 04, 2013 10:30:10 PM Johannes Berg wrote:
> Except for iwlmvm, I don't find much RCU usage wrt. stations in drivers.
carl9170 uses rcu too. (Once there was a nice statistic about rcu
usage by subsystem and drivers, but it somehow disappeared and I
can't find it anymore)

> Is there any other driver that assumes it is safe to delete a station
> pointer in the sta_state callback and not use synchronize_rcu()? From
> looking at the code, I don't see any, but I can't really be sure that
> everyone uses __rcu annotations correctly ... :)
No, carl9170 doesn't use sta_state but it uses sta_remove.
I'm curious: how you would achieve this feat? After all, mac80211's
tx- and rx-paths currently relies on RCU protected station pointers
too (and all that goes along with it. e.g.: ampdu sessions, keys, ...)?

> Would anyone object if we changed mac80211 to *immediately* free the
> station after calling the driver's sta_state (or sta_remove) callback?
No objection.

> We currently delay this until after an RCU grace period, but that way we
> end up having a lot of delay in station freeing ... We'd like to
> optimise that.
Regards,

Christian

> PS: I'll probably have to add another callback "sta going away before
> RCU" so you can invalidate pointers there ... otherwise I'd have to
> synchronize_rcu() in iwlmvm which would kinda defeat the purpose.
hm, invalidating ampdu sessions is going to be tricky isn't it?
But ok, some time ago I played around with moving the complicated
amdpu scheduler of carl9170 into the (single-threaded) firmware.
So large parts of carl9170's code which uses rcu in this context would
become "unnecessary".

I didn't merge the patch - since there was no obvious benefit - but
I still have them and they can be revived if needed.

2013-12-05 12:36:04

by Johannes Berg

[permalink] [raw]
Subject: [PATCH 7/8] mac80211: optimise mixed AP/VLAN station removal

From: Johannes Berg <[email protected]>

Teach sta_info_flush() to optionally also remove stations
from all VLANs associated with an AP interface to optimise
the station removal (in particular, synchronize_net().)

To not have to add the vlans argument throughout, do some
refactoring.

Signed-off-by: Johannes Berg <[email protected]>
---
net/mac80211/cfg.c | 4 +---
net/mac80211/sta_info.c | 8 ++++++--
net/mac80211/sta_info.h | 8 +++++++-
3 files changed, 14 insertions(+), 6 deletions(-)

diff --git a/net/mac80211/cfg.c b/net/mac80211/cfg.c
index f47c7c0..b045deb2 100644
--- a/net/mac80211/cfg.c
+++ b/net/mac80211/cfg.c
@@ -1097,9 +1097,7 @@ static int ieee80211_stop_ap(struct wiphy *wiphy, struct net_device *dev)
if (old_probe_resp)
kfree_rcu(old_probe_resp, rcu_head);

- list_for_each_entry(vlan, &sdata->u.ap.vlans, u.vlan.list)
- sta_info_flush(vlan);
- sta_info_flush(sdata);
+ __sta_info_flush(sdata, true);
synchronize_net();
list_for_each_entry(vlan, &sdata->u.ap.vlans, u.vlan.list)
ieee80211_free_keys(vlan);
diff --git a/net/mac80211/sta_info.c b/net/mac80211/sta_info.c
index 89d449d..4576ba0 100644
--- a/net/mac80211/sta_info.c
+++ b/net/mac80211/sta_info.c
@@ -960,7 +960,7 @@ void sta_info_stop(struct ieee80211_local *local)
}


-int sta_info_flush(struct ieee80211_sub_if_data *sdata)
+int __sta_info_flush(struct ieee80211_sub_if_data *sdata, bool vlans)
{
struct ieee80211_local *local = sdata->local;
struct sta_info *sta, *tmp;
@@ -969,9 +969,13 @@ int sta_info_flush(struct ieee80211_sub_if_data *sdata)

might_sleep();

+ WARN_ON(vlans && sdata->vif.type != NL80211_IFTYPE_AP);
+ WARN_ON(vlans && !sdata->bss);
+
mutex_lock(&local->sta_mtx);
list_for_each_entry_safe(sta, tmp, &local->sta_list, list) {
- if (sdata == sta->sdata) {
+ if (sdata == sta->sdata ||
+ (vlans && sdata->bss == sta->sdata->bss)) {
if (!WARN_ON(__sta_info_destroy_part1(sta)))
list_add(&sta->free_list, &free_list);
ret++;
diff --git a/net/mac80211/sta_info.h b/net/mac80211/sta_info.h
index 1f20664..82043e85 100644
--- a/net/mac80211/sta_info.h
+++ b/net/mac80211/sta_info.h
@@ -612,8 +612,14 @@ void sta_info_stop(struct ieee80211_local *local);
* Returns the number of removed STA entries.
*
* @sdata: sdata to remove all stations from
+ * @vlans: if the given interface is an AP interface, also flush VLANs
*/
-int sta_info_flush(struct ieee80211_sub_if_data *sdata);
+int __sta_info_flush(struct ieee80211_sub_if_data *sdata, bool vlans);
+
+static inline int sta_info_flush(struct ieee80211_sub_if_data *sdata)
+{
+ return __sta_info_flush(sdata, false);
+}

void sta_set_rate_info_tx(struct sta_info *sta,
const struct ieee80211_tx_rate *rate,
--
1.8.4.rc3


2013-12-04 22:23:53

by Johannes Berg

[permalink] [raw]
Subject: Re: question for mac80211 driver maintainers - RCU usage in drivers

On Wed, 2013-12-04 at 23:16 +0100, Christian Lamparter wrote:

> > Is there any other driver that assumes it is safe to delete a station
> > pointer in the sta_state callback and not use synchronize_rcu()? From
> > looking at the code, I don't see any, but I can't really be sure that
> > everyone uses __rcu annotations correctly ... :)

> No, carl9170 doesn't use sta_state but it uses sta_remove.

Yeah, I actually saw this. But I think you use it for aggregation
sessions only? And the code didn't look like it could still get to the
station pointer after it was removed with sta_remove() callback, but I
may be wrong.

> I'm curious: how you would achieve this feat? After all, mac80211's
> tx- and rx-paths currently relies on RCU protected station pointers
> too (and all that goes along with it. e.g.: ampdu sessions, keys, ...)?

Oh, that'll stay. But right now we do *two* things:
a) synchronize_net() to protect against mac80211
b) rcu_barrier() stuff (see commit b22cfcfca) to protect again

It seems to me if the drivers don't assume RCU grace period after
sta_state/sta_remove, then the second one can go away.

> > PS: I'll probably have to add another callback "sta going away before
> > RCU" so you can invalidate pointers there ... otherwise I'd have to
> > synchronize_rcu() in iwlmvm which would kinda defeat the purpose.
> hm, invalidating ampdu sessions is going to be tricky isn't it?
> But ok, some time ago I played around with moving the complicated
> amdpu scheduler of carl9170 into the (single-threaded) firmware.
> So large parts of carl9170's code which uses rcu in this context would
> become "unnecessary".
>
> I didn't merge the patch - since there was no obvious benefit - but
> I still have them and they can be revived if needed.

No, don't worry. I'm keeping the aggregation session guarantees even,
right now I'm not touching that so each aggregation session still uses
synchronize_rcu() etc.

johannes


2013-12-16 10:30:31

by Johannes Berg

[permalink] [raw]
Subject: Re: [PATCH 0/8] mac80211: speed up station cleanup/AP shutdown

On Thu, 2013-12-05 at 13:35 +0100, Johannes Berg wrote:
> Inspired by hostap shutdown potentially taking a long time (which is
> in part also its fault), I took a closer look at the station removal
> code and I think it's too complex. Only a single driver (iwlmvm) has
> a need for an RCU grace period after sta_state() returns, and that
> can easily be addressed differently.
>
> So consolidate many of the synchronize_rcu()/_net() in these paths
> and get rid of the rcu_barrier() stuff completely.
>
> Testing is still pending, but it should improve timing.

Timing tests showed a huge improvement, applied all patches.

johannes


2013-12-05 17:27:36

by Johannes Berg

[permalink] [raw]
Subject: Re: [PATCH 0/8] mac80211: speed up station cleanup/AP shutdown

On Thu, 2013-12-05 at 18:02 +0100, Christian Lamparter wrote:
> On Thursday, December 05, 2013 01:35:48 PM Johannes Berg wrote:
> > Inspired by hostap shutdown potentially taking a long time (which is
> > in part also its fault), I took a closer look at the station removal
> > code and I think it's too complex. Only a single driver (iwlmvm) has
> > a need for an RCU grace period after sta_state() returns, and that
> > can easily be addressed differently.
> >
> > So consolidate many of the synchronize_rcu()/_net() in these paths
> > and get rid of the rcu_barrier() stuff completely.
> >
> > Testing is still pending, but it should improve timing.
>
> So far, I see no crashes or other undefined behavior with
> current carl9170 driver.

Cool. What I wrote came out bad - I did test that it still works with
hwsim and iwlwifi, but I didn't get any *timing* testing yet :)

johannes


2013-12-05 12:36:04

by Johannes Berg

[permalink] [raw]
Subject: [PATCH 6/8] mac80211: optimise synchronize_net() for sta_info_flush

From: Johannes Berg <[email protected]>

There's no reason to have one synchronize_net() for each
removed station, refactor the code slightly to have just
a single synchronize_net() for all stations.

Note that this is currently useless as hostapd removes
stations one by one and this coalescing never happens.

Signed-off-by: Johannes Berg <[email protected]>
---
net/mac80211/sta_info.c | 42 +++++++++++++++++++++++++++++++++++++++---
net/mac80211/sta_info.h | 2 +-
2 files changed, 40 insertions(+), 4 deletions(-)

diff --git a/net/mac80211/sta_info.c b/net/mac80211/sta_info.c
index 08e5076..89d449d 100644
--- a/net/mac80211/sta_info.c
+++ b/net/mac80211/sta_info.c
@@ -794,7 +794,7 @@ static bool sta_info_cleanup_expire_buffered(struct ieee80211_local *local,
return have_buffered;
}

-int __must_check __sta_info_destroy(struct sta_info *sta)
+static int __must_check __sta_info_destroy_part1(struct sta_info *sta)
{
struct ieee80211_local *local;
struct ieee80211_sub_if_data *sdata;
@@ -831,7 +831,23 @@ int __must_check __sta_info_destroy(struct sta_info *sta)
rcu_access_pointer(sdata->u.vlan.sta) == sta)
RCU_INIT_POINTER(sdata->u.vlan.sta, NULL);

- synchronize_net();
+ return 0;
+}
+
+static void __sta_info_destroy_part2(struct sta_info *sta)
+{
+ struct ieee80211_local *local = sta->local;
+ struct ieee80211_sub_if_data *sdata = sta->sdata;
+ int ret;
+
+ /*
+ * NOTE: This assumes at least synchronize_net() was done
+ * after _part1 and before _part2!
+ */
+
+ might_sleep();
+ lockdep_assert_held(&local->sta_mtx);
+
/* now keys can no longer be reached */
ieee80211_free_sta_keys(local, sta);

@@ -863,6 +879,18 @@ int __must_check __sta_info_destroy(struct sta_info *sta)
ieee80211_recalc_min_chandef(sdata);

cleanup_single_sta(sta);
+}
+
+int __must_check __sta_info_destroy(struct sta_info *sta)
+{
+ int err = __sta_info_destroy_part1(sta);
+
+ if (err)
+ return err;
+
+ synchronize_net();
+
+ __sta_info_destroy_part2(sta);

return 0;
}
@@ -936,6 +964,7 @@ int sta_info_flush(struct ieee80211_sub_if_data *sdata)
{
struct ieee80211_local *local = sdata->local;
struct sta_info *sta, *tmp;
+ LIST_HEAD(free_list);
int ret = 0;

might_sleep();
@@ -943,10 +972,17 @@ int sta_info_flush(struct ieee80211_sub_if_data *sdata)
mutex_lock(&local->sta_mtx);
list_for_each_entry_safe(sta, tmp, &local->sta_list, list) {
if (sdata == sta->sdata) {
- WARN_ON(__sta_info_destroy(sta));
+ if (!WARN_ON(__sta_info_destroy_part1(sta)))
+ list_add(&sta->free_list, &free_list);
ret++;
}
}
+
+ if (!list_empty(&free_list)) {
+ synchronize_net();
+ list_for_each_entry_safe(sta, tmp, &free_list, free_list)
+ __sta_info_destroy_part2(sta);
+ }
mutex_unlock(&local->sta_mtx);

return ret;
diff --git a/net/mac80211/sta_info.h b/net/mac80211/sta_info.h
index 9104f81..1f20664 100644
--- a/net/mac80211/sta_info.h
+++ b/net/mac80211/sta_info.h
@@ -329,7 +329,7 @@ struct ieee80211_tx_latency_stat {
*/
struct sta_info {
/* General information, mostly static */
- struct list_head list;
+ struct list_head list, free_list;
struct rcu_head rcu_head;
struct sta_info __rcu *hnext;
struct ieee80211_local *local;
--
1.8.4.rc3


2013-12-05 12:36:05

by Johannes Berg

[permalink] [raw]
Subject: [PATCH 8/8] mac80211: free all AP/VLAN keys at once

From: Johannes Berg <[email protected]>

When the AP interface is stopped, free all AP and VLAN keys at
once to only require synchronize_net() once. Since that does
synchronize_net(), also move two such calls into the function
(using the new force_synchronize parameter) to avoid doing it
twice.

Signed-off-by: Johannes Berg <[email protected]>
---
net/mac80211/cfg.c | 5 +----
net/mac80211/iface.c | 19 ++++++++-----------
net/mac80211/key.c | 44 ++++++++++++++++++++++++++++++++------------
net/mac80211/key.h | 3 ++-
4 files changed, 43 insertions(+), 28 deletions(-)

diff --git a/net/mac80211/cfg.c b/net/mac80211/cfg.c
index b045deb2..556809a 100644
--- a/net/mac80211/cfg.c
+++ b/net/mac80211/cfg.c
@@ -1098,10 +1098,7 @@ static int ieee80211_stop_ap(struct wiphy *wiphy, struct net_device *dev)
kfree_rcu(old_probe_resp, rcu_head);

__sta_info_flush(sdata, true);
- synchronize_net();
- list_for_each_entry(vlan, &sdata->u.ap.vlans, u.vlan.list)
- ieee80211_free_keys(vlan);
- ieee80211_free_keys(sdata);
+ ieee80211_free_keys(sdata, true);

sdata->vif.bss_conf.enable_beacon = false;
sdata->vif.bss_conf.ssid_len = 0;
diff --git a/net/mac80211/iface.c b/net/mac80211/iface.c
index c5e9a11..3ef9674 100644
--- a/net/mac80211/iface.c
+++ b/net/mac80211/iface.c
@@ -889,18 +889,15 @@ static void ieee80211_do_stop(struct ieee80211_sub_if_data *sdata,
cancel_work_sync(&sdata->work);
/*
* When we get here, the interface is marked down.
+ * Free the remaining keys, if there are any
+ * (shouldn't be, except maybe in WDS mode?)
*
- * We need synchronize_rcu() to wait for the RX path in
- * case it is using the interface and enqueuing frames
- * at this very time on another CPU.
+ * Force the key freeing to always synchronize_net()
+ * to wait for the RX path in case it is using this
+ * interface enqueuing frames * at this very time on
+ * another CPU.
*/
- synchronize_rcu();
-
- /*
- * Free all remaining keys, there shouldn't be any,
- * except maybe in WDS mode?
- */
- ieee80211_free_keys(sdata);
+ ieee80211_free_keys(sdata, true);

/* fall through */
case NL80211_IFTYPE_AP:
@@ -1026,7 +1023,7 @@ static void ieee80211_teardown_sdata(struct ieee80211_sub_if_data *sdata)
int i;

/* free extra data */
- ieee80211_free_keys(sdata);
+ ieee80211_free_keys(sdata, false);

ieee80211_debugfs_remove_netdev(sdata);

diff --git a/net/mac80211/key.c b/net/mac80211/key.c
index 12e6154..6ff65a1 100644
--- a/net/mac80211/key.c
+++ b/net/mac80211/key.c
@@ -589,14 +589,10 @@ void ieee80211_iter_keys(struct ieee80211_hw *hw,
}
EXPORT_SYMBOL(ieee80211_iter_keys);

-void ieee80211_free_keys(struct ieee80211_sub_if_data *sdata)
+static void ieee80211_free_keys_iface(struct ieee80211_sub_if_data *sdata,
+ struct list_head *keys)
{
struct ieee80211_key *key, *tmp;
- LIST_HEAD(keys);
-
- cancel_delayed_work_sync(&sdata->dec_tailroom_needed_wk);
-
- mutex_lock(&sdata->local->key_mtx);

sdata->crypto_tx_tailroom_needed_cnt -=
sdata->crypto_tx_tailroom_pending_dec;
@@ -608,21 +604,45 @@ void ieee80211_free_keys(struct ieee80211_sub_if_data *sdata)
ieee80211_key_replace(key->sdata, key->sta,
key->conf.flags & IEEE80211_KEY_FLAG_PAIRWISE,
key, NULL);
- list_add_tail(&key->list, &keys);
+ list_add_tail(&key->list, keys);
}

ieee80211_debugfs_key_update_default(sdata);
+}

- if (!list_empty(&keys)) {
- synchronize_net();
- list_for_each_entry_safe(key, tmp, &keys, list)
- __ieee80211_key_destroy(key, false);
+void ieee80211_free_keys(struct ieee80211_sub_if_data *sdata,
+ bool force_synchronize)
+{
+ struct ieee80211_local *local = sdata->local;
+ struct ieee80211_sub_if_data *vlan;
+ struct ieee80211_key *key, *tmp;
+ LIST_HEAD(keys);
+
+ cancel_delayed_work_sync(&sdata->dec_tailroom_needed_wk);
+
+ mutex_lock(&local->key_mtx);
+
+ ieee80211_free_keys_iface(sdata, &keys);
+
+ if (sdata->vif.type == NL80211_IFTYPE_AP) {
+ list_for_each_entry(vlan, &sdata->u.ap.vlans, u.vlan.list)
+ ieee80211_free_keys_iface(vlan, &keys);
}

+ if (!list_empty(&keys) || force_synchronize)
+ synchronize_net();
+ list_for_each_entry_safe(key, tmp, &keys, list)
+ __ieee80211_key_destroy(key, false);
+
WARN_ON_ONCE(sdata->crypto_tx_tailroom_needed_cnt ||
sdata->crypto_tx_tailroom_pending_dec);
+ if (sdata->vif.type == NL80211_IFTYPE_AP) {
+ list_for_each_entry(vlan, &sdata->u.ap.vlans, u.vlan.list)
+ WARN_ON_ONCE(vlan->crypto_tx_tailroom_needed_cnt ||
+ vlan->crypto_tx_tailroom_pending_dec);
+ }

- mutex_unlock(&sdata->local->key_mtx);
+ mutex_unlock(&local->key_mtx);
}

void ieee80211_free_sta_keys(struct ieee80211_local *local,
diff --git a/net/mac80211/key.h b/net/mac80211/key.h
index 0aebb88..19db686 100644
--- a/net/mac80211/key.h
+++ b/net/mac80211/key.h
@@ -136,7 +136,8 @@ void ieee80211_set_default_key(struct ieee80211_sub_if_data *sdata, int idx,
bool uni, bool multi);
void ieee80211_set_default_mgmt_key(struct ieee80211_sub_if_data *sdata,
int idx);
-void ieee80211_free_keys(struct ieee80211_sub_if_data *sdata);
+void ieee80211_free_keys(struct ieee80211_sub_if_data *sdata,
+ bool force_synchronize);
void ieee80211_free_sta_keys(struct ieee80211_local *local,
struct sta_info *sta);
void ieee80211_enable_keys(struct ieee80211_sub_if_data *sdata);
--
1.8.4.rc3


2013-12-05 12:36:04

by Johannes Berg

[permalink] [raw]
Subject: [PATCH 5/8] mac80211: move synchronize_net() before sta key removal

From: Johannes Berg <[email protected]>

There's no reason to do this inside the sta key removal
since the keys can only be reached through the sta (and
not by the driver at all) so once the sta can no longer
be reached, the keys are safe.

This will allow further optimisation opportunities with
multiple stations.

Signed-off-by: Johannes Berg <[email protected]>
---
net/mac80211/key.c | 16 +++-------------
net/mac80211/sta_info.c | 3 ++-
2 files changed, 5 insertions(+), 14 deletions(-)

diff --git a/net/mac80211/key.c b/net/mac80211/key.c
index e568d98..12e6154 100644
--- a/net/mac80211/key.c
+++ b/net/mac80211/key.c
@@ -628,8 +628,7 @@ void ieee80211_free_keys(struct ieee80211_sub_if_data *sdata)
void ieee80211_free_sta_keys(struct ieee80211_local *local,
struct sta_info *sta)
{
- struct ieee80211_key *key, *tmp;
- LIST_HEAD(keys);
+ struct ieee80211_key *key;
int i;

mutex_lock(&local->key_mtx);
@@ -640,7 +639,7 @@ void ieee80211_free_sta_keys(struct ieee80211_local *local,
ieee80211_key_replace(key->sdata, key->sta,
key->conf.flags & IEEE80211_KEY_FLAG_PAIRWISE,
key, NULL);
- list_add(&key->list, &keys);
+ __ieee80211_key_destroy(key, true);
}

for (i = 0; i < NUM_DEFAULT_KEYS; i++) {
@@ -650,17 +649,8 @@ void ieee80211_free_sta_keys(struct ieee80211_local *local,
ieee80211_key_replace(key->sdata, key->sta,
key->conf.flags & IEEE80211_KEY_FLAG_PAIRWISE,
key, NULL);
- list_add(&key->list, &keys);
- }
-
- /*
- * NB: the station code relies on this being
- * done even if there aren't any keys
- */
- synchronize_net();
-
- list_for_each_entry_safe(key, tmp, &keys, list)
__ieee80211_key_destroy(key, true);
+ }

mutex_unlock(&local->key_mtx);
}
diff --git a/net/mac80211/sta_info.c b/net/mac80211/sta_info.c
index 7241f322..08e5076 100644
--- a/net/mac80211/sta_info.c
+++ b/net/mac80211/sta_info.c
@@ -831,7 +831,8 @@ int __must_check __sta_info_destroy(struct sta_info *sta)
rcu_access_pointer(sdata->u.vlan.sta) == sta)
RCU_INIT_POINTER(sdata->u.vlan.sta, NULL);

- /* this always calls synchronize_net() */
+ synchronize_net();
+ /* now keys can no longer be reached */
ieee80211_free_sta_keys(local, sta);

sta->dead = true;
--
1.8.4.rc3


2013-12-05 12:36:04

by Johannes Berg

[permalink] [raw]
Subject: [PATCH 0/8] mac80211: speed up station cleanup/AP shutdown

Inspired by hostap shutdown potentially taking a long time (which is
in part also its fault), I took a closer look at the station removal
code and I think it's too complex. Only a single driver (iwlmvm) has
a need for an RCU grace period after sta_state() returns, and that
can easily be addressed differently.

So consolidate many of the synchronize_rcu()/_net() in these paths
and get rid of the rcu_barrier() stuff completely.

Testing is still pending, but it should improve timing.

johannes


2013-12-05 12:36:04

by Johannes Berg

[permalink] [raw]
Subject: [PATCH 2/8] iwlwifi: mvm: use pre-RCU-sync sta removal operation

From: Johannes Berg <[email protected]>

iwlmvm relies on the current mac80211 behaviour of allowing
station pointers to be valid for an RCU grace period after
returning from the sta_state() callback. To optimise these
cases, this behaviour is going away, so make the driver use
the new sta_pre_rcu_remove() method to clear the pointer in
the fw_id_to_mac_id[] array.

Since this may happen while the station is still present in
the firmware, don't set the pointer to NULL but to -ENOENT
to mark this particular case. In client mode, the station
is kept even longer (until marking the MAC as unassociated)
so the drain flow must take this new behavior into account.

Signed-off-by: Johannes Berg <[email protected]>
---
drivers/net/wireless/iwlwifi/mvm/debugfs.c | 13 ++++++++-----
drivers/net/wireless/iwlwifi/mvm/mac80211.c | 23 +++++++++++++++++++++++
drivers/net/wireless/iwlwifi/mvm/sta.c | 11 +++++++++--
3 files changed, 40 insertions(+), 7 deletions(-)

diff --git a/drivers/net/wireless/iwlwifi/mvm/debugfs.c b/drivers/net/wireless/iwlwifi/mvm/debugfs.c
index 9864d71..fc570d0 100644
--- a/drivers/net/wireless/iwlwifi/mvm/debugfs.c
+++ b/drivers/net/wireless/iwlwifi/mvm/debugfs.c
@@ -479,14 +479,17 @@ static ssize_t iwl_dbgfs_mac_params_read(struct file *file,
if (vif->type == NL80211_IFTYPE_STATION &&
ap_sta_id != IWL_MVM_STATION_COUNT) {
struct ieee80211_sta *sta;
- struct iwl_mvm_sta *mvm_sta;

sta = rcu_dereference_protected(mvm->fw_id_to_mac_id[ap_sta_id],
lockdep_is_held(&mvm->mutex));
- mvm_sta = (void *)sta->drv_priv;
- pos += scnprintf(buf+pos, bufsz-pos,
- "ap_sta_id %d - reduced Tx power %d\n",
- ap_sta_id, mvm_sta->bt_reduced_txpower);
+ if (!IS_ERR_OR_NULL(sta)) {
+ struct iwl_mvm_sta *mvm_sta = (void *)sta->drv_priv;
+
+ pos += scnprintf(buf+pos, bufsz-pos,
+ "ap_sta_id %d - reduced Tx power %d\n",
+ ap_sta_id,
+ mvm_sta->bt_reduced_txpower);
+ }
}

rcu_read_lock();
diff --git a/drivers/net/wireless/iwlwifi/mvm/mac80211.c b/drivers/net/wireless/iwlwifi/mvm/mac80211.c
index b56c989..241b8f4 100644
--- a/drivers/net/wireless/iwlwifi/mvm/mac80211.c
+++ b/drivers/net/wireless/iwlwifi/mvm/mac80211.c
@@ -1102,6 +1102,28 @@ static void iwl_mvm_mac_sta_notify(struct ieee80211_hw *hw,
}
}

+static void iwl_mvm_sta_pre_rcu_remove(struct ieee80211_hw *hw,
+ struct ieee80211_vif *vif,
+ struct ieee80211_sta *sta)
+{
+ struct iwl_mvm *mvm = IWL_MAC80211_GET_MVM(hw);
+ struct iwl_mvm_sta *mvm_sta = (void *)sta->drv_priv;
+
+ /*
+ * This is called before mac80211 does RCU synchronisation,
+ * so here we already invalidate our internal RCU-protected
+ * station pointer. The rest of the code will thus no longer
+ * be able to find the station this way, and we don't rely
+ * on further RCU synchronisation after the sta_state()
+ * callback deleted the station.
+ */
+ mutex_lock(&mvm->mutex);
+ if (sta == rcu_access_pointer(mvm->fw_id_to_mac_id[mvm_sta->sta_id]))
+ rcu_assign_pointer(mvm->fw_id_to_mac_id[mvm_sta->sta_id],
+ ERR_PTR(-ENOENT));
+ mutex_unlock(&mvm->mutex);
+}
+
static int iwl_mvm_mac_sta_state(struct ieee80211_hw *hw,
struct ieee80211_vif *vif,
struct ieee80211_sta *sta,
@@ -1748,6 +1770,7 @@ struct ieee80211_ops iwl_mvm_hw_ops = {
.bss_info_changed = iwl_mvm_bss_info_changed,
.hw_scan = iwl_mvm_mac_hw_scan,
.cancel_hw_scan = iwl_mvm_mac_cancel_hw_scan,
+ .sta_pre_rcu_remove = iwl_mvm_sta_pre_rcu_remove,
.sta_state = iwl_mvm_mac_sta_state,
.sta_notify = iwl_mvm_mac_sta_notify,
.allow_buffered_frames = iwl_mvm_mac_allow_buffered_frames,
diff --git a/drivers/net/wireless/iwlwifi/mvm/sta.c b/drivers/net/wireless/iwlwifi/mvm/sta.c
index 3299523..1b6b4a6 100644
--- a/drivers/net/wireless/iwlwifi/mvm/sta.c
+++ b/drivers/net/wireless/iwlwifi/mvm/sta.c
@@ -452,8 +452,15 @@ void iwl_mvm_sta_drained_wk(struct work_struct *wk)
rcu_dereference_protected(mvm->fw_id_to_mac_id[sta_id],
lockdep_is_held(&mvm->mutex));

- /* This station is in use */
- if (!IS_ERR(sta))
+ /*
+ * This station is in use or RCU-removed; the latter happens in
+ * managed mode, where mac80211 removes the station before we
+ * can remove it from firmware (we can only do that after the
+ * MAC is marked unassociated), and possibly while the deauth
+ * frame to disconnect from the AP is still queued. Then, the
+ * station pointer is -ENOENT when the last skb is reclaimed.
+ */
+ if (!IS_ERR(sta) || PTR_ERR(sta) == -ENOENT)
continue;

if (PTR_ERR(sta) == -EINVAL) {
--
1.8.4.rc3


2013-12-05 12:36:04

by Johannes Berg

[permalink] [raw]
Subject: [PATCH 3/8] mac80211: move 4-addr sta pointer clearing before synchronize_rcu()

From: Johannes Berg <[email protected]>

The pointer should be cleared before synchronize_rcu() so that the
consequently dead station won't be found by any lookups in the TX
or RX paths.

Also check that the station is actually the one being removed, the
check is not needed because each 4-addr VLAN can only have a single
station and non-4-addr VLANs always have a NULL pointer there, but
the code is clearer this way (and we avoid the memory write.)

Signed-off-by: Johannes Berg <[email protected]>
---
net/mac80211/sta_info.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/net/mac80211/sta_info.c b/net/mac80211/sta_info.c
index 1e14774..557ac25 100644
--- a/net/mac80211/sta_info.c
+++ b/net/mac80211/sta_info.c
@@ -875,6 +875,10 @@ int __must_check __sta_info_destroy(struct sta_info *sta)

drv_sta_pre_rcu_remove(local, sta->sdata, sta);

+ if (sdata->vif.type == NL80211_IFTYPE_AP_VLAN &&
+ rcu_access_pointer(sdata->u.vlan.sta) == sta)
+ RCU_INIT_POINTER(sdata->u.vlan.sta, NULL);
+
/* this always calls synchronize_net() */
ieee80211_free_sta_keys(local, sta);

@@ -883,9 +887,6 @@ int __must_check __sta_info_destroy(struct sta_info *sta)
local->num_sta--;
local->sta_generation++;

- if (sdata->vif.type == NL80211_IFTYPE_AP_VLAN)
- RCU_INIT_POINTER(sdata->u.vlan.sta, NULL);
-
while (sta->sta_state > IEEE80211_STA_NONE) {
ret = sta_info_move_state(sta, sta->sta_state - 1);
if (ret) {
--
1.8.4.rc3


2013-12-05 17:02:11

by Christian Lamparter

[permalink] [raw]
Subject: Re: [PATCH 0/8] mac80211: speed up station cleanup/AP shutdown

On Thursday, December 05, 2013 01:35:48 PM Johannes Berg wrote:
> Inspired by hostap shutdown potentially taking a long time (which is
> in part also its fault), I took a closer look at the station removal
> code and I think it's too complex. Only a single driver (iwlmvm) has
> a need for an RCU grace period after sta_state() returns, and that
> can easily be addressed differently.
>
> So consolidate many of the synchronize_rcu()/_net() in these paths
> and get rid of the rcu_barrier() stuff completely.
>
> Testing is still pending, but it should improve timing.

So far, I see no crashes or other undefined behavior with
current carl9170 driver.

Regards,
Chr