2018-08-29 20:40:24

by Alexander Wetzel

[permalink] [raw]
Subject: [PATCH v7 0/2] Fix PTK rekey freezes and clear text leak

This patch series addresses a long standing issue how mac80211 handles
PTK rekeys. The current implementation can only work with SW crypto or
by chance, e.g. if there are no frames transmitted while the STAs are
rekeying or you hit just the right combination of cards/drivers.

Any ongoing transmission while rekeying will very likely freeze the
connection till the connection is rekeyed again or the user manually
reconnects. Without any indication why, even in a kernel trace.

The multiple ways how this can go wrong are outlined in the commit
message from the last patch in this series, but here a short overview:

The main culprit for that is encryption offloading to the card while
handling the PN (IV) in mac80211 without any synchronization in between.
This allows the replay detection code to account frames intended for
the old key against the new one, which sets the PN to a value which was
correct for the old key but will drop all frames send with a PN
belonging to the new key.
The solution is of course to make sure frames prepared for the old key
are never checked against the PN (IV) of the new key, thus preventing
the invalid carry over of the old PN value to the new key.

The issue is complicated by the fact that at last some drivers do not
expect to be asked to replace a key which may be actively in use for
transmitting frames. Ath9k is e.g. simply removing the key and then
sends out the queued frames in clear till the new key is installed.

As a conclusion we therefore have to assume that all drivers which do
not actively tell us that they can handle replacing an in-use key must
not be asked to do so. Unfortunately the rekey decision is solely the
responsibility of the user space and when the kernel refuses to replace
a key those programs are suddenly exposed to an new error condition.
At least wpa_supplicant currently reacts badly to that and assumes the
PSK is wrong instead of simply quickly reconnecting when trying that.

We also do not have an established way to inform the user space that the
rekey operation is not supported and it must not use it.

As a way forward this patch series makes the needed changes to correctly
rekey connections and allowing the user space to check if PTK rekeys can
be used at all. While enforcing this restriction would probably be OK
there are some constellations where it can work. So instead of reporting
an error back to the user space we now only print out a warning and fall
back to a best-we-can-do approach to maintain backward compatibility.

Downside here is, that till the user space catches up - or all drivers
are supporting the new API for in-use key replaces - users may still
suffer connection freezes and leak clear text frames. But it should only
be a fraction of what it would be without this patch and not break
anything in the cases where it's currently working.

It's also worth mentioning that most of the pitfalls rekeying a PTK has
could have been avoided if the first IEEE 802.11 standards would already
have had the option introduced in the 2012 version, named
"Extended Key ID for Individually Addressed Frames". This basically
drills down to using key ID 0 and 1 for PTK keys (and shift GTKs to 2
and 3), allowing to rollover PTK keys the same way it has been
established for GTK keys.
Supporting this addition will be the ultimate solution for the issues,
but since it only can be used if both sides are supporting it we still
have to handle PTK only using the key ID 0.

Here a quick overview of the patches in the series:
1) nl80211: Add CAN_REPLACE_PTK0 API
This adds support for @NL80211_EXT_FEATURE_CAN_REPLACE_PTK0
to nl80211. We expect the user space (hostap, wpa_supplicant, iwd ...)
to check this flag and only rekey a connection when this flag is
present. When the flag is not set a rekey has to be "emulated" by a
full de-association and a reconnect if it can't be avoided by the
user space.

2) mac80211: Fix PTK rekey freezes and clear text leak
This changes how mac80211 handles the rekey. HW keys are now switched
over to the new key prior to mac80211 for both PTK and GTK. Also the
driver won't get any frames depending on the key we are replacing
from mac80211 till the switch to the new key has been completed and
all running aggregation sessions for the STA are torn down to avoid
really complex code to making those save during a rekey.
When a driver is not signaling compliance with the new requirements
requested for @NL80211_EXT_FEATURE_CAN_REPLACE_PTK0 by setting the
new flag mac80211 will output a rate limited warning and calls the
optional flush() callback from the driver to increase the chance the
driver will work correctly.

Version history:
V7 Fix PTK rekey freezes and clear text leaks
- renamed @NL80211_EXT_FEATURE_ATOMIC_KEY_REPLACE to
@NL80211_EXT_FEATURE_CAN_REPLACE_PTK0
- dropped replace_key() patch, using
@NL80211_EXT_FEATURE_ATOMIC_KEY_REPLACE as replacement
- updated the "Hardware crypto acceleration" doc
- updated commits, cover letter and some comments

V6 Fix PTK rekey freezes and clear text leaks
- typo fix in comment (beeing -> being)

V5 Fix PTK rekey freezes and clear text leaks
- rewritten most of the cover letter to give a better overview
- Make "HW installs key prior to mac80211" the default for all
key installs. (Cleaner, better understandable code.)
- best-we-can-do approach for drivers not implementing replace_key
which should work for many drivers.

V4 Fix PTK rekey freezes and clear text leaks
- Switched over to a small patch series.
- Allows insecure rekeys again for compatibility
- Allows the user space to check if rekeys are safe by extending
nl80211.

V3 mac80211: Fix PTK rekey freezes and clear text leaks
Updates the mac80211 API. When the driver is implementing the new
callback replace_key mac80211 allows PTK rekeys. If not it returns
an error on key install to the requester.

V2 mac80211: Fix wlan freezes under load at rekey
This fixes the issue in mac80211 only without API changes on a
best-can-do approach. Drivers still can freeze the connection and if
so have to be fixed.

V1 mac80211: Fix wlan freezes under load at rekey
Very simple approach, only fixing the freezes and using a not
guaranteed to be working fallback to SW encryption.

Alexander Wetzel (2):
nl80211: Add CAN_REPLACE_PTK0 API
mac80211: Fix PTK rekey freezes and clear text leak

include/net/mac80211.h | 13 ++++
include/uapi/linux/nl80211.h | 6 ++
net/mac80211/key.c | 118 +++++++++++++++++++++++++++++------
net/mac80211/tx.c | 4 ++
4 files changed, 121 insertions(+), 20 deletions(-)

--
2.18.0


2018-08-29 20:33:20

by Alexander Wetzel

[permalink] [raw]
Subject: [PATCH v7 2/2] mac80211: Fix PTK rekey freezes and clear text leak

Rekeying PTK keys without "Extended Key ID for Individually Addressed
Frames" did use a procedure not suitable to replace in-use keys and
could caused the following issues:

1) Freeze caused by incoming frames:
If the local STA installed the key prior to the remote STA we still
had the old key active in the hardware when mac80211 switched over
to the new key.
Therefore there was a window where the card could hand over frames
decoded with the old key to mac80211 and bump the new PN (IV) value
to an incorrect high number. When it happened the local replay
detection silently started to drop all frames sent with the new key.

2) Freeze caused by outgoing frames:
If mac80211 was providing the PN (IV) and handed over a clear text
frame for encryption to the hardware prior to a key change the
driver/card could have processed the queued frame after switching
to the new key. This bumped the PN value on the remote STA to an
incorrect high number, tricking the remote STA to discard all frames
we sent later.

3) Freeze caused by RX aggregation reorder buffer:
An aggregation session started with the old key and ending after the
switch to the new key also bumped the PN to an incorrect high number,
freezing the connection quite similar to 1).

4) Freeze caused by repeating lost frames in an aggregation session:
A driver could repeat a lost frame and encrypt it with the new key
while in a TX aggregation session without updating the PN for the
new key. This also could freeze connections similar to 2).

5) Clear text leak:
Removing encryption offload from the card cleared the encryption
offload flag only after the card had deleted the key and we did not
stop TX during the rekey. The driver/card could therefore get
unencrypted frames from mac80211 while no longer be instructed to
encrypt them.

To prevent those issues the key install logic has been changed:
- We now flag drivers known to be able to rekey PTK0 keys with
NL80211_EXT_FEATURE_CAN_REPLACE_PTK0,
- stop queuing frames depending on the key while replacing it
- replace the key first in the hardware and only after that in mac80211
- and stop/block new aggregation sessions during the rekey.

For drivers not setting
NL80211_EXT_FEATURE_CAN_REPLACE_PTK0 the user space must avoid PTK
rekeys if "Extended Key ID for Individually Addressed Frames" is not
being used. Rekeys for mac80211 drivers without this flag will generate a
warning and use an extra call to ieee80211_flush_queues() to both
highlight and try to prevent the issues with not updated drivers.

The core of the fix changes the key install procedure from:
- atomic switch over to the new key in mac80211
- remove the old key in the hardware (stops encryption offloading, fall
back to software encryption with a potential clear text packet leak
in between)
- delete the inactive old key in mac80211
- enable hardware encryption offloading for the new key
to:
- if it's a PTK mark the old key as tainted to drop TX frames with the
outgoing key
- replace the key in hardware with the new one
- atomic switch over to the new (not marked as tainted) key in
mac80211 (which also resumes TX)
- delete the inactive old key in mac80211

With the new sequence the hardware will be unable to decrypt frames
encrypted with the old key prior to switching to the new key in mac80211
and thus prevent PNs from packets decrypted with the old key to be
accounted against the new key.

For that to work the drivers have to provide a clear boundary.
Mac80211 drivers setting @NL80211_EXT_FEATURE_CAN_REPLACE_PTK0 confirm
to provide it and mac80211 will then be able to correctly rekey in-use
PTK keys with those drivers.

The mac80211 requirements for drivers to set the flag have been added to
the "Hardware crypto acceleration" documentation section. It drills down
to:
The drivers must not hand over frames decrypted with the old key to
mac80211 once the call to set_key() with %DISABLE_KEY has been
completed. It's allowed to either drop or continue to use the old key
for any outgoing frames which are already in the queues, but it must not
send out any of them unencrypted or encrypted with the new key.

Even with the new boundary in place aggregation sessions with the
reorder buffer are a problem:
RX aggregation session started prior and completed after the rekey could
still dump frames received with the old key at mac80211 after it
switched over to the new key. This is side stepped by stopping all (RX
and TX) aggregation sessions when replacing a PTK key and hardware key
offloading.
Stopping TX aggregation sessions avoids the need to get
the PNs (IVs) updated in frames prepared for the old key and
(re)transmitted after the switch to the new key. As a bonus it improves
the compatibility when the remote STA is not handling rekeys as it
should.

When using software crypto aggregation sessions are not stopped.
Mac80211 won't be able to decode the dangerous frames and discard them
without special handling.

Signed-off-by: Alexander Wetzel <[email protected]>
---
include/net/mac80211.h | 13 +++++
net/mac80211/key.c | 118 ++++++++++++++++++++++++++++++++++-------
net/mac80211/tx.c | 4 ++
3 files changed, 115 insertions(+), 20 deletions(-)

diff --git a/include/net/mac80211.h b/include/net/mac80211.h
index 5790f55c241d..4b5df94c4cea 100644
--- a/include/net/mac80211.h
+++ b/include/net/mac80211.h
@@ -2506,6 +2506,19 @@ void ieee80211_free_txskb(struct ieee80211_hw *hw, struct sk_buff *skb);
* The set_default_unicast_key() call updates the default WEP key index
* configured to the hardware for WEP encryption type. This is required
* for devices that support offload of data packets (e.g. ARP responses).
+ *
+ * Mac80211 drivers should set the @NL80211_EXT_FEATURE_CAN_REPLACE_PTK0 flag
+ * when they are able to replace in-use PTK keys according to to following
+ * requirements:
+ * 1) They do not hand over frames decrypted with the old key to
+ mac80211 once the call to set_key() with command %DISABLE_KEY has been
+ completed when also setting @IEEE80211_KEY_FLAG_GENERATE_IV for any key
+ 2) either drop or continue to use the old key for any outgoing frames queued
+ at the time of the key deletion (especially re-transmits),
+ 3) never send out a frame queued prior to the set_key() %SET_KEY command
+ encrypted with the new key and
+ 4) never send out a frame unencrypted when it should be encrypted.
+ Mac80211 will not queue any new frames for a deleted key to the driver.
*/

/**
diff --git a/net/mac80211/key.c b/net/mac80211/key.c
index c054ac85793c..8dc33c29f653 100644
--- a/net/mac80211/key.c
+++ b/net/mac80211/key.c
@@ -248,6 +248,7 @@ static void ieee80211_key_disable_hw_accel(struct ieee80211_key *key)
(key->conf.flags & IEEE80211_KEY_FLAG_RESERVE_TAILROOM)))
increment_tailroom_need_count(sdata);

+ key->flags &= ~KEY_FLAG_UPLOADED_TO_HARDWARE;
ret = drv_set_key(key->local, DISABLE_KEY, sdata,
sta ? &sta->sta : NULL, &key->conf);

@@ -256,8 +257,73 @@ static void ieee80211_key_disable_hw_accel(struct ieee80211_key *key)
"failed to remove key (%d, %pM) from hardware (%d)\n",
key->conf.keyidx,
sta ? sta->sta.addr : bcast_addr, ret);
+}

- key->flags &= ~KEY_FLAG_UPLOADED_TO_HARDWARE;
+static int ieee80211_hw_key_replace(struct ieee80211_key *old_key,
+ struct ieee80211_key *new_key,
+ bool ptk0rekey)
+{
+ struct ieee80211_sub_if_data *sdata;
+ struct ieee80211_local *local;
+ struct sta_info *sta;
+ int ret;
+
+ /* Aggregation sessions are ok when running on SW crypto.
+ * A broken remote STA may cause issues not observed with HW
+ * crypto, though.
+ */
+ if (!(old_key->flags & KEY_FLAG_UPLOADED_TO_HARDWARE))
+ return 0;
+
+ assert_key_lock(old_key->local);
+ sta = old_key->sta;
+
+ /* PTK only using key ID 0 needs special handling on rekey */
+ if (new_key && sta && ptk0rekey) {
+ local = old_key->local;
+ sdata = old_key->sdata;
+
+ /* Stop TX till we are on the new key */
+ old_key->flags |= KEY_FLAG_TAINTED;
+ ieee80211_clear_fast_xmit(sta);
+
+ /* Aggregation sessions during rekey are complicated due to
+ * the reorder buffer. Side step that by blocking aggregation
+ * and tear down running connections.
+ */
+ if (ieee80211_hw_check(&local->hw, AMPDU_AGGREGATION)) {
+ set_sta_flag(sta, WLAN_STA_BLOCK_BA);
+ ieee80211_sta_tear_down_BA_sessions(sta,
+ AGG_STOP_LOCAL_REQUEST);
+ }
+
+ if (!wiphy_ext_feature_isset(new_key->local->hw.wiphy,
+ NL80211_EXT_FEATURE_CAN_REPLACE_PTK0)) {
+ pr_warn_ratelimited("Userspace is rekeying PTK for " \
+ "STA %pM with a driver not " \
+ "setting NL80211_EXT_FEATURE_" \
+ "CAN_REPLACE_PTK0. This may leak " \
+ "clear text packets or randomly " \
+ "freeze the connection.",
+ sta->sta.addr);
+ /* Flushing the driver queues *may* help prevent
+ * the cleartext leaks and freezes.
+ */
+ ieee80211_flush_queues(old_key->local,
+ old_key->sdata,
+ false);
+ }
+ ieee80211_key_disable_hw_accel(old_key);
+ ret = ieee80211_key_enable_hw_accel(new_key);
+ } else {
+ ieee80211_key_disable_hw_accel(old_key);
+ if (new_key)
+ ret = ieee80211_key_enable_hw_accel(new_key);
+ else
+ ret = 0;
+ }
+
+ return ret;
}

static void __ieee80211_set_default_key(struct ieee80211_sub_if_data *sdata,
@@ -316,38 +382,56 @@ void ieee80211_set_default_mgmt_key(struct ieee80211_sub_if_data *sdata,
}


-static void ieee80211_key_replace(struct ieee80211_sub_if_data *sdata,
+static int ieee80211_key_replace(struct ieee80211_sub_if_data *sdata,
struct sta_info *sta,
bool pairwise,
struct ieee80211_key *old,
struct ieee80211_key *new)
{
int idx;
+ int ret;
bool defunikey, defmultikey, defmgmtkey;

/* caller must provide at least one old/new */
if (WARN_ON(!new && !old))
- return;
+ return 0;

if (new)
list_add_tail_rcu(&new->list, &sdata->key_list);

WARN_ON(new && old && new->conf.keyidx != old->conf.keyidx);

- if (old)
+ if (old) {
idx = old->conf.keyidx;
- else
+ /* TODO: proper implement and test "Extended Key ID for
+ * Individually Addressed Frames" from IEEE 802.11-2016.
+ * Till then always assume only key ID 0 is used for
+ * pairwise keys.*/
+ ret = ieee80211_hw_key_replace(old, new, pairwise);
+ } else {
idx = new->conf.keyidx;
+ if (new && !new->local->wowlan)
+ ret = ieee80211_key_enable_hw_accel(new);
+ else
+ ret = 0;
+ }
+
+ if (ret)
+ return ret;

if (sta) {
if (pairwise) {
rcu_assign_pointer(sta->ptk[idx], new);
sta->ptk_idx = idx;
- ieee80211_check_fast_xmit(sta);
+ if (new) {
+ clear_sta_flag(sta, WLAN_STA_BLOCK_BA);
+ ieee80211_check_fast_xmit(sta);
+ }
} else {
rcu_assign_pointer(sta->gtk[idx], new);
}
- ieee80211_check_fast_rx(sta);
+ if (new)
+ ieee80211_check_fast_rx(sta);
} else {
defunikey = old &&
old == key_mtx_dereference(sdata->local,
@@ -380,6 +464,8 @@ static void ieee80211_key_replace(struct ieee80211_sub_if_data *sdata,

if (old)
list_del_rcu(&old->list);
+
+ return 0;
}

struct ieee80211_key *
@@ -575,9 +661,6 @@ static void ieee80211_key_free_common(struct ieee80211_key *key)
static void __ieee80211_key_destroy(struct ieee80211_key *key,
bool delay_tailroom)
{
- if (key->local)
- ieee80211_key_disable_hw_accel(key);
-
if (key->local) {
struct ieee80211_sub_if_data *sdata = key->sdata;

@@ -654,7 +737,6 @@ int ieee80211_key_link(struct ieee80211_key *key,
struct ieee80211_sub_if_data *sdata,
struct sta_info *sta)
{
- struct ieee80211_local *local = sdata->local;
struct ieee80211_key *old_key;
int idx = key->conf.keyidx;
bool pairwise = key->conf.flags & IEEE80211_KEY_FLAG_PAIRWISE;
@@ -691,17 +773,13 @@ int ieee80211_key_link(struct ieee80211_key *key,

increment_tailroom_need_count(sdata);

- ieee80211_key_replace(sdata, sta, pairwise, old_key, key);
- ieee80211_key_destroy(old_key, delay_tailroom);
-
- ieee80211_debugfs_key_add(key);
+ ret = ieee80211_key_replace(sdata, sta, pairwise, old_key, key);

- if (!local->wowlan) {
- ret = ieee80211_key_enable_hw_accel(key);
- if (ret)
- ieee80211_key_free(key, delay_tailroom);
+ if (!ret) {
+ ieee80211_debugfs_key_add(key);
+ ieee80211_key_destroy(old_key, delay_tailroom);
} else {
- ret = 0;
+ ieee80211_key_free(key, delay_tailroom);
}

out:
diff --git a/net/mac80211/tx.c b/net/mac80211/tx.c
index cd332e3e1134..5c79b3c2a7e1 100644
--- a/net/mac80211/tx.c
+++ b/net/mac80211/tx.c
@@ -2951,6 +2951,10 @@ void ieee80211_check_fast_xmit(struct sta_info *sta)
if (!(build.key->flags & KEY_FLAG_UPLOADED_TO_HARDWARE))
goto out;

+ /* Key is being removed */
+ if (build.key->flags & KEY_FLAG_TAINTED)
+ goto out;
+
switch (build.key->conf.cipher) {
case WLAN_CIPHER_SUITE_CCMP:
case WLAN_CIPHER_SUITE_CCMP_256:
--
2.18.0

2018-08-29 20:39:48

by Alexander Wetzel

[permalink] [raw]
Subject: [PATCH v7 1/2] nl80211: Add CAN_REPLACE_PTK0 API

Drivers able to correctly replace a in-use key should set
@NL80211_EXT_FEATURE_CAN_REPLACE_PTK0 to allow the user space (e.g.
hostapd or wpa_supplicant) to rekey PTK keys.

The user space must detect a PTK rekey attempt and only go ahead with it
when the driver has set this flag. If the driver is not supporting the
feature the user space either must not replace the PTK key or perform a
full re-association instead.

Ignoring this flag and continuing to rekey the connection can still work
but has to be considered insecure and broken. Depending on the driver it
can leak clear text packets or freeze the connection and is only
supported to allow the user space to be updated.

Signed-off-by: Alexander Wetzel <[email protected]>
Reviewed-by: Denis Kenzior <[email protected]>
---
include/uapi/linux/nl80211.h | 6 ++++++
1 file changed, 6 insertions(+)

diff --git a/include/uapi/linux/nl80211.h b/include/uapi/linux/nl80211.h
index 7acc16f34942..cf238064d8ab 100644
--- a/include/uapi/linux/nl80211.h
+++ b/include/uapi/linux/nl80211.h
@@ -5224,6 +5224,11 @@ enum nl80211_feature_flags {
* except for supported rates from the probe request content if requested
* by the %NL80211_SCAN_FLAG_MIN_PREQ_CONTENT flag.
*
+ * @NL80211_EXT_FEATURE_CAN_REPLACE_PTK0: Driver/device confirm that they are
+ * able to rekey an in-use key correctly. Userspace must not rekey PTK keys
+ * if this flag is not set. Ignoring this can leak clear text packets and/or
+ * freeze the connection.
+ *
* @NUM_NL80211_EXT_FEATURES: number of extended features.
* @MAX_NL80211_EXT_FEATURES: highest extended feature index.
*/
@@ -5259,6 +5264,7 @@ enum nl80211_ext_feature_index {
NL80211_EXT_FEATURE_TXQS,
NL80211_EXT_FEATURE_SCAN_RANDOM_SN,
NL80211_EXT_FEATURE_SCAN_MIN_PREQ_CONTENT,
+ NL80211_EXT_FEATURE_CAN_REPLACE_PTK0,

/* add new features before the definition below */
NUM_NL80211_EXT_FEATURES,
--
2.18.0