2022-11-23 17:45:25

by Dmitry Safonov

[permalink] [raw]
Subject: [PATCH v6 0/5] net/tcp: Dynamically disable TCP-MD5 static key

Changes from v5:
- Corrected comment for static_key_fast_inc_not_negative() (Peter)
- Renamed static_key_fast_inc_not_negative() =>
static_key_fast_inc_not_disabled() (as suggested by Peter)
- static_key_fast_inc_not_disabled() is exported and declared in the
patch 1 that defines it, rather than in patch 3 that uses it (Peter)

Changes from v4:
- Used rcu_dereference_protected() for tp->md5sig_info in
tcp_md5_do_add() and tcp_md5_key_copy() fail paths to make sure
there won't be false-positives from sparse (Jakub)
- Added Acked-by: Jakub Kicinski

Changes from v3:
- Used atomic_try_cmpxchg() as suggested by Peter Zijlstra
- Renamed static_key_fast_inc() => static_key_fast_inc_not_negative()
(addressing Peter Zijlstra's review)
- Based on linux-tip/master
- tcp_md5_key_copy() now does net_warn_ratelimited()
(addressing Peter Zijlstra's review)
tcp_md5_do_add() does not as it returns -EUSERS from setsockopt()
syscall back to the userspace
- Corrected WARN_ON_ONCE(!static_key_fast_inc(key))
(Spotted by Jason Baron)
- Moved declaration of static_key_fast_inc_not_negative() and its
EXPORT_SYMBOL_GPL() to the patch 3 that uses it,
"net/tcp: Disable TCP-MD5 static key on tcp_md5sig_info destruction"
(addressing Peter Zijlstra's review)
- Added patch 4 that destroys the newly created request socket
if md5 info allocation or static_key increment was unsuccessful.
Instead of proceeding to add a socket without TCP-MD5 keys.
- Added patch 5 that separates helper tcp_time_wait_init()
and converts BUG_ON() to WARN_ON_ONCE().

Changes from v2:
- Prevent key->enabled from turning negative by overflow from
static_key_slow_inc() or static_key_fast_inc()
(addressing Peter Zijlstra's review)
- Added checks if static_branch_inc() and static_key_fast_int()
were successful to TCP-MD5 code.

Changes from v1:
- Add static_key_fast_inc() helper rather than open-coded atomic_inc()
(as suggested by Eric Dumazet)

Version 5:
https://lore.kernel.org/all/[email protected]/T/#u
Version 4:
https://lore.kernel.org/all/[email protected]/T/#u
Version 3:
https://lore.kernel.org/all/[email protected]/T/#u
Version 2:
https://lore.kernel.org/all/[email protected]/T/#u
Version 1:
https://lore.kernel.org/all/[email protected]/T/#u

The static key introduced by commit 6015c71e656b ("tcp: md5: add
tcp_md5_needed jump label") is a fast-path optimization aimed at
avoiding a cache line miss.
Once an MD5 key is introduced in the system the static key is enabled
and never disabled. Address this by disabling the static key when
the last tcp_md5sig_info in system is destroyed.

Previously it was submitted as a part of TCP-AO patches set [1].
Now in attempt to split 36 patches submission, I send this independently.

Cc: Ard Biesheuvel <[email protected]>
Cc: Bob Gilligan <[email protected]>
Cc: David Ahern <[email protected]>
Cc: "David S. Miller" <[email protected]>
Cc: Dmitry Safonov <[email protected]>
Cc: Eric Dumazet <[email protected]>
Cc: Francesco Ruggeri <[email protected]>
Cc: Hideaki YOSHIFUJI <[email protected]>
Cc: Jakub Kicinski <[email protected]>
Cc: Jason Baron <[email protected]>
Cc: Josh Poimboeuf <[email protected]>
Cc: Paolo Abeni <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Salam Noureddine <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: [email protected]
Cc: [email protected]

[1]: https://lore.kernel.org/all/[email protected]/T/#u

Thanks,
Dmitry

Dmitry Safonov (5):
jump_label: Prevent key->enabled int overflow
net/tcp: Separate tcp_md5sig_info allocation into
tcp_md5sig_info_add()
net/tcp: Disable TCP-MD5 static key on tcp_md5sig_info destruction
net/tcp: Do cleanup on tcp_md5_key_copy() failure
net/tcp: Separate initialization of twsk

include/linux/jump_label.h | 21 +++++++--
include/net/tcp.h | 10 ++--
kernel/jump_label.c | 56 +++++++++++++++++-----
net/ipv4/tcp.c | 5 +-
net/ipv4/tcp_ipv4.c | 96 +++++++++++++++++++++++++++++---------
net/ipv4/tcp_minisocks.c | 61 +++++++++++++++---------
net/ipv4/tcp_output.c | 4 +-
net/ipv6/tcp_ipv6.c | 21 ++++-----
8 files changed, 194 insertions(+), 80 deletions(-)


base-commit: 736b6d81d93cf61a0601af90bd552103ef997b3f
--
2.38.1


2022-11-23 17:45:44

by Dmitry Safonov

[permalink] [raw]
Subject: [PATCH v6 1/5] jump_label: Prevent key->enabled int overflow

1. With CONFIG_JUMP_LABEL=n static_key_slow_inc() doesn't have any
protection against key->enabled refcounter overflow.
2. With CONFIG_JUMP_LABEL=y static_key_slow_inc_cpuslocked()
still may turn the refcounter negative as (v + 1) may overflow.

key->enabled is indeed a ref-counter as it's documented in multiple
places: top comment in jump_label.h, Documentation/staging/static-keys.rst,
etc.

As -1 is reserved for static key that's in process of being enabled,
functions would break with negative key->enabled refcount:
- for CONFIG_JUMP_LABEL=n negative return of static_key_count()
breaks static_key_false(), static_key_true()
- the ref counter may become 0 from negative side by too many
static_key_slow_inc() calls and lead to use-after-free issues.

These flaws result in that some users have to introduce an additional
mutex and prevent the reference counter from overflowing themselves,
see bpf_enable_runtime_stats() checking the counter against INT_MAX / 2.

Prevent the reference counter overflow by checking if (v + 1) > 0.
Change functions API to return whether the increment was successful.

Signed-off-by: Dmitry Safonov <[email protected]>
Acked-by: Jakub Kicinski <[email protected]>
---
include/linux/jump_label.h | 21 +++++++++++---
kernel/jump_label.c | 56 ++++++++++++++++++++++++++++++--------
2 files changed, 61 insertions(+), 16 deletions(-)

diff --git a/include/linux/jump_label.h b/include/linux/jump_label.h
index 570831ca9951..4e968ebadce6 100644
--- a/include/linux/jump_label.h
+++ b/include/linux/jump_label.h
@@ -224,9 +224,10 @@ extern bool arch_jump_label_transform_queue(struct jump_entry *entry,
enum jump_label_type type);
extern void arch_jump_label_transform_apply(void);
extern int jump_label_text_reserved(void *start, void *end);
-extern void static_key_slow_inc(struct static_key *key);
+extern bool static_key_slow_inc(struct static_key *key);
+extern bool static_key_fast_inc_not_disabled(struct static_key *key);
extern void static_key_slow_dec(struct static_key *key);
-extern void static_key_slow_inc_cpuslocked(struct static_key *key);
+extern bool static_key_slow_inc_cpuslocked(struct static_key *key);
extern void static_key_slow_dec_cpuslocked(struct static_key *key);
extern int static_key_count(struct static_key *key);
extern void static_key_enable(struct static_key *key);
@@ -278,11 +279,23 @@ static __always_inline bool static_key_true(struct static_key *key)
return false;
}

-static inline void static_key_slow_inc(struct static_key *key)
+static inline bool static_key_fast_inc_not_disabled(struct static_key *key)
{
+ int v;
+
STATIC_KEY_CHECK_USE(key);
- atomic_inc(&key->enabled);
+ /*
+ * Prevent key->enabled getting negative to follow the same semantics
+ * as for CONFIG_JUMP_LABEL=y, see kernel/jump_label.c comment.
+ */
+ v = atomic_read(&key->enabled);
+ do {
+ if (v < 0 || (v + 1) < 0)
+ return false;
+ } while (!likely(atomic_try_cmpxchg(&key->enabled, &v, v + 1)));
+ return true;
}
+#define static_key_slow_inc(key) static_key_fast_inc_not_disabled(key)

static inline void static_key_slow_dec(struct static_key *key)
{
diff --git a/kernel/jump_label.c b/kernel/jump_label.c
index 4d6c6f5f60db..d9c822bbffb8 100644
--- a/kernel/jump_label.c
+++ b/kernel/jump_label.c
@@ -113,9 +113,40 @@ int static_key_count(struct static_key *key)
}
EXPORT_SYMBOL_GPL(static_key_count);

-void static_key_slow_inc_cpuslocked(struct static_key *key)
+/*
+ * static_key_fast_inc_not_disabled - adds a user for a static key
+ * @key: static key that must be already enabled
+ *
+ * The caller must make sure that the static key can't get disabled while
+ * in this function. It doesn't patch jump labels, only adds a user to
+ * an already enabled static key.
+ *
+ * Returns true if the increment was done. Unlike refcount_t the ref counter
+ * is not saturated, but will fail to increment on overflow.
+ */
+bool static_key_fast_inc_not_disabled(struct static_key *key)
{
+ int v;
+
STATIC_KEY_CHECK_USE(key);
+ /*
+ * Negative key->enabled has a special meaning: it sends
+ * static_key_slow_inc() down the slow path, and it is non-zero
+ * so it counts as "enabled" in jump_label_update(). Note that
+ * atomic_inc_unless_negative() checks >= 0, so roll our own.
+ */
+ v = atomic_read(&key->enabled);
+ do {
+ if (v <= 0 || (v + 1) < 0)
+ return false;
+ } while (!likely(atomic_try_cmpxchg(&key->enabled, &v, v + 1)));
+
+ return true;
+}
+EXPORT_SYMBOL_GPL(static_key_fast_inc_not_disabled);
+
+bool static_key_slow_inc_cpuslocked(struct static_key *key)
+{
lockdep_assert_cpus_held();

/*
@@ -124,15 +155,9 @@ void static_key_slow_inc_cpuslocked(struct static_key *key)
* jump_label_update() process. At the same time, however,
* the jump_label_update() call below wants to see
* static_key_enabled(&key) for jumps to be updated properly.
- *
- * So give a special meaning to negative key->enabled: it sends
- * static_key_slow_inc() down the slow path, and it is non-zero
- * so it counts as "enabled" in jump_label_update(). Note that
- * atomic_inc_unless_negative() checks >= 0, so roll our own.
*/
- for (int v = atomic_read(&key->enabled); v > 0; )
- if (likely(atomic_try_cmpxchg(&key->enabled, &v, v + 1)))
- return;
+ if (static_key_fast_inc_not_disabled(key))
+ return true;

jump_label_lock();
if (atomic_read(&key->enabled) == 0) {
@@ -144,16 +169,23 @@ void static_key_slow_inc_cpuslocked(struct static_key *key)
*/
atomic_set_release(&key->enabled, 1);
} else {
- atomic_inc(&key->enabled);
+ if (WARN_ON_ONCE(!static_key_fast_inc_not_disabled(key))) {
+ jump_label_unlock();
+ return false;
+ }
}
jump_label_unlock();
+ return true;
}

-void static_key_slow_inc(struct static_key *key)
+bool static_key_slow_inc(struct static_key *key)
{
+ bool ret;
+
cpus_read_lock();
- static_key_slow_inc_cpuslocked(key);
+ ret = static_key_slow_inc_cpuslocked(key);
cpus_read_unlock();
+ return ret;
}
EXPORT_SYMBOL_GPL(static_key_slow_inc);

--
2.38.1

2022-11-23 17:45:48

by Dmitry Safonov

[permalink] [raw]
Subject: [PATCH v6 2/5] net/tcp: Separate tcp_md5sig_info allocation into tcp_md5sig_info_add()

Add a helper to allocate tcp_md5sig_info, that will help later to
do/allocate things when info allocated, once per socket.

Signed-off-by: Dmitry Safonov <[email protected]>
Reviewed-by: Eric Dumazet <[email protected]>
Acked-by: Jakub Kicinski <[email protected]>
---
net/ipv4/tcp_ipv4.c | 30 +++++++++++++++++++++---------
1 file changed, 21 insertions(+), 9 deletions(-)

diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index f0343538d1f8..2d76d50b8ae8 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -1172,6 +1172,24 @@ struct tcp_md5sig_key *tcp_v4_md5_lookup(const struct sock *sk,
}
EXPORT_SYMBOL(tcp_v4_md5_lookup);

+static int tcp_md5sig_info_add(struct sock *sk, gfp_t gfp)
+{
+ struct tcp_sock *tp = tcp_sk(sk);
+ struct tcp_md5sig_info *md5sig;
+
+ if (rcu_dereference_protected(tp->md5sig_info, lockdep_sock_is_held(sk)))
+ return 0;
+
+ md5sig = kmalloc(sizeof(*md5sig), gfp);
+ if (!md5sig)
+ return -ENOMEM;
+
+ sk_gso_disable(sk);
+ INIT_HLIST_HEAD(&md5sig->head);
+ rcu_assign_pointer(tp->md5sig_info, md5sig);
+ return 0;
+}
+
/* This can be called on a newly created socket, from other files */
int tcp_md5_do_add(struct sock *sk, const union tcp_md5_addr *addr,
int family, u8 prefixlen, int l3index, u8 flags,
@@ -1202,17 +1220,11 @@ int tcp_md5_do_add(struct sock *sk, const union tcp_md5_addr *addr,
return 0;
}

+ if (tcp_md5sig_info_add(sk, gfp))
+ return -ENOMEM;
+
md5sig = rcu_dereference_protected(tp->md5sig_info,
lockdep_sock_is_held(sk));
- if (!md5sig) {
- md5sig = kmalloc(sizeof(*md5sig), gfp);
- if (!md5sig)
- return -ENOMEM;
-
- sk_gso_disable(sk);
- INIT_HLIST_HEAD(&md5sig->head);
- rcu_assign_pointer(tp->md5sig_info, md5sig);
- }

key = sock_kmalloc(sk, sizeof(*key), gfp | __GFP_ZERO);
if (!key)
--
2.38.1

2022-11-23 17:46:25

by Dmitry Safonov

[permalink] [raw]
Subject: [PATCH v6 5/5] net/tcp: Separate initialization of twsk

Convert BUG_ON() to WARN_ON_ONCE() and warn as well for unlikely
static key int overflow error-path.

Signed-off-by: Dmitry Safonov <[email protected]>
Acked-by: Jakub Kicinski <[email protected]>
---
net/ipv4/tcp_minisocks.c | 61 +++++++++++++++++++++++-----------------
1 file changed, 35 insertions(+), 26 deletions(-)

diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c
index 6908812d50d3..e002f2e1d4f2 100644
--- a/net/ipv4/tcp_minisocks.c
+++ b/net/ipv4/tcp_minisocks.c
@@ -240,6 +240,40 @@ tcp_timewait_state_process(struct inet_timewait_sock *tw, struct sk_buff *skb,
}
EXPORT_SYMBOL(tcp_timewait_state_process);

+static void tcp_time_wait_init(struct sock *sk, struct tcp_timewait_sock *tcptw)
+{
+#ifdef CONFIG_TCP_MD5SIG
+ const struct tcp_sock *tp = tcp_sk(sk);
+ struct tcp_md5sig_key *key;
+
+ /*
+ * The timewait bucket does not have the key DB from the
+ * sock structure. We just make a quick copy of the
+ * md5 key being used (if indeed we are using one)
+ * so the timewait ack generating code has the key.
+ */
+ tcptw->tw_md5_key = NULL;
+ if (!static_branch_unlikely(&tcp_md5_needed.key))
+ return;
+
+ key = tp->af_specific->md5_lookup(sk, sk);
+ if (key) {
+ tcptw->tw_md5_key = kmemdup(key, sizeof(*key), GFP_ATOMIC);
+ if (!tcptw->tw_md5_key)
+ return;
+ if (!tcp_alloc_md5sig_pool())
+ goto out_free;
+ if (!static_key_fast_inc_not_disabled(&tcp_md5_needed.key.key))
+ goto out_free;
+ }
+ return;
+out_free:
+ WARN_ON_ONCE(1);
+ kfree(tcptw->tw_md5_key);
+ tcptw->tw_md5_key = NULL;
+#endif
+}
+
/*
* Move a socket to time-wait or dead fin-wait-2 state.
*/
@@ -282,32 +316,7 @@ void tcp_time_wait(struct sock *sk, int state, int timeo)
}
#endif

-#ifdef CONFIG_TCP_MD5SIG
- /*
- * The timewait bucket does not have the key DB from the
- * sock structure. We just make a quick copy of the
- * md5 key being used (if indeed we are using one)
- * so the timewait ack generating code has the key.
- */
- do {
- tcptw->tw_md5_key = NULL;
- if (static_branch_unlikely(&tcp_md5_needed.key)) {
- struct tcp_md5sig_key *key;
-
- key = tp->af_specific->md5_lookup(sk, sk);
- if (key) {
- tcptw->tw_md5_key = kmemdup(key, sizeof(*key), GFP_ATOMIC);
- if (!tcptw->tw_md5_key)
- break;
- BUG_ON(!tcp_alloc_md5sig_pool());
- if (!static_key_fast_inc_not_disabled(&tcp_md5_needed.key.key)) {
- kfree(tcptw->tw_md5_key);
- tcptw->tw_md5_key = NULL;
- }
- }
- }
- } while (0);
-#endif
+ tcp_time_wait_init(sk, tcptw);

/* Get the TIME_WAIT timeout firing. */
if (timeo < rto)
--
2.38.1

2022-11-23 17:56:38

by Dmitry Safonov

[permalink] [raw]
Subject: [PATCH v6 3/5] net/tcp: Disable TCP-MD5 static key on tcp_md5sig_info destruction

To do that, separate two scenarios:
- where it's the first MD5 key on the system, which means that enabling
of the static key may need to sleep;
- copying of an existing key from a listening socket to the request
socket upon receiving a signed TCP segment, where static key was
already enabled (when the key was added to the listening socket).

Now the life-time of the static branch for TCP-MD5 is until:
- last tcp_md5sig_info is destroyed
- last socket in time-wait state with MD5 key is closed.

Which means that after all sockets with TCP-MD5 keys are gone, the
system gets back the performance of disabled md5-key static branch.

While at here, provide static_key_fast_inc() helper that does ref
counter increment in atomic fashion (without grabbing cpus_read_lock()
on CONFIG_JUMP_LABEL=y). This is needed to add a new user for
a static_key when the caller controls the lifetime of another user.

Signed-off-by: Dmitry Safonov <[email protected]>
Acked-by: Jakub Kicinski <[email protected]>
---
include/net/tcp.h | 10 ++++--
net/ipv4/tcp.c | 5 +--
net/ipv4/tcp_ipv4.c | 71 ++++++++++++++++++++++++++++++++--------
net/ipv4/tcp_minisocks.c | 16 ++++++---
net/ipv4/tcp_output.c | 4 +--
net/ipv6/tcp_ipv6.c | 10 +++---
6 files changed, 84 insertions(+), 32 deletions(-)

diff --git a/include/net/tcp.h b/include/net/tcp.h
index 6b814e788f00..f925377066fe 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -1675,7 +1675,11 @@ int tcp_v4_md5_hash_skb(char *md5_hash, const struct tcp_md5sig_key *key,
const struct sock *sk, const struct sk_buff *skb);
int tcp_md5_do_add(struct sock *sk, const union tcp_md5_addr *addr,
int family, u8 prefixlen, int l3index, u8 flags,
- const u8 *newkey, u8 newkeylen, gfp_t gfp);
+ const u8 *newkey, u8 newkeylen);
+int tcp_md5_key_copy(struct sock *sk, const union tcp_md5_addr *addr,
+ int family, u8 prefixlen, int l3index,
+ struct tcp_md5sig_key *key);
+
int tcp_md5_do_del(struct sock *sk, const union tcp_md5_addr *addr,
int family, u8 prefixlen, int l3index, u8 flags);
struct tcp_md5sig_key *tcp_v4_md5_lookup(const struct sock *sk,
@@ -1683,7 +1687,7 @@ struct tcp_md5sig_key *tcp_v4_md5_lookup(const struct sock *sk,

#ifdef CONFIG_TCP_MD5SIG
#include <linux/jump_label.h>
-extern struct static_key_false tcp_md5_needed;
+extern struct static_key_false_deferred tcp_md5_needed;
struct tcp_md5sig_key *__tcp_md5_do_lookup(const struct sock *sk, int l3index,
const union tcp_md5_addr *addr,
int family);
@@ -1691,7 +1695,7 @@ static inline struct tcp_md5sig_key *
tcp_md5_do_lookup(const struct sock *sk, int l3index,
const union tcp_md5_addr *addr, int family)
{
- if (!static_branch_unlikely(&tcp_md5_needed))
+ if (!static_branch_unlikely(&tcp_md5_needed.key))
return NULL;
return __tcp_md5_do_lookup(sk, l3index, addr, family);
}
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 4a69c5fcfedc..267406f199bc 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -4465,11 +4465,8 @@ bool tcp_alloc_md5sig_pool(void)
if (unlikely(!READ_ONCE(tcp_md5sig_pool_populated))) {
mutex_lock(&tcp_md5sig_mutex);

- if (!tcp_md5sig_pool_populated) {
+ if (!tcp_md5sig_pool_populated)
__tcp_alloc_md5sig_pool();
- if (tcp_md5sig_pool_populated)
- static_branch_inc(&tcp_md5_needed);
- }

mutex_unlock(&tcp_md5sig_mutex);
}
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 2d76d50b8ae8..2ae6a061f36e 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -1064,7 +1064,7 @@ static void tcp_v4_reqsk_destructor(struct request_sock *req)
* We need to maintain these in the sk structure.
*/

-DEFINE_STATIC_KEY_FALSE(tcp_md5_needed);
+DEFINE_STATIC_KEY_DEFERRED_FALSE(tcp_md5_needed, HZ);
EXPORT_SYMBOL(tcp_md5_needed);

static bool better_md5_match(struct tcp_md5sig_key *old, struct tcp_md5sig_key *new)
@@ -1177,9 +1177,6 @@ static int tcp_md5sig_info_add(struct sock *sk, gfp_t gfp)
struct tcp_sock *tp = tcp_sk(sk);
struct tcp_md5sig_info *md5sig;

- if (rcu_dereference_protected(tp->md5sig_info, lockdep_sock_is_held(sk)))
- return 0;
-
md5sig = kmalloc(sizeof(*md5sig), gfp);
if (!md5sig)
return -ENOMEM;
@@ -1191,9 +1188,9 @@ static int tcp_md5sig_info_add(struct sock *sk, gfp_t gfp)
}

/* This can be called on a newly created socket, from other files */
-int tcp_md5_do_add(struct sock *sk, const union tcp_md5_addr *addr,
- int family, u8 prefixlen, int l3index, u8 flags,
- const u8 *newkey, u8 newkeylen, gfp_t gfp)
+static int __tcp_md5_do_add(struct sock *sk, const union tcp_md5_addr *addr,
+ int family, u8 prefixlen, int l3index, u8 flags,
+ const u8 *newkey, u8 newkeylen, gfp_t gfp)
{
/* Add Key to the list */
struct tcp_md5sig_key *key;
@@ -1220,9 +1217,6 @@ int tcp_md5_do_add(struct sock *sk, const union tcp_md5_addr *addr,
return 0;
}

- if (tcp_md5sig_info_add(sk, gfp))
- return -ENOMEM;
-
md5sig = rcu_dereference_protected(tp->md5sig_info,
lockdep_sock_is_held(sk));

@@ -1246,8 +1240,59 @@ int tcp_md5_do_add(struct sock *sk, const union tcp_md5_addr *addr,
hlist_add_head_rcu(&key->node, &md5sig->head);
return 0;
}
+
+int tcp_md5_do_add(struct sock *sk, const union tcp_md5_addr *addr,
+ int family, u8 prefixlen, int l3index, u8 flags,
+ const u8 *newkey, u8 newkeylen)
+{
+ struct tcp_sock *tp = tcp_sk(sk);
+
+ if (!rcu_dereference_protected(tp->md5sig_info, lockdep_sock_is_held(sk))) {
+ if (tcp_md5sig_info_add(sk, GFP_KERNEL))
+ return -ENOMEM;
+
+ if (!static_branch_inc(&tcp_md5_needed.key)) {
+ struct tcp_md5sig_info *md5sig;
+
+ md5sig = rcu_dereference_protected(tp->md5sig_info, lockdep_sock_is_held(sk));
+ rcu_assign_pointer(tp->md5sig_info, NULL);
+ kfree_rcu(md5sig);
+ return -EUSERS;
+ }
+ }
+
+ return __tcp_md5_do_add(sk, addr, family, prefixlen, l3index, flags,
+ newkey, newkeylen, GFP_KERNEL);
+}
EXPORT_SYMBOL(tcp_md5_do_add);

+int tcp_md5_key_copy(struct sock *sk, const union tcp_md5_addr *addr,
+ int family, u8 prefixlen, int l3index,
+ struct tcp_md5sig_key *key)
+{
+ struct tcp_sock *tp = tcp_sk(sk);
+
+ if (!rcu_dereference_protected(tp->md5sig_info, lockdep_sock_is_held(sk))) {
+ if (tcp_md5sig_info_add(sk, sk_gfp_mask(sk, GFP_ATOMIC)))
+ return -ENOMEM;
+
+ if (!static_key_fast_inc_not_disabled(&tcp_md5_needed.key.key)) {
+ struct tcp_md5sig_info *md5sig;
+
+ md5sig = rcu_dereference_protected(tp->md5sig_info, lockdep_sock_is_held(sk));
+ net_warn_ratelimited("Too many TCP-MD5 keys in the system\n");
+ rcu_assign_pointer(tp->md5sig_info, NULL);
+ kfree_rcu(md5sig);
+ return -EUSERS;
+ }
+ }
+
+ return __tcp_md5_do_add(sk, addr, family, prefixlen, l3index,
+ key->flags, key->key, key->keylen,
+ sk_gfp_mask(sk, GFP_ATOMIC));
+}
+EXPORT_SYMBOL(tcp_md5_key_copy);
+
int tcp_md5_do_del(struct sock *sk, const union tcp_md5_addr *addr, int family,
u8 prefixlen, int l3index, u8 flags)
{
@@ -1334,7 +1379,7 @@ static int tcp_v4_parse_md5_keys(struct sock *sk, int optname,
return -EINVAL;

return tcp_md5_do_add(sk, addr, AF_INET, prefixlen, l3index, flags,
- cmd.tcpm_key, cmd.tcpm_keylen, GFP_KERNEL);
+ cmd.tcpm_key, cmd.tcpm_keylen);
}

static int tcp_v4_md5_hash_headers(struct tcp_md5sig_pool *hp,
@@ -1591,8 +1636,7 @@ struct sock *tcp_v4_syn_recv_sock(const struct sock *sk, struct sk_buff *skb,
* memory, then we end up not copying the key
* across. Shucks.
*/
- tcp_md5_do_add(newsk, addr, AF_INET, 32, l3index, key->flags,
- key->key, key->keylen, GFP_ATOMIC);
+ tcp_md5_key_copy(newsk, addr, AF_INET, 32, l3index, key);
sk_gso_disable(newsk);
}
#endif
@@ -2284,6 +2328,7 @@ void tcp_v4_destroy_sock(struct sock *sk)
tcp_clear_md5_list(sk);
kfree_rcu(rcu_dereference_protected(tp->md5sig_info, 1), rcu);
tp->md5sig_info = NULL;
+ static_branch_slow_dec_deferred(&tcp_md5_needed);
}
#endif

diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c
index c375f603a16c..6908812d50d3 100644
--- a/net/ipv4/tcp_minisocks.c
+++ b/net/ipv4/tcp_minisocks.c
@@ -291,13 +291,19 @@ void tcp_time_wait(struct sock *sk, int state, int timeo)
*/
do {
tcptw->tw_md5_key = NULL;
- if (static_branch_unlikely(&tcp_md5_needed)) {
+ if (static_branch_unlikely(&tcp_md5_needed.key)) {
struct tcp_md5sig_key *key;

key = tp->af_specific->md5_lookup(sk, sk);
if (key) {
tcptw->tw_md5_key = kmemdup(key, sizeof(*key), GFP_ATOMIC);
- BUG_ON(tcptw->tw_md5_key && !tcp_alloc_md5sig_pool());
+ if (!tcptw->tw_md5_key)
+ break;
+ BUG_ON(!tcp_alloc_md5sig_pool());
+ if (!static_key_fast_inc_not_disabled(&tcp_md5_needed.key.key)) {
+ kfree(tcptw->tw_md5_key);
+ tcptw->tw_md5_key = NULL;
+ }
}
}
} while (0);
@@ -337,11 +343,13 @@ EXPORT_SYMBOL(tcp_time_wait);
void tcp_twsk_destructor(struct sock *sk)
{
#ifdef CONFIG_TCP_MD5SIG
- if (static_branch_unlikely(&tcp_md5_needed)) {
+ if (static_branch_unlikely(&tcp_md5_needed.key)) {
struct tcp_timewait_sock *twsk = tcp_twsk(sk);

- if (twsk->tw_md5_key)
+ if (twsk->tw_md5_key) {
kfree_rcu(twsk->tw_md5_key, rcu);
+ static_branch_slow_dec_deferred(&tcp_md5_needed);
+ }
}
#endif
}
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 894410dc9293..71d01cf3c13e 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -766,7 +766,7 @@ static unsigned int tcp_syn_options(struct sock *sk, struct sk_buff *skb,

*md5 = NULL;
#ifdef CONFIG_TCP_MD5SIG
- if (static_branch_unlikely(&tcp_md5_needed) &&
+ if (static_branch_unlikely(&tcp_md5_needed.key) &&
rcu_access_pointer(tp->md5sig_info)) {
*md5 = tp->af_specific->md5_lookup(sk, sk);
if (*md5) {
@@ -922,7 +922,7 @@ static unsigned int tcp_established_options(struct sock *sk, struct sk_buff *skb

*md5 = NULL;
#ifdef CONFIG_TCP_MD5SIG
- if (static_branch_unlikely(&tcp_md5_needed) &&
+ if (static_branch_unlikely(&tcp_md5_needed.key) &&
rcu_access_pointer(tp->md5sig_info)) {
*md5 = tp->af_specific->md5_lookup(sk, sk);
if (*md5) {
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index f676be14e6b6..83304d6a6bd0 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -677,12 +677,11 @@ static int tcp_v6_parse_md5_keys(struct sock *sk, int optname,
if (ipv6_addr_v4mapped(&sin6->sin6_addr))
return tcp_md5_do_add(sk, (union tcp_md5_addr *)&sin6->sin6_addr.s6_addr32[3],
AF_INET, prefixlen, l3index, flags,
- cmd.tcpm_key, cmd.tcpm_keylen,
- GFP_KERNEL);
+ cmd.tcpm_key, cmd.tcpm_keylen);

return tcp_md5_do_add(sk, (union tcp_md5_addr *)&sin6->sin6_addr,
AF_INET6, prefixlen, l3index, flags,
- cmd.tcpm_key, cmd.tcpm_keylen, GFP_KERNEL);
+ cmd.tcpm_key, cmd.tcpm_keylen);
}

static int tcp_v6_md5_hash_headers(struct tcp_md5sig_pool *hp,
@@ -1382,9 +1381,8 @@ static struct sock *tcp_v6_syn_recv_sock(const struct sock *sk, struct sk_buff *
* memory, then we end up not copying the key
* across. Shucks.
*/
- tcp_md5_do_add(newsk, (union tcp_md5_addr *)&newsk->sk_v6_daddr,
- AF_INET6, 128, l3index, key->flags, key->key, key->keylen,
- sk_gfp_mask(sk, GFP_ATOMIC));
+ tcp_md5_key_copy(newsk, (union tcp_md5_addr *)&newsk->sk_v6_daddr,
+ AF_INET6, 128, l3index, key);
}
#endif

--
2.38.1

2022-11-23 18:26:52

by Dmitry Safonov

[permalink] [raw]
Subject: [PATCH v6 4/5] net/tcp: Do cleanup on tcp_md5_key_copy() failure

If the kernel was short on (atomic) memory and failed to allocate it -
don't proceed to creation of request socket. Otherwise the socket would
be unsigned and userspace likely doesn't expect that the TCP is not
MD5-signed anymore.

Signed-off-by: Dmitry Safonov <[email protected]>
Acked-by: Jakub Kicinski <[email protected]>
---
net/ipv4/tcp_ipv4.c | 9 ++-------
net/ipv6/tcp_ipv6.c | 15 ++++++++-------
2 files changed, 10 insertions(+), 14 deletions(-)

diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 2ae6a061f36e..e214098087fe 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -1630,13 +1630,8 @@ struct sock *tcp_v4_syn_recv_sock(const struct sock *sk, struct sk_buff *skb,
addr = (union tcp_md5_addr *)&newinet->inet_daddr;
key = tcp_md5_do_lookup(sk, l3index, addr, AF_INET);
if (key) {
- /*
- * We're using one, so create a matching key
- * on the newsk structure. If we fail to get
- * memory, then we end up not copying the key
- * across. Shucks.
- */
- tcp_md5_key_copy(newsk, addr, AF_INET, 32, l3index, key);
+ if (tcp_md5_key_copy(newsk, addr, AF_INET, 32, l3index, key))
+ goto put_and_exit;
sk_gso_disable(newsk);
}
#endif
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index 83304d6a6bd0..21486b4a9774 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -1376,13 +1376,14 @@ static struct sock *tcp_v6_syn_recv_sock(const struct sock *sk, struct sk_buff *
/* Copy over the MD5 key from the original socket */
key = tcp_v6_md5_do_lookup(sk, &newsk->sk_v6_daddr, l3index);
if (key) {
- /* We're using one, so create a matching key
- * on the newsk structure. If we fail to get
- * memory, then we end up not copying the key
- * across. Shucks.
- */
- tcp_md5_key_copy(newsk, (union tcp_md5_addr *)&newsk->sk_v6_daddr,
- AF_INET6, 128, l3index, key);
+ const union tcp_md5_addr *addr;
+
+ addr = (union tcp_md5_addr *)&newsk->sk_v6_daddr;
+ if (tcp_md5_key_copy(newsk, addr, AF_INET6, 128, l3index, key)) {
+ inet_csk_prepare_forced_close(newsk);
+ tcp_done(newsk);
+ goto out;
+ }
}
#endif

--
2.38.1

2022-11-25 08:16:08

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH v6 1/5] jump_label: Prevent key->enabled int overflow

On Wed, Nov 23, 2022 at 05:38:55PM +0000, Dmitry Safonov wrote:
> 1. With CONFIG_JUMP_LABEL=n static_key_slow_inc() doesn't have any
> protection against key->enabled refcounter overflow.
> 2. With CONFIG_JUMP_LABEL=y static_key_slow_inc_cpuslocked()
> still may turn the refcounter negative as (v + 1) may overflow.
>
> key->enabled is indeed a ref-counter as it's documented in multiple
> places: top comment in jump_label.h, Documentation/staging/static-keys.rst,
> etc.
>
> As -1 is reserved for static key that's in process of being enabled,
> functions would break with negative key->enabled refcount:
> - for CONFIG_JUMP_LABEL=n negative return of static_key_count()
> breaks static_key_false(), static_key_true()
> - the ref counter may become 0 from negative side by too many
> static_key_slow_inc() calls and lead to use-after-free issues.
>
> These flaws result in that some users have to introduce an additional
> mutex and prevent the reference counter from overflowing themselves,
> see bpf_enable_runtime_stats() checking the counter against INT_MAX / 2.
>
> Prevent the reference counter overflow by checking if (v + 1) > 0.
> Change functions API to return whether the increment was successful.
>
> Signed-off-by: Dmitry Safonov <[email protected]>
> Acked-by: Jakub Kicinski <[email protected]>

This looks good to me:

Acked-by: Peter Zijlstra (Intel) <[email protected]>

What is the plan for merging this? I'm assuming it would want to go
through the network tree, but as already noted earlier it depends on a
patch I have in tip/locking/core.

Now I checked, tip/locking/core is *just* that one patch, so it might be
possible to merge that branch and this series into the network tree and
note that during the pull request to Linus.

2022-11-25 15:25:20

by Dmitry Safonov

[permalink] [raw]
Subject: Re: [PATCH v6 1/5] jump_label: Prevent key->enabled int overflow

On 11/25/22 07:59, Peter Zijlstra wrote:
> On Wed, Nov 23, 2022 at 05:38:55PM +0000, Dmitry Safonov wrote:
>> 1. With CONFIG_JUMP_LABEL=n static_key_slow_inc() doesn't have any
>> protection against key->enabled refcounter overflow.
>> 2. With CONFIG_JUMP_LABEL=y static_key_slow_inc_cpuslocked()
>> still may turn the refcounter negative as (v + 1) may overflow.
>>
>> key->enabled is indeed a ref-counter as it's documented in multiple
>> places: top comment in jump_label.h, Documentation/staging/static-keys.rst,
>> etc.
>>
>> As -1 is reserved for static key that's in process of being enabled,
>> functions would break with negative key->enabled refcount:
>> - for CONFIG_JUMP_LABEL=n negative return of static_key_count()
>> breaks static_key_false(), static_key_true()
>> - the ref counter may become 0 from negative side by too many
>> static_key_slow_inc() calls and lead to use-after-free issues.
>>
>> These flaws result in that some users have to introduce an additional
>> mutex and prevent the reference counter from overflowing themselves,
>> see bpf_enable_runtime_stats() checking the counter against INT_MAX / 2.
>>
>> Prevent the reference counter overflow by checking if (v + 1) > 0.
>> Change functions API to return whether the increment was successful.
>>
>> Signed-off-by: Dmitry Safonov <[email protected]>
>> Acked-by: Jakub Kicinski <[email protected]>
>
> This looks good to me:
>
> Acked-by: Peter Zijlstra (Intel) <[email protected]>

Thank you, Peter!

> What is the plan for merging this? I'm assuming it would want to go
> through the network tree, but as already noted earlier it depends on a
> patch I have in tip/locking/core.
>
> Now I checked, tip/locking/core is *just* that one patch, so it might be
> possible to merge that branch and this series into the network tree and
> note that during the pull request to Linus.

I initially thought it has to go through tip trees because of the
dependence, but as you say it's just one patch.

I was also asked by Jakub on v4 to wait for Eric's Ack/Review, so once I
get a go from him, I will send all 6 patches for inclusion into -net
tree, if that will be in time before the merge window.

Thanks again for the review and ack,
Dmitry

2022-12-01 19:58:47

by Eric Dumazet

[permalink] [raw]
Subject: Re: [PATCH v6 4/5] net/tcp: Do cleanup on tcp_md5_key_copy() failure

On Wed, Nov 23, 2022 at 6:39 PM Dmitry Safonov <[email protected]> wrote:
>
> If the kernel was short on (atomic) memory and failed to allocate it -
> don't proceed to creation of request socket. Otherwise the socket would
> be unsigned and userspace likely doesn't expect that the TCP is not
> MD5-signed anymore.
>
> Signed-off-by: Dmitry Safonov <[email protected]>
> Acked-by: Jakub Kicinski <[email protected]>

Reviewed-by: Eric Dumazet <[email protected]>

2022-12-01 20:00:23

by Eric Dumazet

[permalink] [raw]
Subject: Re: [PATCH v6 5/5] net/tcp: Separate initialization of twsk

On Wed, Nov 23, 2022 at 6:39 PM Dmitry Safonov <[email protected]> wrote:
>
> Convert BUG_ON() to WARN_ON_ONCE() and warn as well for unlikely
> static key int overflow error-path.
>
> Signed-off-by: Dmitry Safonov <[email protected]>
> Acked-by: Jakub Kicinski <[email protected]>
> ---

Reviewed-by: Eric Dumazet <[email protected]>

Thanks !

2022-12-01 20:39:56

by Eric Dumazet

[permalink] [raw]
Subject: Re: [PATCH v6 3/5] net/tcp: Disable TCP-MD5 static key on tcp_md5sig_info destruction

On Wed, Nov 23, 2022 at 6:39 PM Dmitry Safonov <[email protected]> wrote:
>
> To do that, separate two scenarios:
> - where it's the first MD5 key on the system, which means that enabling
> of the static key may need to sleep;
> - copying of an existing key from a listening socket to the request
> socket upon receiving a signed TCP segment, where static key was
> already enabled (when the key was added to the listening socket).
>
> Now the life-time of the static branch for TCP-MD5 is until:
> - last tcp_md5sig_info is destroyed
> - last socket in time-wait state with MD5 key is closed.
>
> Which means that after all sockets with TCP-MD5 keys are gone, the
> system gets back the performance of disabled md5-key static branch.
>
> While at here, provide static_key_fast_inc() helper that does ref
> counter increment in atomic fashion (without grabbing cpus_read_lock()
> on CONFIG_JUMP_LABEL=y). This is needed to add a new user for
> a static_key when the caller controls the lifetime of another user.
>
> Signed-off-by: Dmitry Safonov <[email protected]>
> Acked-by: Jakub Kicinski <[email protected]>

Reviewed-by: Eric Dumazet <[email protected]>

2022-12-01 23:11:02

by Jakub Kicinski

[permalink] [raw]
Subject: Re: [PATCH v6 1/5] jump_label: Prevent key->enabled int overflow

On Fri, 25 Nov 2022 14:28:30 +0000 Dmitry Safonov wrote:
> > What is the plan for merging this? I'm assuming it would want to go
> > through the network tree, but as already noted earlier it depends on a
> > patch I have in tip/locking/core.
> >
> > Now I checked, tip/locking/core is *just* that one patch, so it might be
> > possible to merge that branch and this series into the network tree and
> > note that during the pull request to Linus.
>
> I initially thought it has to go through tip trees because of the
> dependence, but as you say it's just one patch.
>
> I was also asked by Jakub on v4 to wait for Eric's Ack/Review, so once I
> get a go from him, I will send all 6 patches for inclusion into -net
> tree, if that will be in time before the merge window.

Looks like we're all set on the networking side (thanks Eric!!)

Should I pull Peter's branch? Or you want to just resent a patch Peter
already queued. A bit of an unusual situation..

2022-12-01 23:54:59

by Jakub Kicinski

[permalink] [raw]
Subject: Re: [PATCH v6 1/5] jump_label: Prevent key->enabled int overflow

On Thu, 1 Dec 2022 23:17:11 +0000 Dmitry Safonov wrote:
> On Thu, 1 Dec 2022 at 22:31, Jakub Kicinski <[email protected]> wrote:
> > > I initially thought it has to go through tip trees because of the
> > > dependence, but as you say it's just one patch.
> > >
> > > I was also asked by Jakub on v4 to wait for Eric's Ack/Review, so once I
> > > get a go from him, I will send all 6 patches for inclusion into -net
> > > tree, if that will be in time before the merge window.
> >
> > Looks like we're all set on the networking side (thanks Eric!!)
>
> Thanks!
>
> > Should I pull Peter's branch? Or you want to just resent a patch Peter
> > already queued. A bit of an unusual situation..
>
> Either way would work for me.
> I can send it in a couple of hours if you prefer instead of pulling the branch.

I prefer to pull, seems safer in case Peter does get another patch.

It's this one, right?

https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/log/?h=locking/core

I'll pulled, I'll push out once the build is done.

2022-12-02 00:01:15

by Dmitry Safonov

[permalink] [raw]
Subject: Re: [PATCH v6 1/5] jump_label: Prevent key->enabled int overflow

On Thu, 1 Dec 2022 at 22:31, Jakub Kicinski <[email protected]> wrote:
>
> On Fri, 25 Nov 2022 14:28:30 +0000 Dmitry Safonov wrote:
> > > What is the plan for merging this? I'm assuming it would want to go
> > > through the network tree, but as already noted earlier it depends on a
> > > patch I have in tip/locking/core.
> > >
> > > Now I checked, tip/locking/core is *just* that one patch, so it might be
> > > possible to merge that branch and this series into the network tree and
> > > note that during the pull request to Linus.
> >
> > I initially thought it has to go through tip trees because of the
> > dependence, but as you say it's just one patch.
> >
> > I was also asked by Jakub on v4 to wait for Eric's Ack/Review, so once I
> > get a go from him, I will send all 6 patches for inclusion into -net
> > tree, if that will be in time before the merge window.
>
> Looks like we're all set on the networking side (thanks Eric!!)

Thanks!

> Should I pull Peter's branch? Or you want to just resent a patch Peter
> already queued. A bit of an unusual situation..

Either way would work for me.
I can send it in a couple of hours if you prefer instead of pulling the branch.

Thank you,
Dmitry

2022-12-02 00:53:36

by Dmitry Safonov

[permalink] [raw]
Subject: Re: [PATCH v6 1/5] jump_label: Prevent key->enabled int overflow

On Thu, 1 Dec 2022 at 23:36, Jakub Kicinski <[email protected]> wrote:
>
> On Thu, 1 Dec 2022 23:17:11 +0000 Dmitry Safonov wrote:
> > On Thu, 1 Dec 2022 at 22:31, Jakub Kicinski <[email protected]> wrote:
> > > > I initially thought it has to go through tip trees because of the
> > > > dependence, but as you say it's just one patch.
> > > >
> > > > I was also asked by Jakub on v4 to wait for Eric's Ack/Review, so once I
> > > > get a go from him, I will send all 6 patches for inclusion into -net
> > > > tree, if that will be in time before the merge window.
> > >
> > > Looks like we're all set on the networking side (thanks Eric!!)
> >
> > Thanks!
> >
> > > Should I pull Peter's branch? Or you want to just resent a patch Peter
> > > already queued. A bit of an unusual situation..
> >
> > Either way would work for me.
> > I can send it in a couple of hours if you prefer instead of pulling the branch.
>
> I prefer to pull, seems safer in case Peter does get another patch.
>
> It's this one, right?

It is the correct one, yes.

> https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/log/?h=locking/core
>
> I'll pulled, I'll push out once the build is done.

Thank you once again,
Dmitry

2022-12-02 04:25:50

by patchwork-bot+netdevbpf

[permalink] [raw]
Subject: Re: [PATCH v6 0/5] net/tcp: Dynamically disable TCP-MD5 static key

Hello:

This series was applied to netdev/net-next.git (master)
by Jakub Kicinski <[email protected]>:

On Wed, 23 Nov 2022 17:38:54 +0000 you wrote:
> Changes from v5:
> - Corrected comment for static_key_fast_inc_not_negative() (Peter)
> - Renamed static_key_fast_inc_not_negative() =>
> static_key_fast_inc_not_disabled() (as suggested by Peter)
> - static_key_fast_inc_not_disabled() is exported and declared in the
> patch 1 that defines it, rather than in patch 3 that uses it (Peter)
>
> [...]

Here is the summary with links:
- [v6,1/5] jump_label: Prevent key->enabled int overflow
https://git.kernel.org/netdev/net-next/c/eb8c507296f6
- [v6,2/5] net/tcp: Separate tcp_md5sig_info allocation into tcp_md5sig_info_add()
https://git.kernel.org/netdev/net-next/c/f62c7517ffa1
- [v6,3/5] net/tcp: Disable TCP-MD5 static key on tcp_md5sig_info destruction
https://git.kernel.org/netdev/net-next/c/459837b522f7
- [v6,4/5] net/tcp: Do cleanup on tcp_md5_key_copy() failure
https://git.kernel.org/netdev/net-next/c/b389d1affc2c
- [v6,5/5] net/tcp: Separate initialization of twsk
https://git.kernel.org/netdev/net-next/c/c5b8b515a211

You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html


2022-12-02 05:29:55

by Eric Dumazet

[permalink] [raw]
Subject: Re: [PATCH v6 3/5] net/tcp: Disable TCP-MD5 static key on tcp_md5sig_info destruction

On Thu, Dec 1, 2022 at 8:38 PM Eric Dumazet <[email protected]> wrote:
>
> On Wed, Nov 23, 2022 at 6:39 PM Dmitry Safonov <[email protected]> wrote:
> >
> > To do that, separate two scenarios:
> > - where it's the first MD5 key on the system, which means that enabling
> > of the static key may need to sleep;
> > - copying of an existing key from a listening socket to the request
> > socket upon receiving a signed TCP segment, where static key was
> > already enabled (when the key was added to the listening socket).
> >
> > Now the life-time of the static branch for TCP-MD5 is until:
> > - last tcp_md5sig_info is destroyed
> > - last socket in time-wait state with MD5 key is closed.
> >
> > Which means that after all sockets with TCP-MD5 keys are gone, the
> > system gets back the performance of disabled md5-key static branch.
> >
> > While at here, provide static_key_fast_inc() helper that does ref
> > counter increment in atomic fashion (without grabbing cpus_read_lock()
> > on CONFIG_JUMP_LABEL=y). This is needed to add a new user for
> > a static_key when the caller controls the lifetime of another user.
> >
> > Signed-off-by: Dmitry Safonov <[email protected]>
> > Acked-by: Jakub Kicinski <[email protected]>
>
> Reviewed-by: Eric Dumazet <[email protected]>

Hmm, I missed two kfree_rcu(key) calls, I will send the following fix:

diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 7fae586405cfb10011a0674289280bf400dfa8d8..8320d0ecb13ae1e3e259f3c13a4c2797fbd984a5
100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -1245,7 +1245,7 @@ int tcp_md5_do_add(struct sock *sk, const union
tcp_md5_addr *addr,

md5sig =
rcu_dereference_protected(tp->md5sig_info, lockdep_sock_is_held(sk));
rcu_assign_pointer(tp->md5sig_info, NULL);
- kfree_rcu(md5sig);
+ kfree_rcu(md5sig, rcu);
return -EUSERS;
}
}
@@ -1271,7 +1271,7 @@ int tcp_md5_key_copy(struct sock *sk, const
union tcp_md5_addr *addr,
md5sig =
rcu_dereference_protected(tp->md5sig_info, lockdep_sock_is_held(sk));
net_warn_ratelimited("Too many TCP-MD5 keys in
the system\n");
rcu_assign_pointer(tp->md5sig_info, NULL);
- kfree_rcu(md5sig);
+ kfree_rcu(md5sig, rcu);
return -EUSERS;
}
}

2022-12-02 05:58:38

by Eric Dumazet

[permalink] [raw]
Subject: Re: [PATCH v6 3/5] net/tcp: Disable TCP-MD5 static key on tcp_md5sig_info destruction

On Fri, Dec 2, 2022 at 6:05 AM Eric Dumazet <[email protected]> wrote:
>
> Hmm, I missed two kfree_rcu(key) calls, I will send the following fix:
>

https://patchwork.kernel.org/project/netdevbpf/patch/[email protected]/